Formatted text

Formatted text

Formatted text, styled text or rich text, as opposed to plain text, has styling information beyond the minimum of semantic elements: colours, styles (boldface, italic), sizes and special features (such as hyperlinks).

Contents

Terminology

Formatted text cannot rightly be identified with binary files or be distinct from ASCII text. This is because formatted text is not necessarily binary, it may be text-only, such as HTML, RTF or enriched text files, and it may be ASCII-only. Conversely, a plain text file may be non-ASCII (in an encoding such as Unicode UTF-8). Text-only formatted text is achieved by markup which too is textual, while some editors of formatted text like Microsoft Word save in a binary format.

Beginning of formatted text

Formatted text has its genesis in the first interactive systems, where users made up for the lack of formatting in ASCII by using certain symbols as substitutes. Emphasis, for example, could be achieved in ASCII in a number of ways:

  • Capitalization: I am NOT making this up.
  • Surrounding with underscores: I am _not_ making this up.
  • Surrounding with asterisks: I am *not* making this up.
  • Spacing: I am n o t making this up.

Surrounding by underscores was also used for book titles: Look it up in _The_C_Programming_Language_.

Markup languages

Main article: Markup languages

Formatting can be marked by tags distinguished from the body text by special characters, such as angle brackets in HTML. For example, this text:

The dog is classified as Canis lupus familiaris in taxonomy.

is marked up in HTML thus:

<p>The dog is classified as <i>Canis lupus familiaris</i> in taxonomy.</p>

The italicised text is enclosed by an opening and a closing italics tag. In LaTeX, the text would be marked up like this:

The dog is classified as \textit{Canis lupus familiaris} in taxonomy.

Markup languages can be implemented with any text editor, needing no special software.

Formatted document files

Since the invention of MacWrite, the first WYSIWYG word processor, in which the typist codes the formatting visually rather than by inserting textual markup, word processors have tended to save to binary files. Opening such files with a text editor reveals the text embellished with various binary characters, either around the formatted areas (eg in WordPerfect) or separately, at the beginning or end of the file (eg in Microsoft Word).

Formatted text documents in binary files have, however, the disadvantages of formatting scope and secrecy. Whereas the extent of formatting is accurately marked in markup languages, WYSIWYG formatting is based on memory, that is, keeping for example your pressing of the boldface button until cancelled. This can lead to formatting mistakes and maintenance troubles. As for secrecy, formatted text document file formats tend to be proprietary and undocumented, leading to difficulty in coding compatibility by third parties, and also to unnecessary upgrades because of version changes.

WordStar was a popular word processor that did not use binary files with hidden characters.

OpenOffice.org Writer saves files in an XML format. However, the resultant file is a binary since it is compressed (a tarball equivalent).

PDF is another formatted text file format that is usually binary (using compression for the text, and storing graphics and fonts in binary). It is generally an end-user format, written from an application such as Microsoft Word or OpenOffice.org Writer, and not editable by the user once done.

See also

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • formatted text — raiškusis tekstas statusas T sritis informatika apibrėžtis Tekstas, kuriame panaudotos raiškos priemonės: šriftas, ženklų dydis, spalva, teksto skaidymas į pastraipas, puslapius ir pan. Raiškos priemonės pateikiamos ↑formatais (1), ↑gairėmis.… …   Enciklopedinis kompiuterijos žodynas

  • Text format — has multiple meanings.In the field of computing: * Formatted text ndash; text containing word processor metadata for control style.In the fields of graphic design: * Typesetting ndash; the style of text on a page. * Typography ndash; the style of …   Wikipedia

  • Text — (t[e^]kst), n. [F. texte, L. textus, texture, structure, context, fr. texere, textum, to weave, construct, compose; cf. Gr. te ktwn carpenter, Skr. taksh to cut, carve, make. Cf. {Context}, {Mantle}, n., {Pretext}, {Tissue}, {Toil} a snare.] 1. A …   The Collaborative International Dictionary of English

  • Text blindness — Text Text (t[e^]kst), n. [F. texte, L. textus, texture, structure, context, fr. texere, textum, to weave, construct, compose; cf. Gr. te ktwn carpenter, Skr. taksh to cut, carve, make. Cf. {Context}, {Mantle}, n., {Pretext}, {Tissue}, {Toil} a… …   The Collaborative International Dictionary of English

  • Text letter — Text Text (t[e^]kst), n. [F. texte, L. textus, texture, structure, context, fr. texere, textum, to weave, construct, compose; cf. Gr. te ktwn carpenter, Skr. taksh to cut, carve, make. Cf. {Context}, {Mantle}, n., {Pretext}, {Tissue}, {Toil} a… …   The Collaborative International Dictionary of English

  • Text pen — Text Text (t[e^]kst), n. [F. texte, L. textus, texture, structure, context, fr. texere, textum, to weave, construct, compose; cf. Gr. te ktwn carpenter, Skr. taksh to cut, carve, make. Cf. {Context}, {Mantle}, n., {Pretext}, {Tissue}, {Toil} a… …   The Collaborative International Dictionary of English

  • formatted message text — nustatytos formos pranešimo tekstas statusas T sritis Gynyba apibrėžtis Pranešimo tekstas, susidedantis iš tam tikrų nustatyta tvarka išdėstytų dalių, kurių kiekviena turi savo atpažinimo simbolį ir nustatytą apibūdinantį šifrą, sudarytą iš… …   NATO terminų aiškinamasis žodynas

  • Text corpus — In linguistics, a corpus (plural corpora ) or text corpus is a large and structured set of texts (now usually electronically stored and processed). They are used to do statistical analysis and hypothesis testing, checking occurrences or… …   Wikipedia

  • formatted message text — A message text composed of several sets ordered in a specified sequence, each set characterized by an identifier and containing information of a specified type, coded and arranged in an ordered sequence of character fields in accordance with the… …   Military dictionary

  • E-text — An e text (from electronic text ; sometimes written as etext ) is, generally, any text based information that is available in a digitally encoded human readable format and read by electronic means, but more specifically it refers to files in the… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”