Space (punctuation)

Space (punctuation)

In writing, a space ( ) is a blank area that is devoid of content, which separates words, letters, numbers, and punctuation. Conventions for interword and intersentence spaces vary among languages, and in some cases the spacing rules are quite complex.

The Latin alphabet, used for English, was originally written "scripta continua", without any word separators. Later "interpuncts," centred dots, were added to make reading easier, and replaced with spaces after 600–800 AD. In typesetting, spaces have historically been of multiple lengths with particular space-lengths being used for specific typographic purposes, such as separating words or separating sentences or separating punctuation from words. Following the invention of the typewriter and the subsequent overlap of designer style-preferences and computer-technology limitations, much of this reader-centric variation has been lost in normal use.

In computer representation of text, spaces of various sizes, styles, or language characteristics (different "space characters") are indicated with unique code points.

Use of the space in natural languages

paces between words

Modern English uses a space to separate words, but not all languages follow this practice. Spaces were not used to separate words in Latin until roughly AD 600–AD 800. Ancient Hebrew and Arabic "did" use spaces, partly to compensate in clarity for the lack of vowels. Traditionally, all CJK languages have no spaces: modern Chinese and Japanese (except when written with little or no kanji) still do not, but modern Korean uses spaces.

paces between sentences

:"For current practice, see here."

There are three main conventions relating to the number of spaces used to separate sentences within the same paragraph:
* one widened space, typically two to three times wider than an inter-word space (traditional typography)
* two spaces (English spacing or American typewriter spacing)
* one space (French spacing)

"Double spacing" can also refer to a style of line spacing: the insertion of a full additional empty line between lines of text. This is commonly used for text which may incorporate later markup or modifications, such as proof-readers' copies, legal documents, or academic assignments for correction.

Space characters and digital typography

The variable-width general-purpose space

In computer character encodings, there is a normal general-purpose space (Unicode character U+|0020; 32 decimal) whose width will vary according to the design of the typeface. Typical values range from 1/5-em to 1/3-em (in digital typography an em is equal to the nominal size of the font, so for a 10-point font the space will probably be between 2 and 3.3 points). Sophisticated fonts may have differently sized spaces for bold, italic, and small-caps faces, and often compositors will manually adjust the width of the space depending on the size and prominence of the text.

In addition to this general-purpose space, it is possible to encode a space of a specific width. See the table below for a complete list.

(In monospaced proofreading copy, only em- and en-spaces are represented using this character (which is called an "em-quad" or an "en-quad"), while other types of spaces are represented with a number sign.

Breaking and non-breaking spaces

When rendered, the generic Unicode space is often considered insignificant when appearing at the end of a line of text, or when part of a sequence of whitespace characters, so it may be omitted or "collapsed" in such circumstances. The non-breaking space, U+|00A0 (160 decimal), renders the same as a normal space but is expressly non-collapsible. It is often used to prevent line wrapping or to indent text, though best World Wide Web practice prescribes using CSS for the latter purpose.

Hair spaces around dashes

Typically, both en dashes and em dashes are set continuous with the textFact|date=August 2007 (as illustrated by use in the Chicago Manual of Style, 6.80, 6.83–86). However, an em dash can optionally be surrounded with a so-called hair space, U+|200A (8202 decimal). This space should be much thinner than a normal space, and is seldom used on its own. It can be written in HTML by using the numeric character reference   or  . Very few user agents are able to render a hair space correctly: in most cases the result is an unwanted symbol or a question mark on the screen, depending on the font and renderer capabilities.

Table of spaces

Unicode defines several space characters with specific semantics and rendering characteristics, as shown in the table below. Depending on the browser and fonts used to view this table, not all spaces may display properly:

Unicode also provides some visible characters to stand in for space when necessary in the "Control Pictures" block: the Symbol For Space unicode|␠ (U+2420), the Blank Symbol unicode|␢ (U+2422), and the Open Box unicode|␣ (U+2423). The interpunct · is also often used to represent a space in word processing programs such as Microsoft Word.

Use of the space in computing

In programming language syntax, spaces are frequently used to explicitly separate tokens. Aside from this use, spaces and other whitespace characters are usually ignored by modern programming languages. Exceptions are Haskell, ABC, and Python, which use the amount of whitespace in indentation to indicate the bounds of a block, and a whimsical language called Whitespace, where whitespace is the only meaningful syntactical element.

Text editors, word processors, and desktop publishing software differ in how they represent whitespace on the screen, and how they represent spaces at the ends of lines longer than the screen or column width. In some cases, spaces are shown simply as blank space; in other cases they may be represented by an interpunct or other symbols. Many different characters (described below) could be used to produce spaces, and non-character functions (such as margins and tab settings) can also affect whitespace.

pace characters in markup languages

Generalised markup languages, such as SGML, do not treat space characters differently from other characters.

However, special-purpose markup languages may do. In particular, web markup languages such as XML and HTML treat whitespace characters specially, including space characters, for programmers' convenience. One or more space characters read by conforming Display-time processors of those markup languages are collapsed to 0 or 1 space, depending on their semantic context. For example, double (or more) spaces within text are collapsed to a single space, and spaces which appear on either side of the "=" that separates an attribute name from its value have no effect on the interpretation of the document. Element end tags can contain trailing spaces, and empty-element tags in XML can contain spaces before the "/>".

In XML attribute values, sequences of whitespace characters are treated as a single space when the document is read by a parser. [http://www.w3.org/TR/REC-xml/#AVNormalize] Whitespace in XML element content is not changed in this way by the parser, but an application receiving information from the parser may choose to apply similar rules to element content. An XML document author can use the xml:space="preserve" attribute on an element to force the parser to discourage the downstream application from altering whitespace in that element's content.

In most HTML elements, a sequence of whitespace characters is treated as a single "inter-word separator", which may manifest as a single space character when rendering text in a language that normally inserts such space between words. [http://www.w3.org/TR/html4/struct/text.html#h-9.1] Conforming HTML renderers are required to apply a more literal treatment of whitespace within a few prescribed elements, such as the pre tag and any element for which CSS has been used to apply pre-like whitespace processing. In such elements, space characters will not be "collapsed" into inter-word separators.

In both XML and HTML, the non-breaking space character, along with other non-"standard" spaces, is not treated as collapsible "whitespace", so it is not subject to the rules above.

See also

* Hard space
* Hyphenation
* Internal field separator
* Non-breaking space

References

External links

* [http://www.cs.tut.fi/~jkorpela/chars/spaces.html Unicode spaces] , by Jukka "Yucca" Korpela.
* [http://www.cs.sfu.ca/~ggbaker/reference/characters/ Commonly confused characters]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Space (disambiguation) — Space is a framework from which we can quantify distances between objects or points. Space or spacing may also refer to:In science:* Outer space, the area beyond the limit of the Earth s atmosphere * a mathematical space is an informal term for… …   Wikipedia

  • Space War Blues — is a science fiction novel by American writer Richard A. Lupoff. It is a fixup of several previously published pieces, the longest of which, With The Bentfin Boomer Boys On Little Old New Alabama, (afterwards “WTBBB”) first appeared in Harlan… …   Wikipedia

  • punctuation — punctuational, punctuative, adj. /pungk chooh ay sheuhn/, n. 1. the practice or system of using certain conventional marks or characters in writing or printing in order to separate elements and make the meaning clear, as in ending a sentence or… …   Universalium

  • Punctuation (chess) — When annotating chess games, commentators frequently use question marks and exclamation points to denote a move as bad or good. The symbols normally used are ?? , ? , ?! , !? , ! , and !! . The corresponding symbol is juxtaposed in the text… …   Wikipedia

  • Non-breaking space — In computer based text processing and digital typesetting, a non breaking space or no break space (NBSP) is a variant of the space character that prevents an automatic line break (line wrap) at its position. In certain formats (such as HTML), it… …   Wikipedia

  • Hard space — In typesetting and text editors, the term hard space has several meanings, all related to a special way of representing the whitespace between characters.*The most commonly used meaning is the same as non breaking space: a special space character …   Wikipedia

  • Half-space (disambiguation) — half space may refer to either of the following: * in geometry , the half space (or half plane) mdash;part of an affine space * in typography , the half space punctuation character …   Wikipedia

  • Slovene punctuation — Punctuation marks are one or two part graphical marks used in writing, denoting tonal progress, pauses, sentence type (syntactic use), abbreviations, et cetera.Marks used in Slovene include full stops (.), question marks (?), exclamation marks… …   Wikipedia

  • Chinese punctuation — uses a different set of punctuation marks from European languages. Chinese punctuation only became an integral part of the written language in the 20th century The first book to be printed with modern punctuation was Outline of the History of… …   Wikipedia

  • Japanese punctuation — (約物, yakumono ) includes various written marks (besides characters and numbers), which differ from those found in European languages, as well as some not used in formal Japanese writing but frequently found in more casual writing, such as… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”