ISO/IEC 8859-2

ISO/IEC 8859-2

ISO 8859-2, more formally cited as ISO/IEC 8859-2 or less formally as Latin-2, is part 2 of ISO/IEC 8859, a standard character encoding defined by ISO. It encodes what it refers to as Latin alphabet no. 2, consisting of 191 characters from the Latin script, each encoded as a single 8-bit code value.

ISO_8859-2:1987, more commonly known by its preferred mime name of ISO-8859-2 (note extra hyphen), is the IANA charset name for this standard used together with the control codes from ISO/IEC 6429 for the C0 (0x00-0x1F) and C1(0x80-0x9F) parts. Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are notto be interpreted. This character set also has the aliases ISO_8859-2, latin2, l2 and csISOLatin2.

This encoding shares a lot of assignments with windows-1250 but is not a strict subset of it (unlike the case with windows-1252 and ISO 8859-1).

These code values can be used in almost any data interchange system to communicate in the following European languages: Bosnian, Croatian, Czech, Hungarian, Polish, Romanian (also, see next paragraph), Serbian (in Latin transcription), Serbo-Croatian, Slovak, Slovenian, Upper Sorbian and Lower Sorbian. Furthermore it is suitable to represent some western European languages like Finnish (with the exception of å used in Swedish and Finnish) or German. When used alone, these latter languages are nominally using ISO 8859-1 encoding, but the needed codepoints are shared with ISO 8859-2, which is an important aspect for multi-lingual documents.

It may be argued that ISO 8859-2 is not really suitable for Romanian because of lack of letters s and t with commas below, containing s and t with cedillas instead. These letters were unified in the first versions of the Unicode standard, meaning that the appearance with cedilla or with comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should, therefore, have characters with comma below at those code points.

Code page layout

In the following table characters are shown together with their corresponding Unicode code points. Note that code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-2. Code 20 is the regular SPACE character, and A0 is the NON-BREAKING SPACE. Code AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.

External links

* [http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=28246&ICS1=35&ICS2=40&ICS3= ISO 8859-2:1999]
* [http://www.ecma-international.org/publications/standards/Ecma-094.htm Standard ECMA-94] : 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 "2nd edition (June 1986)"
* [http://www.itscj.ipsj.or.jp/ISO-IR/101.pdf ISO-IR 101] Right-Hand Part of Latin Alphabet No.2 "(February 1, 1986)"
* [http://nl.ijs.si/gnusl/cee/iso8859-2.html ISO 8859-2 (Latin 2) Resources]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • ISO/IEC 8859 — is a joint ISO and IEC standard for 8 bit character encodings for use by computers. The standard is divided into numbered, separately published parts, such as ISO/IEC 8859 1, ISO/IEC 8859 2, etc., each of which may be informally referred to as a… …   Wikipedia

  • ISO/IEC 8859-1 — ISO 8859 1 Latin 1, Westeuropäisch 2 Latin 2, Mitteleuropäisch 3 Latin 3, Südeuropäisch 4 Latin 4, Baltisch 5 Kyrillisch 6 Arabisch 7 Griechisch 8 …   Deutsch Wikipedia

  • ISO/IEC 8859-1 — ISO 8859 1, more formally cited as ISO/IEC 8859 1 is part 1 of ISO/IEC 8859, a standard character encoding of the Latin alphabet. It is less formally referred to as Latin 1. It was originally developed by the ISO, but later jointly maintained by… …   Wikipedia

  • ISO/IEC 8859-11 — ISO/IEC 8859 11:2001, Information technology 8 bit single byte coded graphic character sets Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII based standard character encodings, first edition published in 2001. It is… …   Wikipedia

  • ISO/IEC 8859-8 — ISO 8859 8, more formally cited as ISO/IEC 8859 8 (but not as Latin 8!), is part 8 of ISO/IEC 8859, a standard character encoding defined by ISO.ISO 8859 8 contains all the Hebrew letters (no Hebrew vowel signs). ISO 8859 8:1988, more commonly… …   Wikipedia

  • ISO/IEC 8859-6 — ISO/IEC 8859 6:1999, Information technology 8 bit single byte coded graphic character sets Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII based standard character encodings, first edition published in 1987. It is… …   Wikipedia

  • ISO/IEC 8859-15 — ISO 8859 15 is part 15 of ISO 8859, a standard character encoding defined by International Organization for Standardization. It is also known as Latin 9, and unofficially as Latin 0 but not as Latin 15. It is similar to ISO 8859 1 but replaces… …   Wikipedia

  • ISO/IEC 8859-7 — ISO 8859 7, also known as Greek, is an 8 bit character encoding, part of the ISO 8859 standard. It was designed originally to cover the modern Greek language as well as mathematical symbols derived from the Greek.The original 1987 version of the… …   Wikipedia

  • ISO/IEC 8859-13 — ISO 8859 13, also known as Latin 7 or Baltic Rim , is an 8 bit character encoding, part of the ISO 8859 standard. It was designed originally to cover the Baltic languages, and added characters missing from the earlier encodings ISO 8859 4 and ISO …   Wikipedia

  • ISO/IEC 8859-16 — ISO 8859 16, also known as Latin 10 or South Eastern European , is an 8 bit character encoding, part of the ISO 8859 standard. It was designed to cover Albanian, Croatian, Hungarian, Polish, Romanian and Slovenian, but also French, German,… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”