- Half-width kana
Half-width kana (半角カナ) is half of
fullwidth form. It refers to the katakanacharacter portion of the character set specified by JIS X 0201.
Although an official name is JIS X 0201 katakana, half-width kana is the commonly known name and this term will be used in this article.
ASCIIis defined as a 7-bit character setand has room for 128 characters. However, since this standard was designed for the United States, it does not contain characters and symbols (for example, the ¥ yencurrency symbol) needed for representation of Japanese. JIS X 0201was developed in 1969, and since computers at that time simply did not have the computational power and memory necessary to process the thousands of Kanji(Chinese-based) characters that exist in written Japanese, thereforeo as a simplification, Kanji characters were always represented by katakana.
Half-width kana were developed as "...the first Japanese characters encoded on computers because they are used for Japanese telegrams. As single-byte characters..." ref|Lunde1999_1
To make katakana fit into the area allowed, some compromises were made: the diacritical marks
Dakutenand Handakutenare treated as separate characters instead of being part of the preceding character. This led to the so-called "half-width kana" and these compromises still cause problems today for computer programs, apart from frequently being considered to be visually unattractive.
＼Trailing 4 bits→
↓Leading 4 bits
0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a ｡ ｢ ｣ ､ ･ ｦ ｧ ｨ ｩ ｪ ｫ ｬ ｭ ｮ ｯ b ｰ ｱ ｲ ｳ ｴ ｵ ｶ ｷ ｸ ｹ ｺ ｻ ｼ ｽ ｾ ｿ c ﾀ ﾁ ﾂ ﾃ ﾄ ﾅ ﾆ ﾇ ﾈ ﾉ ﾊ ﾋ ﾌ ﾍ ﾎ ﾏ d ﾐ ﾑ ﾒ ﾓ ﾔ ﾕ ﾖ ﾗ ﾘ ﾙ ﾚ ﾛ ﾜ ﾝ ﾞ ﾟ e f
Half-width kana on the Internet
SMTPand NNTPprotocols (used to deliver e-mail and Usenet, respectively) were formerly only able to transmit 7-bits, it was then the convention to use ISO-2022-JPfor sending e-mail in Japanese.
Since half-width kana is not contained in ISO-2022-JP, half-width kana cannot be included in a message, but when half-width kana was accidentally included in a message, it can become garbled during transmission.
This is no longer such a problem since most e-mail servers today use
ESMTP, and hence 8-bit characters are acceptable. Alternatively, an encoding system such as Base64 can be used and specified in the message using MIME.
The problems that exists in e-mail do not exist with Web pages since
HTTPaccepts 8-bit characters.
A problem that does exist is that computer programs have difficulties whether to treat a character as
Shift JIS, EUC-JP, or UTF-7- hence character code information should be specified with a HTTP response header or a Meta tag.
Misunderstanding of JIS X 0201
In fact, JIS X 0201 katakana is not half-width katakana. The standard doesn't define character's width. It defines only the code representation of katakana characters. The term "half-width" is just the remains of the old devices that displayed single-byte characters in half-width (as compared with double-byte ones). In JIS X 0201 standard, katakana characters in its code chart are printed in normal width, not half-width.
However, the misunderstanding that the standard defines "half-width" characters is widespread. People who know the standard will often say "so-called half-width kana."
Halfwidth and Fullwidth Forms
#Note|Lunde1999_1 Lunde, Ken. CJKV Information Processing. 1st ed. O'Reilly, 1999. p. 144-145
Wikimedia Foundation. 2010.
См. также в других словарях:
Extended Unix Code — (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese.The structure of EUC is based on the ISO 2022 standard, which specifies a way to represent character sets containing a maximum of 94… … Wikipedia
Fullwidth form — In CJK computing, graphic characters are traditionally classed into fullwidth (in Taiwan and Hong Kong: 全形; elsewhere: 全角) and halfwidth (in Taiwan and Hong Kong: 半形; elsewhere: 半角) characters. With fixed width fonts (now called bi width by… … Wikipedia
The Matrix — For the series, see The Matrix (franchise). For other uses, see Matrix. The Matrix Theatrical release poster Directed by Andy Wachowski Larry Wachowski … Wikipedia
Matrix digital rain — A screensaver named XMatrix in XScreenSaver representing the digital rain Matrix digital rain, Matrix code or sometimes green rain, is the computer code featured in the Matrix series. The falling green code is a way of representing the activity… … Wikipedia
List of typographic features — State of the art digital typographic systems have solved virtually all the demands of traditional typography and have expanded the possibilities with many new features. The two lists below provide information about many features Contents 1… … Wikipedia
Katakana — Schrifttyp Silbenschrift Sprachen Japanisch Ainu Verwendungszeit seit ca. 800 n. Chr. Offiziell in Japan Abstammung … Deutsch Wikipedia
Dakuten — ﾞ ﾟ Dakuten Diacritics accent acute( … Wikipedia
Language input keys — are keys designed to translate letters entered by users, usually found on Japanese and Korean keyboards, for use with an input method editor.Keys for Japanese KeyboardsKanji (漢字)Used to switch between entering Japanese and English text. It is not … Wikipedia
Meiryo — Category Sans serif Designer(s) C G Inc., Eiichi Kōno, Takeharu Suzuki (Katakana, Hiragana, and Chinese Character), Matthew Carter, Tom Rickner (Latin, Greek, and Cyrillic) … Wikipedia
Japanese language and computers — In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write English is… … Wikipedia