- Half-width kana
Half-width kana (半角カナ) is half of
fullwidth form . It refers to thekatakana character portion of the character set specified byJIS X 0201 .Although an official name is JIS X 0201 katakana, half-width kana is the commonly known name and this term will be used in this article.
History
ASCII is defined as a 7-bitcharacter set and has room for 128 characters. However, since this standard was designed for theUnited States , it does not contain characters and symbols (for example, the ¥yen currency symbol) needed for representation of Japanese.JIS X 0201 was developed in 1969, and since computers at that time simply did not have the computational power and memory necessary to process the thousands ofKanji (Chinese-based) characters that exist in written Japanese, thereforeo as a simplification, Kanji characters were always represented bykatakana .Half-width kana were developed as "...the first Japanese characters encoded on computers because they are used for Japanese telegrams. As single-byte characters..." ref|Lunde1999_1
To make katakana fit into the area allowed, some compromises were made: the diacritical marks
Dakuten andHandakuten are treated as separate characters instead of being part of the preceding character. This led to the so-called "half-width kana" and these compromises still cause problems today for computer programs, apart from frequently being considered to be visually unattractive.Half-width table
\Trailing 4 bits→
↓Leading 4 bits0 1 2 3 4 5 6 7 8 9 a b c d e f 0 1 2 3 4 5 6 7 8 9 a 。 「 」 、 ・ ヲ ァ ィ ゥ ェ ォ ャ ュ ョ ッ b ー ア イ ウ エ オ カ キ ク ケ コ サ シ ス セ ソ c タ チ ツ テ ト ナ ニ ヌ ネ ノ ハ ヒ フ ヘ ホ マ d ミ ム メ モ ヤ ユ ヨ ラ リ ル レ ロ ワ ン ゙ ゚ e f Half-width kana on the Internet
E-mail
Since the
SMTP andNNTP protocols (used to deliver e-mail andUsenet , respectively) were formerly only able to transmit 7-bits, it was then the convention to useISO-2022-JP for sending e-mail in Japanese.Since half-width kana is not contained in ISO-2022-JP, half-width kana cannot be included in a message, but when half-width kana was accidentally included in a message, it can become garbled during transmission.
This is no longer such a problem since most e-mail servers today use
ESMTP , and hence 8-bit characters are acceptable. Alternatively, an encoding system such as Base64 can be used and specified in the message usingMIME .Web pages
The problems that exists in e-mail do not exist with Web pages since
HTTP accepts 8-bit characters.A problem that does exist is that computer programs have difficulties whether to treat a character as
Shift JIS ,EUC-JP , orUTF-7 - hence character code information should be specified with a HTTP response header or aMeta tag .Misunderstanding of JIS X 0201
In fact, JIS X 0201 katakana is not half-width katakana. The standard doesn't define character's width. It defines only the code representation of katakana characters. The term "half-width" is just the remains of the old devices that displayed single-byte characters in half-width (as compared with double-byte ones). In JIS X 0201 standard, katakana characters in its code chart are printed in normal width, not half-width.
However, the misunderstanding that the standard defines "half-width" characters is widespread. People who know the standard will often say "so-called half-width kana."
ee also
*
Fullwidth form
*Halfwidth and Fullwidth Forms References
#Note|Lunde1999_1 Lunde, Ken. CJKV Information Processing. 1st ed. O'Reilly, 1999. p. 144-145
Wikimedia Foundation. 2010.
См. также в других словарях:
Extended Unix Code — (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese.The structure of EUC is based on the ISO 2022 standard, which specifies a way to represent character sets containing a maximum of 94… … Wikipedia
Fullwidth form — In CJK computing, graphic characters are traditionally classed into fullwidth (in Taiwan and Hong Kong: 全形; elsewhere: 全角) and halfwidth (in Taiwan and Hong Kong: 半形; elsewhere: 半角) characters. With fixed width fonts (now called bi width by… … Wikipedia
The Matrix — For the series, see The Matrix (franchise). For other uses, see Matrix. The Matrix Theatrical release poster Directed by Andy Wachowski Larry Wachowski … Wikipedia
Matrix digital rain — A screensaver named XMatrix in XScreenSaver representing the digital rain Matrix digital rain, Matrix code or sometimes green rain, is the computer code featured in the Matrix series. The falling green code is a way of representing the activity… … Wikipedia
List of typographic features — State of the art digital typographic systems have solved virtually all the demands of traditional typography and have expanded the possibilities with many new features. The two lists below provide information about many features Contents 1… … Wikipedia
Katakana — Schrifttyp Silbenschrift Sprachen Japanisch Ainu Verwendungszeit seit ca. 800 n. Chr. Offiziell in Japan Abstammung … Deutsch Wikipedia
Dakuten — ゙ ゚ Dakuten Diacritics accent acute( … Wikipedia
Language input keys — are keys designed to translate letters entered by users, usually found on Japanese and Korean keyboards, for use with an input method editor.Keys for Japanese KeyboardsKanji (漢字)Used to switch between entering Japanese and English text. It is not … Wikipedia
Meiryo — Category Sans serif Designer(s) C G Inc., Eiichi Kōno, Takeharu Suzuki (Katakana, Hiragana, and Chinese Character), Matthew Carter, Tom Rickner (Latin, Greek, and Cyrillic) … Wikipedia
Japanese language and computers — In relation to the Japanese language and computers many adaptation issues arise, some unique to Japanese and others common to languages which have a very large number of characters. The number of characters needed in order to write English is… … Wikipedia