Vietnamese language

Vietnamese language
Tiếng Việt
Pronunciation [tiə̌ŋ viə̀ˀt] (Northern)
[tiə̌n jiə̀k] (Southern)
Spoken in  Vietnam
Vietnamese diaspora
Region Southeast Asia
Ethnicity Kinh/Gin people
Native speakers 69 million  (1999  census)
(incl. 3 million abroad)
Total: 80 million[citation needed]
Language family
  • Vietic
    • Viet–Muong
      • Vietnamese
Writing system Vietnamese variant (quốc ngữ) of Latin alphabet
Official status
Regulated by No official regulation
Language codes
ISO 639-1 vi
ISO 639-2 vie
ISO 639-3 vie
Linguasphere 46-EBA
Extent of Vietnamese

Vietnamese (tiếng Việt, or less commonly Việt ngữ[1]) is the national and official language of Vietnam. It is the mother tongue of 86% of Vietnam's population, and of about three million overseas Vietnamese. It is also spoken as a second language by many ethnic minorities of Vietnam. It is part of the Austro-Asiatic language family, of which it has the most speakers by a significant margin (several times larger than the other Austro-Asiatic languages put together).[citation needed] Much of Vietnamese vocabulary has been borrowed from Chinese, and it was formerly written using the Chinese writing system, albeit in a modified format and was given vernacular pronunciation. As a byproduct of French colonial rule, the language displays some influence from French, and the Vietnamese writing system (quốc ngữ) in use today is an adapted version of the Latin alphabet, with additional diacritics for tones and certain letters.


Geographic distribution

As the national language of the majority ethnic group, Vietnamese is spoken throughout Vietnam by the Vietnamese people, as well as by ethnic minorities. It is also spoken in overseas Vietnamese communities, most notably in the United States, where it has more than one million speakers and is the seventh most-spoken language (it is 3rd in Texas, 4th in Arkansas and Louisiana, and 5th in California[2]). In Australia, it is the sixth most-spoken language.[citation needed]

According to the Ethnologue, Vietnamese is also spoken by substantial numbers of people in Cambodia, Canada, China, Côte d'Ivoire, Czech Republic, Finland, France, Germany, Laos, Martinique, the Netherlands, New Caledonia, Norway, the Philippines, the Russian Federation, Senegal, Taiwan, Thailand, the United Kingdom, and Vanuatu.[3]

Genealogical classification

" At first, as Vietnamese has tones and shares a large vocabulary with Chinese, it was grouped into Sino-Tibetan. Later, it was found that the tones of Vietnamese appeared very recently (André-Georges Haudricourt-1954) and the Chinese-like vocabulary is also borrowed from Han Chinese during their shared history (1992); these two aspects had nothing to do with the origin of Vietnamese. Vietnamese was then classified into the Kam-Tai subfamily of Daic together with Zhuang (including Nung and Tày in North Vietnam) and Thai, after removing the surface influences of Chinese. Nevertheless, the Daic aspects were also borrowed from Zhuang in their long history of being neighbors (André-Georges Haudricourt) , not original aspects of Vietnamese. Finally, Vietnamese was classified into the Austro-Asiatic linguistic family, the Mon-Khmer subfamily, Viet-Moung branch (1992) after more studies were done. Kinh is the largest population in Vietnam. According to Fudan University's 2006 study, it belongs to Mon-Khmer linguistically, but there is no last word for its origin.

Henri Maspero maintained the Vietnamese Language of Thai-Origin, and the Reverend Father Souvignet traced it to the Indo-Malay group. A.G. Haudricourt had refuted the thesis of Maspero and concluded that Vietnamese is properly placed in the Austro-Asiatic family. None of these theories quite explain the origin of the Vietnamese language. One thing, however, remains certain: Vietnamese is not a pure language. It seems to be a blend of several languages, ancient and modern, encountered throughout history following successive contacts between foreign peoples and the people of Vietnam.

Language policy

While spoken by the Vietnamese people for millennia, written Vietnamese did not become the official administrative language of Vietnam until the 20th century. For most of its history, the entity now known as Vietnam used written classical Chinese. In the 13th century, however, the country invented Chữ nôm, a writing system making use of Chinese characters with phonetic elements in order to better suit the tones associated with the Vietnamese language. Chữ nôm was proven to be much more efficient than classical Chinese characters that it was extensively used in the 17th and 18th centuries for poetry and literature. Chữ nôm was used for administrative purposes during the brief Hồ and Tây Sơn Dynasties. During French colonialism, French superseded Chinese in administration. It was not until independence from France that Vietnamese was used officially. It is the language of instruction in schools and universities and is the language for official business.


The words in orange belong to the Vietnamese native lexical stock while the ones in green belong to the Sino-Vietnamese vocabulary.

Like many other Asian countries, as a result of close ties with China for thousands of years, much of the Vietnamese lexicon relating to science and politics is derived from Chinese. At least 60% of the lexical stock has Chinese roots, not including naturalized word borrowings from China, although many compound words are Sino-Vietnamese, composed of native Vietnamese words combined with Chinese borrowings. One can usually distinguish between a native Vietnamese word and a Chinese borrowing if it can be reduplicated or its meaning does not change when the tone is shifted. As a result of French occupation, Vietnamese has since had many words borrowed from the French language, for example cà phê (from French café). Nowadays, many new words are being added to the language's lexicon due to heavy Western cultural influence; these are usually borrowed from English, for example TV (though usually seen in the written form as tivi). Sometimes these borrowings are calques literally translated into Vietnamese (for example, software is calqued into phần mềm, which literally means "soft part").



Like other southeast Asian languages, Vietnamese has a comparatively large number of vowels. Below is a vowel diagram of Hanoi Vietnamese.

  Front Central Back
High i [i] ư [ɨ] u [u]
Upper Mid ê [e] â [ə] / ơ [əː] ô [o]
Lower Mid e [ɛ] o [ɔ]
Low ă [a] / a [aː]

Front, central, and low vowels (i, ê, e, ư, â, ơ, ă, a) are unrounded, whereas the back vowels (u, ô, o) are rounded. The vowels â [ə] and ă [a] are pronounced very short, much shorter than the other vowels. Thus, ơ and â are basically pronounced the same except that ơ [əː][4] is long while â [ə] is short – the same applies to the low vowels long a [aː] and short ă [a].[5]

In addition to single vowels (or monophthongs), Vietnamese has diphthongs[6] and triphthongs. The diphthongs consist of a main vowel component followed by a shorter semivowel offglide to a high front position [ɪ], a high back position [ʊ], or a central position [ə].[7]

Vowel nucleus Diphthong with front offglide Diphthong with back offglide Diphthong with centering offglide Triphthong with front offglide Triphthong with back offglide
i iu [iʊ̯] ia~iê~yê [iə̯] iêu [iə̯ʊ̯]
ê êu [eʊ̯]
e eo [ɛʊ̯]
ư ưi [ɨɪ̯] ưu [ɨʊ̯] ưa~ươ [ɨə̯] ươi [ɨə̯ɪ̯] ươu [ɨə̯ʊ̯]
â ây [əɪ̯] âu [əʊ̯]
ơ ơi [əːɪ̯]
ă ay [aɪ̯] au [aʊ̯]
a ai [aːɪ̯] ao [aːʊ̯]
u ui [uɪ̯] ua~uô [uə̯] uôi [uə̯ɪ̯]
ô ôi [oɪ̯]
o oi [ɔɪ̯]

The centering diphthongs are formed with only the three high vowels (i, ư, u) as the main vowel. They are generally spelled as ia, ưa, ua when they end a word and are spelled , ươ, , respectively, when they are followed by a consonant. There are also restrictions on the high offglides: the high front offglide cannot occur after a front vowel (i, ê, e) nucleus and the high back offglide cannot occur after a back vowel (u, ô, o) nucleus.[8]

The correspondence between the orthography and pronunciation is complicated. For example, the offglide [ɪ̯] is usually written as i however, it may also be represented with y. In addition, in the diphthongs [aɪ̯] and [aːɪ̯] the letters y and i also indicate the pronunciation of the main vowel: ay = ă + [ɪ̯], ai = a + [ɪ̯]. Thus, tay "hand" is [taɪ̯] while tai "ear" is [taːɪ̯]. Similarly, u and o indicate different pronunciations of the main vowel: au = ă + [ʊ̯], ao = a + [ʊ̯]. Thus, thau "brass" is [tʰaʊ̯] while thao "raw silk" is [tʰaːʊ̯].

The four triphthongs are formed by adding front and back offglides to the centering diphthongs. Similarly to the restrictions involving diphthongs, a triphthong with front nucleus cannot have a front offglide (after the centering glide) and a triphthong with a back nucleus cannot have a back offglide.

With regards to the front and back offglides [ɪ̯, ʊ̯], many phonological descriptions analyze these as consonant glides /j, w/. Thus, a word such as đâu "where", phonetically [ɗəʊ̯], would be phonemicized as /ɗəw/.


Pitch contours and duration of the six Northern Vietnamese tones as uttered by a male speaker (not from Hanoi). Fundamental frequency is plotted over time. From Nguyễn & Edmondson (1998).

Vietnamese vowels are all pronounced with an inherent tone.[9] Tones differ in:

Tone is indicated by diacritics written above or below the vowel (most of the tone diacritics appear above the vowel; however, the nặng tone dot diacritic goes below the vowel).[10] The six tones in the northern varieties (including Hanoi), with their self-referential Vietnamese names, are:

Name Description Diacritic Example Sample vowel
ngang   'level' mid level (no mark) ma  'ghost' About this sound a
huyền   'hanging' low falling (often breathy) ` (grave accent)  'but' About this sound à
sắc   'sharp' high rising ´ (acute accent)  'cheek, mother (southern)' About this sound á
hỏi   'asking' mid dipping-rising  ̉ (hook) mả  'tomb, grave' About this sound 
ngã   'tumbling' high breaking-rising ˜ (tilde)  'horse (Sino-Vietnamese), code' About this sound ã
nặng   'heavy' low falling constricted (short length)  ̣ (dot below) mạ  'rice seedling' About this sound 

Other dialects of Vietnamese have fewer tones (typically only five). See the language variation section for a brief survey of tonal differences among dialects.

In Vietnamese poetry, tones are classed into two groups:

Tone group Tones within tone group
bằng "level, flat" ngang and huyền
trắc "oblique, sharp" sắc, hỏi, ngã, and nặng

Words with tones belonging to particular tone group must occur in certain positions with the poetic verse.


The consonants that occur in Vietnamese are listed below in the Vietnamese orthography with the phonetic pronunciation to the right.

Labial Alveolar Retroflex Palatal Velar Glottal
Stop voiceless p [p] t [t] tr [tʂ~ʈ] ch [c~tɕ] c/k/q [k]
aspirated   th [tʰ]
voiced b [ɓ] đ [ɗ]
Fricative voiceless ph [f] x [s] s [ʂ] kh [x] h [h]
voiced v [v] gi [z] r [ʐ~ɹ] d [z~j] g/gh [ɣ]
Nasal m [m] n [n] nh [ɲ] ng/ngh [ŋ]
Approximant u/o [w] l [l] y/i [j]

Some consonant sounds are written with only one letter (like "p"), other consonant sounds are written with a two-letter digraph (like "ph"), and others are written with more than one letter or digraph (the velar stop is written variously as "c", "k", or "q").

Not all dialects of Vietnamese have the same consonant in a given word (although all dialects use the same spelling in the written language). See the language variation section for further elaboration.

The analysis of syllable-final orthographic ch and nh in Hanoi Vietnamese has had different analyses. One analysis has final ch, nh as being phonemes /c, ɲ/ contrasting with syllable-final t, c /t, k/ and n, ng /n, ŋ/ and identifies final ch with the syllable-initial ch /c/. The other analysis has final ch and nh as predictable allophonic variants of the velar phonemes /k/ and /ŋ/ that occur before upper front vowels i /i/ and ê /e/. (See Vietnamese phonology: Analysis of final ch, nh for further details.)

Language variation

There are various mutually intelligible regional varieties (or dialects), the main four being:[11]

Dialect region Localities Names under French colonization
Northern Vietnamese Hanoi, Haiphong, and various provincial forms Tonkinese
North-central (or Area IV) Vietnamese Nghệ An (Vinh, Thanh Chương), Thanh Hoá, Quảng Bình, Hà Tĩnh High Annamese
Central Vietnamese Huế, Quảng Nam Low Annamese
Southern Vietnamese Saigon, Mekong (Far West) Cochinchinese
Icon of loudspeaker
The first article of the Universal Declaration of Human Rights spoken by Nghiem Mai Phuong, native speaker of a northern variety. (audio help)
Icon of loudspeaker
Ho Chi Minh reading his Declaration of Independence. Ho Chi Minh is from Nghe An Province, speaking a northern-central variety. (audio help)

Vietnamese has traditionally been divided into three dialect regions: North, Central, and South. However, Michel Fergus and Nguyễn Tài Cẩn offer evidence for considering a North-Central region separate from Central. The term Haut-Annam refers to dialects spoken from northern Nghệ An Province to southern (former) Thừa Thiên Province that preserve archaic features (like consonant clusters and undiphthongized vowels) that have been lost in other modern dialects.

These dialect regions differ mostly in their sound systems (see below), but also in vocabulary (including basic vocabulary, non-basic vocabulary, and grammatical words) and grammar.[12] The North-central and Central regional varieties, which have a significant amount of vocabulary differences, are generally less mutually intelligible to Northern and Southern speakers. There is less internal variation within the Southern region than the other regions due to its relatively late settlement by Vietnamese speakers (in around the end of the 15th century). The North-central region is particularly conservative. Along the coastal areas, regional variation has been neutralized to a certain extent, while more mountainous regions preserve more variation. As for sociolinguistic attitudes, the North-central varieties are often felt to be "peculiar" or "difficult to understand" by speakers of other dialects.

It should be noted that the large movements of people between North and South beginning in the mid-20th century and continuing to this day have resulted in a significant number of Southern residents speaking in the Northern accent/dialect and, to a lesser extent, Northern residents speaking in the Southern accent/dialect. Following the Geneva Accords of 1954 that called for the temporary division of the country, almost a million northerners (mainly from Hanoi and the surrounding Red River Delta areas) moved south (mainly to Saigon, now Ho Chi Minh City, and the surrounding areas) as part of Operation Passage to Freedom. About a third of that number of people made the move in the reverse direction.

Following the reunification of Vietnam in 1975–76, Northern and North-Central speakers from the densely populated Red River Delta and the traditionally poorer provinces of Nghe An, Ha Tinh and Quang Binh have continued to move South to look for better economic opportunities. Additionally, government and military personnel are posted to various locations throughout the country, often away from their home regions. More recently, the growth of the free market system has resulted in business people and tourists traveling to distant parts of Vietnam. These movements have resulted in some small blending of the dialects but, more significantly, have made the Northern dialect more easily understood in the South and vice versa. It is also interesting to note that most Southerners, when singing modern/popular Vietnamese songs, would do so in the Northern accent. This is true in Vietnam as well as in the overseas Vietnamese communities.

Regional variation in grammatical words[13]
Northern Central Southern English gloss
này ni or nì nầy "this"
thế này ri vầy "thus, this way"
ấy nớ, đó "that"
thế, thế ấy rứa, rứa tê vậy đó "thus, so, that way"
kia đó "that yonder"
kìa tề đó "that yonder (far away)"
đâu đâu "where"
nào nào "which"
sao, thế nào răng sao "how, why"
tôi tui tui "I, me (polite)"
tao tau tao, qua "I, me (arrogant, familiar)"
chúng tôi bầy tui tụi tui "we, us (but not you, polite)"
chúng tao bầy choa tụi tao "we, us (but not you, arrogant, familiar)"
mày mi mầy "you (thou) (arrogant, familiar)"
chúng mày bây, bọn bây tụi mầy "you guys, y'all (arrogant, familiar)"
hắn, nghỉ "he/him, she/her, it (arrogant, familiar)"
chúng nó bọn hắn tụi nó "they/them (arrogant, familiar)"
ông ấy ông nớ ổng "he/him, that gentleman, sir"
bà ấy mệ nớ, mụ nớ, bà nớ bả "she/her, that lady, madam"
cô ấy o nớ cổ "she/her, that unmarried young lady"
chị ấy ả nớ chỉ "she/her, that young lady"
anh ấy eng nớ ảnh "he/him, that young man (of equal status)"

The syllable-initial ch and tr digraphs are pronounced distinctly in North-central, Central, and Southern varieties, but are merged in Northern varieties (i.e. they are both pronounced the same way). The North-central varieties preserve three distinct pronunciations for d, gi, and r whereas the North has a three-way merger and the Central and South have a merger of d and gi while keeping r distinct. At the end of syllables, palatals ch and nh have merged with alveolars t and n, which, in turn, have also partially merged with velars c and ng in Central and Southern varieties.

Regional consonant correspondences
Syllable position Orthography Northern North-central Central Southern
syllable-initial x [s] [s] [s] [s]
s [ʂ] [ʂ] [ʂ]
ch [tɕ] [tɕ] [tɕ] [tɕ]
tr [tʂ] [tʂ] [tʂ]
r [z] [ɹ] [ɹ] [ɹ]
d [ɟ] [j] [j]
gi [z]
v [14] [v] [v]
syllable-final c [k] [k] [k] [k]
t [t] [t]
after e
[k, t]
after ê
[t] [k, t]
after i
ch [c] [c]
ng [ŋ] [ŋ] [ŋ] [ŋ]
n [n] [n]
after i, ê
[n] [n]
nh [ɲ] [ɲ]

In addition to the regional variation described above, there is also a merger of l and n in certain rural varieties:

l, n variation
Orthography "Mainstream" varieties Rural varieties
n [n] [n]
l [l]

Variation between l and n can be found even in mainstream Vietnamese in certain words. For example, the numeral "five" appears as năm by itself and in compound numerals like năm mươi "fifty" but appears as lăm in mười lăm "fifteen". (See Vietnamese syntax: Cardinal numerals.) In some northern varieties, this numeral appears with an initial nh instead of l: hai mươi nhăm "twenty-five" vs. mainstream hai mươi lăm.[15]

The consonant clusters that were originally present in Middle Vietnamese (of the 17th century) have been lost in almost all modern Vietnamese varieties (but retained in other closely related Vietic languages). However, some speech communities have preserved some of these archaic clusters: "sky" is blời with a cluster in Hảo Nho (Yên Mô prefecture, Ninh Binh Province) but trời in Southern Vietnamese and giời in Hanoi Vietnamese (initial single consonants /ʈᶳ, z/, respectively).


Generally, the Northern varieties have six tones while those in other regions have five tones. The hỏi and ngã tones are distinct in North and some North-central varieties (although often with different pitch contours) but have merged in Central, Southern, and some North-central varieties (also with different pitch contours). Some North-central varieties (such as Hà Tĩnh Vietnamese) have a merger of the ngã and nặng tones while keeping the hỏi tone distinct. Still other North-central varieties have a three-way merger of hỏi, ngã, and nặng resulting in a four-tone system. In addition, there are several phonetic differences (mostly in pitch contour and phonation type) in the tones among dialects.

Regional tone correspondences
Tone Northern North-central Central Southern
 Vinh  Thanh
Hà Tĩnh
ngang ˧ 33 ˧˥ 35 ˧˥ 35 ˧˥ 35, ˧˥˧ 353 ˧˥ 35 ˧ 33
huyền ˨˩̤ 21̤ ˧ 33 ˧ 33 ˧ 33 ˧ 33 ˨˩ 21
sắc ˧˥ 35 ˩ 11 ˩ 11, ˩˧̰ 13̰ ˩˧̰ 13̰ ˩˧̰ 13̰ ˧˥ 35
hỏi ˧˩˧̰ 31̰3 ˧˩ 31 ˧˩ 31 ˧˩̰ʔ 31̰ʔ ˧˩˨ 312 ˨˩˦ 214
ngã ˧ʔ˥ 3ʔ5 ˩˧̰ 13̰ ˨̰ 22̰
nặng ˨˩̰ʔ 21̰ʔ ˨ 22 ˨̰ 22̰ ˨̰ 22̰ ˨˩˨ 212

The table above shows the pitch contour of each tone using Chao tone number notation (where 1 = lowest pitch, 5 = highest pitch); glottalization (creaky, stiff, harsh) is indicated with the ⟨◌̰⟩ symbol; breathy voice with ⟨◌̤⟩; glottal stop with ⟨ʔ⟩; sub-dialectal variants are separated with commas. (See also the tone section below.)


Vietnamese, like many languages in Southeast Asia, is an analytic (or isolating) language. Vietnamese does not use morphological marking of case, gender, number or tense (and, as a result, has no finite/nonfinite distinction).[16] Also like other languages in the region, Vietnamese syntax conforms to subject–verb–object word order, is head-initial (displaying modified-modifier ordering), and has a noun classifier system. Additionally, it is pro-drop, wh-in-situ, and allows verb serialization.

Some Vietnamese sentences with English word glosses and translations are provided below.

Mai sinh viên.
Mai be student
"Mai is a student." (College student)
Giáp rất cao.
Giap very tall
"Giap is very tall."
Người đó anh nó.
person that be brother he
"That person is his brother."
Con chó này chẳng bao giờ sủa cả.
classifier dog this not ever bark at.all
"This dog never barks at all."
chỉ ăn cơm Việt Nam thôi.
he only eat rice.colloquial Vietnam only
"He only eats Vietnamese food."
Cái thằng chồng em chẳng ra gì.
focus classifier husband I (as wife) he not turn.out what
"That husband of mine, he is good for nothing."
Tôi thích con ngựa đen.
I (generic) like classifier horse black
"I like the black horse."
Tôi thích cái con ngựa đen.
I (generic) like focus classifier horse black
"I like that black horse."

Writing system

Currently, the written language uses the Vietnamese alphabet (quốc ngữ or "national script", literally "national language"), based on the Latin alphabet. Originally a Romanization of Vietnamese, it was codified in the 17th century by a French Jesuit missionary named Alexandre de Rhodes (1591–1660), based on works of earlier Portuguese missionaries (Gaspar do Amaral and António Barbosa). The use of the script was gradually extended from its initial domain in Christian writing to become more popular among the general public.

Under French colonial rule, the script became official and required for all public documents in 1910 by issue of a decree by the French Résident Supérieur of the protectorate of Tonkin. By the end of first half 20th century virtually all writings were done in quốc ngữ.

Changes in the script were made by French scholars and administrators and by conferences held after independence during 1954–1974. The script now reflects a so-called Middle Vietnamese dialect that has vowels and final consonants most similar to northern dialects and initial consonants most similar to southern dialects (Nguyễn 1996). This Middle Vietnamese is presumably close to the Hanoi variety as spoken sometime after 1600 but before the present. (This is not unlike how English orthography is based on the Chancery Standard of late Middle English, with many spellings retained even after significant phonetic change.)

Before adopting Roman script under French rule, Vietnamese used two ideographic writing systems:

  • Literary Chinese chữ nho characters (scholar's characters,

