A language tree, or family tree with languages substituted for real family members, has the form of a node-link diagram of a logical tree structure. Additional linguistics terminology derives from it. Such a diagram contains branch points, or nodes, from which the daughter languages descend by different links. The nodes are proto-languages or common languages. The concept of descent of a language means that a linked language was created by a process of gradual modification over time (historically centuries or millennia) of the language at the next-earliest node.[1] Modification is detected or hypothesized in comparative linguistics by comparing features in one language that appear similar to parallel features in another. A common ancestor is then assumed for the feature, rightly or wrongly, if a rule can be found to explain the modification.


History of the model

The confusion of Babel

Family tree of Biblical tribes

Historical linguistics is believed to have not been possible in Europe from the dominance of Christianity in the late Roman empire to the Age of Enlightenment due to literal adherence to Genesis 11:1-9, which offers an explanation of why languages differ: "And the whole earth was of one language, and of one speech." The descendants of Noah gathered together in the land of Shinar and constructed the Tower of Babel to reach to heaven. In response to their over-reaching the Lord decided to "confound their language, that they may not understand one another's speech" and "scattered them abroad from thence upon the face of the earth." If languages were given by God, then they did not evolve, and there is no point in comparing them. However, the belief that Genesis contradicts the evolution of language is a self-contradiction, because the languages that became evident about Babel would have naturally evolved thereafter. In fact the only implication of the story of Babel in regards to historical linguistics is that there is a possibility of more than one "master language".

The Christian philosopher, Saint Augustine of Hippo, supposed that each of the descendants of Noah founded a nation and that each nation was given its own language: Assyrian for Assur, Hebrew for Heber, and so on.[2] In all he identified 72 nations, tribal founders and languages. The confusion and dispersion occurred in the time of Peleg, son of Heber, son of Shem, son of Noah.

St. Augustine then makes a hypothesis not unlike those of later historical linguists, that the family of Heber "preserved that language not unreasonably believed to have been the common language of the race ... thenceforth named Hebrew." Most of the 72 languages, however, date to many generations after Heber. St. Augustine solves this first problem as though of historical linguistics by supposing that Heber, who lived 430 years, was still alive when God assigned the 72.

Ursprache, the language of paradise

St. Augustine's hypothesis stood without major question for over a thousand years and then, in a series of tracts, published in 1684, expressing skepticism concerning various beliefs, especially Biblical, Sir Thomas Browne wrote:[3]

"Though the earth were widely peopled before the flood ... yet whether, after a large dispersion, and the space of sixteen hundred years, men maintained so uniform a language in all parts, ... may very well be doubted."

Garden of Eden, home of the Ursprache

By then discovery of the New World and the Far East had brought knowledge of new languages far beyond the 72 calculated by St. Augustine. Citing the native American languages Browne suggests the "confusion of tongues at first fell only upon those present in Sinaar at the work of Babel ...." For those "about the foot of the hills, whereabout the ark rested ... their primitive language might in time branch out into several parts of Europe and Asia ...."[4] This is an inkling of a tree. In Browne's view, the differences in language are to be accounted for by simplification from a larger aboriginal language than Hebrew. He suggests ancient Chinese, from which the others descended by "confusion, admixtion and corruption."[5] Later he invokes "commixture and alteration."[6]

Browne reports a number of reconstructive activities by the scholars of the times:[7]

"The learned Casaubon conceiveth that a dialogue might be composed in Saxon, only of such words as are derivable from the Greek ... Verstegan made no doubt that he could contrive a letter that might be understood by the English, Dutch, and East Frislander ... And if, as the learned Buxhornius contendeth, the Scythian language as the mother tongue runs throughout the nations of Europe, and even as far as Persia, the community on many words, between so many nations, hath more reasonable traduction and were rather derivable from the common tongue diffused through them all, than from any particular nation, which hath also borrowed and holdeth but at second hand."

The confusion at the Tower of Babel was thus removed as an obstacle by setting it aside. Attempts to find similarities in all languages were resulting in the gradual uncovering of an ancient master language from which all the other languages derive. Browne undoubtedly did his writing and thinking well before 1684. In that same revolutionary century in Britain James Howell, a royalist, imprisoned in the Fleet during the most troubled years, 1642–1651, ostensibly for debt, but imprisonment preventing him from performing his duties as secretary to the Privy Council, published from prison Volume II of Epistolae Ho-Elianae, quasi-fictional letters to various important persons in the realm containing valid historical information. In Letter LVIII the metaphor of a tree of languages appears fully developed short of being a professional linguist's view:[8]

"I will now hoist sail for the Netherlands, whose language is the same dialect with the English, and was so from the beginning, being both of them derived from the high Dutch [Howell is wrong here]: The Danish also is but a branch of the same tree ... Now the High Dutch or Teutonick Tongue, is one of the prime and most spacious Maternal Languages of Europe ... it was the language of the Goths and Vandals, and continueth yet of the greatest part of Poland and Hungary, who have a Dialect of hers for their vulgar tongue ... Some of her writers would make this world believe that she was the language spoken in paradise."

The search for the language of paradise was on among all the linguists of Europe. Those who wrote in Latin called it the lingua prima, the lingua primaeva or the lingua primigenia. In English it was the Adamic language; in German, the Ursprache or the hebräische Ursprache if one believed it was Hebrew. This mysterious language had the aura of purity and incorruption about it, and those qualities were the standards used to select candidates. Ursprache was in use well before the neo-grammarians adopted it for their proto-languages. The gap between the widely divergent families of languages remained unclosed.

The first Indo-Europeanists

Possible homeland of the Indo-Europeans, western Kazakhstan

On February 2, 1786, Sir William Jones delivered his Third Anniversary Discourse to the Asiatic Society as its president on the topic of the Hindus. In it he applied the logic of the tree model to three languages, Greek, Latin and Sanskrit, but for the first time in history on purely linguistic grounds, noting "a stronger affinity, both in the roots of the verbs and in the forms of grammar, than could possibly have been produced by accident; ...." He went on to postulate that they sprang from "some common source, which, perhaps, no longer exists." To them he added Gothic, Celtic and Persian as "to the same family."[9]

Jones did not name his "common source" nor develop the idea further, but it was taken up by the linguists of the times. In the (London) Quarterly Review of late 1813-1814, Thomas Young[disambiguation needed ] published a review of Johann Christoph Adelung's Mithridates, oder allgemeine Sprachenkunde ("Mithridates, or a General History of Languages"), Volume I of which had come out in 1806, and Volumes II and III, 1809-1812, continued by Johann Severin Vater. Adelung's work described some 500 "languages and dialects" and hypothesized a universal descent from the language of paradise, located in Kashmir central to the total range of the 500. Young begins by pointing out Adelung's indebtedness to Conrad Gesner's Mithridates, de Differentiis Linguarum of 1555 and other subsequent catalogues of languages and alphabets.[10]

Kashmir (red), Adelung's location of Eden.

Young undertakes to present Adelung's classification. The monosyllabic type is most ancient and primitive, spoken in Asia, to the east of Eden, in the direction of Adam's exit from Eden. Then follows Jones' group still without a name, but attributed to Jones: "Another ancient and extensive class of languages united by a greater number of resemblances than can well be altogether accidental." For this class he offers a name,[11] "Indoeuropean," the first known linguistic use of the word, but not its first known use. The British East India Company was using "Indo-European commerce" to mean the trade of commodities between India and Europe.[12] All the evidence Young cites for the ancestral group are the most similar words: mother, father, etc.

Adelung's additional classes were the Tataric, the African and the American, which depend on geography and a presumed descent from Eden. Young does not share Adelung's enthusiasm for the language of paradise and brands it as mainly speculative.

Young's designation, successful in English, was only one of several candidates proposed between 1810 and 1867: indo-germanique (Conrad Malte-Brun, 1810), japetisk (Rasmus Christian Rask, 1815), Indo-Germanisch (Julius Klaproth, 1823), indisch-teutsch (F. Schmitthenner, 1826), sanskritisch (Wilhelm von Humboldt, 1827), indokeltisch (A. F. Pott, 1840), arioeuropeo (Graziadio Isaia Ascoli, 1854), Aryan (Max Müller, 1861) and aryaque (H. Chavée, 1867). These men were all polyglots and prodigies in languages. Klaproth, author of the successful German-language candidate, Indo-Germanisch, who criticised Jones for his uncritical method, knew Chinese, Japanese, Tibetan and a number of other languages with their scripts. The concept of a Biblical Ursprache appealed to their imagination. As hope of finding it gradually died they fell back on the strengthening concept of common Indo-European spoken by nomadic tribes on the plains of Eurasia and although they made a good case that this language can be deduced by the methods of comparative linguistics in fact that is not how they obtained it. It was the one case in which their efforts to find the Ursprache succeeded.

The neogrammarians

The model is due in its most strict formulation to the Neogrammarians. The model relies on earlier conceptions of William Jones, Franz Bopp and August Schleicher by adding the exceptionlessness of the sound laws and the regularity of the process.

The phylogenetic tree

The old metaphor was given an entirely new meaning under the old name by Joseph Harold Greenberg in a series of essays beginning about 1950. Since the adoption of the family tree metaphor by the linguists, the concept of evolution had been proposed by Charles Darwin and was generally accepted in biology. Taxonomy, the classification of living things, had already been invented by Carl Linnaeus. It used a binomial nomenclature to assign a species name and a genus name to every known living organism. These were arranged in a biological hierarchy under several phyla, or most general groups, branching ultimately to the various species. The basis for this biological classification was the observed shared physical features of the species.

Darwin, however, reviving another ancient metaphor, the tree of life, hypothesized that the groups of the Linnaean classification (today's taxa), descended in a tree structure over time from simplest to most complex. The Linnaean hierarchical tree was synchronic; Darwin envisioned a diachronic process of common descent. Where Linnaeus had conceived ranks, which were consistent with the great chain of being adopted by the rationalists, Darwin conceived lineages. Over the decades after Darwin it became clear that the ranks of Linnaeus' hierarchy did not correspond exactly to the lineages. It became the prime goal of taxonomy to discover the lineages and alter the classification to reflect them, which it did under the overall guidance of the Nomenclature Codes, rule books kept by international organizations to authorize and publish proposals to reclassify species and other taxa. The new approach was called phylogeny, the "generation of phyla," which devised a new tree metaphor, the phylogenetic tree. One unit in the tree and all its offspring units were a clade and the discovery of clades was cladistics.

Greenberg's classification of African language families

Greenberg began writing during a time when phylogenetic systematics lacked the tools available to it later: the computer (computational systematics) and DNA sequencing (molecular systematics). To discover a cladistic relationship researchers relied on as large a number of morphological similarities among species as could be defined and tabulated. Statistically the greater the number of similarities the more likely species were to be in the same clade. This approach appealed to Greenberg, who was interested in discovering linguistic universals. Altering the tree model to make the family tree a phylogenetic tree he said:[13]

"Any language consists of thousands of forms with both sound and meaning ... any sound whatever can express any meaning whatever. Therefore, if two languages agree in a considerable number of such items ... we necessarily draw a conclusion of common historical origin. Such genetic classifications are not arbitrary ... the analogy here to biological classification is extremely close ... just as in biology we classify species in the same genus or high unit because the resemblances are such as to suggest a hypothesis of common descent, so with genetic hypotheses in language."

In this analogy, a language family is like a clade, the languages are like species, the proto-language is like an ancestor taxon, the language tree is like a phylogenetic tree and languages and dialects are like species and varieties. Greenberg formulated large tables of characteristics of hitherto neglected languages of Africa, the Americas, Indonesia and northern Eurasia and typed them according to their similarities. He called this approach typological classification, arrived at by descriptive linguistics rather than by comparative linguistics.[14]

Limitations of the model

One limitation of the Tree Model is that it requires a classification based on languages, or, more generally, on language varieties. Since a variety represents an abstraction over linguistic features, there is the possibility for information loss when translating data (e.g., from a map of isoglosses) into a tree. For example, there is the issue of dialect continua. This issue poses a similar problem for the concept of language variety as does the issue of ring species for the concept of species in biology.

An additional limitation of the Tree Model involves mixed and hybrid languages, as well as language mixing in general, since the Tree Model allows only for divergences. For example, according to Zuckermann (2009:63),[15] "Israeli", his term for Modern Hebrew, which he regards as a Semito-European hybrid, "demonstrates that the reality of linguistic genesis is far more complex than a simple family tree system allows. 'Revived' languages are unlikely to have a single parent."


See also

