English has changed on both sides of the Atlantic, and had been changing well before the voyage of the Mayflower. What grew into standard contemporary English was simply the dialect spoken around London, the political and economic center of England, in the seventeenth century. In the centuries preceding, it had undergone a number of major changes, as you can see in these versions of the Lord’s Prayer:
CONTEMPORARY ENGLISH: Our Father, who is in heaven, may your name be kept holy. May your kingdom come into being. May your will be followed on earth, just as it is in heaven. Give us this day our food for the day. And forgive us our offenses, just as we forgive those who have offended us. And do not bring us to the test. But free us from evil. For the kingdom, the power, and the glory are yours forever. Amen.
EARLY MODERN ENGLISH (c. 1600): Our father which are in heaven, hallowed be thy Name. Thy kingdom come. Thy will be done, on earth, as it is in heaven. Give us this day our daily bread. And forgive us our trespasses, as we forgive those who trespass against us. And lead us not into temptation, but deliver us from evil. For thine is the kingdom, and the power, and the glory, for ever, amen.
MIDDLE ENGLISH (c. 1400): Oure fadir that art in heuenes halowid be thi name, thi kyngdom come to, be thi wille don in erthe es in heuene, yeue to us this day oure bread ouir other substance, & foryeue to us oure dettis, as we forgeuen to oure dettouris, & lede us not in to temptacion: but delyuer us from yuel, amen.
OLD ENGLISH (c. 1000): Faeder ure thu the eart on heofonum, si thin nama gehalgod. Tobecume thin rice. Gewurthe in willa on eorthan swa swa on heofonum. Urne gedaeghwamlican hlaf syle us to daeg. And forgyf us ure gyltas, swa swa we forgyfath urum gyltedum. And ne gelaed thu us on contnungen ac alys us of yfele. Sothlice.
The roots of English are in northern Germany near Denmark, which was inhabited early in the first millennium by pagan tribes called the Angles, the Saxons, and the Jutes. After the armies of the collapsing Roman Empire left Britain in the fifth century, these tribes invaded what was to become England (Angle-land) and displaced the indigenous Celts there into Scotland, Ireland, Wales, and Cornwall. Linguistically, the defeat was total; English has virtually no traces of Celtic. Vikings invaded in the ninth to eleventh centuries, but their language, Old Norse, was similar enough to Anglo-Saxon that aside from many borrowings, the language, Old English, did not change much.
In 1066 William the Conqueror invaded Britain, bringing with him the Norman dialect of French, which became the language of the ruling classes. When King John of the Anglo-Norman kingdom lost Normandy shortly after 1200, English reestablished itself as the exclusive language of England, though with a marked influence of French that lasts to this day in the form of thousands of words and a variety of grammatical quirks that go with them. This “Latinate” vocabulary—including such words as donate, vibrate, and desist—has a more restricted syntax; for example, you can say give the museum a painting but not donate the museum a painting, shake it up but not vibrate it up. The vocabulary also has its own sound pattern: Latinate words are largely polysyllabic with stress on the second syllable, such as desist, construct, and transmit, whereas their Anglo-Saxon synonyms stop, build, and send are single syllables. The Latinate words also trigger many of the sound changes that make English morphology and spelling so idiosyncratic, like electric-electricity and nation-national. Because Latinate words are longer, and are more formal because of their ancestry in the government, church, and schools of the Norman conquerors, overusing them produces the stuffy prose universally deplored by style manuals, such as The adolescents who had effectuated forcible entry into the domicile were apprehended versus We caught the kids who broke into the house. Orwell captured the flabbiness of Latinate English in his translation of a passage from Ecclesiastes into modern institutionalese:
I returned and saw under the sun, that the race is not to the swift, nor the battle to the strong, neither yet bread to the wise, nor yet riches to men of understanding, nor yet favour to men of skill; but time and chance happeneth to them all.
Objective consideration of contemporary phenomena compels the conclusion that success or failure in competitive activities exhibits no tendency to be commensurate with innate capacity, but that a considerable element of the unpredictable must invariably be taken into account.
English changed noticeably in the Middle English period (1100–1450) in which Chaucer lived. Originally all syllables were enunciated, including those now represented in spelling by “silent” letters. For example, make would have been pronounced with two syllables. But the final syllables became reduced to the generic schwa like the a in allow and in many cases they were eliminated entirely. Since the final syllables contained the case markers, overt case began to vanish, and the word order became fixed to eliminate the resulting ambiguity. For the same reason, prepositions and auxiliaries like of and do and will and have were bled of their original meanings and given important grammatical duties. Thus many of the signatures of modern English syntax were the result of a chain of effects beginning with a simple shift in pronunciation.
The period of Early Modern English, the language of Shakespeare and the King James Bible, lasted from 1450 to 1700. It began with the Great Vowel Shift, a revolution in the pronunciation of long vowels whose causes remain mysterious. (Perhaps it was to compensate for the fact that long vowels sounded too similar to short vowels in the monosyllables that were now prevalent; or perhaps it was a way for the upper classes to differentiate themselves from the lower classes once Norman French became obsolete.) Before the vowel shift, mouse had been pronounced “mooce”; the old “oo” turned into a diphthong. The gap left by the departed “oo” was filled by raising what used to be an “oh” sound; what we pronounce as goose had, before the Great Vowel Shift, been pronounced “goce.” That vacuum, in turn, was filled by the “o” vowel (as in hot, only drawn out), giving us broken from what had previously been pronounced more like “brocken.” In a similar rotation, the “ee” vowel turned into a diphthong; like had been pronounced “leek.” This dragged in the vowel “eh” to replace it; our geese was originally pronounced “gace.” And that gap was filled when the long version of ah was raised, resulting in name from what used to be pronounced “nahma.” The spelling never bothered to track these shifts, which is why the letter a is pronounced one way in cam and another way in came, where it had formerly been just a longer version of the a in cam. This is also why vowels are rendered differently in English spelling than in all the other European alphabets and in “phonetic” spelling.
Incidentally, fifteenth-century Englishmen did not wake up one day and suddenly pronounce their vowels differently, like a switch to Daylight Savings Time. To the people living through it, the Great Vowel Shift probably felt like the current trend in the Chicago area to pronounce hot like hat, or the growing popularity of that strange surfer dialect in which dude is pronounced something like “diiihh-hoooood.”
What happens if we try to go back farther in time? The languages of the Angles and the Saxons did not come out of thin air; they evolved from Proto-Germanic, the language of a tribe that occupied much of northern Europe in the first millennium B.C. The western branch of the tribe split into groups that gave us not only Anglo-Saxon, but German and its offshoot Yiddish, and Dutch and its offshoot Afrikaans. The northern branch settled Scandinavia and came to speak Swedish, Danish, Norwegian, and Icelandic. The similarities in vocabulary among these languages are visible in an instant, and there are many similarities in grammar as well, such as forms of the past-tense ending -ed.
The ancestors of the Germanic tribes left no clear mark in written history or the archeological record. But they did leave a special mark on the territory they occupied. That mark was discerned in 1786 by Sir William Jones, a British judge stationed in India, in one of the most extraordinary discoveries in all scholarship. Jones had taken up the study of Sanskrit, a long-dead language, and noted:
The Sanskrit language, whatever may be its antiquity, is of a wonderful str
ucture; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps no longer exists; there is a similar reason, though not quite so forcible, for supposing that both the Gothic [Germanic] and the Celtic, though blended with a very different idiom, had the same origin as the Sanskrit; and the old Persian might be added to the same family…
Here are the kinds of affinities that impressed Jones:
ENGLISH: brother mead is thou bearest he bears
GREEK: phrater methu esti phereis pherei
LATIN: frater est fers fert
OLD SLAVIC: bratre mid yeste berasi beretu
OLD IRISH: brathir mith is beri
SANSKRIT: bhrater medhu asti bharasi bharati
Such similarities in vocabulary and grammar are seen in an immense number of modern languages. Among others, they embrace Germanic, Greek, Romance (French, Spanish, Italian, Portuguese, Romanian), Slavic (Russian, Czech, Polish, Bulgarian, Serbo-Croatian), Celtic (Gaelic, Irish, Welsh, Breton), and Indo-Iranian (Persian, Afghan, Kurdish, Sanskrit, Hindi, Bengali, and the Romany language of the Gypsies). Subsequent scholars were able to add Anatolian (extinct languages spoken in Turkey, including Hittite), Armenian, Baltic (Lithuanian and Latvian), and Tocharian (two extinct languages spoken in China). The similarities are so pervasive that linguists have reconstructed a grammar and a large dictionary for a hypothetical common ancestor language, Proto-Indo-European, and a set of systematic rules by which the daughter languages changed. For example, Jacob Grimm (one of the two Grimm brothers, famous as collectors of fairy tales) discovered the rule by which p and t in Proto-Indo-European became f and th in Germanic, as one can see in comparing Latin pater and Sanskrit piter with English father.
The implications are mind-boggling. Some ancient tribe must have taken over most of Europe, Turkey, Iran, Afghanistan, Pakistan, northern India, western Russia, and parts of China. The idea has excited the imagination of a century of linguists and archeologists, though even today no one really knows who the Indo-Europeans were. Ingenious scholars have made guesses from the reconstructed vocabulary. Words for metals, wheeled vehicles, farm implements, and domesticated animals and plants suggest that the Indo-Europeans were a late Neolithic people. The ecological distributions of the natural objects for which there are Proto-Indo-European words—elm and willow, for example, but not olive or palm—have been used to place the speakers somewhere in the territory from inland northern Europe to southern Russia. Combined with words for patriarch, fort, horse, and weapons, the reconstructions led to an image of a powerful conquering tribe spilling out of an ancestral homeland on horseback to overrun most of Europe and Asia. The word “Aryan” became associated with the Indo-Europeans, and the Nazis claimed them as ancestors. More sanely, archeologists have linked them to artifacts of the Kurgan culture in the southern Russian steppes from around 3500 B.C., a band of tribes that first harnessed the horse for military purposes.
Recently the archeologist Colin Renfrew has argued that the Indo-European takeover was a victory not of the chariot but of the cradle. His controversial theory is that the Indo-Europeans lived in Anatolia (part of modern Turkey) on the flanks of the Fertile Cresent region around 7000 B.C., where they were among the world’s first farmers. Farming is a method for mass-producing human beings by turning land into bodies. Farmers’ daughters and sons need more land, and even if they moved just a mile or two from their parents, they would quickly engulf the less fecund hunter-gatherers standing in their way. Archeologists agree that farming spread in a wave that began in Turkey around 8500 B.C. and reached Ireland and Scandinavia by 2500 B.C. Geneticists recently discovered that a certain set of genes is most concentrated among modern people in Turkey and becomes progressively diluted as one moves through the Balkans to northern Europe. This supports the theory originally proposed by the human geneticist Luca Cavalli-Sforza that farming spread by the movement of farmers, as their offspring interbred with indigenous hunter-gatherers, rather than by the movement of farming techniques, as a fad adopted by the hunter-gatherers. Whether these people were the Indo-Europeans, and whether they spread into Iran, India, and China by a similar process, is still not known. It is an awesome possibility. Every time we use a word like brother, or form the past tense of an irregular verb like break-broke or drink-drank, we would be using the preserved speech patterns of the instigators of the most important event in human history, the spread of agriculture.
Most of the other human languages on earth can also be grouped into phyla descending from ancient tribes of astoundingly successful farmers, conquerers, explorers, or nomads. Not all of Europe is Indo-European. Finnish, Hungarian, and Estonian are Uralic languages, which together with Lappish, Samoyed, and other languages are the remnants of a vast nation based in central Russia about 7,000 years ago. Altaic is generally thought to include the main languages of Turkey, Mongolia, the Islamic republics of the former USSR, and much of central Asia and Siberia. The earliest ancestors are uncertain, but later ones include a sixth-century empire as well as the Mongolian empire of Genghis Khan and the Manchu dynasty. Basque is an orphan, presumably from an island of aboriginal Europeans that resisted the Indo-European tidal wave.
Afro-Asiatic (or Hamito-Semitic), including Arabic, Hebrew, Maltese, Berber, and many Ethiopian and Egyptian languages, dominates Saharan Africa and much of the Middle East. The rest of Africa is divided among three groups. Khoisan includes the IKung and other groups (formerly called “Hottentots” and “Bushmen”), whose ancestors once occupied most of sub-Saharan Africa. The Niger-Congo phylum includes the Bantu family, spoken by farmers from western Africa who pushed the Khoisan into their current small enclaves in southern and southeastern Africa. The third phylum, Nilo-Saharan, occupies three large patches in the southern Saharan region.
In Asia, Dravidian languages such as Tamil dominate southern India and are found in pockets to the north. Dravidian speakers must therefore be the descendants of a people who occupied most of the Indian subcontinent before the incursion of the Indo-Europeans. Some 40 languages between the Black Sea and the Caspian Sea belong to the family called Caucasian (not to be confused with the informal racial term for the typically light-skinned people of Europe and Asia). Sino-Tibetan includes Chinese, Burmese, and Tibetan. Austronesian, having nothing to do with Australia (Austr- means “south”), includes the languages of Madagascar off the coast of Africa, Indonesia, Malaysia, the Philippines, New Zealand (Maori), Micronesia, Melanesia, and Polynesia, all the way to Hawaii—the record of people with extraordinary wanderlust and seafaring skill. Vietnamese and Khmer (the language of Cambodia) fall into Austro-Asiatic. The 200 aboriginal languages of Australia belong to a family of their own, and the 800 of New Guinea belong to a family as well, or perhaps to a small number of families. Japanese and Korean look like linguistic orphans, though a few linguists lump one or both with Altaic.
What about the Americas? Joseph Greenberg, whom we met earlier as the founder of the study of language universals, also classifies languages into phyla. He played a large role in unifying the 1,500 African languages into their four groups. Recently he has claimed that the 200 language stocks of native Americans can be grouped into only three phyla, each descending from a group of migrants who came over the Bering land bridge from Asia beginning 12,000 years ago or earlier. The Eskimos and Aleuts were the most recent immigrants. They were preceded by the Na-Dene, who occupied most of Alaska and northwestern Canada and embrace some of the languages of the American Southwest such as Navajo and Apache. This much is widely accepted. But Greenberg has also proposed that all the other languages, from Hudson Bay to Tierra del Fuego, belong to a single phylum, Amerind. The sweeping idea that America was settled by only three migrations has re
ceived some support from recent studies by Cavalli-Sforza and others of modern natives’ genes and tooth patterns, which fall into groups corresponding roughly to the three language phyla.
At this point we enter a territory of fierce controversy but potentially large rewards. Greenberg’s hypothesis has been furiously attacked by other scholars of American languages. Comparative linguistics is an impeccably precise domain of scholarship, where radical divergences between related languages over centuries or a few millennia can with great confidence be traced back step by step to a common ancestor. Linguists raised in this tradition are appalled by Greenberg’s unorthodox method of lumping together dozens of languages based on rough similarities in vocabulary, rather than carefully tracing sound-changes and reconstructing proto-languages. As an experimental psycholinguist who deals with the noisy data of reaction times and speech errors, I have no problem with Greenberg’s use of many loose correspondences, or even with the fact that some of his data contain random errors. What bothers me more is his reliance on gut feelings of similarity rather than on actual statistics that control for the number of correspondences that might be expected by chance. A charitable observer can always spot similarities in large vocabulary lists, but that does not imply that they descended from a common lexical ancestor. It could be a coincidence, like the fact that the word for “blow” is pneu in Greek and pniw in Klamath (an American Indian language spoken in Oregon), or the fact that the word for “dog” in the Australian aboriginal language Mbabaram happens to be dog. (Another serious problem, which Greenberg’s critics do point out, is that languages can resemble each other because of lateral borrowing rather than vertical inheritance, as in the recent exchanges that led to her negligées and le weekend.)