The Language Instinct: How the Mind Creates Language
In the year of our Lord 1957, the linguist Martin Joos reviewed the preceding three decades of research in linguistics and concluded that God had actually gone much farther in confounding the language of Noah’s descendants. Whereas the God of Genesis was said to be content with mere mutual unintelligibility, Joos declared that “languages could differ from each other without limit and in unpredictable ways.” That same year, the Chomskyan revolution began with the publication of Syntactic Structures, and the next three decades took us back to the literal biblical account. According to Chomsky, a visiting Martian scientist would surely conclude that aside from their mutually unintelligible vocabularies, Earthlings speak a single language.
Even by the standards of theological debates, these interpretations are strikingly different. Where did they come from? The 4,000 to 6,000 languages of the planet do look impressively different from English and from one another. Here are the most conspicuous ways in which languages can differ from what we are used to in English:
English is an “isolating” language, which builds sentences by rearranging immutable word-sized units, like Dog bites man and Man bites dog. Other languages express who did what to whom by modifying nouns with case affixes, or by modifying the verb with affixes that agree with its role-players in number, gender, and person. One example is Latin, an “inflecting” language in which each affix contains several pieces of information; another is Kivunjo, an “agglutinating” language in which each affix conveys one piece of information and many affixes are strung together, as in the eight-part verb in Chapter 5.
English is a “fixed-word-order” language where each phrase has a fixed position. “Free-word-order” languages allow phrase order to vary. In an extreme case like the Australian aboriginal language Warlpiri, words from different phrases can be scrambled together: This man speared a kangaroo can be expressed as Man this kangaroo speared, Man kangaroo speared this, and any of the other four orders, all completely synonymous.
English is an “accusative” language, where the subject of an intransitive verb, like she in She ran, is treated identically to the subject of a transitive verb, like she in She kissed Larry, and different from the object of the transitive verb, like her in Larry kissed her. “Ergative” languages like Basque and many Australian languages have a different scheme for collapsing these three roles. The subject of an intransitive verb and the object of a transitive verb are identical, and the subject of the transitive is the one that behaves differently. It is as if we were to say Ran her to mean “She ran.”
English is a “subject-prominent” language in which all sentences must have a subject (even if there is nothing for the subject to refer to, as in It is raining or There is a unicorn in the garden). In “topic-prominent” languages like Japanese, sentences have a special position that is filled by the current topic of the conversation, as in This place, planting wheat is good or California, climate is good.
English is an “SVO” language, with the order subject-verb-object (Dog bites man). Japanese is subject-object-verb (SOV: Dog man bites); Modern Irish (Gaelic) is verb-subject-object (VSO: Bites dog man).
In English, a noun can name a thing in any construction: a banana; two bananas; any banana; all the bananas. In “classifier” languages, nouns fall into gender classes like human, animal, inanimate, one-dimensional, two-dimensional, cluster, tool, food, and so on. In many constructions, the name for the class, not the noun itself, must be used—for example, three hammers would be referred to as three tools, to wit hammer.
And, of course, a glance at a grammar for any particular language will reveal dozens or hundreds of idiosyncrasies.
On the other hand, one can also hear striking universals through the babble. In 1963 the linguist Joseph Greenberg examined a sample of 30 far-flung languages from five continents, including Serbian, Italian, Basque, Finnish, Swahili, Nubian, Masaai, Berber, Turkish, Hebrew, Hindi, Japanese, Burmese, Malay, Maori, Mayan, and Quechua (a descendant of the language of the Incas). Greenberg was not working in the Chomskyan school; he just wanted to see if any interesting properties of grammar could be found in all these languages. In his first investigation, which focused on the order of words and morphemes, he found no fewer than forty-five universals.
Since then, many other surveys have been conducted, involving scores of languages from every part of the world, and literally hundreds of universal patterns have been documented. Some hold absolutely. For example, no language forms questions by reversing the order of words within a sentence, like Built Jack that house the this is? Some are statistical: subjects normally precede objects in almost all languages, and verbs and their objects tend to be adjacent. Thus most languages have SVO or SOV order; fewer have VSO; VOS and OVS are rare (less than 1%); and OSV may be nonexistent (there are a few candidates, but not all linguists agree that they are OSV). The largest number of universals involve implications: if a language has X, it will also have Y. We came across a typical example of an implicational universal in Chapter 4: if the basic order of a language is SOV, it will usually have question words at the end of the sentence, and postpositions; if it is SVO, it will have question words at the beginning, and prepositions. Universal implications are found in all aspects of language, from phonology (for instance, if a language has nasal vowels, it will have non-nasal vowels) to word meanings (if a language has a word for “purple,” it will have a word for “red”; if a language has a word for “leg,” it will have a word for “arm.”)
If lists of universals show that languages do not vary freely, do they imply that languages are restricted by the structure of the brain? Not directly. First one must rule out two alternative explanations.
One possibility is that language originated only once, and all existing languages are the descendants of that proto-language and retain some of its features. These features would be similar across the languages for the same reason that alphabetical order is similar across the Hebrew, Greek, Roman, and Cyrillic alphabets. There is nothing special about alphabetical order; it was just the order that the Canaanites invented, and all Western alphabets came from theirs. No linguist accepts this as an explanation for language universals. For one thing, there can be radical breaks in language transmission across the generations, the most extreme being creolization, but universals hold of all languages including creoles. Moreover, simple logic shows that a universal implication, like “If a language has SVO order, then it has prepositions, but if it has SOV order, then it has postpositions,” cannot be transmitted from parent to child the way words are. An implication, by its very logic, is not a fact about English: children could learn that English is SVO and has prepositions, but nothing could show them that if a language is SVO, then it must have prepositions. A universal implication is a fact about all languages, visible only from the vantage point of a comparative linguist. If a language changes from SOV to SVO over the course of history and its postpositions flip to prepositions, there has to be some explanation of what keeps these two developments in sync.
Also, if universals were simply what is passed down through the generations, we would expect that the major differences between kinds of language should correlate with the branches of the linguistic family tree, just as the difference between two cultures generally correlates with how long ago they separated. As humanity’s original language differentiated over time, some branches might become SOV and others SVO; within each of these branches some might have agglutinated words, others isolated words. But this is not so. Beyond a time depth of about a thousand years, history and typology often do not correlate well at all. Languages can change from grammatical type to type relatively quickly, and can cycle among a few types over and over; aside from vocabulary, they do not progressively differentiate and diverge. For example, English has changed from a free-word-order, highly inflected, topic-prominent language, as its sister German remains to this day, to a fixed-word-order, poorly inflected, subject-prominent language, all in less than a millennium. Many language families con
tain close to the full gamut of variations seen across the world in particular aspects of grammar. The absence of a strong correlation between the grammatical properties of languages and their place in the family tree of languages suggests that language universals are not just the properties that happen to have survived from the hypothetical mother of all languages.
The second counterexplanation that one must rule out before attributing a universal of language to a universal language instinct is that languages might reflect universals of thought or of mental information processing that are not specific to language. As we saw in Chapter 3, universals of color vocabulary probably come from universals of color vision. Perhaps subjects precede objects because the subject of an action verb denotes the causal agent (as in Dog bites man); putting the subject first mirrors the cause coming before the effect. Perhaps head-first or head-last ordering is consistent across all the phrases in a language because it enforces a consistent branching direction, right or left, in the language’s phrase structure trees, avoiding difficult-to-understand onion constructions. For example, Japanese is SOV and has modifiers to the left; this gives it constructions like “modifier-S O V” with the modifier on the outside rather than “S-modifier O V” with the modifier embedded inside.
But these functional explanations are often tenuous, and for many universals they do not work at all. For example, Greenberg noted that if a language has both derivational suffixes (which create new words from old ones) and inflectional suffixes (which modify a word to fit its role in the sentence), then the derivational suffixes are always closer to the stem than the inflectional ones. In Chapter 5 we saw this principle in English in the difference between the grammatical Darwinisms and the ungrammatical Darwinsism. It is hard to think of how this law could be a consequence of any universal principle of thought or memory: why would the concept of two ideologies based on one Darwin be thinkable, but the concept of one ideology based on two Darwins (say, Charles and Erasmus) not be thinkable (unless one reasons in a circle and declares that the mind must find -ism to be more cognitively basic than the plural, because that’s the order we see in language)? And remember Peter Gordon’s experiments showing that children say mice-eater but never rats-eater, despite the conceptual similarity of rats and mice and despite the absence of either kind of compound in parents’ speech. His results corroborate the suggestion that this particular universal is caused by the way that morphological rules are computed in the brain, with inflection applying to the products of derivation but not vice versa.
In any case, Greenbergisms are not the best place to look for a neurologically given Universal Grammar that existed before Babel. It is the organization of grammar as a whole, not some laundry list of facts, that we should be looking at. Arguing about the possible causes of something like SVO order misses the forest for the trees. What is most striking of all is that we can look at a randomly picked language and find things that can sensibly be called subjects, objects, and verbs to begin with. After all, if we were asked to look for the order of subject, object, and verb in musical notation, or in the computer programming language FORTRAN, or in Morse code, or in arithmetic, we would protest that the very idea is nonsensical. It would be like assembling a representative collection of the world’s cultures from the six continents and trying to survey the colors of their hockey team jerseys or the form of their harakiri rituals. We should be impressed, first and foremost, that research on universals of grammar is even possible!
When linguists claim to find the same kinds of linguistic gadgets in language after language, it is not just because they expect languages to have subjects and so they label as a “subject” the first kind of phrase they see that resembles an English subject. Rather, if a linguist examining a language for the first time calls a phrase a “subject” using one criterion based on English subjects—say, denoting the agent role of action verbs—the linguist soon discovers that other criteria, like agreeing with the verb in person and number and occurring before the object, will be true of that phrase as well. It is these correlations among the properties of a linguistic thingamabob across languages that make it scientifically meaningful to talk about subjects and objects and nouns and verbs and auxiliaries and inflections—and not just Word Class #2,783 and Word Class #1,491—in languages from Abaza to Zyrian.
Chomsky’s claim that from a Martian’s-eye-view all humans speak a single language is based on the discovery that the same symbol-manipulating machinery, without exception, underlies the world’s languages. Linguists have long known that the basic design features of language are found everywhere. Many were documented in 1960 by the non-Chomskyan linguist C. F. Hockett in a comparison between human languages and animal communication systems (Hockett was not acquainted with Martian). Languages use the mouth-to-ear channel as long as the users have intact hearing (manual and facial gestures, of course, are the substitute channel used by the deaf). A common grammatical code, neutral between production and comprehension, allows speakers to produce any linguistic message they can understand, and vice versa. Words have stable meanings, linked to them by arbitrary convention. Speech sounds are treated discontinuously; a sound that is acoustically halfway between bat and pat does not mean something halfway between batting and patting. Languages can convey meanings that are abstract and remote in time or space from the speaker. Linguistic forms are infinite in number, because they are created by a discrete combinatorial system. Languages all show a duality of patterning in which one rule system is used to order phonemes within morphemes, independent of meaning, and another is used to order morphemes within words and phrases, specifying their meaning.
Chomskyan linguistics, in combination with Greenbergian surveys, allows us to go well beyond this basic spec sheet. It is safe to say that the grammatical machinery we used for English in Chapters 4–6 is used in all the world’s languages. All languages have a vocabulary in the thousands or tens of thousands, sorted into part-of-speech categories including noun and verb. Words are organized into phrases according to the X-bar system (nouns are found inside N-bars, which are found inside noun phrases, and so on). The higher levels of phrase structure include auxiliaries (INFL), which signify tense, modality, aspect, and negation. Nouns are marked for case and assigned semantic roles by the mental dictionary entry of the verb or other predicate. Phrases can be moved from their deep-structure positions, leaving a gap or “trace,” by a structure-dependent movement rule, thereby forming questions, relative clauses, passives, and other widespread constructions. New word structures can be created and modified by derivational and inflectional rules. Inflectional rules primarily mark nouns for case and number, and mark verbs for tense, aspect, mood, voice, negation, and agreement with subjects and objects in number, gender, and person. The phonological forms of words are defined by metrical and syllable trees and separate tiers of features like voicing, tone, and manner and place of articulation, and are subsequently adjusted by ordered phonological rules. Though many of these arrangements are in some sense useful, their details, found in language after language but not in any artificial system like FORTRAN or musical notation, give a strong impression that a Universal Grammar, not reducible to history or cognition, underlies the human language instinct.
God did not have to do much to confound the language of Noah’s descendants. In addition to vocabulary—whether the word for “mouse” is mouse or souris—a few properties of language are simply not specified in Universal Grammar and can vary as parameters. For example, it is up to each language to choose whether the order of elements within a phrase is head-first or head-last (eat sushi and to Chicago versus sushi eat and Chicago to) and whether a subject is mandatory in all sentences or can be omitted when the speaker desires. Furthermore, a particular grammatical widget often does a great deal of important work in one language and hums away unobtrusively in the corner of another. The overall impression is that Universal Grammar is like an archetypal body plan found across vast numbers of animals in a phylum. For example, amo
ng all the amphibians, reptiles, birds, and mammals, there is a common body architecture, with a segmented backbone, four jointed limbs, a tail, a skull, and so on. The various parts can be grotesquely distorted or stunted across animals: a bat’s wing is a hand, a horse trots on its middle toes, whales’ forelimbs have become flippers and their hindlimbs have shrunken to invisible nubs, and the tiny hammer, anvil, and stirrup of the mammalian middle ear are jaw parts of reptiles. But from newts to elephants, a common topology of the body plan—the shin bone connected to the thigh bone, the thigh bone connected to the hip bone—can be discerned. Many of the differences are caused by minor variations in the relative timing and rate of growth of the parts during embryonic development. Differences among languages are similar. There seems to be a common plan of syntactic, morphological, and phonological rules and principles, with a small set of varying parameters, like a checklist of options. Once set, a parameter can have far-reaching changes on the superficial appearance of the language.
If there is a single plan just beneath the surfaces of the world’s languages, then any basic property of one language should be found in all the others. Let’s reexamine the six supposedly un-English language traits that opened the chapter. A closer look shows that all of them can be found right here in English, and that the supposedly distinctive traits of English can be found in the other languages.
English, like the inflecting languages it supposedly differs from, has an agreement marker, the third person singular -s in He walks. It also has case distinctions in the pronouns, such as he versus him. And like agglutinating languages, it has machinery that can glue many bits together into a long word, like the derivational rules and affixes that create sensationalization and Darwinianisms. Chinese is supposed to be an even more extreme example of an isolating language than English, but it, too, contains rules that create multipart words such as compounds and derivatives.