The Language Instinct: How the Mind Creates Language
What about flying out? To the baseball cognoscenti, it is not directly based on the familiar verb to fly (“to proceed through the air”) but on the noun a fly (“a ball hit on a conspicuously parabolic trajectory”). To fly out means “to make an out by hitting a fly that gets caught.” The noun a fly, of course, itself came from the verb to fly. The word-within-a-word-within-a-word structure can be seen in this bamboo-like tree:
Since the whole word, represented by its topmost label, is a verb, but the element it is made out of one level down is a noun, to fly out, like low-life, must be headless—if the noun fly were its head, fly out would have to be a noun, too, which it is not. Lacking a head and its associated data pipeline, the irregular forms of the original verb to fly, namely flew and flown, are trapped at the bottommost level and cannot bubble up to attach to the whole word. The regular -ed rule rushes in in its usual role as the last resort, and thus we say that Wade Boggs flied out. What kills the irregularity of to fly out, then, is not its specialized meaning, but its being a verb based on a word that is not a verb. By the same logic, we say They ringed the city with artillery (“formed a ring around it”), not They rang the city with artillery, and He grandstanded to the crowd (“played to the grandstand”), not He grandstood to the crowd.
This principle works every time. Remember Sally Ride, the astronaut? She received a lot of publicity because she was America’s first woman in space. But recently Mae Jemison did her one better. Not only is Jemison America’s first black woman in space, but she appeared in People magazine in 1993 in their list of the fifty most beautiful people in the world. Publicity-wise, she has out-Sally-Bided Sally Ride (not has out-Sally-Ridden Sally Ride). For many years New York State’s most infamous prison was Sing Sing. But since the riot at the Attica Correctional Facility in 1971, Attica has become even more infamous: it has out-Sing-Singed Sing Sing (not has out-Sing-Sung Sing Sing).
As for the Maple Leafs, the noun being pluralized is not leaf, the unit of foliage, but a noun based on the name Maple Leaf, Canada’s national symbol. A name is not the same thing as a noun. (For example, whereas a noun may be preceded by an article like the, a name may not be: you cannot refer to someone as the Donald, unless you are Ivana Trump, whose first language is Czech.) Therefore, the noun a Maple Leaf (referring to, say, the goalie) must be headless, because it is a noun based on a word that is not a noun. And a noun that does not get its nounhood from one of its components cannot get an irregular plural from that component either; hence it defaults to the regular form Maple Leafs. This explanation also answers a question that kept bothering David Letterman thoughout one of his recent Late Night shows: why is the new major league baseball team in Miami called the Florida Marlins rather than the Florida Marlin, given that those fish are referred to in the plural as marlin?. Indeed, the explanation applies to all nouns based on names:
I’m sick of dealing with all the Mickey Mouses in this administration, [not Mickey Mice]
Hollywood has been relying on movies based on comic book heroes and their sequels, like the three Supermans and the two Batmans. [not Supermen and Batmen]
Why has the second half of the twentieth century produced no Thomas Manns? [not Thomas Menn]
We’re having Julia Child and her husband over for dinner tonight. You know, the Childs are great cooks, [not the Children]
Irregular forms, then, live at the bottom of word structure trees, where roots and stems from the mental dictionary are inserted. The developmental psycholinguist Peter Gordon has capitalized on this effect in an ingenious experiment that shows how children’s minds seem to be designed with the logic of word structure built in.
Gordon focused on a seeming oddity first noticed by the linguist Paul Kiparsky: compounds can be formed out of irregular plurals but not out of regular plurals. For example, a house infested with mice can be described as mice-infested, but it sounds awkward to describe a house infested with rats as rats-infested. We say that it is rat-infested, even though by definition one rat does not make an infestation. Similarly, there has been much talk about men-bashing but no talk about gays-bashing (only gay-bashing), and there are teethmarks, but no clawsmarks. Once there was a song about a purple-people-eater, but it would be ungrammatical to sing about a purple-babies-eater. Since the licit irregular plurals and the illicit regular plurals have similar meanings, it must be the grammar of irregularity that makes the difference.
The theory of word structure explains the effect easily. Irregular plurals, because they are quirky, have to be stored in the mental dictionary as roots or stems; they cannot be generated by a rule. Because of this storage, they can be fed into the compounding rule that joins an existing stem to another existing stem to yield a new stem. But regular plurals are not stems stored in the mental dictionary; they are complex words that are assembled on the fly by inflectional rules whenever they are needed. They are put together too late in the root-to-stem-to-word assembly process to be available to the compounding rule, whose inputs can only come out of the dictionary.
Gordon found that three- to five-year-old children obey this restriction fastidiously. Showing the children a puppet, he first asked them, “Here is a monster who likes to eat mud. What do you call him?” He then gave them the answer, a mud-eater, to get them started. Children like to play along, and the more gruesome the meal, the more eagerly they fill in the blank, often to the dismay of their onlooking parents. The crucial parts came next. A “monster who likes to eat mice,” the children said, was a mice-eater. But a “monster who likes to eat rats” was never called a rats-eater, only a rat-eater. (Even the children who made the error mouses in their spontaneous speech never called the puppet a mouses-eater.) The children, in other words, respected the subtle restrictions on combining plurals and compounds inherent in the word structure rules. This suggests that the rules take the same form in the unconscious mind of the child as they do in the unconscious mind of the adult.
But the most interesting discovery came when Gordon examined how children might have acquired this constraint. Perhaps, he reasoned, they learned it from their parents by listening for whether the plurals that occur inside the parents’ compounds are irregular, regular, or both, and then duplicate whatever kinds of compounds they hear. This would be impossible, he discovered. Motherese just doesn’t have any compounds containing plurals. Most compounds are like toothbrush, with singular nouns inside them; compounds like mice-infested, though grammatically possible, are seldom used. The children produced mice-eater but never rats-eater, even though they had no evidence from adult speech that this is how languages work. We have another demonstration of knowledge despite “poverty of the input,” and it suggests that another basic aspect of grammar may be innate. Just as Crain and Nakayama’s Jabba experiment showed that in syntax children automatically distinguish between word strings and phrase structures, Gordon’s mice-eater experiment shows that in morphology children automatically distinguish between roots stored in the mental dictionary and inflected words created by a rule.
A word, in a word, is complicated. But then what in the world is a word? We have just seen that “words” can be built out of parts by morphological rules. But then what makes them different from phrases or sentences? Shouldn’t we reserve the word “word” for a thing that has to be rote-memorized, the arbitrary Saussurean sign that exemplifies the first of the two principles of how language works (the other being the discrete combinatorial system)? The puzzlement comes from the fact that the everyday word “word” is not scientifically precise. It can refer to two things.
The concept of a word that I have used so far in this chapter is a linguistic object that, even if built out of parts by the rules of morphology, behaves as the indivisible, smallest unit with respect to the rules of syntax—a “syntactic atom,” in atom’s original sense of something that cannot be split. The rules of syntax can look inside a sentence or phrase and cut and paste the smaller phrases inside it. For example, the rule for producing questions can look
inside the sentence This monster eats mice and move the phrase corresponding to mice to the front, yielding What did this monster eat? But the rules of syntax halt at the boundary between a phrase and a word; even if the word is built out of parts, the rules cannot look “inside” the word and fiddle with those parts. For example, the question rule cannot look inside the word mice-eater in the sentence This monster is a mice-eater and move the morpheme corresponding to mice to the front; the resulting question is virtually unintelligible: What is this monster an -eater? (Answer: mice.) Similarly, the rules of syntax can stick an adverb inside a phrase, as in This monster eats mice quickly. But they cannot stick an adverb inside a word, as in This monster is a mice-quickly-eater. For these reasons, we say that words, even if they are generated out of parts by one set of rules, are not the same thing as phrases, which are generated out of parts by a different set of rules. Thus one precise sense of our everyday term “word” refers to the units of language that are the products of morphological rules, and which are unsplittable by syntactic rules.
The second, very different sense of “word” refers to a rote-memorized chunk: a string of linguistic stuff that is arbitrarily associated with a particular meaning, one item from the long list we call the mental dictionary. The grammarians Anna Maria Di Sciullo and Edwin Williams coined the term “listeme,” the unit of a memorized list, to refer to this sense of “word” (their term is a play on “morpheme,” the unit of morphology, and “phoneme,” the unit of sound). Note that a listeme need not coincide with the first precise sense of “word,” a syntactic atom. A listeme can be a tree branch any size, as long as it cannot be produced mechanically by rules and therefore has to be memorized. Take idioms. There is no way to predict the meaning of kick the bucket, buy the farm, spill the beans, bite the bullet, screw the pooch, give up the ghost, hit the fan, or go bananas from the meanings of their components using the usual rules of heads and role-players. Kicking the bucket is not a kind of kicking, and buckets have nothing to do with it. The meanings of these phrase-sized units have to be memorized as listemes, just as if they were simple word-sized units, and so they are really “words” in this second sense. Di Sciullo and Williams, speaking as grammatical chauvinists, describe the mental dictionary (lexicon) as follows: “If conceived of as the set of listemes, the lexicon is incredibly boring by its very nature…. The lexicon is like a prison—it contains only the lawless, and the only thing that its inmates have in common is their lawlessness.”
In the rest of this chapter I turn to the second sense of “word,” the listeme. It will be a kind of prison reform: I want to show that the lexicon, though a repository of lawless listemes, is deserving of respect and appreciation. What seems to a grammarian like an act of brute force incarceration—a child hears a parent use a word and thenceforth retains that word in memory—is actually an inspiring feat.
One extraordinary feature of the lexicon is the sheer capacity for memorization that goes into building it. How many words do you think an average person knows? If you are like most writers who have offered an opinion based on the number of words they hear or read, you might guess a few hundred for the uneducated, a few thousand for the literate, and as many as 15,000 for gifted wordsmiths like Shakespeare (that is how many distinct words are found in his collected plays and sonnets).
The real answer is very different. People can recognize vastly more words than they have occasion to use in some fixed period of time or space. To estimate the size of a person’s vocabulary—in the sense of memorized listemes, not morphological products, of course, because the latter are infinite—psychologists use the following method. Start with the largest unabridged dictionary available; the smaller the dictionary, the more words a person might know but not get credit for. Funk & Wagnall’s New Standard Unabridged Dictionary, to take an example, has 450,000 entries, a healthy number, but too many to test exhaustively. (At thirty seconds a word, eight hours a day, it would take more than a year to test a single person.) Instead, draw a sample—say, the third entry from the top of the first column on every eighth left-hand page. Entries often have many meanings, such as “hard: (1) firm; (2) difficult; (3) harsh; (4) toilsome…” and so on, but counting them would require making arbitrary decisions about how to lump or split the meanings. Thus it is practical only to estimate how many words a person has learned at least one meaning for, not how many meanings a person has learned altogether. The testee is presented with each word in the sample, and asked to choose the closest synonym from a set of alternatives. After a correction for guessing, the proportion correct is multiplied by the size of the dictionary, and that is an estimate of the person’s vocabulary size.
Actually, another correction must be applied first. Dictionaries are consumer products, not scientific instruments, and for advertising purposes their editors often inflate the number of entries. (“Authoritative. Comprehensive. Over 1.7 million words of text and 160,000 definitions. Includes a 16-page full-color atlas.”) They do it by including compounds and affixed forms whose meanings are predictable from the meanings of their roots and the rules of morphology, and thus are not true listemes. For example, my desk dictionary includes, together with sail, the derivatives sailplane, sailer, sailless, sailing-boat, and sailcloth, whose meanings I could deduce even if I had never heard them before.
The most sophisticated estimate comes from the psychologists William Nagy and Richard Anderson. They began with a list of 227,553 different words. Of these, 45,453 were simple roots and stems. Of the remaining 182,100 derivatives and compounds, they estimated that all but 42,080 could be understood in context by someone who knew their components. Thus there were a total of 44,453 + 42,080 = 88,533 listeme words. By sampling from this list and testing the sample, Nagy and Anderson estimated that an average American high school graduate knows 45,000 words—three times as many as Shakespeare managed to use! Actually, this is an underestimate, because proper names, numbers, foreign words, acronyms, and many common undecomposable compounds were excluded. There is no need to follow the rules of Scrabble in estimating vocabulary size; these forms are all listemes, and a person should be given credit for them. If they had been included, the average high school graduate would probably be credited with something like 60,000 words (a tetrabard?), and superior students, because they read more, would probably merit a figure twice as high, an octobard.
Is 60,000 words a lot or a little? It helps to think of how quickly they must have been learned. Word learning generally begins around the age of twelve months. Therefore, high school graduates, who have been at it for about seventeen years, must have been learning an average of ten new words a day continuously since their first birthdays, or about a new word every ninety waking minutes. Using similar techniques, we can estimate that an average six-year-old commands about 13,000 words (notwithstanding those dull, dull Dick and Jane reading primers, which were based on ridiculously lowball estimates). A bit of arithmetic shows that preliterate children, who are limited to ambient speech, must be lexical vacuum cleaners, inhaling a new word every two waking hours, day in, day out. Remember that we are talking about listemes, each involving an arbitrary pairing. Think about having to memorize a new batting average or treaty date or phone number every ninety minutes of your waking life since you took your first steps. The brain seems to be reserving an especially capacious storage space and an especially rapid transcribing mechanism for the mental dictionary. Indeed, naturalistic studies by the psychologist Susan Carey have shown that if you casually slip a new color word like olive into a conversation with a three-year-old, the child will probably remember something about it five weeks later.
Now think of what goes into each act of memorization. A word is the quintessential symbol. Its power comes from the fact that every member of a linguistic community uses it interchangeably in speaking and understanding. If you use a word, then as long as it is not too obscure I can take it for granted that if I later utter it to a third party, he will understand my use of it the same wa
y I understood yours. I do not have to try the word back on you to see how you react, or test it out on every third party and see how they react, or wait for you to use it with third parties. This sounds more obvious than it is. After all, if I observe that a bear snarls before it attacks, I cannot expect to scare a mosquito by snarling at it; if I bang a pot and the bear flees, I cannot expect the bear to bang a pot to scare hunters. Even within our species, learning a word from another person is not just a case of imitating that person’s behavior. Actions are tied to particular kinds of actors and targets of the action in ways that words are not. If a girl learns to flirt by watching her older sister, she does not flirt with the sister or with their parents but only with the kind of person that she observes to be directly affected by the sister’s behavior. Words, in contrast, are a universal currency within a community. In order to learn to use a word upon merely hearing it used by others, babies must tacitly assume that a word is not merely a person’s characteristic behavior in affecting the behavior of others, but a shared bidirectional symbol, available to convert meaning to sound by any person when the person speaks, and sound to meaning by any person when the person listens, according to the same code.