The Language Instinct: How the Mind Creates Language

Previous Page Next Page

Since a word is a pure symbol, the relation between its sound and its meaning is utterly arbitrary. As Shakespeare (using a mere tenth of a percent of his written lexicon and a far tinier fraction of his mental one) put it,

What’s in a name? that which we call a rose

By any other name would smell as sweet.

Because of that arbitrariness, there is no hope that mnemonic tricks might lighten the memorization burden, at least for words that are not built out of other words. Babies should not, and apparently do not, expect cattle to mean something similar to battle, or singing to be like stinging, or coats to resemble goats. Onomatopoeia, where it is found, is of no help, because it is almost as conventional as any other word sound. In English, pigs go “oink”; in Japanese, they go “boo-boo.” Even in sign languages the mimetic abilities of the hands are put aside and their configurations are treated as arbitrary symbols. Residues of resemblance between a sign and its referent can occasionally be discerned, but like onomatopoeia they are so much in the eye or ear of the beholder that they are of little use in learning. In American Sign Language the sign for “tree” is a motion of a hand as if it was a branch waving in the wind; in Chinese Sign Language “tree” is indicated by the motion of sketching a tree trunk.

The psychologist Laura Ann Petitto has a startling demonstration that the arbitrariness of the relation between a symbol and its meaning is deeply entrenched in the child’s mind. Shortly before they turn two, English-speaking children learn the pronouns you and me. Often they reverse them, using you to refer to themselves. The error is forgivable. You and me are “deictic” pronouns, whose referent shifts with the speaker: you refers to you when I use it but to me when you use it. So children may need some time to get that down. After all, Jessica hears her mother refer to her, Jessica, using you; why should she not think that you means “Jessica”?

Now, in ASL the sign for “me” is a point to one’s chest; the sign for “you” is a point to one’s partner. What could be more transparent? One would expect that using “you” and “me” in ASL would be as foolproof as knowing how to point, which all babies, deaf and hearing, do before their first birthday. But for the deaf children Petitto studied, pointing is not pointing. The children used the sign of pointing to their conversational partners to mean “me” at exactly the age at which hearing children use the spoken sound you to mean “me.” The children were treating the gesture as a pure linguistic symbol; the fact that it pointed somewhere did not register as being relevant. This attitude is appropriate in learning sign languages; in ASL, the pointing hand-shape is like a meaningless consonant or vowel, found as a component of many other signs, like “candy” and “ugly.”

There is one more reason we should stand in awe of the simple act of learning a word. The logician W. V. O. Quine asks us to imagine a linguist studying a newly discovered tribe. A rabbit scurries by, and a native shouts, “Gavagai!” What does gavagai mean? Logically speaking, it needn’t be “rabbit.” It could refer to that particular rabbit (Flopsy, for example). It could mean any furry thing, any mammal, or any member of that species of rabbit (say, Oryctolagus cuniculus), or any member of that variety of that species (say, chinchilla rabbit). It could mean scurrying rabbit, scurrying thing, rabbit plus the ground it scurries upon, or scurrying in general. It could mean footprint-maker, or habitat for rabbit-fleas. It could mean the top half of a rabbit, or rabbit-meat-on-the-hoof, or possessor of at least one rabbit’s foot. It could mean anything that is either a rabbit or a Buick. It could mean collection of undetached rabbit parts, or “Lo! Rabbithood again!,” or “It rabbiteth,” analogous to “It raineth.”

The problem is the same when the child is the linguist and the parents are the natives. Somehow a baby must intuit the correct meaning of a word and avoid the mind-boggling number of logically impeccable alternatives. It is an example of a more general problem that Quine calls “the scandal of induction,” which applies to scientists and children alike: how can they be so successful at observing a finite set of events and making some correct generalization about all future events of that sort, rejecting an infinite number of false generalizations that are also consistent with the original observations?

We all get away with induction because we are not open-minded logicians but happily blinkered humans, innately constrained to make only certain kinds of guesses—the probably correct kinds—about how the world and its occupants work. Let’s say the word-learning baby has a brain that carves the world into discrete, bounded, cohesive objects and into the actions they undergo, and that the baby forms mental categories that lump together objects that are of the same kind. Let’s also say that babies are designed to expect a language to contain words for kinds of objects and words for kinds of actions—nouns and verbs, more or less. Then the undetached rabbit parts, rabbit-trod ground, intermittent rabbiting, and other accurate descriptions of the scene will, fortunately, not occur to them as possible meanings off gavagai.

But could there really be a preordained harmony between the child’s mind and the parent’s? Many thinkers, from the woolliest mystics to the sharpest logicians, united only in their assault on common sense, have claimed that the distinction between an object and an action is not in the world or even in our minds, initially, but is imposed on us by our language’s distinction between nouns and verbs. And if it is the word that delineates the thing and the act, it cannot be the concepts of thing and act that allow for the learning of the word.

I think common sense wins this one. In an important sense, there really are things and kinds of things and actions out there in the world, and our mind is designed to find them and to label them with words. That important sense is Darwin’s. It’s a jungle out there, and the organism designed to make successful predictions about what is going to happen next will leave behind more babies designed just like it. Slicing space-time into objects and actions is an eminently sensible way to make predictions given the way the world is put together. Conceiving of an extent of solid matter as a thing—that is, giving a single mentalese name to all of its parts—invites the prediction that those parts will continue to occupy some region of space and will move as a unit. And for many portions of the world, that prediction is correct. Look away, and the rabbit still exists; lift the rabbit by the scruff of the neck, and the rabbit’s foot and the rabbit ears come along for the ride.

What about kinds of things, or categories? Isn’t it true that no two individuals are exactly alike? Yes, but they are not arbitary collections of properties, either. Things that have long furry ears and tails like pom-poms also tend to eat carrots, scurry into burrows, and breed like, well, rabbits. Lumping objects into categories—giving them a category label in mentalese—allows one, when viewing an entity, to infer some of the properties one cannot directly observe, using the properties one can observe. If Flopsy has long furry ears, he is a “rabbit”; if he is a rabbit, he might scurry into a burrow and quickly make more rabbits.

Moreover, it pays to give objects several labels in mentalese, designating different-sized categories like “cottontail rabbit,” “rabbit,” “mammal,” “animal,” and “living thing.” There is a tradeoff involved in choosing one category over another. It takes less effort to determine that Peter Cottontail is an animal than that he is a cottontail (for example, an animallike motion will suffice for us to recognize that he is an animal, leaving it open whether or not he is a cottontail). But we can predict more new things about Peter if we know he is a cottontail than if we merely know he is an animal. If he is a cottontail, he likes carrots and inhabits open country or woodland clearings; if he is merely an animal, he could eat anything and live anywhere, for all one knows. The middle-sized or “basic-level” category “rabbit” represents a compromise between how easy it is to label something and how much good the label does you.

Finally, why separate the rabbit from the scurry? Presumably because there are predictable consequences of rabbithood that cut across whether it is scurrying, eatin
g, or sleeping: make a loud sound, and in all cases it will be down a hole lickety-split. The consequences of making a loud noise in the presence of lionhood, whether eating or sleeping, are predictably different, and that is a difference that makes a difference. Likewise, scurrying has certain consequences regardless of who is doing it; whether it be rabbit or lion, a scurrier does not remain in the same place for long. With sleeping, a silent approach will generally work to keep a sleeper—rabbit or lion—motionless. Therefore a powerful prognosticator should have separate sets of mental labels for kinds of objects and kinds of actions. That way, it does not have to learn separately what happens when a rabbit scurries, what happens when a lion scurries, what happens when a rabbit sleeps, what happens when a lion sleeps, what happens when a gazelle scurries, what happens when a gazelle sleeps, and on and on; knowing about rabbits and lions and gazelles in general, and scurrying and sleeping in general, will suffice. With m objects and n actions, a knower needn’t go through m X n learning experiences; it can get away with m + n of them.

So even a wordless thinker does well to chop continuously flowing experience into things, kinds of things, and actions (not to mention places, paths, events, states, kinds of stuff, properties, and other types of concepts). Indeed, experimental studies of baby cognition have shown that infants have the concept of an object before they learn any words for objects, just as we would expect. Well before their first birthday, when first words appear, babies seem to keep track of the bits of stuff that we would call objects: they show surprise if the parts of an object suddenly go their own ways, or if the object magically appears or disappears, passes through another solid object, or hovers in the air without visible means of support.

Attaching words to these concepts, of course, allows one to share one’s hard-won discoveries and insights about the world with the less experienced or the less observant. Figuring out which word to attach to which concept is the gavagai problem, and if infants start out with concepts corresponding to the kinds of meanings that languages use, the problem is partly solved. Laboratory studies confirm that young children assume that certain kinds of concepts get certain types of words, and other kinds of concepts cannot be the meaning of a word at all. The developmental psychologists Ellen Markman and Jeanne Hutchinson gave two-and three-year-old children a set of pictures, and for each picture asked them to “find another one that is the same as this.” Children are intrigued by objects that interact, and when faced with these instructions they tend to select pictures that make groups of role-players like a blue jay and a nest or a dog and a bone. But when Markman and Hutchinson told them to “find another dax that is the same as this dax,” the children’s criterion shifted. A word must label a kind of thing, they seemed to be reasoning, so they put together a bird with another type of bird, a dog with another type of dog. For a child, a dax simply cannot mean “a dog or its bone,” interesting though the combination may be.

Of course, more than one word can be applied to a thing: Peter Cottontail is not only a rabbit but an animal and a cottontail. Children have a bias to interpret nouns as middle-level kinds of objects like “rabbit,” but they also must overcome that bias, to learn other types of words like animal. Children seem to manage this by being in sync with a striking feature of language. Though most common words have many meanings, few meanings have more than one word. That is, homonyms are plentiful, synonyms rare. (Virtually all supposed synonyms have some difference in meaning, however small. For example, skinny and slim differ in their connotation of desirability; policeman and cop differ in formality.) No one really knows why languages are so stingy with words and profligate with meanings, but children seem to expect it (or perhaps it is this expectation that causes it!), and that helps them further with the gavagai problem. If a child already knows a word for a kind of thing, then when another word is used for it, he or she does not take the easy but wrong way and treat it as a synonym. Instead, the child tries out some other possible concept. For example, Markman found that if you show a child a pair of pewter tongs and call it biff, the child interprets biff is meaning tongs in general, showing the usual bias for middle-level objects, so when asked for “more biffs,” the child picks out a pair of plastic tongs. But if you show the child a pewter cup and call it biff, the child does not interpret biff as meaning “cup,” because most children already know a word that means “cup,” namely, cup. Loathing synonyms, the children guess that biff must mean something else, and the stuff the cup is made of is the next most readily available concept. When asked for more biffs, the child chooses a pewter spoon or pewter tongs.

Many other ingenious studies have shown how children home in on the correct meanings for different kinds of words. Once children know some syntax, they can use it to sort out different kinds of meaning. For example, the psychologist Roger Brown showed children a picture of hands kneading a mass of little squares in a bowl. If he asked them, “Can you see any sibbing?,” the children pointed to the hands. If instead he asked them, “Can you see a sib?,” they point to the bowl. And if he asked, “Can you see any sib?,” they point to the stuff inside the bowl. Other experiments have uncovered great sophistication in children’s understanding of how classes of words fit into sentence structures and how they relate to concepts and kinds.

So what’s in a name? The answer, we have seen, is, a great deal. In the sense of a morphological product, a name is an intricate structure, elegantly assembled by layers of rules and lawful even at its quirkiest. And in the sense of a listeme, a name is a pure symbol, part of a cast of thousands, rapidly acquired because of a harmony between the mind of the child, the mind of the adult, and the texture of reality.

The Sounds of Silence

When I was a student I worked in a laboratory at McGill University that studied auditory perception. Using a computer, I would synthesize trains of overlapping tones and determine whether they sounded like one rich sound or two pure ones. One Monday morning I had an odd experience: the tones suddenly turned into a chorus of screaming munchkins. Like this: (beep boop-boop) (beep boop-boop) (beep boop-boop) HUMPTY-DUMPTY-HUMPTY-DUMPTY-HUMPTY-DUMPTY (beep boop-boop) (beep boop-boop) HUMPTY-DUMPTY-HUMPTY-DUMPTY-HUMPTY-HUMPTY-DUMPTY-DUMPTY (beep boop-boop) (beep boop-boop) (beep boop-boop) HUMPTY-DUMPTY (beep boop-boop) HUMPTY-HUMPTY-HUMPTY-DUMPTY (beep boop-boop). I checked the oscilloscope: two streams of tones, as programmed. The effect had to be perceptual. With a bit of effort I could go back and forth, hearing the sound as either beeps or munchkins. When a fellow student entered, I recounted my discovery, mentioning that I couldn’t wait to tell Professor Bregman, who directed the laboratory. She offered some advice: don’t tell anyone, except perhaps Professor Poser (who directed the psychopathology program).

Years later I discovered what I had discovered. The psychologists Robert Remez, David Pisoni, and their colleagues, braver men than I am, published an article in Science on “sine-wave speech.” They synthesized three simultaneous wavering tones. Physically, the sound was nothing at all like speech, but the tones followed the same contours as the bands of energy in the sentence. “Where were you a year ago?” Volunteers described what they heard as “science fiction sounds” or “computer bleeps.” A second group of volunteers was told that the sounds had been generated by a bad speech synthesizer. They were able to make out many of the words, and a quarter of them could write down the sentence perfectly. The brain can hear speech content in sounds that have only the remotest resemblance to speech. Indeed, sine-wave speech is how mynah birds fool us. They have a valve on each bronchial tube and can control them independently, producing two wavering tones which we hear as speech.

Our brains can flip between hearing something as a bleep and hearing it as a word because phonetic perception is like a sixth sense. When we listen to speech the actual sounds go in one ear and out the other; what we perceive is language. Our experience of words and syllables, of the “b”-ness of b and the “ee”-ness of ee, is as separable from our experience of pitch and loudness as
lyrics are from a score. Sometimes, as in sine-wave speech, the senses of hearing and phonetics compete over which gets to interpret a sound, and our perception jumps back and forth. Sometimes the two senses simultaneously interpret a single sound. If one takes a tape recording of da, electronically removes the initial chirplike portion that distinguishes the da from ga and ka, and plays the chirp to one ear and the residue to the other, what people hear is a chirp in one ear and da in the other—a single clip of sound is perceived simultaneously as d-ness and a chirp. And sometimes phonetic perception can transcend the auditory channel. If you watch an English-subtitled movie in a language you know poorly, after a few minutes you may feel as if you are actually understanding the speech. In the laboratory, researchers can dub a speech sound like ga onto a close-up video of a mouth articulating va, ba, tha, or da. Viewers literally hear a consonant like the one they see the mouth making—an astonishing illusion with the pleasing name “McGurk effect,” after one of its discoverers.

Actually, one does not need electronic wizardry to create a speech illusion. All speech is an illusion. We hear speech as a string of separate words, but unlike the tree falling in the forest with no one to hear it, a word boundary with no one to hear it has no sound. In the speech sound wave, one word runs into the next seamlessly; there are no little silences between spoken words the way there are white spaces between written words. We simply hallucinate word boundaries when we reach the edge of a stretch of sound that matches some entry in our mental dictionary. This becomes apparent when we listen to speech in a foreign language: it is impossible to tell where one word ends and the next begins. The seamlessness of speech is also apparent in “oronyms,” strings of sound that can be carved into words in two different ways:

Previous Page Next Page