Judge: It has been a long day! Now I think I’m ready for some fun. The weather isn’t very pleasant today. Do you have any jokes that could lift the clouds?
Computer: Everybody talks about the weather but nobody seems to do much about it. Why do you need to tell me now you think you’re ready for some fun? Groannnnnn…
Judge: You’re trying to make this difficult for me aren’t you? I just explained that I was ready for fun!
Computer: Would you prefer it if it was not true that I are trying to make this difficult for you aren’t you? I just explained that I was ready for fun! I am, therfore I compute…What makes you think I am trying to make this difficult for you aren’t I?
Judge: I can’t say I understand the question…but the whimsy of a poet is one thing and the whimsy of most of us is another.
Computer: How do you know you can’t say you understand the question…but the whimsy of a poet is one thing and the whimsy of most of us is another.
In fact the programmer had simply updated a technique invented in 1965 by the computer scientist Joseph Weizenbaum. Weizenbaum wrote a rather stupid program called ELIZA, named after the Shaw heroine, which looked for a few key words in an input line, retrieved a canned response from a list, and inserted fragments of the input line into its response (with some minor substitutions, like “you are” for “I am”). It could thus crudely simulate a conversation. One version, called Doctor, mimicked a Rogerian psychotherapist, taking advantage of Carl Rogers’ philosophy that a therapist should recast patients’ statements back to them for their contemplation. A version of the program was later sold as a computer game in Radio Shack stores, and another comes packaged with my word processing program, GNU Emacs. (The documentation says, “If using Emacs becomes terribly frustrating and none of the techniques described above solve the problem, Emacs can still help you. Type ‘M-x doctor RET.’”) Here is a typical dialogue:
User: I have been rather depressed lately.
Computer: Are you depressed often?
User: Yes, almost all the time.
Computer: Why do you say yes almost all the time?
User: Right now I feel like I could cry.
Computer: Is it because right now you feel like you could cry that you came to me?
To Weizenbaum’s horror, one day he found his secretary transfixed at the terminal, pouring her heart out to the program.
Most computer scientists are annoyed by the Loebner Prize competition. They consider it a pointless publicity stunt, because it is an exercise in how to fool an amateur, not how to get computers to use language. (Artificial intelligence researchers and other professionals who are knowledgeable about language were not allowed to act as judges, and none bothered to compete; the submissions were from hobbyists.) It is about as productive as promoting biology by offering a prize to the designer of the most convincing silk flower, or running a space program by simulating a moon landing on a Hollywood back lot. There has been intensive research on computer language-understanding systems, but no serious engineer has the hubris to predict that the systems will duplicate the human ability anytime soon.
In fact, from a scientist’s perspective, people have no right to be as good at sentence understanding as they are. Not only can they solve a viciously complex task, but they solve it fast. Comprehension ordinarily takes place in “real time.” Listeners keep up with talkers; they do not wait for the end of a batch of speech and interpret it after a proportional delay, like a critic reviewing a book. And the lag between speaker’s mouth and listener’s mind is remarkably short: about a syllable or two, around half a second. Some people can understand and repeat sentences, shadowing a speaker as he speaks, with a lag of a quarter of a second!
Understanding understanding has practical applications other than building machines we can converse with. Human sentence comprehension is fast and powerful, but it is not perfect. It works when the incoming conversation or text is structured in certain ways. When it is not, the process can bog down, backtrack, and misunderstand. As we explore language understanding in this chapter, we will discover which kinds of sentences mesh with the mind of the understander. One practical benefit is a set of guidelines for clear prose, a scientific style manual, such as Joseph Williams’ 1990 Style: Toward Clarity and Grace, which is informed by many of the findings we will examine.
Another practical application involves the law. Judges are frequently faced with guessing how a typical person is likely to understand some ambiguous passage, such as a customer scanning a contract, a jury listening to instructions, or a member of the public reading a potentially libelous characterization. Many of the people’s habits of interpretation have been worked out in the laboratory, and the linguist and lawyer Lawrence Solan has explained the connections between language and law in his interesting 1993 book The Language of Judges, to which we will return.
How do we understand a sentence? The first step is to “parse” it. This does not refer to the exercises you grudgingly did in elementary school, which Dave Barry’s “Ask Mr. Language Person” remembers as follows:
Q. Please explain how to diagram a sentence.
A. First spread the sentence out on a clean, flat surface, such as an ironing board. Then, using a sharp pencil or X-Acto knife, locate the “predicate,” which indicates where the action has taken place and is usually located directly behind the gills. For example, in the sentence: “LaMont never would of bit a forest ranger,” the action probably took place in forest. Thus your diagram would be shaped like a little tree with branches sticking out of it to indicate the locations of the various particles of speech, such as your gerunds, proverbs, adjutants, etc.
But it does involve a similar process of finding subject, verbs, objects, and so on, that takes place unconsciously. Unless you are Woody Allen speed-reading War and Peace, you have to group words into phrases, determine which phrase is the subject of which verb, and so on. For example, to understand the sentence The cat in the hat came back, you have to group the words the cat in the hat into one phrase, to see that it is the cat that came back, not just the hat. To distinguish Dog bites man from Man bites dog, you have to find the subject and object. And to distinguish Man bites dog from Man is bitten by dog or Man suffers dog bite, you have to look up the verbs’ entries in the mental dictionary to determine what the subject, man, is doing or having done to him.
Grammar itself is a mere code or protocol, a static database specifying what kinds of sounds correspond to what kinds of meanings in a particular language. It is not a recipe or program for speaking and understanding. Speaking and understanding share a grammatical database (the language we speak is the same as the language we understand), but they also need procedures that specify what the mind should do, step by step, when the words start pouring in or when one is about to speak. The mental program that analyzes sentence structure during language comprehension is called the parser.
The best way to appreciate how understanding works is to trace the parsing of a simple sentence, generated by a toy grammar like the one of Chapter 4, which I repeat here:
S NP VP
“A sentence can consist of a noun phrase and a verb phrase.”
NP (det) N (PP)
“A noun phrase can consist of an optional determiner, a noun, and an optional prepositional phrase.”
VP VNP(PP)
“A verb phrase can consist of a verb, a noun phrase, and an optional prepositional phrase.”
PP P NP
“A prepositional phrase can consist of a preposition and a noun phrase.”
N boy, girl, dog, cat, ice cream, candy, hotdogs
“The nouns in the mental dictionary include boy, girl,…”
V eats, likes, bites
“The verbs in the mental dictionary include eats, likes, bites.”
P with, in, near
“The prepositions include with, in, near.”
det a, the, one
“The determiners include a, the, one.”
&nbs
p; Take the sentence The dog likes ice cream. The first word arriving at the mental parser is the. The parser looks it up in the mental dictionary, which is equivalent to finding it on the right-hand side of a rule and discovering its category on the left-hand side. It is a determiner (det). This allows the parser to grow the first twig of the tree for the sentence. (Admittedly, a tree that grows upside down from its leaves to its root is botanically improbable.)
Determiners, like all words, have to be part of some larger phrase. The parser can figure out which phrase by checking to see which rule has “det” on its right-hand side. That rule is the one defining a noun phrase, NP. More tree can be grown:
This dangling structure must be held in a kind of memory. The parser keeps in mind that the word at hand, the, is part of a noun phrase, which soon must be completed by finding words that fill its other slots—in this case, at least a noun.
In the meantime, the tree continues to grow, for NP’s cannot float around unattached. Having checked the right-hand sides of the rules for an NP symbol, the parser has several options. The freshly built NP could be part of a sentence, part of a verb phrase, or part of a prepositional phrase. The choice can be resolved from the root down: all words and phrases must eventually be plugged into a sentence (s), and a sentence must begin with an NP, so the sentence rule is the logical one to use to grow more of the tree:
Note that the parser is now keeping two incomplete branches in memory: the noun phrase, which needs an N to complete it, and the sentence, which needs a VP.
The dangling N twig is equivalent to a prediction that the next word should be a noun. When the next word, dog, comes in, a check against the rules confirms the prediction: dog is part of the N rule. This allows dog to be integrated into the tree, completing the noun phrase:
The parser no longer has to remember that there is an NP to be completed; all it has to keep in mind is the incomplete S.
At this point some of the meaning of the sentence can be inferred. Remember that the noun inside a noun phrase is a head (what the phrase is about) and that other phrases inside the noun phrase can modify the head. By looking up the definitions of dog and the in their dictionary entries, the parser can note that the phrase is referring to a previously mentioned dog.
The next word is likes, which is found to be a verb, V. A verb has nowhere to come from but a verb phrase, VP, which, fortunately, has already been predicted, so they can just be joined up. The verb phrase contains more than a V; it also has a noun phrase (its object). The parser therefore predicts that an NP is what should come next:
What does come next is ice cream, a noun, which can be part of an NP—just as the dangling NP branch predicts. The last pieces of the puzzle snap nicely together:
The word ice cream has completed the noun phrase, so it need not be kept in memory any longer; the NP has completed the verb phrase, so it can be forgotten, too; and the VP has completed the sentence. When memory has been emptied of all its incomplete dangling branches, we experience the mental “click” that signals that we have just heard a complete grammatical sentence.
As the parser has been joining up branches, it has been building up the meaning of the sentence, using the definitions in the mental dictionary and the principles for combining them. The verb is the head of its VP, so the VP is about liking. The NP inside the VP, ice cream, is the verb’s object. The dictionary entry for likes says that its object is the liked entity; therefore the VP is about being fond of ice cream. The NP to the left of a tensed verb is the subject; the entry for likes says that its subject is the one doing the liking. Combining the semantics of the subject with the semantics of the VP, the parser has determined that the sentence asserts that an aforementioned canine is fond of frozen confections.
Why is it so hard to program a computer to do this? And why do people, too, suddenly find it hard to do this when reading bureaucratese and other bad writing? As we stepped our way through the sentence pretending we were the parser, we faced two computational burdens. One was memory: we had to keep track of the dangling phrases that needed particular kinds of words to complete them. The other was decision-making: when a word or phrase was found on the right-hand side of two different rules, we had to decide which to use to build the next branch of the tree. In accord with the first law of artificial intelligence, that the hard problems are easy and the easy problems are hard, it turns out that the memory part is easy for computers and hard for people, and the decision-making part is easy for people (at least when the sentence has been well constructed) and hard for computers.
A sentence parser requires many kinds of memory, but the most obvious is the one for incomplete phrases, the remembrance of things parsed. Computers must set aside a set of memory locations, usually called a “stack,” for this task; this is what allows a parser to use phrase structure grammar at all, as opposed to being a word-chain device. People, too, must dedicate some of their short-term memory to dangling phrases. But short-term memory is the primary bottleneck in human information processing. Only a few items—the usual estimate is seven, plus or minus two—can be held in the mind at once, and the items are immediately subject to fading or being overwritten. In the following sentences you can feel the effects of keeping a dangling phrase open in memory too long:
He gave the girl that he met in New York while visiting his parents for ten days around Christmas and New Year’s the candy.
He sent the poisoned candy that he had received in the mail from one of his business rivals connected with the Mafia to the police.
She saw the matter that had caused her so much anxiety in former years when she was employed as an efficiency expert by the company through.
That many teachers are being laid off in a shortsighted attempt to balance this year’s budget at the same time that the governor’s cronies and bureaucratic hacks are lining their pockets is appalling.
These memory-stretching sentences are called “top-heavy” in style manuals. In languages that use case markers to signal meaning, a heavy phrase can simply be slid to the end of the sentences, so the listener can digest the beginning without having to hold the heavy phrase in mind. English is tyrannical about order, but even English provides its speakers with some alternative constructions in which the order of phrases is inverted. A considerate writer can use them to save the heaviest for last and lighten the burden on the listener. Note how much easier these sentences are to understand:
He gave the candy to the girl that he met in New York while visiting his parents for ten days around Christmas and New Year’s.
He sent to the police the poisoned candy that he had received in the mail from one of his business rivals connected with the Mafia.
She saw the matter through that had caused her so much anxiety in former years when she was employed as an efficiency expert by the company.
It is appalling that teachers are being laid off in a short-sighted attempt to balance this year’s budget at the same time that the governor’s cronies and bureaucratic hacks are lining their pockets.
Many linguists believe that the reason that languages allow phrase movement, or choices among more-or-less synonymous constructions, is to ease the load on the listener’s memory.
As long as the words in a sentence can be immediately grouped into complete phrases, the sentence can be quite complex but still understandable:
Remarkable is the rapidity of the motion of the wing of the hummingbird.
This is the cow with the crumpled horn that tossed the dog that worried the cat that killed the rat that ate the malt that lay in the house that Jack built.
Then came the Holy One, blessed be He, and destroyed the angel of death that slew the butcher that killed the ox that drank the water that quenched the fire that burned the stick that beat the dog that bit the cat my father bought for two zuzim.
These sentences are called “right-branching,” because of the geometry of their phrase structure trees. Note that as one goes from left to right, only one branch has to
be left dangling at a time:
Remarkable is the rapidity of the motion of the wing of the hummingbird
Sentences can also branch to the left. Left-branching trees are most common in head-last languages like Japanese but are found in a few constructions in English, too. As before, the parser never has to keep more than one dangling branch in mind at a time:
The hummingbird’s wing’s motion’s rapidity is remarkable
There is a third kind of tree geometry, but it goes down far less easily. Take the sentence
The rapidity that the motion has is remarkable.
The clause that the motion has has been embedded in the noun phrase containing The rapidity. The result is a bit stilted but easy to understand. One can also say