For millennia the gap between physical events, on the one hand, and meaning, content, ideas, reasons, and intentions, on the other, seemed to cleave the universe in two. How can something as ethereal as “inciting hatred” or “wanting to speak to Cecile” actually cause matter to move in space? But the cognitive revolution unified the world of ideas with the world of matter using a powerful new theory: that mental life can be explained in terms of information, computation, and feedback. Beliefs and memories are collections of information—like facts in a database, but residing in patterns of activity and structure in the brain. Thinking and planning are systematic transformations of these patterns, like the operation of a computer program. Wanting and trying are feedback loops, like the principle behind a thermostat: they receive information about the discrepancy between a goal and the current state of the world, and then they execute operations that tend to reduce the difference. The mind is connected to the world by the sense organs, which transduce physical energy into data structures in the brain, and by motor programs, by which the brain controls the muscles.
This general idea may be called the computational theory of mind. It is not the same as the “computer metaphor” of the mind, the suggestion that the mind literally works like a human-made database, computer program, or thermostat. It says only that we can explain minds and human-made information processors using some of the same principles. It is just like other cases in which the natural world and human engineering overlap. A physiologist might invoke the same laws of optics to explain how the eye works and how a camera works without implying that the eye is like a camera in every detail.
The computational theory of mind does more than explain the existence of knowing, thinking, and trying without invoking a ghost in the machine (though that would be enough of a feat). It also explains how those processes can be intelligent—how rationality can emerge from a mindless physical process. If a sequence of transformations of information stored in a hunk of matter (such as brain tissue or silicon) mirrors a sequence of deductions that obey the laws of logic, probability, or cause and effect in the world, they will generate correct predictions about the world. And making correct predictions in pursuit of a goal is a pretty good definition of “intelligence.”3
Of course there is no new thing under the sun, and the computational theory of mind was foreshadowed by Hobbes when he described mental activity as tiny motions and wrote that “reasoning is but reckoning.” Three and a half centuries later, science has caught up to his vision. Perception, memory, imagery, reasoning, decision making, language, and motor control are being studied in the lab and successfully modeled as computational paraphernalia such as rules, strings, matrices, pointers, lists, files, trees, arrays, loops, propositions, and networks. For example, cognitive psychologists are studying the graphics system in the head and thereby explaining how people “see” the solution to a problem in a mental image. They are studying the web of concepts in long-term memory and explaining why some facts are easier to recall than others. They are studying the processor and memory used by the language system to learn why some sentences are a pleasure to read and others a difficult slog.
And if the proof is in the computing, then the sister field of artificial intelligence is confirming that ordinary matter can perform feats that were supposedly performable by mental stuff alone. In the 1950s computers were already being called “electronic brains” because they could calculate sums, organize data, and prove theorems. Soon they could correct spelling, set type, solve equations, and simulate experts on restricted topics such as picking stocks and diagnosing diseases. For decades we psychologists preserved human bragging rights by telling our classes that no computer could read text, decipher speech, or recognize faces, but these boasts are obsolete. Today software that can recognize printed letters and spoken words comes packaged with home computers. Rudimentary programs that understand or translate sentences are available in many search engines and Help programs, and they are steadily improving. Face-recognition systems have advanced to the point that civil libertarians are concerned about possible abuse when they are used with security cameras in public places.
Human chauvinists can still write off these low-level feats. Sure, they say, the input and output processing can be fobbed off onto computational modules, but you still need a human user with the capacity for judgment, reflection, and creativity. But according to the computational theory of mind, these capacities are themselves forms of information processing and can be implemented in a computational system. In 1997 an IBM computer called Deep Blue defeated the world chess champion Garry Kasparov, and unlike its predecessors, it did not just evaluate trillions of moves by brute force but was fitted with strategies that intelligently responded to patterns in the game. Newsweek called the match “The Brains Last Stand.” Kasparov called the outcome “the end of mankind.”
You might still object that chess is an artificial world with discrete moves and a clear winner, perfectly suited to the rule-crunching of a computer. People, on the other hand, live in a messy world offering unlimited moves and nebulous goals. Surely this requires human creativity and intuition—which is why everyone knows that computers will never compose a symphony, write a story, or paint a picture. But everyone may be wrong. Recent artificial intelligence systems have written credible short stories,4 composed convincing Mozart-like symphonies,5 drawn appealing pictures of people and landscapes,6 and conceived clever ideas for advertisements.7
None of this is to say that the brain works like a digital computer, that artificial intelligence will ever duplicate the human mind, or that computers are conscious in the sense of having first-person subjective experience. But it does suggest that reasoning, intelligence, imagination, and creativity are forms of information processing, a well-understood physical process. Cognitive science, with the help of the computational theory of mind, has exorcised at least one ghost from the machine.
A second idea: The mind cannot be a blank slate, because blank slates don’t do anything. As long as people had only the haziest concept of what a mind was or how it might work, the metaphor of a blank slate inscribed by the environment did not seem too outrageous. But as soon as one starts to think seriously about what kind of computation enables a system to see, think, speak, and plan, the problem with blank slates becomes all too obvious: they don’t do anything. The inscriptions will sit there forever unless something notices patterns in them, combines them with patterns learned at other times, uses the combinations to scribble new thoughts onto the slate, and reads the results to guide behavior toward goals. Locke recognized this problem and alluded to something called “the understanding,” which looked at the inscriptions on the white paper and carried out the recognizing, reflecting, and associating. But of course explaining how the mind understands by invoking something called “the understanding” is circular.
This argument against the Blank Slate was stated pithily by Gottfried Wilhelm Leibniz (1646-1716) in a reply to Locke. Leibniz repeated the empiricist motto “There is nothing in the intellect that was not first in the senses,” then added, “except the intellect itself.”8 Something in the mind must be innate, if it is only the mechanisms that do the learning. Something has to see a world of objects rather than a kaleidoscope of shimmering pixels. Something has to infer the content of a sentence rather than parrot back the exact wording. Something has to interpret other people’s behavior as their attempts to achieve goals rather than as trajectories of jerking arms and legs.
In the spirit of Locke, one could attribute these feats to an abstract noun—perhaps not to “the understanding” but to “learning,” “intelligence,” “plasticity,” or “adaptiveness.” But as Leibniz remarked, to do so is to “[save appearances] by fabricating faculties or occult qualities,… and fancying them to be like little demons or imps which can without ado perform whatever is wanted, as though pocket watches told the time by a certain horological faculty without needing wheels, or as though mills crushed grain by a
fractive faculty without needing anything in the way of millstones.”9 Leibniz, like Hobbes (who had influenced him), was ahead of his time in recognizing that intelligence is a form of information processing and needs complex machinery to carry it out. As we now know, computers don’t understand speech or recognize text as they roll off the assembly line; someone has to install the right software first. The same is likely to be true of the far more demanding performance of the human being. Cognitive modelers have found that mundane challenges like walking around furniture, understanding a sentence, recalling a fact, or guessing someone’s intentions are formidable engineering problems that are at or beyond the frontiers of artificial intelligence. The suggestion that they can be solved by a lump of Silly Putty that is passively molded by something called “culture” just doesn’t cut the mustard.
This is not to say that cognitive scientists have put the nature-nurture debate completely behind them; they are still spread out along a continuum of opinion on how much standard equipment comes with the human mind. At one end are the philosopher Jerry Fodor, who has suggested that all concepts might be innate (even “doorknob” and “tweezers”), and the linguist Noam Chomsky, who believes that the word “learning” is misleading and we should say that children “grow” language instead.10 At the other end are the connectionists, including Rumelhart, McClelland, Jeffrey Elman, and Elizabeth Bates, who build relatively simple computer models and train the living daylights out of them.11 Fans locate the first extreme, which originated at the Massachusetts Institute of Technology, at the East Pole, the mythical place from which all directions are west. They locate the second extreme, which originated at the University of California, San Diego, at the West Pole, the mythical place from which all directions are east. (The names were suggested by Fodor during an MIT seminar at which he was fulminating against a “West Coast theorist” and someone pointed out that the theorist worked at Yale, which is, technically, on the East Coast.)12
But here is why the East Pole–West Pole debate is different from the ones that preoccupied philosophers for millennia: neither side believes in the Blank Slate. Everyone acknowledges that there can be no learning without innate circuitry to do the learning. In their West Pole manifesto Rethinking Innateness, Bates and Elman and their coauthors cheerfully concede this point: “No learning rule can be entirely devoid of theoretical content nor can the tabula ever be completely rasa.”13 They explain:
There is a widespread belief that connectionist models (and modelers) are committed to an extreme form of empiricism; and that any form of innate knowledge is to be avoided like the plague…. We obviously do not subscribe to this point of view…. There are good reasons to believe that some kinds of prior constraints [on learning models] are necessary. In fact, all connectionist models necessarily make some assumptions which must be regarded as constituting innate constraints.14
The disagreements between the two poles, though significant, are over the details: how many innate learning networks there are, and how specifically engineered they are for particular jobs. (We will explore some of these disagreements in Chapter 5.)
A third idea: An infinite range of behavior can be generated by finite combinatorial programs in the mind. Cognitive science has undermined the Blank Slate and the Ghost in the Machine in another way. People can be forgiven for scoffing at the suggestion that human behavior is “in the genes” or “a product of evolution” in the senses familiar from the animal world. Human acts are not selected from a repertoire of knee-jerk reactions like a fish attacking a red spot or a hen sitting on eggs. Instead, people may worship goddesses, auction kitsch on the Internet, play air guitar, fast to atone for past sins, build forts out of lawn chairs, and so on, seemingly without limit. A glance at National Geographic shows that even the strangest acts in our own culture do not exhaust what our species is capable of. If anything goes, one might think, then perhaps we are Silly Putty, or unconstrained agents, after all.
But that impression has been made obsolete by the computational approach to the mind, which was barely conceivable in the era in which the Blank Slate arose. The clearest example is the Chomskyan revolution in language.15 Language is the epitome of creative and variable behavior. Most utterances are brand-new combinations of words, never before uttered in the history of humankind. We are nothing like Tickle Me Elmo dolls who have a fixed list of verbal responses hard-wired in. But, Chomsky pointed out, for all its open-endedness language is not a free-for-all; it obeys rules and patterns. An English speaker can utter unprecedented strings of words such as Every day new universes come into existence, or He likes his toast with cream cheese and ketchup, or My car has been eaten by wolverines. But no one would say Car my been eaten has wolverines by or most of the other possible orderings of English words. Something in the head must be capable of generating not just any combinations of words but highly systematic ones.
That something is a kind of software, a generative grammar that can crank out new arrangements of words. A battery of rules such as “An English sentence contains a subject and a predicate,” “A predicate contains a verb, an object, and a complement,” and “The subject of eat is the eater” can explain the boundless creativity of a human talker. With a few thousand nouns that can fill the subject slot and a few thousand verbs that can fill the predicate slot, one already has several million ways to open a sentence. The possible combinations quickly multiply out to unimaginably large numbers. Indeed, the repertoire of sentences is theoretically infinite, because the rules of language use a trick called recursion. A recursive rule allows a phrase to contain an example of itself, as in She thinks that he thinks that they think that he knows and so on, ad infinitum. And if the number of sentences is infinite, the number of possible thoughts and intentions is infinite too, because virtually every sentence expresses a different thought or intention. The combinatorial grammar for language meshes with other combinatorial programs in the head for thoughts and intentions. A fixed collection of machinery in the mind can generate an infinite range of behavior by the muscles.16
Once one starts to think about mental software instead of physical behavior, the radical differences among human cultures become far smaller, and that leads to a fourth new idea: Universal mental mechanisms can underlie superficial variation across cultures. Again, we can use language as a paradigm case of the open-endedness of behavior. Humans speak some six thousand mutually unintelligible languages. Nonetheless, the grammatical programs in their minds differ far less than the actual speech coming out of their mouths. We have known for a long time that all human languages can convey the same kinds of ideas. The Bible has been translated into hundreds of non-Western languages, and during World War II the U.S. Marine Corps conveyed secret messages across the Pacific by having Navajo Indians translate them to and from their native language. The fact that any language can be used to convey any proposition, from theological parables to military directives, suggests that all languages are cut from the same cloth.
Chomsky proposed that the generative grammars of individual languages are variations on a single pattern, which he called Universal Grammar. For example, in English the verb comes before the object (drink beer) and the preposition comes before the noun phrase (from the bottle). In Japanese the object comes before the verb (beer drink) and the noun phrase comes before the preposition, or, more accurately, the postposition (the bottle from). But it is a significant discovery that both languages have verbs, objects, and pre- or postpositions to start with, as opposed to having the countless other conceivable kinds of apparatus that could power a communication system. And it is even more significant that unrelated languages build their phrases by assembling a head (such as a verb or preposition) and a complement (such as a noun phrase) and assigning a consistent order to the two. In English the head comes first; in Japanese the head comes last. But everything else about the structure of phrases in the two languages is pretty much the same. And so it goes with phrase after phrase and language after language. The commo
n kinds of heads and complements can be ordered in 128 logically possible ways, but 95 percent of the world’s languages use one of two: either the English ordering or its mirror image the Japanese ordering.17 A simple way to capture this uniformity is to say that all languages have the same grammar except for a parameter or switch that can be flipped to either the “head-first” or “head-last” setting. The linguist Mark Baker has recently summarized about a dozen of these parameters, which succinctly capture most of the known variation among the languages of the world.18
Distilling the variation from the universal patterns is not just a way to tidy up a set of messy data. It can also provide clues about the innate circuitry that makes learning possible. If the universal part of a rule is embodied in the neural circuitry that guides babies when they first learn language, it could explain how children learn language so easily and uniformly and without the benefit of instruction. Rather than treating the sound coming out of Mom’s mouth as just an interesting noise to mimic verbatim or to slice and dice in arbitrary ways, the baby listens for heads and complements, pays attention to how they are ordered, and builds a grammatical system consistent with that ordering.