But do monkeys really know how their groupmates are related to each other, and, more impressively, do they realize that different pairs of individuals like brothers and sisters can be related in the same way? Cheney and Seyfarth hid a loudspeaker behind a bush and played tapes of a two-year-old monkey screaming. The females in the area reacted by looking at the mother of the infant who had been recorded—showing that they not only recognized the infant by its scream but recalled who its mother was. Similar abilities have been shown in the longtailed macaques that Verena Dasser coaxed into a laboratory adjoining a large outdoor enclosure. Three slides were projected: a mother at the center, one of her offspring on one side, and an unrelated juvenile of the same age and sex on the other. Each screen had a button under it. After the monkey had been trained to press a button under the offspring slide, it was tested on pictures of other mothers in the group, each one flanked by a picture of that mother’s offspring and a picture of another juvenile. More than ninety percent of the time the monkey picked the offspring. In another test, the monkey was shown two slides, each showing a pair of monkeys, and was trained to press a button beneath the slide showing a particular mother and her juvenile daughter. When presented with slides of new monkeys in the group, the subject monkey always picked the mother-and-offspring pair, whether the offspring was male, female, infant, juvenile, or adult. Moreover, the monkeys appeared to be relying not only on physical resemblance between a given pair of monkeys, or on the sheer number of hours they had previously spent together, as the basis for recognizing they were kin, but on something more subtle in the history of their interaction. Cheney and Seyfarth, who work hard at keeping track of who is related to whom in what way in the groups of animals they study, note that monkeys would make excellent primatologists.

  Many creative people insist that in their most inspired moments they think not in words but in mental images. Samuel Taylor Coleridge wrote that visual images of scenes and words once appeared involuntarily before him in a dreamlike state (perhaps opiuminduced). He managed to copy the first forty lines onto paper, resulting in the poem we know as “Kubla Khan,” before a knock on the door shattered the images and obliterated forever what would have been the rest of the poem. Many contemporary novelists, like Joan Didion, report that their acts of creation begin not with any notion of a character or a plot but with vivid mental pictures that dictate their choice of words. The modern sculptor James Surls plans his projects lying on a couch listening to music; he manipulates the sculptures in his mind’s eye, he says, putting an arm on, taking an arm off, watching the images roll and tumble.

  Physical scientists are even more adamant that their thinking is geometrical, not verbal. Michael Faraday, the originator of our modern conception of electric and magnetic fields, had no training in mathematics but arrived at his insights by visualizing lines of force as narrow tubes curving through space. James Clerk Maxwell formalized the concepts of electromagnetic fields in a set of mathematical equations and is considered the prime example of an abstract theoretician, but he set down the equations only after mentally playing with elaborate imaginary models of sheets and fluids. Nikola Tesla’s idea for the electrical motor and generator, Friedrich Kekulé’s discovery of the benzene ring that kicked off modern organic chemistry, Ernest Lawrence’s conception of the cyclotron, James Watson and Francis Crick’s discovery of the DNA double helix—all came to them in images. The most famous self-described visual thinker is Albert Einstein, who arrived at some of his insights by imagining himself riding a beam of light and looking back at a clock, or dropping a coin while standing in a plummeting elevator. He wrote:

  The psychical entities which seem to serve as elements in thought are certain signs and more or less clear images which can be “voluntarily” reproduced and combined…. This combinatory play seems to be the essential feature in productive thought—before there is any connection with logical construction in words or other kinds of signs which can be communicated to others. The above-mentioned elements are, in my case, of visual and some muscular type. Conventional words or other signs have to be sought for laboriously only in a secondary state, when the mentioned associative play is sufficiently established and can be reproduced at will.

  Another creative scientist, the cognitive psychologist Roger Shepard, had his own moment of sudden visual inspiration, and it led to a classic laboratory demonstration of mental imagery in mere mortals. Early one morning, suspended between sleep and awakening in a state of lucid consciousness, Shepard experienced “a spontaneous kinetic image of three-dimensional structures majestically turning in space.” Within moments and before fully awakening, Shepard had a clear idea for the design of an experiment. A simple variant of his idea was later carried out with his then-student Lynn Cooper. Cooper and Shepard flashed thousands of slides, each showing a single letter of the alphabet, to their long-suffering student volunteers. Sometimes the letter was upright, but sometimes it was tilted or mirror-reversed or both. As an example, here are the sixteen versions of the letter F:

  The subjects were asked to press one button if the letter was normal (that is, like one of the letters in the top row of the diagram), another if it was a mirror image (like one of the letters in the bottom row). To do the task, the subjects had to compare the letter in the slide against some memory record of what the normal version of the letter looks like right-side up. Obviously, the right-side-up slide (0 degrees) is the quickest, because it matches the letter in memory exactly, but for the other orientations, some mental transformation to the upright is necessary first. Many subjects reported that they, like the famous sculptors and scientists, “mentally rotated” an image of the letter to the upright. By looking at the reaction times, Shepard and Cooper showed that this introspection was accurate. The upright letters were fastest, followed by the 45 degree letters, the 90 degree letters, and the 135 degree letters, with the 180 degree (upside-down) letters the slowest. In other words, the farther the subjects had to mentally rotate the letter, the longer they took. From the data, Cooper and Shepard estimated that letters revolve in the mind at a rate of 56 RPM.

  Note that if the subjects had been manipulating something resembling verbal descriptions of the letters, such as “an upright spine with one horizontal segment that extends rightwards from the top and another horizontal segment that extends rightwards from the middle,” the results would have been very different. Among all the topsy-turvy letters, the upside-down versions (180 degrees) should be fastest: one simply switches all the “top”s to “bottom”s and vice versa, and the “left”s to “right”s and vice versa, and one has a new description of the shape as it would appear right-side up, suitable for matching against memory. Sideways letters (90 degrees) should be slower, because “top” gets changed either to “right” or to “left,” depending on whether it lies clockwise (+ 90 degrees) or counterclockwise (- 90 degrees) from the upright. Diagonal letters (45 and 135 degrees) should be slowest, because every word in the description has to be replaced: “top” has to be replaced with either “top right” or “top left,” and so on. So the order of difficulty should be 0, 180, 90, 45, 135, not the majestic rotation of 0, 45, 90, 135, 180 that Cooper and Shepard saw in the data. Many other experiments have corroborated the idea that visual thinking uses not language but a mental graphics system, with operations that rotate, scan, zoom, pan, displace, and fill in patterns of contours.

  What sense, then, can we make of the suggestion that images, numbers, kinship relations, or logic can be represented in the brain without being couched in words? In the first half of this century, philosophers had an answer: none. Reifying thoughts as things in the head was a logical error, they said. A picture or family tree or number in the head would require a little man, a homunculus, to look at it. And what would be inside his head—even smaller pictures, with an even smaller man looking at them? But the argument was unsound. It took Alan Turing, the brilliant British mathematician and philosopher, to make the idea of a mental representation scie
ntifically respectable. Turing described a hypothetical machine that could be said to engage in reasoning. In fact this simple device, named a Turing machine in his honor, is powerful enough to solve any problem that any computer, past, present, or future, can solve. And it clearly uses an internal symbolic representation—a kind of mentalese—without requiring a little man or any occult processes. By looking at how a Turing machine works, we can get a grasp of what it would mean for a human mind to think in mentalese as opposed to English.

  In essence, to reason is to deduce new pieces of knowledge from old ones. A simple example is the old chestnut from introductory logic: if you know that Socrates is a man and that all men are mortal, you can figure out that Socrates is mortal. But how could a hunk of matter like a brain accomplish this feat? The first key idea is a representation: a physical object whose parts and arrangement correspond piece for piece to some set of ideas or facts. For example, the pattern of ink on this page

  * * *

  Socrates is a man

  * * *

  is a representation of the idea that Socrates is a man. The shape of one group of ink marks, Socrates, is a symbol that stands for the concept of Socrates. The shape of another set of ink marks, isa, stands for the concept of being an instance of, and the shape of the third, man, stands for the concept of man. Now, it is crucial to keep one thing in mind. I have put these ink marks in the shape of English words as a courtesy to you, the reader, so that you can keep them straight as we work through the example. But all that really matters is that they have different shapes. I could have used a star of David, a smiley face, and the Mercedes Benz logo, as long as I used them consistently.

  Similarly, the fact that the Socrates ink marks are to the left of the isa ink marks on the page, and the man ink marks are to the right, stands for the idea that Socrates is a man. If I change any part of the representation, like replacing isa with isasonofa, or flipping the positions of Socrates and man, we would have a representation of a different idea. Again, the left-to-right English order is just a mnemonic device for your convenience. I could have done it right-to-left or up-and-down, as long as I used that order consistently.

  Keeping these conventions in mind, now imagine that the page has a second set of ink marks, representing the proposition that every man is mortal:

  * * *

  Socrates is a man

  Every man is mortal

  * * *

  To get reasoning to happen, we now need a processor. A processor is not a little man (so one needn’t worry about an infinite regress of homunculi inside homunculi) but something much stupider: a gadget with a fixed number of reflexes. A processor can react to different pieces of a representation and do something in response, including altering the representation or making new ones. For example, imagine a machine that can move around on a printed page. It has a cutout in the shape of the letter sequence isa, and a light sensor that can tell when the cutout is superimposed on a set of ink marks in the exact shape of the cutout. The sensor is hooked up to a little pocket copier, which can duplicate any set of ink marks, either by printing identical ink marks somewhere else on the page or by burning them into a new cutout.

  Now imagine that this sensor-copier-creeper machine is wired up with four reflexes. First, it rolls down the page, and whenever it detects some isa ink marks, it moves to the left, and copies the ink marks it finds there onto the bottom left corner of the page. Let loose on our page, it would create the following:

  * * *

  Socrates is a man

  Every man is mortal

  Socrates

  * * *

  Its second reflex, also in response to finding an isa, is to get itself to the right of that isa and copy any ink marks it finds there into the holes of a new cutout. In our case, this forces the processor to make a cutout in the shape of man. Its third reflex is to scan down the page checking for ink marks shaped like Every, and if it finds some, seeing if the ink marks to the right align with its new cutout. In our example, it finds one: the man in the middle of the second line. Its fourth reflex, upon finding such a match, is to move to the right and copy the ink marks it finds there onto the bottom center of the page. In our example, those are the ink marks ismortal. If you are following me, you’ll see that our page now looks like this:

  * * *

  Socrates isa man

  Every man ismortal

  Socrates ismortal

  * * *

  A primitive kind of reasoning has taken place. Crucially, although the gadget and the page it sits on collectively display a kind of intelligence, there is nothing in either of them that is itself intelligent. Gadget and page are just a bunch of ink marks, cutouts, photocells, lasers, and wires. What makes the whole device smart is the exact correspondence between the logician’s rule “If X is a Y and all Y’s are Z, then X is Z” and the way the device scans, moves, and prints. Logically speaking, “X is a Y” means that what is true of Y is also true of X, and mechanically speaking, X isa Y causes what is printed next to the Y to be also printed next to the X. The machine, blindly following the laws of physics, just responds to the shape of the ink marks isa (without understanding what it means to us) and copies other ink marks in a way that ends up mimicking the operation of the logical rule. What makes it “intelligent” is that the sequence of sensing and moving and copying results in its printing a representation of a conclusion that is true if and only if the page contains representations of premises that are true. If one gives the device as much paper as it needs, Turing showed, the machine can do anything that any computer can do—and perhaps, he conjectured, anything that any physically embodied mind can do.

  Now, this example uses ink marks on paper as its representation and a copying-creeping-sensing machine as its processor. But the representation can be in any physical medium at all, as long as the patterns are used consistently. In the brain, there might be three groups of neurons, one used to represent the individual that the proposition is about (Socrates, Aristotle, Rod Stewart, and so on), one to represent the logical relationship in the proposition (is a, is not, is like, and so on), and one to represent the class or type that the individual is being categorized as (men, dogs, chickens, and so on). Each concept would correspond to the firing of a particular neuron; for example, in the first group of neurons, the fifth neuron might fire to represent Socrates and the seventeenth might fire to represent Aristotle; in the third group, the eighth neuron might fire to represent men, the twelfth neuron might fire to represent dogs. The processor might be a network of other neurons feeding into these groups, connected together in such a way that it reproduces the firing pattern in one group of neurons in some other group (for example, if the eighth neuron is firing in group 3, the processor network would turn on the eighth neuron in some fourth group, elsewhere in the brain). Or the whole thing could be done in silicon chips. But in all three cases the principles are the same. The way the elements in the processor are wired up would cause them to sense and copy pieces of a representation, and to produce new representations, in a way that mimics the rules of reasoning. With many thousands of representations and a set of somewhat more sophisticated processors (perhaps different kinds of representations and processors for different kinds of thinking), you might have a genuinely intelligent brain or computer. Add an eye that can detect certain contours in the world and turn on representations that symbolize them, and muscles that can act on the world whenever certain representations symbolizing goals are turned on, and you have a behaving organism (or add a TV camera and set of levers and wheels, and you have a robot).

  This, in a nutshell, is the theory of thinking called “the physical symbol system hypothesis” or the “computational” or “representational” theory of mind. It is as fundamental to cognitive science as the cell doctrine is to biology and plate tectonics is to geology. Cognitive psychologists and neuroscientists are trying to figure out what kinds of representations and processors the brain has. But there are ground rules that must be f
ollowed at all times: no little men inside, and no peeking. The representations that one posits in the mind have to be arrangements of symbols, and the processor has to be a device with a fixed set of reflexes, period. The combination, acting all by itself, has to produce the intelligent conclusions. The theorist is forbidden to peer inside and “read” the symbols, “make sense” of them, and poke around to nudge the device in smart directions like some deus ex machina.

  Now we are in a position to pose the Whorfian question in a precise way. Remember that a representation does not have to look like English or any other language; it just has to use symbols to represent concepts, and arrangements of symbols to represent the logical relations among them, according to some consistent scheme. But though internal representations in an English speaker’s mind don’t have to look like English, they could, in principle, look like English—or like whatever language the person happens to speak. So here is the question: Do they in fact? For example, if we know that Socrates is a man, is it because we have neural patterns that correspond one-to-one to the English words Socrates, is, a, and man, and groups of neurons in the brain that correspond to the subject of an English sentence, the verb, and the object, laid out in that order? Or do we use some other code for representing concepts and their relations in our heads, a language of thought or mentalese that is not the same as any of the world’s languages? We can answer this question by seeing whether English sentences embody the information that a processor would need to perform valid sequences of reasoning—without requiring any fully intelligent homunculus inside doing the “understanding.”