The Texas researchers applied a classical learning experiment to their simulation and compared the results to many similar experiments on actual human conditioning. In the human studies, the task involved associating an auditory tone with a puff of air applied to the eyelid, which causes the eyelid to close. If the puff of air and the tone are presented together for one hundred to two hundred trials, the subject will learn the association and close the subject’s eye upon merely hearing the tone. If the tone is then presented many times without the air puff, the subject ultimately learns to disassociate the two stimuli (to “extinguish” the response), so the learning is bidirectional. After tuning a variety of parameters, the simulation provided a reasonable match to experimental results on human and animal cerebellar conditioning. Interestingly, the researchers found that if they created simulated cerebellar lesions (by removing portions of the simulated cerebellar network), they got results similar to those obtained in experiments on rabbits that had received actual cerebellar lesions.86
On account of the uniformity of this large region of the brain and the relative simplicity of its interneuronal wiring, its input-output transformations are relatively well understood, compared to those of other brain regions. Although the relevant equations still require refinement, this bottom-up simulation has proved quite impressive.
Another Example: Watts’s Model of the Auditory Regions
I believe that the way to create a brain-like intelligence is to build a real-time working model system, accurate in sufficient detail to express the essence of each computation that is being performed, and verify its correct operation against measurements of the real system. The model must run in real-time so that we will be forced to deal with inconvenient and complex real-world inputs that we might not otherwise think to present to it. The model must operate at sufficient resolution to be comparable to the real system, so that we build the right intuitions about what information is represented at each stage. Following Mead,87 the model development necessarily begins at the boundaries of the system (i.e., the sensors) where the real system is well-understood, and then can advance into the less-understood regions. . . . In this way, the model can contribute fundamentally to our advancing understanding of the system, rather than simply mirroring the existing understanding. In the context of such great complexity, it is possible that the only practical way to understand the real system is to build a working model, from the sensors inward, building on our newly enabled ability to visualize the complexity of the system as we advance into it. Such an approach could be called reverse-engineering of the brain. . .. Note that I am not advocating a blind copying of structures whose purpose we don’t understand, like the legendary Icarus who naively attempted to build wings out of feathers and wax. Rather, I am advocating that we respect the complexity and richness that is already well-understood at low levels, before proceeding to higher levels.
—LLOYD WATTS88
A major example of neuromorphic modeling of a region of the brain is the comprehensive replica of a significant portion of the human auditory-processing system developed by Lloyd Watts and his colleagues.89 It is based on neurobiological studies of specific neuron types as well as on information regarding interneuronal connection. The model, which has many of the same properties as human hearing and can locate and identify sounds, has five parallel paths of processing auditory information and includes the actual intermediate representations of this information at each stage of neural processing. Watts has implemented his model as real-time computer software which, though a work in progress, illustrates the feasibility of converting neurobiological models and brain connection data into working simulations. The software is not based on reproducing each individual neuron and connection, as is the cerebellum model described above, but rather the transformations performed by each region.
Watts’s software is capable of matching the intricacies that have been revealed in subtle experiments on human hearing and auditory discrimination. Watts has used his model as a preprocessor (front end) in speech recognition systems and has demonstrated its ability to pick out one speaker from background sounds (the “cocktail party effect”). This is an impressive feat of which humans are capable but up until now had not been feasible in automated speech-recognition systems.90
Like human hearing, Watts’s cochlea model is endowed with spectral sensitivity (we hear better at certain frequencies), temporal responses (we are sensitive to the timing of sounds, which create the sensation of their spatial locations), masking, nonlinear frequency-dependent amplitude compression (which allows for greater dynamic range—the ability to hear both loud and quiet sounds), gain control (amplification), and other subtle features. The results it obtains are directly verifiable by biological and psychophysical data.
The next segment of the model is the cochlear nucleus, which Yale University professor of neuroscience and neurobiology Gordon M. Shepherd91 has described as “one of the best understood regions of the brain.”92 Watts’s simulation of the cochlear nucleus is based on work by E. Young that describes in detail “the essential cell types responsible for detecting spectral energy, broadband transients, fine tuning in spectral channels, enhancing sensitivity to temporary envelope in spectral channels, and spectral edges and notches, all while adjusting gain for optimum sensitivity within the limited dynamic range of the spiking neural code.”93
The Watts model captures many other details, such as the interaural time difference (ITD) computed by the medial superior olive cells.94 It also represents the interaural level difference (ILD) computed by the lateral superior olive cells and normalizations and adjustments made by the inferior colliculus cells.95
Reverse engineering the human brain: Five parallel auditory pathways. 96
The Visual System
We’ve made enough progress in understanding the coding of visual information that experimental retina implants have been developed and surgically installed in patients.97 However, because of the relative complexity of the visual system, our understanding of the processing of visual information lags behind our knowledge of the auditory regions. We have preliminary models of the transformations performed by two visual areas (called V1 and MT), although not at the individual neuron level. There are thirty-six other visual areas, and we will need to be able to scan these deeper regions at very high resolution or place precise sensors to ascertain their functions.
A pioneer in understanding visual processing is MIT’s Tomaso Poggio, who has distinguished its two tasks as identification and categorization.98 The former is relatively easy to understand, according to Poggio, and we have already designed experimental and commercial systems that are reasonably successful in identifying faces.99 These are used as part of security systems to control entry of personnel and in bank machines. Categorization—the ability to differentiate, for example, between a person and a car or between a dog and a cat—is a more complex matter, although recently progress has been made.100
Early (in terms of evolution) layers of the visual system are largely a feedforward (lacking feedback) system in which increasingly sophisticated features are detected. Poggio and Maximilian Riesenhuber write that “single neurons in the macaque posterior inferotemporal cortex may be tuned to . . . a dictionary of thousands of complex shapes.” Evidence that visual recognition uses a feedforward system during recognition includes MEG studies that show the human visual system takes about 150 milliseconds to detect an object. This matches the latency of feature-detection cells in the inferotemporal cortex, so there does not appear to be time for feedback to play a role in these early decisions.
Recent experiments have used a hierarchical approach in which features are detected to be analyzed by later layers of the system.101 From studies on macaque monkeys, neurons in the inferotemporal cortex appear to respond to complex features of objects on which the animals are trained. While most of the neurons respond only to a particular view of the object, some are able to respond regardless of perspective. Other research on the
visual system of the macaque monkey includes studies on many specific types of cells, connectivity patterns, and high-level descriptions of information flow.102
Extensive literature supports the use of what I call “hypothesis and test” in more complex pattern-recognition tasks. The cortex makes a guess about what it is seeing and then determines whether the features of what is actually in the field of view match its hypothesis.103 We are often more focused on the hypothesis than the actual test, which explains why people often see and hear what they expect to perceive rather than what is actually there. “Hypothesis and test” is also a useful strategy in our computer-based pattern-recognition systems.
Although we have the illusion of receiving high-resolution images from our eyes, what the optic nerve actually sends to the brain is just outlines and clues about points of interest in our visual field. We then essentially hallucinate the world from cortical memories that interpret a series of extremely low-resolution movies that arrive in parallel channels. In a 2001 study published in Nature, Frank S. Werblin, professor of molecular and cell biology at the University of California at Berkeley, and doctoral student Boton Roska, M.D., showed that the optic nerve carries ten to twelve output channels, each of which carries only minimal information about a given scene.104 One group of what are called ganglion cells sends information only about edges (changes in contrast). Another group detects only large areas of uniform color, whereas a third group is sensitive only to the backgrounds behind figures of interest.
Seven of the dozen separate movies that the eye extracts from a scene and sends to the brain.
“Even though we think we see the world so fully, what we are receiving is really just hints, edges in space and time,” says Werblin. “These 12 pictures of the world constitute all the information we will ever have about what’s out there, and from these 12 pictures, which are so sparse, we reconstruct the richness of the visual world. I’m curious how nature selected these 12 simple movies and how it can be that they are sufficient to provide us with all the information we seem to need.” Such findings promise to be a major advance in developing an artificial system that could replace the eye, retina, and early optic-nerve processing.
In chapter 3, I mentioned the work of robotics pioneer Hans Moravec, who has been reverse engineering the image processing done by the retina and early visual-processing regions in the brain. For more than thirty years Moravec has been constructing systems to emulate the ability of our visual system to build representations of the world. It has only been recently that sufficient processing power has been available in microprocessors to replicate this human-level feature detection, and Moravec is applying his computer simulations to a new generation of robots that can navigate unplanned, complex environments with human-level vision.105
Carver Mead has been pioneering the use of special neural chips that utilize transistors in their native analog mode, which can provide very efficient emulation of the analog nature of neural processing. Mead has demonstrated a chip that performs the functions of the retina and early transformations in the optic nerve using this approach.106
A special type of visual recognition is detecting motion, one of the focus areas of the Max Planck Institute of Biology in Tübingen, Germany. The basic research model is simple: compare the signal at one receptor with a time-delayed signal at the adjacent receptor.107 This model works for certain speeds but leads to the surprising result that above a certain speed, increases in the velocity of an observed object will decrease the response of this motion detector. Experimental results on animals (based on behavior and analysis of neuronal outputs) and humans (based on reported perceptions) have closely matched the model.
Other Works in Progress: An Artificial Hippocampus and an Artificial Olivocerebellar Region
The hippocampus is vital for learning new information and long-term storage of memories. Ted Berger and his colleagues at the University of Southern California mapped the signal patterns of this region by stimulating slices of rat hippocampus with electrical signals millions of times to determine which input produced a corresponding output.108 They then developed a real-time mathematical model of the transformations performed by layers of the hippocampus and programmed the model onto a chip.109 Their plan is to test the chip in animals by first disabling the corresponding hippocampus region, noting the resulting memory failure, and then determining whether that mental function can be restored by installing their hippocampal chip in place of the disabled region.
Ultimately, this approach could be used to replace the hippocampus in patients affected by strokes, epilepsy, or Alzheimer’s disease. The chip would be located on a patient’s skull, rather than inside the brain, and would communicate with the brain via two arrays of electrodes, placed on either side of the damaged hippocampal section. One would record the electrical activity coming from the rest of the brain, while the other would send the necessary instructions back to the brain.
Another brain region being modeled and simulated is the olivocerebellar region, which is responsible for balance and coordinating the movement of limbs. The goal of the international research group involved in this effort is to apply their artificial olivocerebellar circuit to military robots as well as to robots that could assist the disabled.110 One of their reasons for selecting this particular brain region was that “it’s present in all vertebrates—it’s very much the same from the most simple to the most complex brains,” explains Rodolfo Llinas, one of the researchers and a neuroscientist at New York University Medical School. “The assumption is that it is conserved [in evolution] because it embodies a very intelligent solution. As the system is involved in motor coordination—and we want to have a machine that has sophisticated motor control—then the choice [of the circuit to mimic] was easy.”
One of the unique aspects of their simulator is that it uses analog circuits. Similar to Mead’s pioneering work on analog emulation of brain regions, the researchers found substantially greater performance with far fewer components by using transistors in their native analog mode.
One of the team’s researchers, Ferdinando Mussa-Ivaldi, a neuroscientist at Northwestern University, commented on the applications of an artificial olivocerebellar circuit for the disabled: “Think of a paralyzed patient. It is possible to imagine that many ordinary tasks—such as getting a glass of water, dressing, undressing, transferring to a wheelchair—could be carried out by robotic assistants, thus providing the patient with more independence.”
Understanding Higher-Level Functions: Imitation, Prediction, and Emotion
Operations of thought are like cavalry charges in a battle—they are strictly limited in number, they require fresh horses, and must only be made at decisive moments.
—ALFRED NORTH WHITEHEAD
But the big feature of human-level intelligence is not what it does when it works but what it does when it’s stuck.
—MARVIN MINSKY
If love is the answer, could you please rephrase the question?
—LILY TOMLIN
Because it sits at the top of the neural hierarchy, the part of the brain least well understood is the cerebral cortex. This region, which consists of six thin layers in the outermost areas of the cerebral hemispheres, contains billions of neurons. According to Thomas M. Bartol Jr. of the Computational Neurobiology Laboratory of the Salk Institute of Biological Studies,“A single cubic millimeter of cerebral cortex may contain on the order of 5 billion . . . synapses of different shapes and sizes.” The cortex is responsible for perception, planning, decision making and most of what we regard as conscious thinking.
Our ability to use language, another unique attribute of our species, appears to be located in this region. An intriguing hint about the origin of language and a key evolutionary change that enabled the formation of this distinguishing skill is the observation that only a few primates, including humans and monkeys, are able to use an (actual) mirror to master skills. Theorists Giacomo Rizzolatti and Michael Arbib hypothesized that languag
e emerged from manual gestures (which monkeys—and, of course, humans—are capable of). Performing manual gestures requires the ability to mentally correlate the performance and observation of one’s own hand movements.111 Their “mirror system hypothesis” is that the key to the evolution of language is a property called “parity,” which is the understanding that the gesture (or utterance) has the same meaning for the party making the gesture as for the party receiving it; that is, the understanding that what you see in a mirror is the same (although reversed left-to-right) as what is seen by someone else watching you. Other animals are unable to understand the image in a mirror in this fashion, and it is believed that they are missing this key ability to deploy parity.
A closely related concept is that the ability to imitate the movements (or, in the case of human babies, vocal sounds) of others is critical to developing language.112 Imitation requires the ability to break down an observed presentation into parts, each of which can then be mastered through recursive and iterative refinement.