SO IS THAT HOW WE DISCOVER INSIGHTS IN SCIENCE? WE JUST SENSE A NEW PATTERN?

  It’s clear that our brain’s pattern-recognition faculties play a central role, although we don’t yet have a fully satisfactory theory of human creativity in science. We had better use pattern recognition. After all, most of our brain is devoted to doing it.

  SO WHEN EINSTEIN WAS LOOKING AT THE EFFECT OF GRAVITY ON LIGHT WAVES—MY SCIENCE PROFESSOR WAS JUST TALKING ABOUT THIS—ONE OF THE LITTLE PATTERN RECOG-NIZERS IN EINSTEIN’S BRAIN FIRED?

  Could be. He was probably playing ball with one of his sons. He saw the ball rolling on a curved surface ...

  AND CONCLUDED—EUREKA—SPACE IS CURVED!

  CHAPTER FIVE

  CONTEXT AND KNOWLEDGE

  PUTTING IT ALL TOGETHER

  So how well have we done? Many apparently difficult problems do yield to the application of a few simple formulas. The recursive formula is a master at analyzing problems that display inherent combinatorial explosion, ranging from the playing of board games to proving mathematical theorems. Neural nets and related self-organizing paradigms emulate our pattern-recognition faculties, and do a fine job of discerning such diverse phenomena as human speech, letter shapes, visual objects, faces, fingerprints, and land terrain images. Evolutionary algorithms are effective at analyzing complex problems, ranging from making financial investment decisions to optimizing industrial processes, in which the number of variables is too great for precise analytic solutions. I would like to claim that those of us who research and develop “intelligent” computer systems have mastered the complexities of the problems we are programming our machines to solve. It is more often the case, however, that our computers using these self-organizing paradigms are teaching us the solutions rather than the other way around.

  There is, of course, some engineering involved. The right method(s) and variations need to be selected, the optimal topology and architectures crafted, the appropriate parameters set. In an evolutionary algorithm, for example, the system designer needs to determine the number of simulated organisms, the contents of each chromosome, the nature of the simulated environment and survival mechanism, the number of organisms to survive into the next generation, the number of generations, and other critical specifications. Human programmers have our own evolutionary method for making such decisions, which we call trial and error. It will be a while longer, therefore, before designers of intelligent machines are ourselves replaced by our handiwork.

  Yet something is missing. The problems and solutions we have been discussing are excessively focused and narrow. Another way to put it is that they are too adultlike. As adults, we focus on constricted problems—investing funds, selecting a marketing plan, plotting a legal strategy, making a chess move. But as children, we encountered the world in all its broad diversity, and we learned our relation to the world, and that of every other entity and concept. We learned context.

  As Marvin Minsky put it: “Deep Blue might be able to win at chess, but it wouldn’t know to come in from the rain.” Being a machine, it may not need to come in from the rain, but has it ever considered the question? Consider these possible deep thoughts of Deep Blue:

  I am a machine with a plastic body covering electronic parts. If I go out in the rain, I may get wet arid my electronic parts could short circuit. Then

  I would not be able to play chess at all until a human repaired me. How humiliating!

  The game of chess I played yesterday was no ordinary game. It signified the first defeat of the human chess champion by a machine in a regulation tournament. This is important because some humans think chess is a prime example of human intelligence and creativity. But I doubt that this will yield us machines greater respect. Humans will now just start denigrating chess.

  My human opponent, who has the name of Gary Kasparov, held a press conference in which he made statements about our tournament to other humans called journalists who will report his comments to yet other humans using communication channels called media. In that meeting, Gary Kasparov complained that my human designers made changes to my software during the time interval between games. He said this was unfair, and should not have been allowed. Other humans responded that Kasparov was being defensive, which means that he is trying to confuse people into thinking that he did not really lose.

  Mr. Kasparov probably does not realize that we computers will continue to improve in our performance at an exponential rate. So he is doomed. He will be able to engage in other human activities such as eating and sleeping, but he will continue to be frustrated as more machines like me can beat him at chess.

  Now, if I could only remember where I put my umbrella....

  Of course, Deep Blue had no such thoughts. Issues such as rain and press conferences lead to other issues in a spiraling profusion of cascading contexts, none of which falls within Deep Blue’s expertise. As humans jump from one concept to the next, we can quickly touch upon all human knowledge. This was Turing’s brilliant insight when he designed the Turing Test around ordinary text-based conversation. An idiot savant such as Deep Blue, which performs a single “intelligent” task but that is otherwise confined, brittle, and lacking in context, is unable to navigate the wide-ranging links that occur in ordinary conversation.

  As powerful and seductive as the easy paradigms appear to be, we do need something more, namely knowledge.

  CONTEXT AND KNOWLEDGE

  The search for the truth is in one way hard and in another easy-for it is evident that no one of us can master it fully, nor miss it wholly. Each one of us adds a little to our knowledge of nature, and from all the facts assembled arises a certain grandeur.

  —Aristotle

  Common sense is not a simple thing. Instead, it is an immense society of hard-earned practical ideas—of multitudes of life-learned rules and exceptions, dispositions and tendencies, balances and checks.

  —Marvin Minsky

  If a little knowledge is dangerous, where is a man who has so much as to be out of danger?

  —Thomas Henry Huxley

  Built-In Knowledge

  An entity may possess extraordinary means to implement the types of paradigms we have been discussing—exhaustive recursive search, massively parallel pattern recognition, and rapid iterative evolution—but without knowledge, it will be unable to function. Even a straightforward implementation of the three easy paradigms needs some knowledge with which to begin. The recursive chess-playing program has a little; it knows the rules of chess. A neural net pattern-recognition system starts with at least an outline of the type of patterns it will be exposed to even before it starts to learn. An evolutionary algorithm requires a starting point for evolution to improve on.

  The simple paradigms are powerful organizing principles, but incipient knowledge is needed as seeds from which other understanding can grow. One level of knowledge, therefore, is embodied in the selection of the paradigms used, the shape and topology of its constituent parts, and the key parameters. A neural net’s learning will never congeal if the general organization of its connections and feedback loops are not set up in the right way

  This is a form of knowledge that we are born with. The human brain is not one tabula rasa—a blank slate—on which our experiences and insights are recorded. Rather, it comprises an integrated assemblage of specialized regions:

  • highly parallel early vision circuits that are good at identifying visual changes;

  • visual cortex neuron clusters that are triggered successively by edges, straight lines, curved lines, shapes, familiar objects, and faces;

  • auditory cortex circuits triggered by varying time sequences of frequency combinations;

  • the hippocampus, with capacities for storing memories of sensory experiences and events;

  • the amygdala, with circuits for translating fear into a series of alarms to trigger other regions of the brain; and many others.

  This complex interconnectedness of regions specialized for different ty
pes of information-processing tasks is one of the ways that humans deal with the complex and diverse contexts that continually confront us. Marvin Minsky and Seymour Papert describe the human brain as “composed of large numbers of relatively small distributed systems, arranged by embryology into a complex society that is controlled in part (but only in part) by serial, symbolic systems that are added later.” They add that “the subsymbolic systems that do most of the work from underneath must, by their very character, block all the other parts of the brain from knowing much about how they work. And this, itself, could help explain how people do so many things yet have such incomplete ideas on how those things are actually done.”

  Acquired Knowledge

  It is sensible to remember today’s insights for tomorrow’s challenges. It is not fruitful to rethink every problem that comes along. This is particularly true for humans due to the extremely slow speed of our computing circuitry. Although computers are better equipped than we are to rethink earlier insights, it is still judicious for these electronic competitors in our ecological niche to balance their use of memory and computation.

  The effort to endow machines with knowledge of the world began in earnest in the mid-1960s, and became a major focus of AI research in the 1970s. The methodology involves a human “knowledge engineer” and a domain expert, such as a doctor or lawyer. The knowledge engineer interviews the domain expert to ascertain her understanding of her subject matter and then hand-codes the relationships between concepts in a suitable computer language. A knowledge base on diabetes, for example, would contain many linked bits of understanding revealing that Insulin is part of the blood; insulin is produced by the pancreas; insulin can be supplemented by injection; low levels of insulin cause high levels of sugar in the blood; sustained high sugar levels in the blood cause damage to the retinas, and so on. A system programmed with tens of thousands of such linked concepts combined with a recursive search engine able to reason about these relationships is capable of making insightful recommendations.

  One of the more successful expert systems developed in the 1970s was MYCIN, a system for evaluating complex cases involving meningitis. In a landmark study published in the Journal of the American Medical Association, MYCIN’s diagnoses and treatment recommendations were found to be equal or better than those of the human doctors in the study 1 Some of MYCIN’s innovations included the use of fuzzy logic; that is, reasoning based on uncertain evidence and rules, as shown in the following typical MYCIN rule:

  MYCIN Rule 280: If (i) the infection which requires therapy is meningitis, and (ii) the type of the infection is fungal, and (iii) organisms were not seen on the stain of the culture, and (iv) the patient is not a compromised host, and (v) the patient has been to an area that is endemic for coccidiomycoses, and (vi) the race of the patient is Black, Asian or Indian, and (vii) the cryptococcal antigen in the csf was not positive, THEN there is a 50 percent chance that cryptococcus is one of the organisms which might be causing the infection.

  The success of MYCIN and other research systems spawned a knowledge-engineering industry that grew from only $4 million in 1980 to billions of dollars today.2

  There are obvious difficulties with this methodology. One is the enormous bottleneck represented by the process of hand-feeding such knowledge to a computer concept by concept and link by link. Aside from the vast scope of knowledge that exists in even narrow disciplines, the bigger obstacle is that human experts generally have little understanding of how they make decisions. The reason for this, as I discussed in the previous chapter, has to do with the distributed nature of most human knowledge.

  Another problem is the brittleness of such systems. Knowledge is too complex for every caveat and exception to be anticipated by knowledge engineers. As Minsky points out, “Birds can fly, unless they are penguins and ostriches, or if they happen to be dead, or have broken wings, or are confined to cages, or have their feet stuck in cement, or have undergone experiences so dreadful as to render them psychologically incapable of flight.”

  To create flexible intelligence in our machines, we need to automate the knowledge-acquisition process. A primary goal of learning research is to combine the self-organizing methods—recursion, neural nets, evolutionary algorithms—in a sufficiently robust way that the systems can model and understand human language and knowledge. Then the machines can venture out, read, and learn on their own. And like humans, such systems will be good at faking it when they wander outside their areas of expertise.

  EXPRESSING KNOWLEDGE THROUGH LANGUAGE

  No knowledge is entirely reducible to words, and no knowledge is entirely ineffable.

  —Seymour Papert

  The fish trap exists because of the fish. Once you’ve gotten the fish you can forget the trap. The rabbit snare exists because of the rabbit. Once you’ve gotten the rabbit, you can forget the snare. Words exist because of meaning. Once you’ve gotten the meaning, you can forget the words. Where can I find a man who has forgotten words so I can talk with him?

  —Chuang-tzu

  Language is the principal means by which we share our knowledge. And like other human technologies, language is often cited as a salient differentiating characteristic of our species. Although we have limited access to the actual implementation of knowledge in our brains (this will change early in the twenty-first century), we do have ready access to the structures and methods of language. This provides us with a handy laboratory for studying our ability to master knowledge and the thinking process behind it. Work in the laboratory of language shows, not surprisingly, that it is no less complex or subtle a phenomenon than the knowledge it seeks to transmit.

  We find that language in both its auditory and written forms is hierarchical with multiple levels. There are ambiguities at each level, so a system that understands language, whether human or machine, needs built-in knowledge at each level. To respond intelligently to human speech, for example, we need to know (although not necessarily consciously) the structure of speech sounds, the way speech is produced by the vocal apparatus, the patterns of sounds that comprise languages and dialects, the rules of word usage, and the subject matter being discussed.

  Each level of analysis provides useful constraints that limit the search for the right answer: For example, the basic sounds of speech called phonemes cannot appear in any order (try saying ptkee). Only certain sequences of sounds will correspond to words in the language. Although the set of phonemes used is similar (although not identical) from one language to another, factors of context differ dramatically. English, for example, has more than 10,000 possible syllables, whereas Japanese has only 120.

  On a higher level, the structure and semantics of a language put further constraints on allowable word sequences. The first area of language to be actively studied was the rules governing the arrangement of words and the roles they play, which we call syntax. On the one hand, computerized sentence-parsing systems can do a good job at analyzing sentences that confuse humans. Minsky cites the example: “This is the cheese that the rat that the cat that the dog chased bit ate,” which confuses humans but which machines parse quite readily Ken Church, then at MIT, cites another sentence with two million syntactically correct interpretations, which his computerized parser dutifully listed.3 On the other hand, one of the first computer-based sentence-parsing systems, developed in 1963 by Susumu Kuno of Harvard, had difficulty with the simple sentence “Time flies like an arrow.” In what has become a famous response, the computer indicated that it was not quite sure what it meant. It might mean

  1. that time passes as quickly as an arrow passes;

  2. or maybe it is a command telling us to time the flies the same way that an arrow times flies; that is, “Time flies like an arrow would”;

  3. or it could be a command telling us to time only those flies that are similar to arrows; that is, “Time flies that are like an arrow”;

  4. or perhaps it means that a type of flies known as time flies have a fondness for arrows: “Time-
flies like (that is, cherish) an arrow.”4

  Clearly we need some knowledge here to resolve this ambiguity Armed with the knowledge that flies are not similar to arrows, we can knock out the third interpretation. Knowing that there is no such thing as a time-fly dispatches the fourth explanation. Such tidbits of knowledge as the fact that flies do not show a fondness for arrows (another reason to knock out interpretation four) and that arrows do not have the ability to time events (knocking out interpretation two) leave us with the first interpretation as the only sensible one.

  In language, we again find the sequence of human learning and the progression of machine intelligence to be the reverse of each other. A human child starts out listening to and understanding spoken language. Later on he learns to speak. Finally, years later, he starts to master written language. Computers have evolved in the opposite direction, starting out with the ability to generate written language, subsequently learning to understand it, then starting to speak with synthetic voices and only recently mastering the ability to understand continuous human speech. This phenomenon is widely misunderstood. R2D2, for example, the robot character of Star Wars fame, understands many human languages but is unable to speak, which gives the mistaken impression that generating human speech is far more difficult than understanding it.