The Age of Spiritual Machines: When Computers Exceed Human Intelligence
5 Turing’s Robinson was not a programmable computer. It didn’t have to be—it had only one job to do. The first programmable computer was developed by the Germans. Konrad Zuse, a German civil engineer and tinkerer, was motivated to ease what he later called those “awful calculations required of civil engineers.” Like Babbage’s, his first device, the Z-1, was entirely mechanical—built from an erector set in his parents’ living room. The Z-2 used electromechanical relays and was capable of solving complex simultaneous equations. It was his third version—the Z-3—that is the most historic. It stands as the world’s first programmable computer. As one would retroactively predict from the Law of Accelerating Returns as applied to computation, Zuse’s Z-3 was rather slow—a multiplication took more than three seconds.
While Zuse received some incidental support from the German government and his machines played a minor military role, there was little, if any, awareness of computation and its military significance by the German leadership. This explains their apparent confidence in the security of their Enigma code. Instead the German military gave immensely high priority to several other advanced technologies, such as rocketry and atomic weapons.
It would be Zuse’s fate that no one would pay much attention to him or his inventions; even the Allies ignored him after the end of the war. Credit for the world’s first programmable computer is often given to Howard Aiken, despite the fact that his Mark I was not operational until nearly three years after the Z-3. When Zuse’s funding was withdrawn in the middle of the war by the Third Reich, a German officer explained to him that “the German aircraft is the best in the world. I cannot see what we could possibly calculate to improve on.”
Zuse’s claim to having built the world’s first operational fully programmable digital computer is supported by the patent application he filed. See, for instance, K. Zuse, “Verfahren zur Selbst Atigen Durchfurung von Rechnungen mit Hilfe von Rechenmaschinen,” German Patent Application Z23624, April 11, 1936. Translated extracts, titled “Methods for Automatic Execution of Calculations with the Aid of Computers,” appear in Brian Randell, ed., The Origins of Digital Computers, pp. 159-166.
6 “Computing Machinery and Intelligence,” Mind 59 (1950): 433-460, reprinted in E. Feigenbaum and J. Feldman, eds., Computers and Thought (New York: McGraw-Hill, 1963).
7 See A. Newell, J. C. Shaw, and H. A. Simon, “Programming the Logic Theory Machine,” Proceedings of the Western joint Computer Conference, 1957, pp. 230-240.
8 Russell and Whitehead’s Principia Mathematica (see reference at the end of this end-note), first published in 1910-1913, was a seminal work that reformulated mathematics based on Russell’s new conception of set theory. Russell’s breakthrough in set theory set the stage for Turing’s subsequent development of computational theory based on the Turing machine (see note below). Following is my version of “Russell’s paradox,” which stimulated Russell’s discovery:
Before ending up in “the Other Place,” our friend the gambler had lived a rough life. He was short of temper and not fond of losing. In our story, he is also a bit of a logician. This time he has picked the wrong man to dispatch. If only he had known that the fellow was the judge’s nephew.
Known anyway as a hanging judge, the magistrate is furious and wishes to mete out the most severe sentence he can think of. So he tells the gambler that not only is he sentenced to die but the sentence is to be carried out in a unique way. “First off, we’re gonna dispense with you quickly, just like you done with the victim. This punishment must be carried out no later than Saturday. Furthermore, 1 don’t want you preparing yourself for the judgment day. On the morning of your execution, you won’t know for certain that the day is at hand. When we come for you, it’ll be a surprise.”
To which the gambler replies, “Well, that’s great, judge, I am greatly relieved.”
To which the judge exclaims, “I don’t understand, how can you be relieved ? I have condemned you to be executed. I have ordered that the sentence be carried out soon, but you’ll be unable to prepare yourself because on the morning that we carry it out, you won’t know for certain that you’ll be dying that day.”
“Well, Your Honor,” the gambler points out, “in order for your sentence to be carried out, I cannot be executed on Saturday.”
“Why is that?” asks the judge.
“Because since the sentence must be carried out by Saturday, if we actually get to Saturday, I will know for certain that I am to be executed on that day, and thus it would not be a surprise.”
“I suppose you are right,” replies the judge. “You cannot be executed on Saturday. But I still don’t see why you’re relieved.”
“Well, if we have definitely ruled out Saturday, then I can’t be executed on Friday either.”
“Why is that?” asks the judge, being a little slow.
“We have agreed that I can’t be executed on Saturday. Therefore Friday is the last day I can be executed. But if Friday rolls around, I will definitely know that I am to be executed on that day and therefore it would not be a surprise. So I can’t be executed on Friday.”
“I see,” says the judge.
“Thus the last day I can be executed would be Thursday. But if Thursday rolls around, I would know I had to be executed on that day, and thus it would not be a surprise. So Thursday is out. By the same reasoning, we can eliminate Wednesday, Tuesday, Monday, and today.”
The judge scratches his head as the confident gambler is led back to his prison cell.
There is an epilogue to the story On Thursday, the gambler is taken to be executed. And he is very surprised. So the judge’s orders are successfully carried out.
This is my version of what has become known as “Russell’s paradox” after Bertrand Russell, perhaps the last person to secure major achievements in both mathematics and philosophy. If we analyze this story, we see that the conditions that the judge has set up result in a conclusion that none of the days comply, because, as the prisoner so adroitly points out, each one of them in turn would not be a surprise. But the conclusion itself changes the situation, and now surprise is possible again. This brings us back to the original situation in which the prisoner could (in theory) demonstrate that each day in turn would be impossible, and so on, ad infinitum. The judge applies “Alexander’s solution” in which King Alexander slashed the hopelessly tied Gordian knot.
A simpler example, and the one that Russell actually struggled with, is the following question about sets. A set is a mathematical construct that, as its name implies, is a collection of things. A set may include chairs, books, authors, gamblers, numbers, other sets, themselves, whatever. Now consider set A, which is defined to contain all sets that are not members of themselves. Does set A contain itself?
As we consider this famous problem, we realize there are only two possible answers: Yes and No. We can, therefore, try them all (this is not the case for most problems in mathematics). So let’s consider Yes. If the answer is Yes, then set A does contain itself. But if set A contains itself, then according to its defining condition, set A would not belong to set A, and thus it does not belong to itself. Since the answer of Yes led to a contradiction, it must be wrong.
So let’s try No. If the answer is No, then set A does not contain itself. But again according to the defining condition, if set A does not belong to itself, then it would belong to set A, another contradiction. As with the story about the prisoner, we have incompatible propositions that imply one another. Yes implies No, which yields Yes, and so on.
This may not seem like a big deal, but to Russell it threatened the foundation of mathematics. Mathematics is based on the concept of sets, and the issue of inclusion (i.e., what belongs to a set) is fundamental to the idea. The definition of set A appears to be a reasonable one. The question of whether set A belongs to itself also appears reasonable. Yet we have difficulty coming up with a reasonable answer to this reasonable question. Mathematics was in big trouble.
Russell pondered t
his dilemma for more than a decade, nearly exhausting himself and wrecking at least one marriage. But he came up with an answer. To do so, he invented the equivalent of a theoretical computer (although not by name). Russell’s “computer” is a logic machine and it implements one logical transformation at a time, each one requiring a quantum of time—so things don’t happen all at once. Our question about set A is examined in an orderly fashion. Russell turns on his theoretical computer (which, lacking a real computer, ran only in his head) and the logical operations are “executed” in turn. So at one point, our answer is Yes, but the program keeps running, and a few quantums of time later the answer becomes No. The program runs in an infinite loop, constantly alternating between Yes and No.
But the answer is never Yes and No at the same time!
Impressed? Well Russell was very pleased. Eliminating the possibility of the answer being Yes and No at the same time was enough to save mathematics. With the help of his friend and former tutor Alfred North Whitehead, Russell recast all of mathematics in terms of his new theory of sets and logic, which they published in their Principia Mathematica in 1910-1913. It is worth pointing out that the concept of a computer, theoretical or otherwise, was not widely understood at the time. The nineteenth-century efforts of Charles Babbage, which are discussed in chapter 4, were largely unknown at the time. It is not clear if Russell was aware of Babbage’s efforts. Russell’s highly influential and revolutionary work invented a logical theory of computation and recast mathematics as one of its branches. Mathematics was now part of computation.
Russell and Whitehead did not explicitly talk about computers but cast their ideas in the mathematical terminology of set theory. It was left to Alan Turing to create the first theoretical computer in 1936, in his Turing machine (see note 16 below).
Alfred N. Whitehead and Bertrand Russell, Principia Mathematica, 3 vols., second edition (Cambridge: Cambridge University Press, 1925-1927). (The first edition was published in 1910, 1912, and 1913.)
Russell’s paradox was first introduced in Bertrand Russell, Principles of Mathematics (Reprint, New York: W W. Norton & Company, 1996), 2nd ed., 79-81. Russell’s paradox is a subtle variant of the Liar Paradox. See E. W Beth, Foundations of Mathematics (Amsterdam: North Holland, 1959), p. 485.
9 “Heuristic Problem Solving: The Next Advance in Operations Research,” Journal of the Operations Research Society of America 6, no. 1 (1958), reprinted in Herbert Simon, Models of Bounded Rationality, vol. 1, Economic Analysis and Public Policy (Cambridge, MA: MIT Press, 1982).
10 “A Mean Chess-Playing Computer Tears at the Meaning of Thought,” New York Times, February 19, 1996, contains the reactions of Gary Kasparov and a number of noted thinkers concerning the ramifications of Deep Blue beating the world chess champion.
11 Daniel Bobrow, “Natural Language Input for a Computer Problem Solving System,” in Marvin Minsky, Semantic Information Processing, pp. 146-226.
12 Thomas Evans, “A Program for the Solution of Geometric-Analogy Intelligence Test Questions,” in Marvin Minsky, ed., Semantic Information Processing (Cambridge, MA: MIT Press, 1968), pp. 271-353.
13 Robert Lindsay, Bruce Buchanan, Edward Feigenbaum, and Joshua Lederberg describe DENDRAL in Applications of Artificial Intelligence for Chemical Inference: The DENDRAL Project (New York: McGraw-Hill, 1980). For a brief and clear explanation of the essential mechanisms behind DENDRAL, see Patrick Winston, Artificial Intelligence (1984), pp. 163-164, 195-197.
14 For many years SHRDLU was cited as a prominent accomplishment of artificial intelligence. Winograd describes his research in his thesis Understanding Natural Language (New York: Academic Press, 1972). A brief version appears as “A Procedural Model of Thought and Language,” in Roger Schank and Kenneth Colby, eds., Computer Models of Thought and Language (San Francisco: W H. Freeman, 1973).
15 Haneef A. Fatmi and R. W Young, “A Definition of Intelligence,” Nature 228 (1970): 97.
16 Alan Turing showed that the essential basis of computation could be modeled with a very simple theoretical machine. He created the first theoretical computer in 1936 (first introduced in Alan M. Turing, “On Computable Numbers with an Application to the Entscheinungs Problem,” Proc. London Math. Soc. 42 [1936]: 230-265) in an eponymous conception called the Turing machine. As with a number of Turing’s breakthroughs, he would have both the first and last word. The Turing machine represented the founding of modern computational theory. It has also persisted as our primary theoretical model of a computer because of its combination of simplicity and power.
The Turing machine is one example of the simplicity of the foundations of intelligence. A Turing machine consists of two primary (theoretical) units: a tape drive and a computation unit. The tape drive has a tape of infinite length on which it can write, and (subsequently) read, a series of two symbols: zero and one. The computation unit contains a program consisting of a sequence of commands, drawing from only seven possible operations:
• Read the tape
• Move the tape left one symbol
• Move the tape right one symbol
• Write 0 on the tape
• Write 1 on the tape
• Jump to another command
• Halt
Turing was able to show that this extremely simple machine can compute anything that any machine can compute, no matter how complex. If a problem cannot be solved by a Turing machine, then it cannot be solved by any machine. Occasionally there are challenges to this position, but in large measure it has stood the test of time.
In the same paper, Turing reports another unexpected discovery, that of unsolvable problems. These are problems that are well defined with unique answers that can be shown to exist, but that we can also prove can never be computed by a Turing machine—that is to say by any machine, yet another reversal of what had been a nineteenth-century confidence that problems that could be defined would ultimately be solved. Turing showed that there are as many unsolvable problems as solvable ones.
Turing and Alonzo Church, his former professor, went on to assert what has become known as the Church-Turing thesis: If a problem that can be presented to a Turing machine is not solvable by one, then it is also not solvable by human thought. “Strong” interpretations of the Church-Turing thesis propose an essential equivalence between what a human can think or know and what is computable by a machine. The Church-Turing thesis can be viewed as a restatement in mathematical terms of one of Wittgenstein’s primary theses in his Tractatus. The basic idea is that the human brain is subject to natural law, and thus its information-processing ability cannot exceed that of a machine. We are thus left with the perplexing situation of being able to define a problem, to prove that a unique answer exists, and yet know that the answer can never be known.
Perhaps the most interesting unsolvable problem is called the Busy Beaver, which may be stated as follows: Each Turing machine has a certain number of commands in its program. Given a positive integer n, we construct all of the Turing machines that have n states (i.e., n commands). Next we eliminate those n-state Turing machines that get into an infinite loop (i.e., never halt). Finally, we select the machine (one that halts) that writes the largest number of 1s on its tape. The number of 1s that this Turing machine writes is called busy beaver of n.
Tibor Rado, a mathematician and admirer of Turing, showed that there is no algorithm (that is, no Turing machine) that can compute the busy beaver function for all n’s. The crux of the problem is sorting out those n-state Turing machines that get into infinite loops. If we program a Turing machine to generate and simulate every possible n-state Turing machine, this simulator itself goes into an infinite loop when it attempts to simulate one of the n-state Turing machines that gets into an infinite loop. Busy beaver can be computed for some ns, and interestingly it is also an unsolvable problem to separate those ns for which we can determine busy beaver of n from those for which we cannot.
Busy beaver is an “intelligent function.” More precis
ely stated, it is a function that requires increasing intelligence to compute for increasing arguments. As we increase n, the complexity of the processes needed to compute busy beaver of n increases.
With n = 6, we are dealing with addition and busy beaver of 6 equals 35. In other words, addition is the most complex operation that a Turing machine with only 6 steps in its program is capable of performing. At 7, busy beaver learns to multiply and busy beaver of 7 equals 22,961. At 8, busy beaver can exponentiate, and the number of 1s that our eighth busy beaver writes on its tape is approximately 1043. Note that this is even faster growth than Moore’s Law. By the time we get to 10 we need an exotic notation in which we have a stack of exponents (10 to the 10 to the 10, etc.), the height of which is determined by another stack of exponents, the height of which is determined by another stack of exponents, and so on. For the twelfth busy beaver we need an even more exotic notation. Human intelligence (in terms of the complexity of mathematical operations that we can understand) is surpassed well before the busy beaver gets to 100. The computers of the twenty-first century will do a bit better.
The busy beaver problem is one example of a large class of noncomputable functions, as one can see from Tibor Rado, “On Noncomputable Functions,” Bell System Technical journal 41, no. 3 (1962): 877-884.
17 Raymond Kurzweil, The Age of Intelligent Machines (Cambridge, MA: MIT Press, 1990), pp. 132-133.
18 H. J. Berliner, “Backgammon Computer Program Beats World Champion,” Artificial Intelligence 14, no. 1 (1980). Also see Hans Berliner, “Computer Backgammon,” Scientific American, June 1980.
19 To download Ray Kurzweil’s Cybernetic Poet (RKCP), go to: . RKCP is further discussed in the section The Creative Machine in chapter 8, “1999.”