An enthusiastic proponent of an information-based theory of physics was Edward Fredkin, who in the early 1980s proposed a “new theory of physics” founded on the idea that the universe is ultimately composed of software. We should not think of reality as consisting of particles and forces, according to Fredkin, but rather as bits of data modified according to computation rules.

  Fredkin was quoted by Robert Wright in the 1980s as saying,

  There are three great philosophical questions. What is life? What is consciousness and thinking and memory and all that? And how does the universe work? . . . [The] “informational viewpoint” encompasses all three. . . . What I’m saying is that at the most basic level of complexity an information process runs what we think of as physics. At the much higher level of complexity, life, DNA—you know, the biochemical functions—are controlled by a digital information process. Then, at another level, our thought processes are basically information processing. . . . I find the supporting evidence for my beliefs in ten thousand different places. . . . And to me it’s just totally overwhelming. It’s like there’s an animal I want to find. I’ve found his footprints. I’ve found his droppings. I’ve found the half-chewed food. I find pieces of his fur, and so on. In every case it fits one kind of animal, and it’s not like any animal anyone’s ever seen. People say, Where is this animal? I say, Well, he was here, he’s about this big, this that, and the other. And I know a thousand things about him. I don’t have him in hand, but I know he’s there. . . . What I see is so compelling that it can’t be a creature of my imagination.62

  In commenting on Fredkin’s theory of digital physics, Wright writes,

  Fredkin . . . is talking about an interesting characteristic of some computer programs, including many cellular automata: there is no shortcut to finding out what they will lead to. This, indeed, is a basic difference between the “analytical” approach associated with traditional mathematics, including differential equations, and the “computational” approach associated with algorithms. You can predict a future state of a system susceptible to the analytic approach without figuring out what states it will occupy between now and then, but in the case of many cellular automata, you must go through all the intermediate states to find out what the end will be like: there is no way to know the future except to watch it unfold. . . . Fredkin explains: “There is no way to know the answer to some question any faster than what’s going on.” . . . Fredkin believes that the universe is very literally a computer and that it is being used by someone, or something, to solve a problem. It sounds like a good-news/bad-news joke: the good news is that our lives have purpose; the bad news is that their purpose is to help some remote hacker estimate pi to nine jillion decimal places. 63

  Fredkin went on to show that although energy is needed for information storage and retrieval, we can arbitrarily reduce the energy required to perform any particular example of information processing, and that this operation has no lower limit.64 That implies that information rather than matter and energy may be regarded as the more fundamental reality.65 I will return to Fredkin’s insight regarding the extreme lower limit of energy required for computation and communication in chapter 3, since it pertains to the ultimate power of intelligence in the universe.

  Wolfram builds his theory primarily on a single, unified insight. The discovery that has so excited Wolfram is a simple rule he calls cellular automata rule 110 and its behavior. (There are some other interesting automata rules, but rule 110 makes the point well enough.) Most of Wolfram’s analyses deal with the simplest possible cellular automata, specifically those that involve just a one-dimensional line of cells, two possible colors (black and white), and rules based only on the two immediately adjacent cells. For each transformation, the color of a cell depends only on its own previous color and that of the cell on the left and the cell on the right. Thus, there are eight possible input situations (that is, three combinations of two colors). Each rule maps all combinations of these eight input situations to an output (black or white). So there are 28(256) possible rules for such a one-dimensional, two-color, adjacent-cell automaton. Half of the 256 possible rules map onto the other half because of left-right symmetry. We can map half of them again because of black-white equivalence, so we are left with 64 rule types. Wolfram illustrates the action of these automata with two-dimensional patterns in which each line (along the y-axis) represents a subsequent generation of applying the rule to each cell in that line.

  Most of the rules are degenerate, meaning they create repetitive patterns of no interest, such as cells of a single color, or a checkerboard pattern. Wolfram calls these rules class 1 automata. Some rules produce arbitrarily spaced streaks that remain stable, and Wolfram classifies these as belonging to class 2. Class 3 rules are a bit more interesting, in that recognizable features (such as triangles) appear in the resulting pattern in an essentially random order.

  However, it was class 4 automata that gave rise to the “aha” experience that resulted in Wolfram’s devoting a decade to the topic. The class 4 automata, of which rule 110 is the quintessential example, produce surprisingly complex patterns that do not repeat themselves. We see in them artifacts such as lines at various angles, aggregations of triangles, and other interesting configurations. The resulting pattern, however, is neither regular nor completely random; it appears to have some order but is never predictable.

  Why is this important or interesting? Keep in mind that we began with the simplest possible starting point: a single black cell. The process involves repetitive application of a very simple rule.66 From such a repetitive and deterministic process, one would expect repetitive and predictable behavior. There are two surprising results here. One is that the results produce apparent randomness. However, the results are more interesting than pure randomness, which itself would become boring very quickly. There are discernible and interesting features in the designs produced, so the pattern has some order and apparent intelligence. Wolfram includes a number of examples of these images, many of which are rather lovely to look at.

  Wolfram makes the following point repeatedly: “Whenever a phenomenon is encountered that seems complex it is taken almost for granted that the phenomenon must be the result of some underlying mechanism that is itself complex. But my discovery that simple programs can produce great complexity makes it clear that this is not in fact correct.” 67

  I do find the behavior of rule 110 rather delightful. Furthermore, the idea that a completely deterministic process can produce results that are completely unpredictable is of great importance, as it provides an explanation for how the world can be inherently unpredictable while still based on fully deterministic rules.68 However, I am not entirely surprised by the idea that simple mechanisms can produce results more complicated than their starting conditions. We’ve seen this phenomenon in fractals, chaos and complexity theory, and self-organizing systems (such as neural nets and Markov models), which start with simple networks but organize themselves to produce apparently intelligent behavior.

  At a different level, we see it in the human brain itself, which starts with only about thirty to one hundred million bytes of specification in the compressed genome yet ends up with a complexity that is about a billion times greater.69

  It is also not surprising that a deterministic process can produce apparently random results. We have had random-number generators (for example, the “randomize” function in Wolfram’s program Mathematica) that use deterministic processes to produce sequences that pass statistical tests for randomness. These programs date back to the earliest days of computer software, such as the first versions of Fortran. However, Wolfram does provide a thorough theoretical foundation for this observation.

  Wolfram goes on to describe how simple computational mechanisms can exist in nature at different levels, and he shows that these simple and deterministic mechanisms can produce all of the complexity that we see and experience. He provides myriad examples, such as the pleasing designs of pigmentation o
n animals, the shape and markings of shells, and patterns of turbulence (such as the behavior of smoke in the air). He makes the point that computation is essentially simple and ubiquitous. The repetitive application of simple computational transformations, according to Wolfram, is the true source of complexity in the world.

  My own view is that this is only partly correct. I agree with Wolfram that computation is all around us, and that some of the patterns we see are created by the equivalent of cellular automata. But a key issue to ask is this: Just how complex are the results of class 4 automata?

  Wolfram effectively sidesteps the issue of degrees of complexity. I agree that a degenerate pattern such as a chessboard has no complexity. Wolfram also acknowledges that mere randomness does not represent complexity either, because pure randomness also becomes predictable in its pure lack of predictability. It is true that the interesting features of class 4 automata are neither repeating nor purely random, so I would agree that they are more complex than the results produced by other classes of automata.

  However, there is nonetheless a distinct limit to the complexity produced by class 4 automata. The many images of such automata in Wolfram’s book all have a similar look to them, and although they are nonrepeating, they are interesting (and intelligent) only to a degree. Moreover, they do not continue to evolve into anything more complex, nor do they develop new types of features. One could run these automata for trillions or even trillions of trillions of iterations and the image would remain at the same limited level of complexity. They do not evolve into, say, insects or humans or Chopin preludes or anything else that we might consider of a higher order of complexity than the streaks and intermingling triangles displayed in these images.

  Complexity is a continuum. Here I define “order” as “information that fits a purpose.”70 A completely predictable process has zero order. A high level of information alone does not necessarily imply a high level of order either. A phone book has a lot of information, but the level of order of that information is quite low. A random sequence is essentially pure information (since it is not predictable) but has no order. The output of class 4 automata does possess a certain level of order, and it does survive like other persisting patterns. But the pattern represented by a human being has a far higher level of order, and of complexity.

  Human beings fulfill a highly demanding purpose: they survive in a challenging ecological niche. Human beings represent an extremely intricate and elaborate hierarchy of other patterns. Wolfram regards any patterns that combine some recognizable features and unpredictable elements to be effectively equivalent to one another. But he does not show how a class 4 automaton can ever increase its complexity, let alone become a pattern as complex as a human being.

  There is a missing link here, one that would account for how one gets from the interesting but ultimately routine patterns of a cellular automaton to the complexity of persisting structures that demonstrate higher levels of intelligence. For example, these class 4 patterns are not capable of solving interesting problems, and no amount of iteration moves them closer to doing so. Wolfram would counter that a rule 110 automaton could be used as a “universal computer.”71 However, by itself, a universal computer is not capable of solving intelligent problems without what I would call “software.” It is the complexity of the software that runs on a universal computer that is precisely the issue.

  One might point out that class 4 patterns result from the simplest possible cellular automata (one-dimensional, two-color, two-neighbor rules). What happens if we increase the dimensionality—for example, go to multiple colors or even generalize these discrete cellular automata to continuous functions? Wolfram addresses all of this quite thoroughly. The results produced from more complex automata are essentially the same as those of the very simple ones. We get the same sorts of interesting but ultimately quite limited patterns. Wolfram makes the intriguing point that we do not need to use more complex rules to get complexity in the end result. But I would make the converse point that we are unable to increase the complexity of the end result through either more complex rules or further iteration. So cellular automata get us only so far.

  Can We Evolve Artificial Intelligence from Simple Rules?

  So how do we get from these interesting but limited patterns to those of insects or humans or Chopin preludes? One concept we need to take into consideration is conflict—that is, evolution. If we add another simple concept—an evolutionary algorithm—to that of Wolfram’s simple cellular automata, we start to get far more exciting and more intelligent results. Wolfram would say that the class 4 automata and an evolutionary algorithm are “computationally equivalent.” But that is true only on what I consider the “hardware” level. On the software level, the order of the patterns produced are clearly different and of a different order of complexity and usefulness.

  An evolutionary algorithm can start with randomly generated potential solutions to a problem, which are encoded in a digital genetic code. We then have the solutions compete with one another in a simulated evolutionary battle. The better solutions survive and procreate in a simulated sexual reproduction in which offspring solutions are created, drawing their genetic code (encoded solutions) from two parents. We can also introduce a rate of genetic mutation. Various high-level parameters of this process, such as the rate of mutation, the rate of offspring, and so on, are appropriately called “God parameters,” and it is the job of the engineer designing the evolutionary algorithm to set them to reasonably optimal values. The process is run for many thousands of generations of simulated evolution, and at the end of the process one is likely to find solutions that are of a distinctly higher order than the starting ones.

  The results of these evolutionary (sometimes called genetic) algorithms can be elegant, beautiful, and intelligent solutions to complex problems. They have been used, for example, to create artistic designs and designs for artificial life-forms, as well as to execute a wide range of practical assignments such as designing jet engines. Genetic algorithms are one approach to “narrow” artificial intelligence—that is, creating systems that can perform particular functions that used to require the application of human intelligence.

  But something is still missing. Although genetic algorithms are a useful tool in solving specific problems, they have never achieved anything resembling “strong AI”—that is, aptitude resembling the broad, deep, and subtle features of human intelligence, particularly its powers of pattern recognition and command of language. Is the problem that we are not running the evolutionary algorithms long enough? After all, humans evolved through a process that took billions of years. Perhaps we cannot re-create that process with just a few days or weeks of computer simulation. This won’t work, however, because conventional genetic algorithms reach an asymptote in their level of performance, so running them for a longer period of time won’t help.

  A third level (beyond the ability of cellular processes to produce apparent randomness and genetic algorithms to produce focused intelligent solutions) is to perform evolution on multiple levels. Conventional genetic algorithms allow evolution only within the confines of a narrow problem and a single means of evolution. The genetic code itself needs to evolve; the rules of evolution need to evolve. Nature did not stay with a single chromosome, for example. There have been many levels of indirection incorporated in the natural evolutionary process. And we require a complex environment in which the evolution takes place.

  To build strong AI we will have the opportunity to short-circuit this process, however, by reverse engineering the human brain, a project well under way, thereby benefiting from the evolutionary process that has already taken place. We will be applying evolutionary algorithms within these solutions just as the human brain does. For example, the fetal wiring is initially random within constraints specified in the genome in at least some regions. Recent research shows that areas having to do with learning undergo more change, whereas structures having to do with sensory processing experience l
ess change after birth.72

  Wolfram makes the valid point that certain (indeed, most) computational processes are not predictable. In other words, we cannot predict future states without running the entire process. I agree with him that we can know the answer in advance only if somehow we can simulate a process at a faster speed. Given that the universe runs at the fastest speed it can run, there is usually no way to short-circuit the process. However, we have the benefits of the billions of years of evolution that have already taken place, which are responsible for the greatly increased order of complexity in the natural world. We can now benefit from it by using our evolved tools to reverse engineer the products of biological evolution (most importantly, the human brain).

  Yes, it is true that some phenomena in nature that may appear complex at some level are merely the result of simple underlying computational mechanisms that are essentially cellular automata at work. The interesting pattern of triangles on a “tent olive” shell (cited extensively by Wolfram) or the intricate and varied patterns of a snowflake are good examples. I don’t think this is a new observation, in that we’ve always regarded the design of snowflakes to derive from a simple molecular computation-like building process. However, Wolfram does provide us with a compelling theoretical foundation for expressing these processes and their resulting patterns. But there is more to biology than class 4 patterns.

  Another important thesis by Wolfram lies in his thorough treatment of computation as a simple and ubiquitous phenomenon. Of course, we’ve known for more than a century that computation is inherently simple: we can build any possible level of complexity from a foundation of the simplest possible manipulations of information.