Full House
3
Different Parsings, Different Images of Trends
Fallacies in the Reading and Identification of Trends
The more important the subject and the closer it cuts to the bone of our hopes and needs, the more we are likely to err in establishing a framework for analysis. We are story-telling creatures, products of history ourselves. We are fascinated by trends, in part because they tell stories by the basic device of imparting directionality to time, in part because they so often supply a moral dimension to a sequence of events: a cause to bewail as something goes to pot, or to highlight as a rare beacon of hope.
But our strong desire to identify trends often leads us to detect a directionality that doesn’t exist, or to infer causes that cannot be sustained. The subject of trends has inspired and illustrated some of the classic fallacies in human reasoning. Most prominently, since people seem to be so bad at thinking about probability and so prone to read pattern into sequences of events, we often commit the fallacy of spotting a "sure" trend and speculating about causes, when we observe no more than a random string of happenings.
In the classic case, most people have little sense of how often an apparent pattern will emerge in purely random data. Take the standard illustration of coin flipping: we compute the probability of sequences by multiplying the chances of individual events. Since the probability for heads is always 1/2, the chance of flipping five heads in a row is 1/2 × 1/2 × 1/2 × 1/2 × 1/2, or one in thirty-two-rare to be sure, but something that will happen every once in a while for no reason but randomness. Many people, however, particularly if they are betting on tails, will read five heads in a row as prima facie evidence of cheating. People have been shot and killed for less—in life as well as in Western movies.
In my favorite, more subtle example of the same error, T. Gilovich, R. Vallone, and A. Tversky debunked a phenomenon that every basketball fan and player absolutely "knows" to be true—"hot hands," or streaks of successive baskets, magic minutes of "getting into the groove" or "finding the range," when every shot hits. The phenomenon sounds so obvious: when you’re hot you’re hot, and when you’re not you’re not. But "hot hands" does not exist. My colleagues studied every basket made by the Philadelphia 76ers for more than a season. They made two debunking discoveries: first, the probability of making a second basket did not rise following a successful shot; second, and more important, the number of "runs," or successful baskets in sequence, did not exceed the predictions of a standard random, or coin-tossing, model. Remember that, on average, you will flip five heads in a row once in every thirty-two sequences of five tosses. We can, by analogy, compute expected runs for any basketball player. Suppose that Mr. Swish, a particularly good shooter, succeeds in 60 percent of his field-goal attempts. He should then notch six baskets in a row once every 20 sequences or so (0.6 x 0.6 × 0.6 x 0.6 x 0.6 × 0.6, for 0.047, or 4.7 percent). If Swish’s actual play includes sequences of six at about this rate, then we have no evidence for hot hands, but only for Swish playing in his characteristic manner for each shot independently. Gilovich, Vallone, and Tversky found no sequences beyond the range of random expectations.
My colleague Ed Purcell, a Nobel Prize winner in physics but just a keen baseball fan in this context, then did a similar study of baseball streaks and slumps, and we published the results together (Gould, 1988). Purcell found that among all runs, the subject of so much mythology about heroes (and goats), only one record stands beyond reasonable probability, and should not have happened at all—Joe DiMaggio’s fifty-six-game hitting streak in 1941—thus validating the feeling of many fans that DiMaggio’s splendid run is the greatest achievement in modern sports (and exonerating all the poor schlumps whose runs of failure lie entirely within the expectations of their characteristic probabilities!).
As one final example, probably more intellectual energy has been invested in discovering (and exploiting) trends in the stock market than in any other subject—for the obvious reason that stakes are so high, as measured in the currency of our culture. The fact that no one has ever come close to finding a consistent way to beat the system—despite intense efforts by some of the smartest people in the world—probably indicates that such causal trends do not exist, and that the sequences are effectively random.
In the second most prominent fallacy about trends, people correctly identify a genuine directionality, but then fall into the error of assuming that something else moving in the same direction at the same time must be acting as the cause. This error, the conflation of correlation with causality, arises for the obvious reason (once you think about it) that, at any moment, oodles of things must be moving in the same direction (Halley’s comet is receding from earth and my cat is getting more ornery)—and the vast majority of these correlated sequences cannot be causally related. In the classic illustration, a famous statistician once showed a precise correlation between arrests for public drunkenness and the number of Baptist preachers in nineteenth-century America. The correlation is real and intense, but we may assume that the two increases are causally unrelated, and that both arise as consequences of a single different factor: a marked general increase in the American population.
The error detailed in this book has not often been named or identified, but may be just as prominent in our fallacious thinking about trends. I shall focus on two central examples from two dramatically different cultural realms: "Why does no one hit 0.400 anymore in baseball?" and "How does progress characterize the history of life?" These are classic trends, in the sense that each encapsulates the essence and history of an important institution, and both have moral implications—one, in baseball, apparently trying to tell us that something about modern life causes excellence, or old-fashioned virtue, to degenerate; the other, for life, providing our necessary solace and excuse for continuing to view ourselves as lords of all.
I shall not use the juxtaposition of these examples to present pap and nonsense about how life imitates baseball, or vice versa. But I will show that the same error has led us to view both trends the wrong way round. Straighten out the fallacy, and you will see that the disappearance of 0.400 hitting illustrates the increasing excellence of play in baseball (however paradoxical such a claim may sound at first)—while life, on the other hand, shows no general thrust to improvement, but just adds an occasional exemplar of complexity in the only region of available anatomical space, while maintaining, for more than 3 billion years, an unvarying bacterial mode. Baseball has improved, but life has always been, and will probably always remain until the sun explodes, in the Age of Bacteria.
The common error lies in failing to recognize that apparent trends can be generated as by-products, or side consequences, of expansions and contractions in the amount of variation within a system, and not by anything directly moving anywhere. Average values may, in fact, stay constant within the system (as average batting percentages have done in major-league baseball, and as the bacterial mode has remained for life)—while our (mis)perception of a trend may represent only our myopic focus on rare objects at one extreme in a system’s variation (as this periphery expands or contracts). And the reasons for expansion or contraction of a periphery may be very different from causes for a change in average values. Thus, if we mistake the growth or shrinkage of an edge for movement of an entire mass, we may devise a backwards explanation. I will show that the disappearance of 0.400 hitting marks the shrinkage of such an edge caused by increasing excellence in play, not the extinction of a cherished entity (which would surely signify degeneration of something, and a loss of excellence).
Let me illustrate this unfamiliar concept with a simple (and silly) example to show how, in two cases, an apparent trend may arise only by expansion or contraction of variation. In both cases we tend to misinterpret a phenomenon because we maintain such strong preferences for viewing trends as entities moving somewhere.
The one hundred inhabitants of a mythical land subsist on an identical diet and all weigh one hundred pounds. In my first c
ase, an argument about nutrition develops, with some folks pushing a new (and particularly calorific) brand of cake, and others advocating increased abstemiousness. Most members of the population don’t give a damn and stay where they are, but ten folks eat copious amounts of cake and now average 150 pounds, while ten others run and starve to reach an average weight of fifty pounds. The mean of the population hasn’t altered at all, remaining right at its old value of one hundred pounds—but variation in weight has expanded markedly (and symmetrically in both directions).
Cake-makers, pushing the aesthetic beauty of the new and fuller look, might celebrate a trend to greater weight by focusing on the small subset of people under their influence, and ignoring the others—just as the running-and-dieting moralists might exalt twigginess and praise a supposed trend in this direction by isolating their own small subset. But no general trend has occurred at all, at least in the usual sense. The average of the population has not altered by a single pound, and most people (80 percent in this case) have not varied their weight by an ounce. The only change has been a symmetrical expansion of variation on both sides of a constant mean weight. (You may recognize this increased spread as significant, of course, but we usually don’t describe such nondirectional changes as "trends.")
You may choose to regard this example as both silly and transparent. Few of us would have any trouble identifying the actual changes, and we would laugh the shills of both cake-makers and runner-dieters out of town, if they tried to pass off the changes in their small subset as a general trend. But bear with me, for I shall show that many phenomena often perceived as trends, and either celebrated or lamented with gusto and acres of printer’s ink—the disappearance of 0.400 hitting among them—also represent symmetrical changes of variation around constant mean values, and therefore display the same fallacy, though better hidden.
My second case features a totalitarian society ruled by the runner-dieters. They have been pushing their line for so long that everyone has succumbed to social pressure and weighs fifty pounds. A more liberal regime takes over and permits free discussion about ideal weights. Fine, but for one catch imposed by physiology rather than politics: fifty pounds is the lower limit for sustaining life, and no one can get any thinner. Therefore, although citizens are now free to alter their weight, only one direction of change is possible. The great majority of inhabitants remain content with the old ways and elect to maintain themselves at fifty pounds. Fifteen percent of the population revels in its newfound freedom and begins to gain weight with abandon. Six months later, these fifteen individuals average seventy-five pounds; after a year, one hundred pounds; and after two years, 150 pounds.
The statistical spin doctors for the fat fifteen now step in. They argue that their clients’ point of view is sweeping through the whole society, as clearly indicated by the steady increase of mean weight for the entire population. And who can deny their evidence? They even present a fancy graph (shown here as Figure 3). Before the liberation, average weight stood at fifty pounds; after six months the mean rises to 53.8 pounds (the average for eighty-five remaining at fifty pounds, and fifteen rising to seventy-five pounds); after a year to 57.5 pounds; and after two years to sixty-five pounds (an increase of 30 percent from the original fifty)—a steady, unreversed, and substantial rise.
Again, you may view this example as silly (and purposely chosen to illustrate the obvious nature of the point, once you understand the whole system and its variation). Few people would be fooled, so long as they grasped the totality of the story, and knew that most members of the population had not changed their weight, and that the steady increase in mean values arises as an artifact produced by amalgamating two entirely different subpopulations—a majority of stalwarts with a minority of revolutionaries. But suppose you didn’t appreciate the whole tale, and only listened to the statistical spin doctors for the fat fifteen. Suppose, in addition, that you tended to imbue mean values (as I fear most of us do) with a reality transcending actual individuals and the variation among them. You might then be persuaded from Figure 3 that a general trend has swept through the population, thrusting it as a whole toward greater average weights.
FIGURE 3 Average weight of my hypothetical population plotted against time to show how a false impression of an overall trend may be generated.
We are more likely to be fooled by the second case, where limits to variation on one side of the average permit change in only one direction. The rise of mean values isn’t "false" in this second case, but the supposed trend is surely misleading in the sense of Mark Twain’s or Disraeli’s famous line (the quote has been attributed to both) about three kinds of falsification—"lies, damned lies, and statistics." I will present the technicalities later, but let me quickly state why such false impressions can emerge from correct data in this case—as so often exploited by economic pundits and political spin doctors. As in the cliché about skinning cats, there is more than one way to represent an "average." The most common method, technically called the mean, instructs us to add up all the values and divide by the number of cases. If ten kids have ten dollars among them, the mean wealth per kid is one dollar. But means can be grossly misleading—and never more so than in the type of example purposely chosen above: when variation can expand markedly in one direction and little or not at all in the other. For means will then drift toward the open end and give an impression (often quite false) that the whole population has moved in that direction.
After all, one kid may have a ten-dollar bill, and the other nine nothing. One dollar per kid would still be the mean value, but would such a figure accurately characterize the population? Similarly, to be serious about real cases, spin doctors for politicians in power often use mean incomes to paint dishonestly bright pictures. Suppose that, under a super-Reaganomic system with tax breaks only for the rich, a few millionaires add immense wealth while a vast mass of people at the poverty line either gain nothing or become poorer. The mean income may rise because one tycoon’s increase from, say, $6 million to $600 million per year may balance several million paupers. If one man gains $594 million and one hundred million people lose five dollars each (for a total of $500 million), mean income for the whole population will still rise—but no one would dare say (honestly) that the average person was making more money.
Statisticians have developed other measure of average, or "central tendency," to deal with such cases. One alternative, called the mode, is defined as the most common value in the population. No mathematical rule can tell us which measure of central tendency will be most appropriate for any particular problem. Proper decisions rest upon knowledge of all factors in a given case, and upon basic honesty.
Would anyone dispute a claim that modes, rather than means, provide a better understanding of all the examples presented above? The modal amount of money for the ten kids is zip. The modal income for our population remains constant (or falls slightly), while the mean rises because one tycoon makes an immense killing. The modal weight for the population of my second silly example remains at fifty pounds. The fifteen gainers increase steadily (and the mean of the whole population therefore rises), but who would deny that stability of the majority best characterizes the population as a whole? (At the very least, allow me that you cannot represent the population by the rising mean values of Figure 3 if, for whatever personal reason, you choose to focus on the gainers—and that you must identify the stability of the majority as a major phenomenon.) I be-labor this point because my second focal example, progress in the history of life, emerges as a delusion on precisely the same grounds. A few creatures have evolved greater complexity in the only direction open to variation. The mode has remained rock-solid on bacteria throughout the history of life—and bacteria, by any reasonable criterion, were in the beginning, are now, and ever shall be the most successful organisms on earth.
Variation as Universal Reality
I have tried to show how an apparent trend in a whole system—traditionally read as a "thing" (the popul
ation’s average, for example) moving somewhere—can represent a false reading based only on expansion or contraction of variation within the system. We make such errors either because we focus myopically upon the small set of changing extreme values and falsely read their alteration as a trend in the whole system (my first case, to be illustrated by 0.400 hitting in baseball)—or because variation can expand or contract in only one direction, and we falsely characterize the system by a changing mean value, while a stable mode suggests a radically different interpretation (my second case, to be illustrated by the chimera of progress as the primary thrust of life’s history).
I am not saying that all trends fall victim to this error (genuine "things" do move somewhere sometimes), or that this "fallacy of reified variation"2 exceeds in importance the two more commonly recognized errors of confusing trends with random sequences, or conflating correlation with causality. But the variational fallacy has caused us to read some of our most important, and most intensely discussed, cultural trends in an ass-backwards manner. I am also intrigued by this fallacy because our general misunderstanding or undervaluation of variation raises a much deeper issue about the basic perception of physical reality.
We often portray taxonomy as the dullest of all fields, as expressed in a variety of deprecatory metaphors: hanging garments on nature’s coat-rack; placing items into pigeonholes; or (in an image properly resented by philatelists) sticking stamps into the album of reality. All these images clip the wings of taxonomy and reduce the science of classification to the dullest task of keeping things neat and tidy. But these portrayals also reflect a cardinal fallacy: the assumption of a fully objective nature "out there" and visible in the same way to any unprejudiced observer (the same image that I criticized in the first section of this chapter as "Huxley’s chessboard"). If such a vision could be sustained, I suppose that taxonomy would become the most boring of all sciences, for nature would then present a set of obvious pigeonholes, and taxonomists would search for occupants and shove them in—an enterprise requiring diligence, perhaps, but not much creativity or imagination.