The Better Angels of Our Nature: Why Violence Has Declined
FIGURE 5–10. Populations of cities (a power-law distribution), plotted on linear and log scales
Source: Graph adapted from Newman, 2005, p. 324.
Which brings us back to wars. Since wars fall into a power-law distribution, some of the mathematical properties of these distributions may help us understand the nature of wars and the mechanisms that give rise to them. For starters, power-law distributions with the exponent we see for wars do not even have a finite mean. There is no such thing as a “typical war.” We should not expect, even on average, that a war will proceed until the casualties pile up to an expected level and then will naturally wind itself down.
Also, power-law distributions are scale-free. As you slide up or down the line in the log-log graph, it always looks the same, namely, like a line. The mathematical implication is that as you magnify or shrink the units you are looking at, the distribution looks the same. Suppose that computer files of 2 kilobytes are a quarter as common as files of 1 kilobyte. Then if we stand back and look at files in higher ranges, we find the same thing: files of 2 megabytes are a quarter as common as files of 1 megabyte, and files of 2 terabytes are a quarter as common as files of 1 terabyte. In the case of wars, you can think of it this way. What are the odds of going from a small war, say, with 1,000 deaths, to a medium-size war, with 10,000 deaths? It’s the same as the odds of going from a medium-size war of 10,000 deaths to a large war of 100,000 deaths, or from a large war of 100,000 deaths to a historically large war of 1 million deaths, or from a historic war to a world war.
Finally, power-law distributions have “thick tails,” meaning that they have a nonnegligible number of extreme values. You will never meet a 20-foot man, or see a car driving down the freeway at 500 miles per hour. But you could conceivably come across a city of 14 million, or a book that was on the bestseller list for 10 years, or a moon crater big enough to see from the earth with the naked eye—or a war that killed 55 million people.
The thick tail of a power-law distribution, which declines gradually rather than precipitously as you rocket up the magnitude scale, means that extreme values are extremely unlikely but not astronomically unlikely. It’s an important difference. The chances of meeting a 20-foot-tall man are astronomically unlikely; you can bet your life it will never happen. But the chances that a city will grow to 20 million, or that a book will stay on the bestseller list for 20 years, is merely extremely unlikely—it probably won’t happen, but you could well imagine it happening. I hardly need to point out the implications for war. It is extremely unlikely that the world will see a war that will kill 100 million people, and less likely still that it will have one that will kill a billion. But in an age of nuclear weapons, our terrified imaginations and the mathematics of power-law distributions agree: it is not astronomically unlikely.
So far I’ve been discussing the causes of war as Platonic abstractions, as if armies were sent into war by equations. What we really need to understand is why wars distribute themselves as power laws; that is, what combination of psychology and politics and technology could generate this pattern. At present we can’t be sure of the answer. Too many kinds of mechanisms can give rise to power-law distributions, and the data on wars are not precise enough to tell us which is at work.
Still, the scale-free nature of the distribution of deadly quarrels gives us an insight about the drivers of war.60 Intuitively, it suggests that size doesn’t matter. The same psychological or game-theoretic dynamics that govern whether quarreling coalitions will threaten, back down, bluff, engage, escalate, fight on, or surrender apply whether the coalitions are street gangs, militias, or armies of great powers. Presumably this is because humans are social animals who aggregate into coalitions, which amalgamate into larger coalitions, and so on. Yet at any scale these coalitions may be sent into battle by a single clique or individual, be it a gang leader, capo, warlord, king, or emperor.
How can the intuition that size doesn’t matter be implemented in models of armed conflict that actually generate power-law distributions?61 The simplest is to assume that the coalitions themselves are power-law-distributed in size, that they fight each other in proportion to their numbers, and that they suffer losses in proportion to their sizes. We know that some human aggregations, namely municipalities, are power-law-distributed, and we know the reason. One of the commonest generators of a power-law distribution is preferential attachment: the bigger something is, the more new members it attracts. Preferential attachment is also known as accumulated advantage, the-rich-get-richer, and the Matthew Effect, after the passage in Matthew 25:29 that Billie Holiday summarized as “Them that’s got shall get, them that’s not shall lose.” Web sites that are popular attract more visitors, making them even more popular; bestselling books are put on bestseller lists, which lure more people into buying them; and cities with lots of people offer more professional and cultural opportunities so more people flock to them. (How are you going to keep them down on the farm after they’ve seen Paree?)
Richardson considered this simple explanation but found that the numbers didn’t add up.62 If deadly quarrels reflected city sizes, then for every tenfold reduction in the size of a quarrel, there should be ten times as many of them, but in fact there are fewer than four times as many. Also, in recent centuries wars have been fought by states, not cities, and states follow a log-normal distribution (a warped bell curve) rather than a power law.
Another kind of mechanism has been suggested by the science of complex systems, which looks for laws that govern structures that are organized into similar patterns despite being made of different stuff. Many complexity theorists are intrigued by systems that display a pattern called self-organized criticality. You can think of “criticality” as the straw that broke the camel’s back: a small input causes a sudden large output. “Self-organized” criticality would be a camel whose back healed right back to the exact strength at which straws of various sizes could break it again. A good example is a trickle of sand falling onto a sandpile, which periodically causes landslides of different sizes; the landslides are distributed according to a power law. An avalanche of sand stops at a point where the slope is just shallow enough to be stable, but the new sand trickling onto it steepens the slope and sets off a new avalanche. Earthquakes and forest fires are other examples. A fire burns a forest, which allows trees to grow back at random, forming clusters that can grow into each other and fuel another fire. Several political scientists have developed computer simulations that model wars on an analogy to forest fires.63 In these models, countries conquer their neighbors and create larger countries in the same way that patches of trees grow into each other and create larger patches. Just as a cigarette tossed in a forest can set off either a brushfire or a conflagration, a destabilizing event in the simulation of states can set off either a skirmish or a world war.
In these simulations, the destructiveness of a war depends mainly on the territorial size of the combatants and their alliances. But in the real world, variations in destructiveness also depend on the resolve of the two parties to keep a war going, with each hoping that the other will collapse first. Some of the bloodiest conflicts in modern history, such as the American Civil War, World War I, the Vietnam War, and the Iran-Iraq War, were wars of attrition, where both sides kept shoveling men and matériel into the maw of the war machine hoping that the other side would exhaust itself first.
John Maynard Smith, the biologist who first applied game theory to evolution, modeled this kind of standoff as a War of Attrition game.64 Each of two contestants competes for a valuable resource by trying to outlast the other, steadily accumulating costs as he waits. In the original scenario, they might be heavily armored animals competing for a territory who stare at each other until one of them leaves; the costs are the time and energy the animals waste in the standoff, which they could otherwise use in catching food or pursuing mates. A game of attrition is mathematically equivalent to an auction in which the highest bidder wins the prize and both side
s have to pay the loser’s low bid. And of course it can be analogized to a war in which the expenditure is reckoned in the lives of soldiers.
The War of Attrition is one of those paradoxical scenarios in game theory (like the Prisoner’s Dilemma, the Tragedy of the Commons, and the Dollar Auction) in which a set of rational actors pursuing their interests end up worse off than if they had put their heads together and come to a collective and binding agreement. One might think that in an attrition game each side should do what bidders on eBay are advised to do: decide how much the contested resource is worth and bid only up to that limit. The problem is that this strategy can be gamed by another bidder. All he has to do is bid one more dollar (or wait just a bit longer, or commit another surge of soldiers), and he wins. He gets the prize for close to the amount you think it is worth, while you have to forfeit that amount too, without getting anything in return. You would be crazy to let that happen, so you are tempted to use the strategy “Always outbid him by a dollar,” which he is tempted to adopt as well. You can see where this leads. Thanks to the perverse logic of an attrition game, in which the loser pays too, the bidders may keep bidding after the point at which the expenditure exceeds the value of the prize. They can no longer win, but each side hopes not to lose as much. The technical term for this outcome in game theory is “a ruinous situation.” It is also called a “Pyrrhic victory”; the military analogy is profound.
One strategy that can evolve in a War of Attrition game (where the expenditure, recall, is in time) is for each player to wait a random amount of time, with an average wait time that is equivalent in value to what the resource is worth to them. In the long run, each player gets good value for his expenditure, but because the waiting times are random, neither is able to predict the surrender time of the other and reliably outlast him. In other words, they follow the rule: At every instant throw a pair of dice, and if they come up (say) 4, concede; if not, throw them again. This is, of course, like a Poisson process, and by now you know that it leads to an exponential distribution of wait times (since a longer and longer wait depends on a less and less probable run of tosses). Since the contest ends when the first side throws in the towel, the contest durations will also be exponentially distributed. Returning to our model where the expenditures are in soldiers rather than seconds, if real wars of attrition were like the “War of Attrition” modeled in game theory, and if all else were equal, then wars of attrition would fall into an exponential distribution of magnitudes.
Of course, real wars fall into a power-law distribution, which has a thicker tail than an exponential (in this case, a greater number of severe wars). But an exponential can be transformed into a power law if the values are modulated by a second exponential process pushing in the opposite direction. And attrition games have a twist that might do just that. If one side in an attrition game were to leak its intention to concede in the next instant by, say, twitching or blanching or showing some other sign of nervousness, its opponent could capitalize on the “tell” by waiting just a bit longer, and it would win the prize every time. As Richard Dawkins has put it, in a species that often takes part in wars of attrition, one expects the evolution of a poker face.
Now, one also might have guessed that organisms would capitalize on the opposite kind of signal, a sign of continuing resolve rather than impending surrender. If a contestant could adopt some defiant posture that means “I’ll stand my ground; I won’t back down,” that would make it rational for his opposite number to give up and cut its losses rather than escalate to mutual ruin. But there’s a reason we call it “posturing.” Any coward can cross his arms and glower, but the other side can simply call his bluff. Only if a signal is costly—if the defiant party holds his hand over a candle, or cuts his arm with a knife—can he show that he means business. (Of course, paying a self-imposed cost would be worthwhile only if the prize is especially valuable to him, or if he had reason to believe that he could prevail over his opponent if the contest escalated.)
In the case of a war of attrition, one can imagine a leader who has a changing willingness to suffer a cost over time, increasing as the conflict proceeds and his resolve toughens. His motto would be: “We fight on so that our boys shall not have died in vain.” This mindset, known as loss aversion, the sunk-cost fallacy, and throwing good money after bad, is patently irrational, but it is surprisingly pervasive in human decision-making.65 People stay in an abusive marriage because of the years they have already put into it, or sit through a bad movie because they have already paid for the ticket, or try to reverse a gambling loss by doubling their next bet, or pour money into a boondoggle because they’ve already poured so much money into it. Though psychologists don’t fully understand why people are suckers for sunk costs, a common explanation is that it signals a public commitment. The person is announcing: “When I make a decision, I’m not so weak, stupid, or indecisive that I can be easily talked out of it.” In a contest of resolve like an attrition game, loss aversion could serve as a costly and hence credible signal that the contestant is not about to concede, preempting his opponent’s strategy of outlasting him just one more round.
I already mentioned some evidence from Richardson’s dataset which suggests that combatants do fight longer when a war is more lethal: small wars show a higher probability of coming to an end with each succeeding year than do large wars.66 The magnitude numbers in the Correlates of War Dataset also show signs of escalating commitment: wars that are longer in duration are not just costlier in fatalities; they are costlier than one would expect from their durations alone.67 If we pop back from the statistics of war to the conduct of actual wars, we can see the mechanism at work. Many of the bloodiest wars in history owe their destructiveness to leaders on one or both sides pursuing a blatantly irrational loss-aversion strategy. Hitler fought the last months of World War II with a maniacal fury well past the point when defeat was all but certain, as did Japan. Lyndon Johnson’s repeated escalations of the Vietnam War inspired a protest song that has served as a summary of people’s understanding of that destructive war: “We were waist-deep in the Big Muddy; The big fool said to push on.”
The systems biologist Jean-Baptiste Michel has pointed out to me how escalating commitments in a war of attrition could produce a power-law distribution. All we need to assume is that leaders keep escalating as a constant proportion of their past commitment—the size of each surge is, say, 10 percent of the number of soldiers that have fought so far. A constant proportional increase would be consistent with the well-known discovery in psychology called Weber’s Law: for an increase in intensity to be noticeable, it must be a constant proportion of the existing intensity. (If a room is illuminated by ten lightbulbs, you’ll notice a brightening when an eleventh is switched on, but if it is illuminated by a hundred lightbulbs, you won’t notice the hundred and first; someone would have to switch on another ten bulbs before you noticed the brightening.) Richardson observed that people perceive lost lives in the same way: “Contrast for example the many days of newspaper-sympathy over the loss of the British submarine Thetis in time of peace with the terse announcement of similar losses during the war. This contrast may be regarded as an example of the Weber-Fechner doctrine that an increment is judged relative to the previous amount.”68 The psychologist Paul Slovic has recently reviewed several experiments that support this observation. 69 The quotation falsely attributed to Stalin, “One death is a tragedy; a million deaths is a statistic,” gets the numbers wrong but captures a real fact about human psychology.
If escalations are proportional to past commitments (and a constant proportion of soldiers sent to the battlefield are killed in battle), then losses will increase exponentially as a war drags on, like compound interest. And if wars are attrition games, their durations will also be distributed exponentially. Recall the mathematical law that a variable will fall into a power-law distribution if it is an exponential function of a second variable that is distributed exponentially.70 My own guess is that the
combination of escalation and attrition is the best explanation for the power-law distribution of war magnitudes.
Though we may not know exactly why wars fall into a power-law distribution, the nature of that distribution—scale-free, thick-tailed—suggests that it involves a set of underlying processes in which size doesn’t matter. Armed coalitions can always get a bit larger, wars can always last a bit longer, and losses can always get a bit heavier, with the same likelihood regardless of how large, long, or heavy they were to start with.
The next obvious question about the statistics of deadly quarrels is: What destroys more lives, the large number of small wars or the few big ones? A power-law distribution itself doesn’t give the answer. One can imagine a dataset in which the aggregate damage from the wars of each size adds up to the same number of deaths: one war with ten million deaths, ten wars with a million deaths, a hundred wars with a hundred thousand deaths, all the way down to ten million murders with one death apiece. But in fact, distributions with exponents greater than one (which is what we get for wars) will have the numbers skewed toward the tail. A power-law distribution with an exponent in this range is sometimes said to follow the 80:20 rule, also known as the Pareto Principle, in which, say, the richest 20 percent of the population controls 80 percent of the wealth. The ratio may not be 80:20 exactly, but many power-law distributions have this kind of lopsidedness. For example, the 20 percent most popular Web sites get around two-thirds of the hits.71