To understand The Bell Curve, we need to begin with a definition of “intelligence.” Predictably, Murray and Herrnstein chose a narrow definition of intelligence—one that brings us back to nineteenth-century biometrics and eugenics. Galton and his disciples, we might recall, were obsessed with the measurement of intelligence. Between 1890 and 1910, dozens of tests were devised in Europe and America that purported to measure intelligence in some unbiased and quantitative manner. In 1904, Charles Spearman, a British statistician, noted an important feature of these tests: people who did well in one test generally tended to do well in another test. Spearman hypothesized that this positive correlation existed because all the tests were obliquely measuring some mysterious common factor. This factor, Spearman proposed, was not knowledge itself, but the capacity to acquire and manipulate abstract knowledge. Spearman called it “general intelligence.” He labeled it g.
By the early twentieth century, g had caught the imagination of the public. First, it captivated early eugenicists. In 1916, the Stanford psychologist Lewis Terman, an avid supporter of the American eugenics movement, created a standardized test to rapidly and quantitatively assess general intelligence, hoping to use the test to select more intelligent humans for eugenic breeding. Recognizing that this measurement varied with age during childhood development, Terman advocated a new metric to quantify age-specific intelligence. If a subject’s “mental age” was the same as his or her physical age, their “intelligence quotient,” or IQ, was defined as exactly 100. If a subject lagged in mental age compared to physical age, the IQ was less than a hundred; if she was more mentally advanced, she was assigned an IQ above 100.
A numerical measure of intelligence was also particularly suited to the demands of the First and Second World Wars, during which recruits had to be assigned to wartime tasks requiring diverse skills based on rapid, quantitative assessments. When veterans returned to civilian life after the wars, they found their lives dominated by intelligence testing. By the early 1940s, such tests had become accepted as an inherent part of American culture. IQ tests were used to rank job applicants, place children in school, and recruit agents for the Secret Service. In the 1950s, Americans commonly listed their IQs on their résumés, submitted the results of a test for a job application, or even chose their spouses based on the test. IQ scores were pinned on the babies who were on display in Better Babies contests (although how IQ was measured in a two-year-old remained mysterious).
These rhetorical and historical shifts in the concept of intelligence are worth noting, for we will return to them in a few paragraphs. General intelligence (g) originated as a statistical correlation between tests given under particular circumstances to particular individuals. It morphed into the notion of “general intelligence” because of a hypothesis concerning the nature of human knowledge acquisition. And it was codified into “IQ” to serve the particular exigencies of war. In a cultural sense, the definition of g was an exquisitely self-reinforcing phenomenon: those who possessed it, anointed as “intelligent” and given the arbitration of the quality, had every incentive in the world to propagate its definition. Richard Dawkins, the evolutionary biologist, once defined a meme as a cultural unit that spreads virally through societies by mutating, replicating, and being selected. We might imagine g as such a self-propagating unit. We might even call it a “selfish g.”
It takes counterculture to counter culture—and it was only inevitable, perhaps, that the sweeping political movements that gripped America in the 1960s and 1970s would shake the notions of general intelligence and IQ by their very roots. As the civil rights movement and feminism highlighted chronic political and social inequalities in America, it became evident that biological and psychological features were not just inborn but likely to be deeply influenced by context and environment. The dogma of a single form of intelligence was also challenged by scientific evidence. Developmental psychologists such as Louis Thurstone (in the fifties) and Howard Gardner (in the late seventies) argued that “general intelligence” was a rather clumsy way to lump together many vastly more context-specific and subtle forms of intelligence, such as visuospatial, mathematical, or verbal intelligence. A geneticist, revisiting this data, might have concluded that g—the measurement of a hypothetical quality invented to serve a particular context—might be a trait barely worth linking to genes, but this hardly dissuaded Murray and Herrnstein. Drawing heavily from an earlier article by the psychologist Arthur Jensen, Murray and Herrnstein set out to prove that g was heritable, that it varied between ethnic groups, and—most crucially—that the racial disparity was due to inborn genetic differences between whites and African-Americans.
Is g heritable? In a certain sense, yes. In the 1950s, a series of reports suggested a strong genetic component. Of these, twin studies were the most definitive. When identical twins who had been reared together—i.e., with shared genes and shared environments—were tested in the early fifties, psychologists had found a striking degree of concordance in their IQs, with a correlation value of 0.86.III In the late eighties, when identical twins who were separated at birth and reared separately were tested, the correlation fell to 0.74—still a striking number.
But the heritability of a trait, no matter how strong, may be the result of multiple genes, each exerting a relatively minor effect. If that was the case, identical twins would show strong correlations in g, but parents and children would be far less concordant. IQ followed this pattern. The correlation between parents and children living together, for instance, fell to 0.42. With parents and children living apart, the correlation collapsed to 0.22. Whatever the IQ test was measuring, it was a heritable factor, but one also influenced by many genes and possibly strongly modified by environment—part nature and part nurture.
The most logical conclusion from these facts is that while some combination of genes and environments can strongly influence g, this combination will rarely be passed, intact, from parents to their children. Mendel’s laws virtually guarantee that the particular permutation of genes will scatter apart in every generation. And environmental interactions are so difficult to capture and predict that they cannot be reproduced over time. Intelligence, in short, is heritable (i.e., influenced by genes), but not easily inheritable (i.e., moved down intact from one generation to the next).
Had Murray and Herrnstein reached these conclusions, they would have published an accurate, if rather uncontroversial, book on the inheritance of intelligence. But the molten centerpiece of The Bell Curve is not the heritability of IQ—but its racial distribution. Murray and Herrnstein began by reviewing 156 independent studies that had compared IQs between races. Taken together, these studies had found an average IQ of 100 for whites (by definition, the average IQ of the index population has to be 100) and 85 for African-Americans—a 15-point difference. Murray and Herrnstein tried, somewhat valiantly, to ferret out the possibility that the tests were biased against African-Americans. They limited the tests to only those administered after 1960, and only given outside the South, hoping to curtail endemic biases—but the 15-point difference persisted.
Could the difference in black-white IQ scores be a result of socioeconomic status? That impoverished children, regardless of race, perform worse in IQ tests had been known for decades. Indeed, of all the hypotheses concerning the difference in racial IQs, this was, by far, the most plausible: that the vast part of the black-white difference may be the consequence of the overrepresentation of poor African-American children. In the 1990s, the psychologist Eric Turkheimer strongly validated this theory by demonstrating that genes play a rather minor role in determining IQ in severely impoverished circumstances. If you superpose poverty, hunger, and illness on a child, then these variables dominate the influence on IQ. Genes that control IQ only become significant if you remove these limitations.
It is easy to demonstrate an analogous effect in a lab: If you raise two plant strains—one tall and one short—in undernourished circumstances, then both plants grow short reg
ardless of intrinsic genetic drive. In contrast, when nutrients are no longer limiting, the tall plant grows to its full height. Whether genes or environment—nature or nurture—dominates in influence depends on context. When environments are constraining, they exert a disproportionate influence. When the constraints are removed, genes become ascendant.IV
The effects of poverty and deprivation offered a perfectly reasonable cause for the overall black-white difference in IQ, but Murray and Herrnstein dug deeper. Even correcting for socioeconomic status, they found, the black-white score difference could not be fully eliminated. If you plot a curve of IQ of whites and African-Americans across increasing socioeconomic status, the IQ increases in both cases, as expected. Wealthier children certainly score better than their poorer counterparts—both in white and in African-American populations. Yet, the difference between the IQ scores across races persists. Indeed, paradoxically, the difference increases as you increase the socioeconomic status of whites and African-Americans. The difference in IQ scores between wealthy whites and wealthy African-Americans is even more pronounced: far from narrowing, the gap widens at the top brackets of income.
Quarts and quarts of ink have been spilled in books, magazines, scientific journals, and newspapers analyzing, cross-examining, and debunking these results. In a blistering article written for the New Yorker, for instance, the evolutionary biologist Stephen Jay Gould argued that the effect was far too mild, and the variation within tests was far too great, to make any statistical conclusions about the difference. The Harvard historian Orlando Patterson, in the slyly titled “For Whom the Bell Curves,” reminded readers that the frayed legacies of slavery, racism, and bigotry had deepened the cultural rifts between whites and African-Americans so dramatically that biological attributes across races could not be compared in a meaningful way. Indeed, the social psychologist Claude Steele demonstrated that when black students are asked to take an IQ test under the pretext that they are being tested to try out a new electronic pen, or a new way of scoring, they perform well. Told that they are being tested for “intelligence,” however, their scores collapse. The real variable being measured, then, is not intelligence but an aptitude for test taking, or self-esteem, or simply ego or anxiety. In a society where black men and women experience routine, pervasive, and insidious discrimination, such a propensity could become fully self-reinforcing: black children do worse at tests because they’ve been told that they are worse at tests, which makes them perform badly in tests and furthers the idea that they are less intelligent—ad infinitum.
But the final fatal flaw in The Bell Curve is something far simpler, a fact buried so inconspicuously in a single throwaway paragraph in an eight-hundred-page book that it virtually disappears. If you take African-Americans and whites with identical IQ scores, say 105, and measure their performance in various subtests for intelligence, black children often score better in certain sets (tests of short-term memory and recall, for instance), while whites often score better in others (tests of visuospatial and perceptual changes). In other words, the way an IQ test is configured profoundly affects the way different racial groups, and their gene variants, perform on it: alter the weights and balances within the same test, and you alter the measure of intelligence.
The strongest evidence for such a bias comes from a largely forgotten study performed by Sandra Scarr and Richard Weinberg in 1976. Scarr studied transracial adoptees—black children adopted by white parents—and found that these children had an average IQ of 106, at least as high as white children. By analyzing carefully performed controls, Scarr concluded that “intelligence” was not being enhanced, but performance on particular subtests of intelligence.
We cannot shrug this proposition away by suggesting that the current construction of the IQ test must be correct since it predicts performance in the real world. Of course it does—because the concept of IQ is powerfully self-reinforcing: it measures a quality imbued with enormous meaning and value whose job it is to propagate itself. The circle of its logic is perfectly closed and impenetrable. Yet the actual configuration of the test is relatively arbitrary. You do not render the word intelligence meaningless by shifting the balance in a test—from visuospatial perception to short-term recall, say—but you do shift the black-white IQ score discrepancy. And therein lies the rub. The tricky thing about the notion of g is that it pretends to be a biological quality that is measurable and heritable, while it is actually strongly determined by cultural priorities. It is—to simplify it somewhat—the most dangerous of all things: a meme masquerading as a gene.
If the history of medical genetics teaches us one lesson, it is to be wary of precisely such slips between biology and culture. Humans, we now know, are largely similar in genetic terms—but with enough variation within us to represent true diversity. Or, perhaps more accurately, we are culturally or biologically inclined to magnify variations, even if they are minor in the larger scheme of the genome. Tests that are explicitly designed to capture variances in abilities will likely capture variances in abilities—and these variations may well track along racial lines. But to call the score in such a test “intelligence,” especially when the score is uniquely sensitive to the configuration of the test, is to insult the very quality it sets out to measure.
Genes cannot tell us how to categorize or comprehend human diversity; environments can, cultures can, geographies can, histories can. Our language sputters in its attempt to capture this slip. When a genetic variation is statistically the most common, we call it normal—a word that implies not just superior statistical representation but qualitative or even moral superiority (Merriam-Webster’s dictionary has no less than eight definitions of the word, including “occurring naturally” and “mentally and physically healthy”). When the variation is rare, it is termed a mutant—a word that implies not just statistical uncommonness, but qualitative inferiority, or even moral repugnance.
And so it goes, interposing linguistic discrimination on genetic variation, mixing biology and desire. When a gene variant reduces an organism’s fitness in a particular environment—a hairless man in Antarctica—we call the phenomenon genetic illness. When the same variant increases fitness in a different environment, we call the organism genetically enhanced. The synthesis of evolutionary biology and genetics reminds us that these judgments are meaningless: enhancement or illness are words that measure the fitness of a particular genotype to a particular environment; if you alter the environment, the words can even reverse their meanings. “When nobody read,” the psychologist Alison Gopnik writes, “dyslexia wasn’t a problem. When most people had to hunt, a minor genetic variation in your ability to focus attention was hardly a problem, and may even have been an advantage [enabling a hunter to maintain his focus on multiple and simultaneous targets, for instance]. When most people have to make it through high school, the same variation can become a life-altering disease.”
The desire to categorize humans along racial lines, and the impulse to superpose attributes such as intelligence (or criminality, creativity, or violence) on those lines, illustrates a general theme concerning genetics and categorization. Like the English novel, or the face, say, the human genome can be lumped or split in a million different ways. But whether to split or lump, to categorize or synthesize, is a choice. When a distinct, heritable biological feature, such as a genetic illness (e.g., sickle-cell anemia), is the ascendant concern, then examining the genome to identify the locus of that feature makes absolute sense. The narrower the definition of the heritable feature or the trait, the more likely we will find a genetic locus for that trait, and the more likely that the trait will segregate within some human subpopulation (Ashkenazi Jews in the case of Tay-Sachs disease, or Afro-Caribbeans for sickle-cell anemia). There’s a reason that marathon running, for instance, is becoming a genetic sport: runners from Kenya and Ethiopia, a narrow eastern wedge of one continent, dominate the race not just because of talent and training, but also because the marathon is a narrowly defined test
for a certain form of extreme fortitude. Genes that enable this fortitude (e.g., particular combinations of gene variants that produce distinct forms of anatomy, physiology, and metabolism) will be naturally selected.
Conversely, the more we widen the definition of a feature or trait (say, intelligence, or temperament), the less likely that the trait will correlate with single genes—and, by extension, with races, tribes, or subpopulations. Intelligence and temperament are not marathon races: there are no fixed criteria for success, no start or finish lines—and running sideways or backward, might secure victory.
The narrowness, or breadth, of the definition of a feature is, in fact, a question of identity—i.e., how we define, categorize, and understand humans (ourselves) in a cultural, social, and political sense. The crucial missing element in our blurred conversation on the definition of race, then, is a conversation on the definition of identity.
* * *
I. Wilson drew his crucial insight from two giants of biochemistry, Linus Pauling and Émile Zuckerkandl, who had proposed an entirely novel way to think of the genome—not just as a compendium of information to build an individual organism, but as a compendium of information for an organism’s evolutionary history: a “molecular clock.” The Japanese evolutionary biologist Motoo Kimura also developed this theory.