Bunyan’s father was a brazier or tinker, but a tinker of recognized position in the village; and the mother was not of the squalid poor, but of people who were “decent and worthy in their ways.” This would be sufficient evidence for a rating between 90 and 100. But the record goes further, and we read that notwithstanding their “meanness and inconsiderableness,” Bunyan’s parents put their boy to school to learn “both to read and write,” which probably indicates that he showed something more than the promise of a future tinker (p. 90).

  Michael Faraday squeaked by at 105, overcoming the demerit of parental standing with snippets about his reliability as an errand boy and his questioning nature. His elevated A2 IQ of 150 only records increasing information about his more notable young manhood. In one case, however, Cox couldn’t bear to record the unpleasant result that her methods dictated. Shakespeare, of humble origin and unknown childhood, would have scored below 100. So Cox simply left him out, even though she included several others with equally inadequate childhood records.

  Among other curiosities of scoring that reflect Cox and Terman’s social prejudices, several precocious youngsters (Clive, Liebig, and Swift, in particular) were downgraded for their rebelliousness in school, particularly for their unwillingness to study classics. An animus against the performing arts is evident in the rating of composers, who (as a group) rank just above soldiers at the bottom of the final list. Consider the following understatement about Mozart (p. 129): “A child who learns to play the piano at 3, who receives and benefits by musical instruction at that age, and who studies and executes the most difficult counterpoint at age 14, is probably above the average level of his social group.”

  In the end, I suspect that Cox recognized the shaky basis of her work, but persisted bravely nonetheless. Correlations between rank in eminence (length of Cattell’s entry) and awarded IQ were disappointing to say the least—a mere 0.25 for eminence vs. A2 IQ, with no figure recorded at all for eminence vs. A1 IQ (it is a lower 0.20 by my calculation). Instead, Cox makes much of the fact that her ten most eminent subjects average 4—yes only 4—A1 IQ points above her ten least eminent.

  Cox calculated her strongest correlation (0.77) between A2 IQ and “index of reliability,” a measure of available information about her subjects. I can imagine no better demonstration that Cox’s IQ’s are artifacts of differential amounts of data, not measures of innate ability or even, for that matter, of simple talent. Cox recognized this and, in a final effort, tried to “correct” her scores for missing information by adjusting poorly documented subjects upward toward the group means of 135 for A1 IQ and 145 for A2 IQ. These adjustments boosted average IQ’s substantially, but led to other embarrassments. For uncorrected scores, the most eminent fifty averaged 142 for A1 IQ, while the least eminent fifty scored comfortably lower at 133. With corrections, the first fifty scored 160, the last fifty, 165. Ultimately, only Goethe and Voltaire scored near the top both in IQ and eminence. One might paraphrase Voltaire’s famous quip about God and conclude that even though adequate information on the IQ of history’s eminent men does not exist, it was probably inevitable that the American hereditarians would try to invent it.

  Terman on group differences

  Terman’s empirical work measured what statisticians call the “within-group variance” of IQ—that is, the differences in scores within single populations (all children in a school, for example). At best, he was able to show that children testing well or poorly at a young age generally maintain their ordering with respect to other children as the population grows up. Terman ascribed most of these differences to variation in biological endowment, without much evidence beyond an assertion that all right-minded people recognize the domination of nurture by nature. This brand of hereditarianism might offend our present sensibilities with its elitism and its accompanying proposals for institutional care and forced abstinence from breeding, but it does not, by itself, entail the more contentious claim for innate differences between groups.

  Terman made this invalid extrapolation, as virtually all hereditarians did and still do. He then compounded his error by confusing the genesis of true pathologies with causes for variation in normal behavior. We know, for example, that the mental retardation associated with Down’s syndrome has its origin in a specific genetic defect (an extra chromosome). But we cannot therefore attribute the low IQ of many apparently normal children to an innate biology. We might as well claim that all overweight people can’t help it because some very obese individuals can trace their condition to hormonal imbalances. Terman’s data on the stability of ordering in IQ within groups of growing children relied largely upon the persistently low IQ of biologically afflicted individuals, despite Terman’s attempt to bring all scores under the umbrella of a normal curve (1916, pp. 65–67), and thus to suggest that all variation has a common root in the possession of more or less of a single substance. In short, it is invalid to extrapolate from variation within a group to differences between groups. It is doubly invalid to use the innate biology of pathological individuals as a basis for ascribing normal variation within a group to inborn causes.

  At least the IQ hereditarians did not follow their craniological forebears in harsh judgments about women. Girls did not score below boys in IQ, and Terman proclaimed their limited access to professions both unjust and wasteful of intellectual talent (1916, p. 72; 1919, p. 288). He noted, assuming that IQ should earn its monetary reward, that women scoring between 100 and 120 generally earned, as teachers or “high-grade stenographers,” what men with an IQ of 85 received as motormen, firemen, or policemen (1919, p. 278).

  But Terman took the hereditarian line on race and class and proclaimed its validation as a primary aim of his work. In ending his chapter on the uses of IQ (1916, pp. 19–20), Terman posed three questions:

  Is the place of the so-called lower classes in the social and industrial scale the result of their inferior native endowment, or is their apparent inferiority merely a result of their inferior home and school training? Is genius more common among children of the educated classes than among the children of the ignorant and poor? Are the inferior races really inferior, or are they merely unfortunate in their lack of opportunity to learn?

  Despite a poor correlation of 0.4 between social status and IQ, Terman (1917) advanced five major reasons for claiming that “environment is much less important than is original endowment in determining the nature of the traits in question” (p. 91). The first three, based on additional correlations, add no evidence for innate causes. Terman calculated: 1) a correlation of 0.55 between social status and teachers’ assessments of intelligence; 2) 0.47 between social status and school work; and 3) a lower, but unstated,* correlation between “age-grade progress” and social status. Since all five properties—IQ, social status, teacher’s assessment, school work, and age-grade progress—may be redundant measures of the same complex and unknown causes, the correlation between any additional pair adds little to the basic result of 0.4 between IQ and social status. If the 0.4 correlation offers no evidence for innate causes, then the additional correlations do not either.

  The fourth argument, recognized as weak by Terman himself (1916, p. 98), confuses probable pathology with normal variation, and is therefore irrelevant, as discussed above: feeble-minded children are occasionally born to rich or to intellectually successful parents.

  The fifth argument reveals the strength of Terman’s hereditarian convictions and his remarkable insensitivity to the influence of environment. Terman measured the IQ of twenty children in a California orphanage. Only three were “fully normal,” while seventeen ranged from 75 to 95. The low scores cannot be attributed to life without parents, Terman argues, because (p. 99):

  The orphanage in question is a reasonably good one and affords an environment which is about as stimulating to normal mental development as average home life among the middle classes. The children live in the orphanage and attend an excellent public school in a California village.

>   Low scores must reflect the biology of children committed to such institutions:

  Some of the tests which have been made in such institutions indicate that mental subnormality of both high and moderate grades is extremely frequent among children who are placed in these homes. Most, though admittedly not all of these, are children of inferior social classes (p. 99).

  Terman offers no direct evidence about the lives of his twenty children beyond the fact of their institutional placement. He is not even certain that they all came from “inferior social classes.” Surely, the most parsimonious assumption would relate low IQ scores to the one incontestable and common fact about the children—their life in the orphanage itself.

  Terman moved easily from individuals, to social classes, to races. Distressed by the frequency of IQ scores between 70 and 80, he lamented (1916, pp. 91–92):

  Among laboring men and servant girls there are thousands like them.… The tests have told the truth. These boys are ineducable beyond the merest rudiments of training. No amount of school instruction will ever make them intelligent voters or capable citizens.… They represent the level of intelligence which is very, very common among Spanish-Indian and Mexican families of the Southwest and also among negroes. Their dullness seems to be racial, or at least inherent in the family stocks from which they came. The fact that one meets this type with such extraordinary frequency among Indians, Mexicans, and negroes suggests quite forcibly that the whole question of racial differences in mental traits will have to be taken up anew and by experimental methods. The writer predicts that when this is done there will be discovered enormously significant racial differences in general intelligence, differences which cannot be wiped out by any scheme of mental culture. Children of this group should be segregated in special classes and be given instruction which is concrete and practical. They cannot master abstractions, but they can often be made efficient workers, able to look out for themselves. There is no possibility at present of convincing society that they should not be allowed to reproduce, although from a eugenic point of view they constitute a grave problem because of their unusually prolific breeding.

  Terman sensed that his arguments for innateness were weak. Yet what did it matter? Do we need to prove what common sense proclaims so clearly?

  After all, does not common observation teach us that, in the main, native qualities of intellect and character, rather than chance, determine the social class to which a family belongs? From what is already known about heredity, should we not naturally expect to find the children of well-to-do, cultured, and successful parents better endowed than the children who have been reared in slums and poverty? An affirmative answer to the above question is suggested by nearly all the available scientific evidence (1917, p. 99).

  Whose common sense?

  Terman recants

  Terman’s book on the Stanford-Binet revision of 1937 was so different from the original volume of 1916 that common authorship seems at first improbable. But then times had changed and intellectual fashions of jingoism and eugenics had been swamped in the morass of a Great Depression. In 1916 Terman had fixed adult mental age at sixteen because he couldn’t get a random sample of older schoolboys for testing. In 1937 he could extend his scale to age eighteen; for “the task was facilitated by the extremely unfavorable employment situation at the time the tests were made, which operated to reduce considerably the school elimination normally occurring after fourteen” (1937, p. 30).

  Terman did not explicity abjure his previous conclusions, but a veil of silence descended upon them. Not a word beyond a few statements of caution do we hear about heredity. All potential reasons for differences between groups are framed in environmental terms. Terman presents his old curves for average differences in IQ between social classes, but he warns us that mean differences are too small to provide any predictive information for individuals. We also do not know how to partition the average differences between genetic and environmental influences:

  It is hardly necessary to stress the fact that these figures refer to mean values only, and that in view of the variability of the IQ within each group the respective distributions greatly overlap one another. Nor should it be necessary to point out that such data do not, in themselves, offer any conclusive evidence of the relative contributions of genetic and environmental factors in determining the mean differences observed.

  A few pages later, Terman discusses the differences between rural and urban children, noting the lower country scores and the curious finding that rural IQ drops with age after entrance to school, while IQ for urban children of semiskilled and unskilled workers rises. He expresses no firm opinion, but note that the only hypotheses he wishes to test are now environmental:

  It would require extensive research, carefully planned for the purpose, to determine whether the lowered IQ of rural children can be ascribed to the relatively poorer educational facilities in rural communities, and whether the gain for children from the lower economic strata can be attributed to an assumed enrichment of intellectual environment that school attendance bestows.

  Autres temps, autres moeurs.

  R. M. Yerkes and the Army Mental Tests: IQ comes of age

  Psychology’s great leap forward

  Robert M. Yerkes, about to turn forty, was a frustrated man in 1915. He had been on the faculty of Harvard University since 1902. He was a superb organizer, and an eloquent promotor of his profession. Yet psychology still wallowed in its reputation as a “soft” science, if a science at all. Some colleges did not acknowledge its existence; others ranked it among the humanities and placed psychologists in departments of philosophy. Yerkes wished, above all, to establish his profession by proving that it could be as rigorous a science as physics. Yerkes and most of his contemporaries equated rigor and science with numbers and quantification. The most promising source of copious and objective numbers, Yerkes believed, lay in the embryonic field of mental testing. Psychology would come of age, and gain acceptance as a true science worthy of financial and institutional support, if it could bring the question of human potential under the umbrella of science:

  Most of us are wholly convinced that the future of mankind depends in no small measure upon the development of the various biological and social sciences.… We must … strive increasingly for the improvement of our methods of mental measurement, for there is no longer ground for doubt concerning the practical as well as the theoretical importance of studies of human behavior. We must learn to measure skillfully every form and aspect of behavior which has psychological and sociological significance (Yerkes, 1917a, p. 111).

  But mental testing suffered from inadequate support and its own internal contradictions. It was, first of all, practiced extensively by poorly trained amateurs whose manifestly absurd results were giving the enterprise a bad name. In 1915, at the annual meeting of the American Psychological Association in Chicago, a critic reported that the mayor of Chicago himself had tested as a moron on one version of the Binet scales. Yerkes joined with critics in discussions at the meeting and proclaimed: “We are building up a science, but we have not yet devised a mechanism which anyone can operate” (quoted in Chase, 1977, p. 24).

  Second, available scales gave markedly different results even when properly applied. As discussed on p. 166, half the individuals who tested in the low, but normal range on the Stanford-Binet, were morons on Goddard’s version of the Binet scale. Finally, support had been too inadequate, and coordination too sporadic, to build up a pool of data sufficiently copious and uniform to compel belief (Yerkes, 1917b).

  Wars always generate their retinue of camp followers with ulterior motives. Many are simply scoundrels and profiteers, but a few are spurred by higher ideals. As mobilization for World War I approached, Yerkes got one of those “big ideas” that propel the history of science: could psychologists possibly persuade the army to test all its recruits? If so, the philosopher’s stone of psychology might be constructed: the copious, useful, and uniform body of numbers
that would fuel a transition from dubious art to respected science. Yerkes proselytized within his profession and within government circles, and he won his point. As Colonel Yerkes, he presided over the administration of mental tests to 1.75 million recruits during World War I. Afterward, he proclaimed that mental testing “helped to win the war.” “At the same time,” he added, “it has incidentally established itself among the other sciences and demonstrated its right to serious consideration in human engineering” (quoted in Kevles, 1968, p. 581).

  Yerkes brought together all the major hereditarians of American psychometrics to write the army mental tests. From May to July 1917 he worked with Terman, Goddard, and other colleagues at Goddard’s Training School in Vineland, New Jersey.

  Their scheme included three types of tests. Literate recruits would be given a written examination, called the Army Alpha. Illiterates and men who had failed Alpha would be given a pictorial test, called the Army Beta. Failures in Beta would be recalled for an individual examination, usually some version of the Binet scales. Army psychologists would then grade each man from A to E (with plusses and minuses) and offer suggestions for proper military placement. Yerkes suggested that recruits with a score of C—should be marked as “low average intelligence—ordinary private.” Men of grade D are “rarely suited for tasks requiring special skill, forethought, resourcefulness or sustained alertness.” D and E men could not be expected “to read and understand written directions.”