* Following strictures of the argument outlined above. I do not treat all theories of craniometries (I omit phrenology, for example, because it did not reify intelligence as a single entity but sought multiple organs with the brain). Likewise, I exclude many important and often quantified styles of determinism that did not seek to measure intelligence as a property of the brain—for example, most of eugenics.

  * Also too precious to exclude is my favorite modern invocation of biological determinism as an excuse for dubious behavior. Bill Lee, baseball’s self-styled philosopher, justifying the beanball (New York Times, 24 July 1976): “I read a book in college called ‘Territorial Imperative.’ A fellow always has to protect his master’s home much stronger than anything down the street; My territory is down and away from the hitters. If they’re going out there and getting the ball, I’ll have to come in close.”

  * I have been struck by the frequency of such aesthetic claims as a basis of racial preference. Although J. F. Blumenbach, the founder of anthropology, had stated that toads must view other toads as paragons of beauty, many astute intellectuals never doubted the equation of whiteness with perfection. Franklin at least had the decency to include the original inhabitants in his future America; but, a century later, Oliver Wendell Holmes rejoiced in the elimination of Indians on aesthetic grounds: “… and so the red-crayon sketch is rubbed out, and the canvas is ready for a picture of manhood a little more like God’s own image” (in Gossett, 1965, P. 243)

  * Darwin wrote, for example, in the Voyage of the Beagle: “Near Rio de Janeiro I lived opposite to an old lady, who kept screws to crush the fingers of her female slaves. I have stayed in a house where a young household mulatto, daily and hourly, was reviled, beaten, and persecuted enough to break the spirit of the lowest animal. I have seen a little boy, six or seven years old, struck thrice with a horse-whip (before I could interfere) opn his naked head, for having handed me a glass of water not quite clean.… And these deeds are done and palliated by men, who profess to love their neighbors as themselves, who believe in God, and pray that his Will be done on earth! It makes one’s blood boil, yet heart tremble, to think that we Englishmen and our American descendants, with their boastful cry of liberty, have been and are so guilty.”

  * This “inductive” argument from human cultures is far from dead as a defense of racism. In his Study of History (1934 edition), Arnold Toynbee wrote: “When we classify mankind by color, the only one of the primary races, given by this classification, which has not made a creative contribution to any of our twenty-one civilizations is the Black Race” (in Newby, 1969, p. 217).

  * Modern evolutionary theory does invoke a barrier to interfertility as the primary criterion for status as a species. In the standard definition: “Species are actually or potentially interbreeding populations sharing a common gene pool, and reproductively isolated from all other groups.” Reproductive isolation, however, does not mean that individual hybrids never arise, but only that the two species maintain their integrity in natural contact. Hybrids may be sterile (mules). Fertile hybrids may even arise quite frequently, but if natural selection acts preferentially against them (as a result of inferiority in structural design, rejection as mates by full members of either species, etc.) they will not increase in frequency and the two species will not amalgamate. Often fertile hybrids can be produced in the laboratory by imposing situations not encountered in nature (forced breeding between species that normally mature at different times of the year, for example). Such examples do not refute a status as separate species because the two groups do not amalgamate in the wild (maturation at different times of the year may be an efficient means of reproductive isolation).

  * An excellent history of the entire “American school” can be found in W. Stanton’s The Leopard’s Spots.

  * E. D. Cope, America’s leading paleontologist and evolutionary biologist, reiterated the same theme even more forcefully in 1890 (p. 2054): “The highest race of man cannot afford to lose or even to compromise the advantages it has acquired by hundreds of centuries of toil and hardship, by mingling its blood with the lowest.… We cannot cloud or extinguish the fine nervous susceptibility, and the mental force, which cultivation develops in the constitution of the Indo-European, by the fleshly instincts, and dark mind of the African. Not only is the mind stagnated, and the life of mere living introduced in its stead, but the possibility of resurrection is rendered doubtful or impossible.”

  * Not all detractors of blacks were so generous. E. D. Cope, who feared that miscegenation would block the path to heaven (see preceding footnote), advocated the return of all blacks to Africa (1890, p. 2053): “Have we not burdens enough to carry in the European peasantry which we are called on every year to receive and assimilate? Is our own race on a plane sufficiently high, to render it safe for us to carry eight millions of dead material in the very center of our vital organism?”

  * This account omits many statistical details of my analysis. The complete tale appears in Gould, 1978. Some passages in pp. 88–101 are taken from this article.

  * In his final catalogue of 1849, Morton guessed at sex (and age within five years!) for all crania. In this later work, he specifies 77, 87, and 88 as male, and the remaining 77 as female. This allocation was pure guesswork; my alternate version is equally plausible. In the Crania Aegyptiaca itself, Morton was more cautious and only identified sex for specimens with mummified remains.

  * To demonstrate again how large differences based on stature can be, I report these additional data, recovered from Morton’s tabulations, but never calculated or recognized by him: 1) For Inca Peruvians, fifty-three male skulls average 77.5; sixty-one female skulls, 72.1. a) For Germans, nine male skulls average 92.2; eight females, 84.3.

  † My original report (Gould, 1978) incorrectly listed the modern Caucasian mean as 85.3. The reason for this error is embarrassing, but instructive, for it illustrates, at my expense, the cardinal principle of this book: the social embeddedness of science and the frequent grafting of expectation upon supposed objectivity. Line 7 in Table 2.3 lists the range of Semitic skulls as 84 to 98 cubic inches for Morton’s sample of 3. However, my original paper cited a mean of 80—an obvious impossibility if the smallest skull measures 84. I was working from a Xerox of Morton’s original chart, and his correct value of 89 is smudged to look like an 80 on my copy. Nonetheless, the range of 84 to 98 is clearly indicated right alongside, and I never saw the inconsistency—presumably because a low value of 80 fit my hopes for a depressed Caucasian mean. The 80 therefore “felt” right and I never checked it. I am grateful to Dr. Irving Klotz of Northwestern University for pointing out this error to me.

  * Broca did not confine his arguments on the relative worth of brain parts to the distinction between front and back. Virtually any measured difference between peoples could be given a value in terms of prior conviction about relative worth. Broca once claimed, for example (1861, p. 187), that blacks probably had larger cranial nerves than whites, hence a larger nonintellectual portion of the brain.

  * Ten years later, America’s leading evolutionary biologist, E. D. Cope, dreaded the result if “a spirit of revolt become general among women.” “Should the nation have an attack of this kind,” he wrote (1890, p. 2071), “like a disease, it would leave its traces in many after-generations.” He detected the beginnings of such anarchy in pressures exerted by women “to prevent men from drinking wine and smoking tobacco in moderation,” and in die carriage of misguided men who supported female suffrage: “Some of these men are effeminate and long-haired.”

  * I calculate, where y is brain size in grams, x1 age in years, and x2 body height in an: y= 764.5–2.55x1+ 3.47x2

  † For his largest sample of males, and using the favored power function for bivariate analysis of brain allometry, I calculate, where y is brain weight in grams and x is body height in cm: y = 121.6x0.47

  * Readers interested in the justification provided for recapitulation by Haeckel and his colleagues, a
nd in the reasons for its later downfall, may consult my dull, but highly detailed treatise, Ontogeny and Phylogeny, Harvard University Press, 1977.

  * In his Annotated Dracula, Leonard Wolf (1975, p. 300) notes that Jonathan Harker’s initial description of Count Dracula is based directly upon Cesare Lombroso’s account of the born criminal. Wolf presents the following contrasts:

  HARKER WRITES: “His [the Count’s] face was … aquiline, with high bridge of the thin nose and peculiarly arched nostrils.…”

  LOMBROSO: “[The criminal’s] nose on the contrary … is often aquiline like the beak of a bird of prey.”

  HARKER: “His eyebrows were very massive, almost meeting over the nose.…”

  LOMBROSO: “The eyebrows are bushy and tend to meet across the nose.”

  HARKER: “… his ears were pale and at the tops extremely pointed.…”

  LOMBROSO:“with a protuberance on the upper part of the posterior margin … a relic of the pointed ear.…”

  * In Dracula, Professor Van Helsing, in his inimitable broken English, extolled the argument from recapitulation by branding the Count as a persistent child (and therefore both a primitive and a criminal as well):

  Ah! there I have hope that our man-brains that have been of man so long and that have not lost the grace of God, will come higher than his child-brain that lie in his tomb for centuries, that grow not yet to our stature, and that do only work selfish and therefore small.… He is clever and cunning and resourceful; but he be not of man-stature as to brain. He be of child-brain in much. Now this criminal of ours is predestinate to crime also; he too have child-brain, and it is of the child to do what he have done. The little bird, the little fish, the little animal learn not by principle but empirically; and when he learn to do, then there is to him the ground to start from to do more.

  * Other standard craniometrical arguments were often pressed into service by criminal anthropology. For example, as early as 1843, Voisin invoked the classical argument of front and back (see pp.129–135) to place criminals among the animals. He studied five hundred young offenders and reported deficiencies in the forward and upper parts of their brain—the supposed seat of morality and rationality. He wrote (1843, pp. 100–101):

  Their brains are at a minimum of development in their anterior and superior parts, in the two parts that make us what we are and that place us above the animals and make us men. They [criminal brains] are placed by their nature … entirely outside the human species.

  * Division is more appropriate because it is the relative, not the absolute, magnitude of disparity between mental and chronological age that matters. A two-year disparity between mental age two and chronological age four may denote a far severer degree of deficiency than a two-year disparity between mental age fourteen and chronological age sixteen. Binet’s method of subtraction would give the same result in both cases, while Stern’s IQ measures 50 for the first case and 88 for the second. (Stern multiplied the actual quotient by 100 to eliminate the decimal point.)

  * The link of morality to intelligence was a favorite eugenical theme. Thorndike (1940, pp. 264–265), refuting a popular impression that all monarchs are reprobates, cited a correlation coefficient of 0.56 for the estimated intelligence vs. the estimated morality of 269 male members of European royal families!

  * Do not read into this statement more than Goddard intended. He had not abandoned his belief in the heritability of moronity itself. Moron parents will have moron children, but they can be made useful through education. Moron parents, however, do not preferentially beget defectives of lower grade—idiots and imbeciles.

  * Terman (1919) provided a lengthy list of the attributes of general intelligence captured by the Stanford-Binet tests: memory, language comprehension, size of vocabulary, orientation in space and time, eye-hand coordination, knowledge of familiar things, judgment, likeness and differences, arithmetical reasoning, resourcefulness and ingenuity in difficult practical situations, ability to detect absurdities, speed and richness of association of ideas, power to combine the dissected parts of a form board or a group of ideas into a unitary whole, capacity to generalize from particulars, and ability to deduce a rule from connected facts.

  * This, in itself, is not finagling, but a valid statistical procedure for establishing uniformity of average score and variance across age levels.

  * Jensen writes: “The average estimated IQ of three hundred historical persons … on whom sufficient childhood evidence was available for a reliable estimate was IQ 155.… Thus the majority of these eminent men would most likely have been recognized as intellectually gifted in childhood had they been given IQ tests” (Jensen, 1979, p. 113).

  * It is annoyingly characteristic of Terman’s work that he cites correlations when they are high and favorable, but does not give the actual figures when they are low but still favorable to his hypothesis. This ploy abounds in Cox’s study of posthumous genius and in Terman’s analysis of IQ among professions, both discussed previously.

  * Yerkes continued to complain throughout his career that military psychology had not achieved its due respect, despite its accomplishments in World War I. During World War II the aging Yerkes was still grousing and arguing that the Nazis were upstaging America in their proper use and encouragement of mental testing for military personnel. “Germany has a long lead in the development of military psychology.… The Nazis have achieved something that is entirely without parallel in military history.… What has happened in Germany is the logical sequel to the psychological and personnel services in our own Army during 1917–1918” (Yerkes, 1941, p. 209).

  * I doubt that Yerkes wrote all parts of the massive 1921 monograph himself. But he is listed as the only author of this official report, and I shall continue to attribute its statements to him, both as shorthand and for want of other information.

  * Note how choice of language can serve as an indication of bias. This 2.5 year difference in mental ages (13.74–11.29) only represents “somewhat better” performance. The smaller (but presumably hereditary) difference of 2 years between Nordic-Teutonic and Latin-Slav groups had been described as “considerable.”

  * In all other parts of the book, he claims that his aim is to measure and interpret innate differences in intelligence.

  * Pearson’s r is not an appropriate measure for all kinds of correlation, for it assesses only what statisticians call the intensity of linear relationship between two measures—the tendency for all points to fall on a single straight line. Other relationships of strict dependence will not achieve a value of 1.0 for r. If, for example, each increase of 8 units in one variable were matched by an increase in 22 units in the other variable, r would be less than 1.0, even though the two variables might be perfectly “correlated” in the vernacular sense. Their plot would be a parabola, not a straight line, and Pearson’s r measures the intensity of linear relationship.

  * (Footnote for aficionados—others may safely skip.) Here, I am technically discussing a procedure called “principal components analysis,” not quite the same thing as factor analysis. In principal components analysis, we preserve all information in the original measures and fit new axes to them by the same criterion used in factor analysis in principal components orientation—that is, the first axis explains more data than any other axis could and subsequent axes lie at right angles to all other axes and encompass steadily decreasing amounts of information. In true factor analysis, we decide beforehand (by various procedures) not to include all information on our factor axes. But the two techniques—true factor analysis in principal components orientation and principal components analysis—play the same conceptual role and differ only in mode of calculation. In both, the first axis (Spearman’s g for intelligence tests) is a “best fit” dimension that resolves more information in a set of vectors than any other axis could.

  During the past decade or so, semantic confusion has spread in statistical circles through a tendency to restrict the term “factor analysis” only to the ro
tations of axes usually performed after the calculation of principal components, and to extend the term “principal components analysis” both to true principal components analysis (all information retained) and to factor analysis done in principal components orientation (reduced dimensionality and loss of information). This shift in definition is completely out of keeping with the history of the subject and terms. Spearman, Burt, and hosts of other psychometricians worked for decades in this area before Thurstone and others invented axial rotations. They performed all their calculations in the principal components orientation, and they called themselves “factor analysts.” I continue, therefore, to use the term “factor analysis” in its original sense to include any orientation of axes—principal components or rotated, orthogonal or oblique.

  I will also use a common, if somewhat sloppy, shorthand in discussing what factor axes do. Technically, factor axes resolve variance in original measures. I will, as is often done, speak of them as “explaining” or “resolving” information—as they do in the vernacular (though not in the technical) sense of information. That is, when the vector of an original variable projects strongly on a set of factor axes, little of its variance lies unresolved in higher dimensions outside the system of factor axes.

  * Spearman took a special interest in problems of correlation and invented a measure that probably ranks second in use to Pearson’s r as a measure of association between two variables—the so-called Spearman’s rank-correlation coefficient.

  * The g calculated by the tetrad formula is conceptually equivalent and mathematically almost equivalent to the first principal component described on pp. 275–278 and used in modern tactor analysis.

  * The terms “saturation” and “loading” refer to the correlation between a test and a factor axis. If a test “loads” strongly on a factor then most of its information is explained by the factor.