The Mismeasure of Man
6.11 Thurstone’s oblique simple structure axes for the same four mental tests depicted in Figs. 6–6 and 6–7. Factor axes are no longer perpendicular to each other. In this example, the factor axes coincide with the peripheral vectors of the cluster.
Thurstone wrestled with what he called this “second-order” g. I confess that I do not understand why he wrestled so hard, unless the many years of working with orthogonal solutions had set his mind and rendered the concept too unfamiliar to accept at first. If anyone understood the geometrical representation of vectors, it was Thurstone. This representation guarantees that oblique axes will be positively correlated, and that a second-order general factor must therefore exist. Second-order g is merely a fancier way of acknowledging what the raw correlation coefficients show—that nearly all correlation coefficients between mental tests are positive.
In any case, Thurstone finally bowed to inevitability and admitted the existence of a second-order general factor. He once even described it in almost Spearmanian terms (1946, p. 110):
There seems to exist a large number of special abilities that can be identified as primary abilities by the factorial methods, and underlying these special abilities there seems to exist some central energizing factor which promotes the activity of all these special abilities.
It might appear as if all the sound and fury of Thurstone’s debate with the British factorists ended in a kind of stately compromise, more favorable to Burt and Spearman, and placing poor Thurstone in the unenviable position of struggling to save face. If the correlation of oblique axes yields a second-order g, then weren’t Spearman and Burt right all along in their fundamental insistence upon a general factor? Thurstone may have shown that group factors were more important than any British factorist had ever admitted, but hadn’t the primacy of g reasserted itself?
Arthur Jensen (1979) presents such an interpretation, but it badly misrepresents the history of this debate. Second-order g did not unite the disparate schools of Thurstone and the British factorists; it did not even produce a substantial compromise on either side. After all, the quotes I cited from Thurstone on the futility of ranking by IQ and the necessity of constructing profiles based on primary mental abilities for each individual were written after he had admitted the second-order general factor. The two schools were not united and Spearman’s g was not vindicated for three basic reasons:
1. For Spearman and Burt, g cannot merely exist; it must dominate. The hierarchical view—with a controlling innate g and subsidiary trainable group factors—was fundamental for the British school. How else could unilinear ranking be supported? How else could the 11+ examination be defended? For this examination supposedly measured a controlling mental force that defined a child’s general potential and shaped his entire intellectual future.
Thurstone admitted a second-order g, but he regarded it as secondary in importance to what he continued to call “primary” mental abilities. Quite apart from any psychological speculation, the basic mathematics certainly supports Thurstone’s view. Second-order g (the correlation of oblique simple structure axes) rarely accounts for more than a small percentage of the total information in a matrix of tests. On the other hand, Spearman’s g (the first principal component) often encompasses more than half the information. The entire psychological apparatus, and all the practical schemes, of the British school depended upon the preeminence of g, not its mere presence. When Thurstone revised The Vectors of Mind in 1947, after admitting a second-order general factor, he continued to contrast himself with the British factorists by arguing that his scheme treated group factors as primary and the second-order general factor as residual, while they extolled g and considered group factors as secondary.
2. The central reason for claiming that Thurstone’s alternate view disproves the necessary reality of Spearman’s g retains its full force. Thurstone derived his contrasting interpretation from the same data simply by placing factor axes in different locations. One could no longer move directly from the mathematics of factor axes to a psychological meaning.
In the absence of corroborative evidence from biology for one scheme or the other, how can one decide? Ultimately, however much a scientist hates to admit it, the decision becomes a matter of taste, or of prior preference based on personal or cultural biases. Spearman and Burt, as privileged citizens of class-conscious Britain, defended g and its linear ranking. Thurstone preferred individual profiles and numerous primary abilities. In an unintentionally amusing aside, Thurstone once mused over the technical differences between Burt and himself, and decided that Burt’s propensity for algebraic rather than geometrical representation of factors arose from his deficiency in the spatial PMA:
The configurational interpretations are evidently distasteful to Burt, for he does not have a single diagram in his text. Perhaps this is indicative of individual differences in imagery types which lead to differences in methods and interpretation among scientists (1947, p. ix).
3. Burt and Spearman based their psychological interpretation of factors on a belief that g was dominant and real—an innate, general intelligence, marking a person’s essential nature. Thurstone’s analysis permitted them, at best, a weak second-order g. But suppose they had prevailed and established the inevitability of a dominant g? Their argument still would have failed for a reason so basic that it passed everybody by. The problem resided in a logical error committed by all the great factorists I have discussed—the desire to reify factors as entities. In a curious way, the entire history that I have traced didn’t matter. If Burt and Thurstone had never lived, if an entire profession had been permanently satisfied with Spearman’s two-factor theory and had been singing the praises of its dominant £ for three-quarters of a century since he proposed it, the flaw would be as glaring still.
The fact of pervasive positive correlation between mental tests must be among the most unsurprising major discoveries in the history of science. For positive correlation is the prediction of almost every contradictory theory about its potential cause, including both extreme views: pure hereditarianism (which Spearman and Burt came close to promulgating) and pure environmentalism (which no major thinker has ever been foolish enough to propose). In the first, people do jointly well or poorly on all sorts of tests because they are born either smart or stupid. In the second, they do jointly well or poorly because they either ate, read, learned, and lived in an enriched or a deprived fashion as children. Since both theories predict pervasive positive correlation, the fact of correlation itself can confirm neither. Since g is merely one elaborate way of expressing the correlations, its putative existence also says nothing about causes.
Thurstone on the uses of factor analysis
Thurstone sometimes advanced grandiose claims for the explanatory scope of his work. But he also possessed a streak of modesty that one never detects in Burt or Spearman. In reflective moments, he recognized that the choice of factor analysis as a method records the primitive state of knowledge in a field. Factor analysis is a brutally empirical technique, used when a discipline has no firmly established principles, but only a mass of crude data, and a hope that patterns of correlation might provide suggestions for further and more fruitful lines of inquiry. Thurstone wrote (1935, p. xi):
No one would think of investigating the fundamental laws of classical mechanics by correlational methods or by factor methods, because the laws of classical mechanics are already well known. If nothing were known about the law of falling bodies, it would be sensible to analyze, factorially, a great many attributes of objects that are dropped or thrown from an elevated point. It would then be discovered that one factor is heavily loaded with the time of fall and with the distance fallen but that this factor has a zero loading in the weight of the object. The usefulness of the factor methods will be at the borderline of science.
Nothing had changed when he revised The Vectors of Mind (1947, p. 56):
The exploratory nature of factor analysis is often not understood. Factor analysis has its
principal usefulness at the borderline of science.… Factor analysis is useful, especially in those domains where basic and fruitful concepts are essentially lacking and where crucial experiments have been difficult to conceive. The new methods have a humble role. They enable us to make only the crudest first map of a new domain.
Note the common phrase—useful “at the borderline of science.” According to Thurstone, the decision to use factor analysis as a primary method implies a deep ignorance of principles and causes. That the three greatest factorists in psychology never got beyond these methods—despite all their lip service to neurology, endocrinology, and other potential ways of discovering an innate biology—proves how right Thurstone was. The tragedy of this tale is that the British hereditarians promoted an innatist interpretation of dominant g nonetheless, and thereby blunted the hopes of millions.
Epilogue: Arthur Jensen and the resurrection of Spearman’s g
When I researched this chapter in 1979, I knew that the ghost of Spearman’s g still haunted modern theories of intelligence. But I thought that its image was veiled, and its influence largely unrecognized. I hoped that a historical analysis of conceptual errors in its formulation and use might expose the hidden fallacies in some contemporary views of intelligence and IQ. I never expected to find a modern defense of IQ from an explicitly Spearmanian perspective.
But then America’s best-known hereditarian, Arthur Jensen (1979) revealed himself as an unreconstructed Spearmanian, and centered an eight-hundred-page defense of IQ on the reality of g. More recently, Richard Herrnstein and Charles Murray also base their equally long Bell Curve (1994) on the same fallacy. I shall analyze Jensen’s error here and The Bell Curve’s version in the first two essays at the end of the book. History often cycles its errors.
Jensen performs most of his factor analyses in Spearman and Burt’s preferred principal components orientation (though he is also willing to accept g in the form of Thurstone’s correlation between oblique simple structure axes). Throughout the book, he names and reifies factors by the usual invalid appeal to mathematical pattern alone. We have g’s for general intelligence as well as g’s for general athletic ability (with subsidiary group factors for hand and arm strength, hand-eye coordination, and body balance).
Jensen explicitly defines intelligence as “the g factor of an indefinitely large and varied battery of mental tests” (p. 249). “We identify intelligence with g,” he states. “To the extent that a test orders individuals on g, it can be said to be a test of intelligence” (p. 224). IQ is our most effective test of intelligence because it projects so strongly upon the first principal component (g) in factor analyses of mental tests. Jensen reports (p. 219) that Full Scale IQ of the Wechsler adult scale correlates about o.g with g, while the 1937 Stanford-Binet projects about 0.8 upon a g that remains “highly stable over successive age levels” (while the few small group factors are not always present and tend to be unstable in any case).
Jensen proclaims the “ubiquity” of g, extending its scope into realms that might even have embarrassed Spearman himself. Jensen would not only rank people; he believes that all God’s creatures can be ordered on a g scale from amoebae at the bottom (p. 175) to extraterrestrial intelligences at the top (p. 248). I have not encountered such an explicit chain of being since last I read Kant’s speculations about higher beings on Jupiter that bridge the gap between man and God.
Jensen has combined two of the oldest cultural prejudices of Western thought: the ladder of progress as a model for organizing life, and the reification of some abstract quality as a criterion for ranking. Jensen chooses “intelligence” and actually claims that the performance of invertebrates, fishes, and turtles on simple behavioral tests represents, in diminished form, the same essence that humans possess in greater abundance—namely g, reified as a measurable object. Evolution then becomes a march up the ladder to realms of more and more g.
As a paleontologist, I am astounded. Evolution forms a copiously branching bush, not a unilinear progressive sequence. Jensen speaks of “different levels of the phyletic scale—that is, earthworms, crabs, fishes, turtles, pigeons, rats, and monkeys.” Doesn’t he realize that modern earthworms and crabs are descendants of lineages that have evolved separately from vertebrates for more than 500 million years? They are not our ancestors; they are not even “lower” or less complicated than humans in any meaningful sense. They represent good solutions for their own way of life; they must not be judged by the hubristic notion that one peculiar primate forms a standard for all of life. As for vertebrates, “the turtle” is not, as Jensen claims, “phylogenetically higher than the fish.” Turtles evolved much earlier than most modern fishes, and they exist as hundreds of species, while modern bony fishes include almost twenty thousand distinct kinds. What then is “the fish” and “the turtle”? Does Jensen really think that pigeon-rat-monkey-human represents an evolutionary sequence among warm-blooded vertebrates?
Jensen’s caricature of evolution exposes his preference for unilinear ranking by implied worth. With such a perspective, g becomes almost irresistible, and Jensen uses it as a universal criterion of rank:
The common features of experimental tests developed by comparative psychologists that most clearly distinguish, say, chickens from dogs, dogs from monkeys, and monkeys from chimpanzees suggests that they are roughly scalable along a g dimension … g can be viewed as an interspecies concept with a broad biological base culminating in the primates (p. 251).
Not satisfied with awarding g a real status as guardian of earthly ranks, Jensen would extend it throughout the universe, arguing that all conceivable intelligence must be measured by it:
The ubiquity of the concept of intelligence is clearly seen in discussions of the most culturally different beings one could well imagine—extraterrestrial life in the universe.… Can one easily imagine “intelligent” beings for whom there is no g, or whose g is qualitatively rather than quantitatively different from g as we know it (p. 248).
Jensen discusses Thurstone’s work, but dismisses it as a criticism because Thurstone eventually admitted a second-order g. But Jensen has not recognized that if g is only a numerically weak, second-order effect, then it cannot support a claim that intelligence is a unitary, dominant entity of mental functioning. I think that Jensen senses his difficulty, because on one chart (p. 220) he calculates both classical g as a first principal component and then rotates all the factors (including g) to obtain a set of simple structure axes. Thus, he records the same thing twice for each test—g as a first principal component and the same information dispersed among simple structure axes—giving some tests a total information of more than 100 percent. Since big g’s appear in the same chart with large loadings on simple-structure axes, one might be falsely led to infer that g remains large even in simple-structure solutions.
Jensen is contemptuous of Thurstone’s orthogonal simple structure, dismissing it as “flatly wrong” (p. 675) and as “scientifically an egregious error” (p. 258). Since he acknowledges that simple structure is mathematically equivalent to principal components, why the uncompromising rejection? It is wrong, Jensen argues, “not mathematically, but psychologically and scientifically” (p. 675) because “it artificially hides or submerges the large general factor” (p. 258) by rotating it away. Jensen has fallen into a vicious circle. He assumes a priori that g exists and that simple structure is wrong because it disperses g. But Thurstone developed the concept of simple structure largely to claim that g is a mathematical artifact. Thurstone wished to disperse g and succeeded; it is no disproof of his position to reiterate that he did so.
Jensen also uses g more specifically to buttress his claim that the average difference in IQ between whites and blacks records an innate deficiency of intelligence among blacks. He cites the quotation on p. 271 as “Spearman’s interesting hypothesis” that blacks score most poorly with respect to whites on tests strongly correlated with g:
This hypothesis is important to the study of
test bias, because, if true, it means that the white-black difference in test scores is not mainly attributable to idiosyncratic cultural peculiarities in this or that test, but to a general factor that all the ability tests measure in common. A mean difference between populations that is related to one or more small group factors would seem to be explained more easily in terms of cultural differences than if the mean group difference is most closely related to a broad general factor common to a wide variety of tests (p. 535).
Here we see a reincarnation of the oldest argument in the Spearmanian tradition—the contrast between an innate dominant g and trainable group factors. But g, as I have shown, is neither clearly a thing, nor necessarily innate if a thing. Even if data existed to confirm Spearman’s “interesting hypothesis,” the results could not support Jensen’s notion of ineluctable, innate difference.
I am grateful to Jensen for one thing: he has demonstrated by example that a reified Spearman’s g is still the only promising justification for hereditarian theories of mean differences in IQ among human groups. The Bell Curve of Herrnstein and Murray (1994) has reinforced this poverty, indeed bankruptcy, of justification for the theory of unitary, rankable, innate, and effectively immutable intelligence—for these authors also ground their entire edifice on the fallacy of Spearman’s g. The conceptual errors of reification have plagued g from the start, and Thurstone’s critique remains as valid today as it was in the 1930s. Spearman’s g is not an ineluctable entity; it represents one mathematical solution among many equivalent alternatives. The chimerical nature of g is the rotten core of Jensen’s work, The Bell Curve, and of the entire hereditarian school.