Spearman’s g is particularly subject to ambiguity in interpretation, if only because the two most contradictory causal hypotheses are both fully consistent with it: 1) that it reflects an inherited level of mental acuity (some people do well on most tests because they are born smarter); or 2) that it records environmental advantages and deficits (some people do well on most tests because they are well schooled, grew up with enough to eat, books in the home, and loving parents). If the simple existence of g can be theoretically interpreted in either a purely hereditarian or purely environmentalist way, then its mere presence—even its reasonable strength—cannot justly lead to any reification at all. The temptation to reify is powerful. The idea that we have detected something “underlying” the externalities of a large set of correlation coefficients, something perhaps more real than the superficial measurements themselves, can be intoxicating. It is Plato’s essence, the abstract, eternal reality underlying superficial appearances. But it is a temptation that we must resist, for it reflects an ancient prejudice of thought, not a truth of nature.
Rotation and the nonnecessity of principal components
Another, more technical, argument clearly demonstrates why principal components cannot be automatically reified as causal entities. If principal components represented the only way to simplify a correlation matrix, then some special status for them might be legitimately sought. But they represent only one method among many for inserting axes into a multidimensional space. Principal components have a definite geometric arrangement, specified by the criterion used to construct them—that the first principal component shall resolve a maximal amount of information in a set of vectors and that subsequent components shall all be mutually perpendicular. But there is nothing sacrosanct about this criterion; vectors may be resolved into any set of axes placed within their space. Principal components provide insight in some cases, but other criteria are often more useful.
Consider the following situation, in which another scheme for placing axes might be preferred. In Figure 6.6 I show correlations between four mental tests, two of verbal and two of arithmetical aptitude. Two “clusters” are evident, even though all tests are positively correlated. Suppose that we wish to identify these clusters by factor analysis. If we use principal components, we may not recognize them at all. The first principal component (Spearman’s g) goes right up the middle, between the two clusters. It lies close to no vector and resolves an approximately equal amount of each, thereby masking the existence of verbal and arithmetic clusters. Is this component an entity? Does a “general intelligence” exist? Or is g, in this case, merely a meaningless average based on the invalid amalgamation of two types of information?
We may pick up verbal and arithmetic clusters on the second principal component (called a “bipolar factor” because some projections upon it will be positive and others negative when vectors lie on both sides of the first principal component). In this case, verbal tests project on the negative side of the second component, and arithmetic tests on the positive side. But we may fail to detect these clusters altogether if the first principal component dominates all vectors. For projections on the second component will then be small, and the pattern can easily be lost (see Fig. 6.6).
During the 1930s factorists developed methods to treat this dilemma and to recognize clusters of vectors that principal components often obscured. They did this by rotating factor axes from the principal components orientation to new positions. The rotations, established by several criteria, had as their common aim the positioning of axes near clusters. In Figure 6.7, for example, we use the criterion: place axes near vectors occupying extreme or outlying positions in the total set. If we now resolve all vectors into these rotated axes, we detect the clusters easily; for arithmetic tests project high on rotated axis 1 and low on rotated axis 2, while verbal tests project high on 2 and low on 1. Moreover,g has disappeared. We no longer find a “general factor” of intelligence, nothing that can be reified as a single number expresssing overall ability. Yet we have lost no information. The two rotated axes resolve as much information in the four vectors as did the two principal components. They simply distribute the same information differently upon the resolving axes. How can we argue that g has any claim to reified status as an entity if it represents but one of numerous possible ways to position axes within a set of vectors?
In short, factor analysis simplifies large sets of data by reducing dimensionality and trading some loss of information for the recognition of ordered structure in fewer dimensions. As a tool for simplification, it has proved its great value in many disciplines. But many factorists have gone beyond simplification, and tried to define factors as causal entities. This error of reification has plagued the technique since its inception. It was “present at the creation” since Spearman invented factor analysis to study the correlation matrix of mental tests and then reified his principal component asg- or innate, general intelligence. Factor analysis may help us to understand causes by directing us to information beyond the mathematics of correlation. But factors, by themselves, are neither things nor causes; they are mathematical abstractions. Since the same set of vectors (see Figs. 6.6, 6.7) can be partitioned into g and a small residual axis, or into two axes of equal strength that identify verbal and arithmetical clusters and dispense with g entirely, we cannot claim that Spearman’s “general intelligence” is an ineluctable entity necessarily underlying and causing the correlations among mental tests. Even if we choose to defend g as a nonaccidental result, neither its strength nor its geometric position can specify what it means in causal terms—if only because its features are equally consistent with extreme hereditarian and extreme environmentalist views of intelligence.
6.6 A principal components analysis of four mental tests. All correlations are high and the first principal component, Spearman’s g, expresses the overall correlation. But the group factors for verbal and mathematical aptitude are not well resolved in this style of analysis.
6.7 Rotated factor axes for the same four mental tests depicted in Fig. 6.6. Axes are now placed near vectors lying at the periphery of the cluster. The group factors for verbal and mathematical aptitude are now well identified (see high projections on the axes indicated by dots), but g has disappeared.
Charles Spearman and general intelligence
The two-factor theory
Correlation coefficients are now about as ubiquitous and unsurprising as cockroaches in New York City. Even the cheapest pocket calculators produce correlation coefficients with the press of a button. However indispensable, they are taken for granted as automatic accouterments of any statistical analysis that deals with more than one measure. In such a context, we easily forget that they were once hailed as a breakthrough in research, as a new and exciting tool for discovering underlying structure in tables of raw measures. We can sense this excitement in reading early papers of the great American biologist and statistician Raymond Pearl (see Pearl, 1905 and 1906, and Pearl and Fuller, 1905). Pearl completed his doctorate at the turn of the century and then proceeded, like a happy boy with a gleaming new toy, to correlate everything in sight, from the lengths of earth worms vs. the number of their body segments (where he found no correlation and assumed that increasing length reflects larger, rather than more, segments), to size of the human head vs. intelligence (where he found a very small correlation, but attributed it to the indirect effect of better nutrition).
Charles Spearman, an eminent psychologist and fine statistician as well* began to study correlations between mental tests during these heady times. If two mental tests are given to a large number of people, Spearman noted, the correlation coefficient between them is nearly always positive. Spearman pondered this result and wondered what higher generality it implied. The positive correlations clearly indicated that each test did not measure an independent attribute of mental functioning. Some simpler structure lay behind the pervasive positive correlations; but what structure? Spearman imagined two alternatives. First,
the positive correlations might reduce to a small set of independent attributes—the “faculties” of the phrenologists and other schools of early psychology. Perhaps the mind had separate “compartments” for arithmetic, verbal, and spatial aptitudes, for example. Spearman called such theories of intelligence “oligarchic.” Second, the positive correlations might reduce to a single, underlying general factor—a notion that Spearman called “monarchic.” In either case, Spearman recognized that the underlying factors—be they few (oligarchic) or single (monarchic)—would not encompass all information in a matrix of positive correlation coefficients for a large number of mental tests. A “residual variance” would remain—information peculiar to each test and not related to any other. In other words, each test would have its “anarchic” component. Spearman called the residual variance of each test its s, or specific information. Thus, Spearman reasoned, a study of underlying structure might lead to a “two-factor theory” in which each test contained some specific information (its s) and also reflected the operation of a single, underlying factor, which Spearman called g, or general intelligence. Or each test might include its specific information and also record one or several among a set of independent, underlying faculties—a many-factor theory. If the simplest two-factor theory held, then all common attributes of intelligence would reduce to a single underlying entity—a true “general intelligence” that might be measured for each person and might afford an unambiguous criterion for ranking in terms of mental worth.
Charles Spearman developed factor analysis—still the most important technique in modern multivariate statistics—as a procedure for deciding between the two- vs. the many-factor theory by determining whether the common variance in a matrix of correlation coefficients could be reduced to a single “general” factor, or only to several independent “group” factors. He found but a single “intelligence,” opted for the two-factor theory, and, in 1904, published a paper that later won this assessment from a man who opposed its major result: “No single event in the history of mental testing has proved to be of such momentous importance as Spearman’s proposal of his famous two-factor theory” (Guilford, 1936, p. 155). Elated, and with characteristic immodesty, Spearman gave his 1904 paper a heroic title: “General Intelligence Objectively Measured and Determined.” Ten years later (1914, p. 237), he exulted: “The future of research into the inheritance of ability must center on the theory of ‘two factors.’ This alone seems capable of reducing the bewildering chaos of facts to a perspicuous orderliness. By its means, the problems are rendered clear; in many respects, their answers are already foreshadowed; and everywhere, they are rendered susceptible of eventual decisive solution.”
The method of tetrad differences
In his original work, Spearman did not use the method of principal components described on pp. 275–278. Instead, he developed a simpler, though tedious, procedure better suited for a precomputer age when all calculations had to be performed by hand.* He computed the entire matrix of correlation coefficients between all pairs of tests, took all possible groupings of four measures and computed for each a number that he called the “tetrad difference.” Consider the following example as an attempt to define the tetrad difference and to explain how Spearman used it to test whether the common variance of his matrix could be reduced to a single general factor, or only to several group factors.
Suppose that we wish to compute the tetrad difference for four measures taken on a series of mice ranging in age from babies to adults—leg length, leg width, tail length, and tail width. We compute all correlation coefficients between pairs of variables and find, unsurprisingly, that all are positive—as mice grow, their parts get larger. But we would like to know whether the common variance in the positive correlations all reflects a single general factor—growth itself—or whether two separate components of growth must be identified—in this case, a leg factor and a tail factor, or a length factor and a width factor. Spearman gives the following formula for the tetrad difference
r 13 × r 24 – r 23 × r 14
where r is the correlation coefficient and the two subscripts represent the two measures being correlated (in this case, 1 is leg length, 2 is leg width, 3 is tail length and 4 is tail width—so that r13 is the correlation coefficient between the first and the third measure, or between leg length and tail length). In our example, the tetrad difference is
(leg length and tail length) × (leg width and tail width)-
(leg width and tail length) × (leg length and tail width)
Spearman argued that tetrad differences of zero imply the existence of a single general factor while either positive or negative values indicate that group factors must be recognized. Suppose, for example, that group factors for general body length and general body width govern the growth of mice. In this case, we would get a high positive value for the tetrad difference because the correlation coefficients of a length with another length or a width with another width would tend to be higher than correlation coefficients of a width with a length. (Note that the left-hand side of the tetrad equation includes only lengths with lengths or widths with widths, while the right-hand side includes only lengths with widths.) But if only a single, general growth factor regulates the size of mice, then lengths with widths should show as high a correlation as lengths with lengths or widths with widths—and the tetrad difference should be zero. Fig. 6.8 shows a hypothetical correlation matrix for the four measures that yields a tetrad difference of zero (values taken from Spearman’s example in another context, 1927, p. 74). Fig. 6.8 also shows a different hypothetical matrix yielding a positive tetrad difference and a conclusion (if other tetrads show the same pattern) that group factors for length and width must be recognized.
The top matrix of Fig. 6.8 illustrates another important point that reverberates throughout the history of factor analysis in psychology. Note that, although the tetrad difference is zero, the correlation coefficients need not be (and almost invariably are not) equal. In this case, leg width with leg length gives a correlation of 0.80, while tail width with tail length yields only 0.18. These differences reflect varying “saturations” with g, the single general factor when the tetrad differences are zero. Leg measures have higher saturations than tail measures—that is, they are closer to g, or reflect it better (in modern terms, they lie closer to the first principal component in geometric representations like Fig. 6.6). Tail measures do not load strongly on g.* They contain little common variance and must be explained primarily by their s—the information unique to each measure. Moving now to mental tests: if g represents general intelligence, then mental tests most saturated with g are the best surrogates for general intelligence, while tests with low g-loadings (and high s values) cannot serve as good measures of general mental worth. Strength of g-loading becomes the criterion for determining whether or not a particular mental test (IQ, for example) is a good measure of general intelligence.
6.8 Tetrad differences of zero (above) and a positive value (below) from hypothetical correlation matrices for four measurements: LL = leg length, LW = leg width, TL = tail length, and TW = tail width. The positive tetrad difference indicates the existence of group factors for lengths and widths.
Spearman’s tetrad procedure is very laborious when the correlation matrix includes a large number of tests. Each tetrad difference must be calculated separately. If the common variance reflects but a single general factor, then the tetrads should equal zero. But, as in any statistical procedure, not all cases meet the expected value (half heads and half tails is the expectation in coin flipping, but you will flip six heads in a row about once in sixty-four series of six flips). Some calculated tetrad differences will be positive or negative even when a single g exists and the expected value is zero. Thus, Spearman computed all tetrad differences and looked for normal frequency distributions with a mean tetrad difference of zero as his test for the existence of g.
Spearman’s g and the great instauration of psychology
Charles
Spearman computed all his tetrads, found a distribution close enough to normal with a mean close enough to zero, and proclaimed that the common variance in mental tests recorded but a single underlying factor—Spearman’s g, or general intelligence. Spearman did not hide his pleasure, for he felt that he had discovered the elusive entity that would make psychology a true science. He had found the innate essence of intelligence, the reality underlying all the superficial and inadequate measures devised to search for it. Spearman’s g would be the philosopher’s stone of psychology, its hard, quantifiable “thing”—a fundamental particle that would pave the way for an exact science as firm and as basic as physics.
In his 1904 paper, Spearman proclaimed the ubiquity of g in all processes deemed intellectual: “All branches of intellectual activity have in common one fundamental function … whereas the remaining or specific elements seem in every case to be wholly different from that in all the others.… This g, far from being confined to some small set of abilities whose intercorrelations have actually been measured and drawn up in some particular table, may enter into all abilities whatsoever.”
The conventional school subjects, insofar as they reflect aptitude rather than the simple acquisition of information, merely peer through a dark glass at the single essence inside: “All examination in the different sensory, school, and other specific faculties may be considered as so many independently obtained estimates of the one great common Intellective Function” (1904, p. 273). Thus Spearman tried to resolve a traditional dilemma of conventional education for the British elite: why should training in the classics make a better soldier or a statesman? “Instead of continuing ineffectively to protest that high marks in Greek syntax are no test as to the capacity of men to command troops or to administer provinces, we shall at last actually determine the precise accuracy of the various means of measuring General Intelligence” (1904, p. 277). In place of fruitless argument, one has simply to determine the g-loading of Latin grammar and military acuity. If both lie close to g, then skill in conjugation may be a good estimate of future ability to command.