The Mismeasure of Man
I do not think that the army ever made much use of the tests. One can well imagine how professional officers felt about smart-assed young psychologists who arrived without invitation, often assumed an officer’s rank without undergoing basic training, commandeered a building to give the tests (if they could), saw each recruit for an hour in a large group, and then proceeded to usurp an officer’s traditional role in judging the worthiness of men for various military tasks. Yerkes’s corps encountered hostility in some camps; in others, they suffered a penalty in many ways more painful: they were treated politely, given appropriate facilities, and then ignored.* Some army officials became suspicious of Yerkes’s intent and launched three independent investigations of the testing program. One concluded that it should be controlled so that “no theorist may … ride it as a hobby for the purpose of obtaining data for research work and the future benefit of the human race” (quoted in Kevles, 1968, p. 577).
Still, the tests did have a strong impact in some areas, particularly in screening men for officer training. At the start of the war, the army and national guard maintained nine thousand officers. By the end, two hundred thousand officers presided, and two-thirds of them had started their careers in training camps where the tests were applied. In some camps, no man scoring below C could be considered for officer training.
But the major impact of Yerkes’s tests did not fall upon the army. Yerkes may not have brought the army its victory, but he certainly won his battle. He now had uniform data on 1.75 million men, and he had devised, in the Alpha and Beta exams, the first mass-produced written tests of intelligence. Inquiries flooded in from schools and businesses. In his massive monograph (Yerkes, 1921) on Psychological Examining in the United States Army, Yerkes buried a statement of great social significance in an aside on page 96. He spoke of “the steady stream of requests from commercial concerns, educational institutions, and individuals for the use of army methods of psychological examining or for the adaptation of such methods to special needs.” Binet’s purpose could now be circumvented because a technology had been developed for testing all pupils. Tests could now rank and stream everybody; the era of mass testing had begun.
Results of the army tests
The primary impact of the tests arose not from the army’s lackadaisical use of scores for individuals, but from general propaganda that accompanied Yerkes’s report of the summary statistics (Yerkes, 1921, pp. 553–875). E. G. Boring, later a famous psychologist himself but then Yerkes’s lieutenant (and the army’s captain), selected one hundred sixty thousand cases from the files and produced data that reverberated through the 1920s with a hard hereditarian ring. The task was a formidable one. The sample, which Boring culled himself with the aid of only one assistant, was very large; moreover, the scales of three different tests (Alpha, Beta, and individual) had to be converted to a common standard so that racial and national averages could be constructed from samples of men who had taken the tests in different proportions (few blacks took Alpha, for example).
From Boring’s ocean of numbers, three “facts” rose to the top and continued to influence social policy in America long after their source in the tests had been forgotten.
1. The average mental age of white American adults stood just above the edge of moronity at a shocking and meager thirteen. Terman had previously set the standard at sixteen. The new figure became a rallying point for eugenicists who predicted doom and lamented our declining intelligence, caused by the unconstrained breeding of the poor and feeble-minded, the spread of Negro blood through miscegenation, and the swamping of an intelligent native stock by the immigrating dregs of southern and eastern Europe. Yerkes* wrote:
It is customary to say that the mental age of the average adult is about 16 years. This figure is based, however, upon examinations of only 62 persons; 32 of them high-school pupils from 16–20 years of age, and 30 of them “business men of moderate success and of very limited educational advantages.” The group is too small to give very reliable results and is furthermore probably not typical.… It appears that the intelligence of the principal sample of the white draft, when transmuted from Alpha and Beta exams into terms of mental age, is about 13 years (13.08) (1921, p. 785).
Yet, even as he wrote, Yerkes began to sense the logical absurdity of such a statement. An average is what it is; it cannot lie three years below what it should be. So Yerkes thought again and added:
We can hardly say, however, with assurance that these recruits are three years mental age below the average. Indeed, it might be argued on extrinsic grounds that the draft itself is more representative of the average intelligence of the country than is a group of high-school students and business men (1921, p. 785).
If 13.08 is the white average, and everyone from mental age 8 through 12 is a moron, then we are a nation of nearly half-morons. Yerkes concluded (1921, p. 791): “It would be totally impossible to exclude all morons as that term is at present defined, for there are under 13 years 37 percent of whites and 8g percent of negroes.”
2. European immigrants can be graded by their country of origin. The average man of many nations is a moron. The darker peoples of southern Europe and the Slavs of eastern Europe are less intelligent than the fair peoples of western and northern Europe. Nordic supremacy is not a jingoistic prejudice. The average Russian has a mental age of 11.34; the Italian, 11.01; the Pole, 10.74. The Polish joke attained the same legitimacy as the moron joke—indeed, they described the same animal.
3. The Negro lies at the bottom of the scale with an average mental age of 10.41. Some camps tried to carry the analysis a bit further, and in obvious racist directions. At Camp Lee, blacks were divided into three groups based upon intensity of color; the lighter groups scored higher (p. 531). Yerkes reported that the opinions of officers matched his numbers (p. 742):
All officers without exception agree that the negro lacks initiative, displays little or no leadership, and cannot accept responsibility. Some point out that these defects are greater in the southern negro. All officers seem further to agree that the negro is a cheerful, willing soldier, naturally subservient. These qualities make for immediate obedience, although not necessarily for good discipline, since petty thieving and venereal disease are commoner than with white troops.
Along the way, Yerkes and company tested several other social prejudices. Some fared poorly, particularly the popular eugenical notion that most offenders are feeble-minded. Among conscientious objectors for political reasons, 59 percent received a grade of A. Even outright disloyals scored above the average (p. 803). But other results buoyed their prejudices. As camp followers themselves, Yerkes’s corps decided to test a more traditional category of colleagues: the local prostitutes. They found that 53 percent (44 percent of whites and 68 percent of blacks) ranked at age ten or below on the Goddard version of the Binet scales. (They acknowledge that the Goddard scales ranked people well below their scores on other versions of the Binet tests.) Yerkes concluded (p. 808):
The results of Army examining of prostitutes corroborate the conclusion, attained by civilian examinations of prostitutes in various parts of the country, that from 30 to 60 percent of prostitutes are deficient and are for the most part high-grade morons; and that 15 to 25 percent of all prostitutes are so low-grade mentally that it is wise (as well as possible under the existing laws in most states) permanently to segregate them in institutions for the feeble-minded.
One must be thankful for small bits of humor to lighten the reading of an eight-hundred-page statistical monograph. The thought of army personnel rounding up the local prostitutes and sitting them down to take the Binet tests amused me no end, and must have bemused the ladies even more.
As pure numbers, these data carried no inherent social message. They might have been used to promote equality of opportunity and to underscore the disadvantages imposed upon so many Americans. Yerkes might have argued that an average mental age of thirteen reflected the fact that relatively few recruits had the opportunity
to finish or even to attend high school. He might have attributed the low average of some national groups to the fact that most recruits from these countries were recent immigrants who did not speak English and were unfamiliar with American culture. He might have recognized the link between low Negro scores and the history of slavery and racism.
But scarcely a word do we read through eight hundred pages of any role for environmental influence. The tests had been written by a committee that included all the leading American hereditarians discussed in this chapter. They had been constructed to measure innate intelligence, and they did so by definition. The circularity of argument could not be broken. All the major findings received hereditarian interpretations, often by near miracles of special pleading to argue past a patent environmental influence. A circular issued from the School of Military Psychology at Camp Greenleaf proclaimed (do pardon its questionable grammar): “These tests do not measure occupational fitness nor educational attainment; they measure intellectual ability. This latter has been shown to be important in estimating military value” (p. 424). And the boss himself argued (Yerkes, quoted in Chase, 1977, p. 249):
Examinations Alpha and Beta are so constructed and administered as to minimize the handicap of men who because of foreign birth or lack of education are little skilled in the use of English. These group examinations were originally intended, and are now definitely known, to measure native intellectual ability. They are to some extent influenced by-educational acquirement, but in the main the soldier’s inborn intelligence and not the accidents of environment determines his mental rating or grade in the army.
A critique of the Army Mental Tests
THE CONTENT OF THE TESTS
The Alpha test included eight parts, the Beta seven; each took less than an hour and could be given to large groups. Most of the Alpha parts presented items that have become familiar to generations of test-takers ever since: analogies, filling in the next number in a sequence, unscrambling sentences, and so forth. This similarity is no accident; the Army Alpha was the granddaddy, literally as well as figuratively, of all written mental tests. One of Yerkes’s disciples, C. C. Brigham, later became secretary of the College Entrance Examination Board and developed the Scholastic Aptitude Test on army models. If people get a peculiar feeling of deja-vu in perusing Yerkes’s monograph, I suggest that they think back to their own College Boards, with all its attendant anxiety.
These familiar parts are not especially subject to charges of cultural bias, at least no more so than their modern descendants. In a general way, of course, they test literacy, and literacy records education more than inherited intelligence. Moreover, a schoolmaster’s claim that he tests children of the same age and school experience, and therefore may be recording some internal biology, didn’t apply to the army recruits—for they varied greatly in access to education and recorded different amounts of schooling in their scores. A few of the items are amusing in the light of Yerkes’s assertion that the tests “measure native intellectual ability.” Consider the Alpha analogy: “Washington is to Adams as first is to.…”
But one part of each test is simply ludicrous in the light of Yerkes’s analysis. How could Yerkes and company attribute the low scores of recent immigrants to innate stupidity when their multiple-choice test consisted entirely of questions like:
Crisco is a: patent medicine, disinfectant, toothpaste, food product
The number of a Kaffir’s legs is: 2, 4, 6, 8
Christy Mathewson is famous as a: writer, artist, baseball player, comedian
I got the last one, but my intelligent brother, who, to my distress, grew up in New York utterly oblivious to the heroics of three great baseball teams then resident, did not.
Yerkes might have responded that recent immigrants generally took Beta rather than Alpha, but Beta contains a pictorial version of the same theme. In this complete-a-picture test, early items might be defended as sufficiently universal: adding a mouth to a face or an ear to a rabbit. But later items required a rivet in a pocket knife, a filament in a light bulb, a horn on a phonograph, a net on a tennis court, and a ball in a bowler’s hand (marked wrong, Yerkes explained, if an examinee drew the ball in the alley, for you can tell from the bowler’s posture that he has not yet released the ball). Franz Boas, an early critic, told the tale of a Sicilian recruit who added a crucifix where it always appeared in his native land to a house without a chimney. He was marked wrong.
The tests were strictly timed, for the next fifty were waiting by the door. Recruits were not expected to finish each part; this was explained to the Alpha men, but not to Beta people. Yerkes wondered why so many recruits scored flat zero on so many of the parts (the most telling proof of the tests’ worthlessness—see pp. 244–247). How many of us, if nervous, uncomfortable, and crowded (and even if not), would have understood enough to write anything at all in the ten seconds allotted for completing the following commands, each given but once in Alpha, Part 1?
Attention! Look at 4. When I say “go” make a figure 1 in the space which is in the circle but not in the triangle or square, and also make a figure 2 in the space which is in the triangle and circle, but not in the square. Go.
Attention! Look at 6. When I say “go” put in the second circle the right answer to the question: “How many months has a year?” In the third circle do nothing, but in the fourth circle put any number that is a wrong answer to the question that you have just answered correctly. Go.
INADEQUATE CONDITIONS
Yerkes’s protocol was rigorous and trying enough. His examiners had to process men rapidly and grade the exams immediately, so that failures could be recalled for a different test. When faced with the added burden of thinly veiled hostility from the brass at several camps, Yerkes’s testers were rarely able to carry out more than a caricature of their own stated procedure. They continually compromised, backtracked, and altered in the face of necessity. Procedures varied so much from camp to camp that results could scarcely be collated and compared. The whole effort, through no fault of Yerkes’s beyond impracticality and overambition, became something of a shambles, if not a disgrace. The details are all in Yerkes’s monograph, but hardly anyone ever read it. The summary statistics became an important social weapon for racists and eugenicists; their rotten core lay exposed in the monograph, but who looks within when the surface shines with such a congenial message.
The army mandated that special buildings be supplied or even constructed for Yerkes’s examinations, but a different reality prevailed (1921, p. 61). The examiners had to take what they could get, often rooms in cramped barracks with no furnishings at all, and inadequate acoustics, illumination, and lines of sight. The chief tester at one camp complained (p. 106): “Part of this inaccuracy I believe to be due to the fact that the room in which the examination is held is filled too full of men. As a result, the men who are sitting in the rear of the room are unable to hear clearly and thoroughly enough to understand the instructions.”
Tensions rose between Yerkes’s testers and regular officers. The chief tester of Camp Custer complained (p. 111): “The ignorance of the subject on the part of the average officer is equalled only by his indifference to it.” Yerkes urged restraint and accommodation (p. 155):
The examiner should strive especially to take the military point of view. Unwarranted claims concerning the accuracy of the results should be avoided. In general, straightforward commonsense statements will be found more convincing than technical descriptions, statistical exhibits, or academic arguments.
As friction and doubt mounted, the secretary of war polled commanding officers of all camps to ask their opinion of Yerkes’s tests. He received one hundred replies, nearly all negative. They were, Yerkes admitted (p. 43), “with a few exceptions, unfavorable to psychological work, and have led to the conclusion on the part of various officers of the General Staff that this work has little, if any, value to the army and should be discontinued.” Yerkes fought back and won a standoff (but not all the pr
omotions, commissions, and hirings he had been promised); his work proceeded under a cloud of suspicion.
Minor frustrations never abated. Camp Jackson ran out of forms and had to improvise on blank paper (p. 78). But a major and persistent difficulty dogged the entire enterprise and finally, as I shall demonstrate, deprived the summary statistics of any meaning. Recruits had to be allocated to their appropriate test. Men illiterate in English, either by lack of schooling or foreign birth, should have taken examination Beta, either by direct assignment, or indirectly upon failing Alpha. Yerkes’s corps tried heroically to fulfill this procedure. In at least three camps, they marked identification tags or even painted letters directly on the bodies of men who failed—a ready identification guide for further assessment (p. 73, p. 76): “A list of D men was sent within six hours after the group examination to the clerk at the mustering office. As the men appeared, this clerk marked on the body of each D man a letter P” (indicating that the psychiatrist should examine them further).
But standards for the division between Alpha and Beta varied substantially from camp to camp. A survey across camps revealed that the minimum score on an early version of Alpha varied from 20 to 100 for assignment to further testing (p. 476). Yerkes admitted (p. 354):
This lack of a uniform process of segregation is certainly unfortunate. On account of the variable facilities for examining and the variable quality of the groups examined however, it appeared entirely impossible to establish a standard uniform for all camps.
C. C. Brigham, Yerkes’s most zealous votary, even complained (1921):
The method of selecting men for Beta varied from camp to camp, and sometimes from week to week in the same camp. There was no established criterion of literacy, and no uniform method of selecting illiterates.
The problem cut far deeper than simple inconsistency among camps. The persistent logistical difficulties imposed a systematic bias that substantially lowered the mean scores of blacks and immigrants. For two major reasons, many men took only Alpha and scored either zero or next to nothing, not because they were innately dumb, but because they were illiterate and should have taken Beta by Yerkes’s own protocol. First, recruits and draftees had, on average, spent fewer years in school than Yerkes had anticipated. Lines for Beta began to lengthen and the entire operation threatened to clog at this bottleneck. At many camps, unqualified men were sent in droves to Alpha by artificial lowering of standards. Schooling to the third grade sufficed for Alpha in one camp; in another, anyone who said he could read, at whatever level, took Alpha. The chief tester at Camp Dix reported (p. 72): “To avoid excessively large Beta groups, standards for admission to examination Alpha were set low.”