The Mismeasure of Man
Second, and more important, the press of time and the hostility of regular officers often precluded a Beta retest for men who had incorrectly taken Alpha. Yerkes admitted (p. 472): “It was never successfully shown, however, that the continued recalls … were so essential that repeated interference with company maneuvers should be permitted.” As the pace became more frantic, the problem worsened. The chief tester at Camp Dix complained (pp. 72–73): “In June it was found impossible to recall a thousand men listed for individual examination. In July Alpha failures among negroes were not recalled.” The stated protocol scarcely applied to blacks who, as usual, were treated with less concern and more contempt by everyone. Failure on Beta, for example, should have led to an individual examination. Half the black recruits scored D—on Beta, but only one-fifth of these were recalled and four-fifths received no further examination (p. 708). Yet we know that scores for blacks improved substantially when the protocol was followed. At one camp (p. 736), only 14.1 percent of men who had scored D—on Alpha failed to gain a higher grade on Beta.
The effects of this systematic bias are evident in one of Boring’s experiments with the summary statistics. He culled 4,893 cases of men who had taken both Alpha and Beta. Converting their scores to the common scale, he calculated an average mental age of 10.775 for Alpha, and a Beta mean of 12.158 (p. 655). He used only the Beta scores in his summaries; Yerkes procedure worked. But what of the myriads who should have taken Beta, but only received Alpha and scored abysmally as a result—primarily poorly educated blacks and immigrants with an imperfect command of English—the very groups whose low scores caused such a hereditarian stir later on?
DUBIOUS AND PERVERSE PROCEEDINGS: A PERSONAL TESTIMONY
Academicians often forget how poorly or incompletely the written record, their primary source, may represent experience. Some things have to be seen, touched, and tasted. What was it like to be an illiterate black or foreign recruit, anxious and befuddled at the novel experience of taking an examination, never told why, or what would be made of the results: expulsion, the front lines? In 1968 (quoted in Kevles), an examiner recalled his administration of Beta: “It was touching to see the intense effort… put into answering the questions, often by men who never before had held a pencil in their hands.” Yerkes had overlooked, or consciously bypassed, something of importance. The Beta examination contained only pictures, numbers, and symbols. But it still required pencil work and, on three of its seven parts, a knowledge of numbers and how to write them.
Yerkes’s monograph is so thorough that his procedure for giving the two examinations can be reconstructed down to the choreography of motion for all examiners and orderlies. He provides facsimiles in full size for the examinations themselves, and for all explanatory material used by examiners. The standardized words and gestures of examiners are reproduced in full. Since I wanted to know in as complete a way as possible what it felt like to give and take the test, I administered examination Beta (for illiterates) to a group of fifty-three Harvard undergraduates in my course on biology as a social weapon. I tried to follow Yerkes’s protocol scrupulously in all its details. I feel that I reconstructed the original situation accurately, with one important exception: my students knew what they were doing, didn’t have to provide their names on the form, and had nothing at stake. (One friend later suggested that I should have required names—and posted results—as just a small contribution to simulating the anxiety of the original.)
I knew before I started that internal contradictions and a priori prejudice thoroughly invalidated the hereditarian conclusions that Yerkes had drawn from the results. Boring himself called these conclusions “preposterous” late in his career (in a 1962 interview, quoted in Kevles, 1968). But I had not understood how the Draconian conditions of testing made such a thorough mockery of the claim that recruits could have been in a frame of mind to record anything about their innate abilities. In short, most of the men must have ended up either utterly confused or scared shitless.
The recruits were ushered into a room and seated before an examiner and demonstrator standing atop a platform, and several orderlies at floor level. Examiners were instructed to administer the test “in a genial manner” since “the subjects who take this examination sometimes sulk and refuse to work” (p. 163). Recruits were told nothing about the examination or its purposes. The examiner simply said: “Here are some papers. You must not open them or turn them over until you are told to.” The men then filled in their names, age, and education (with help for those too illiterate to do so). After these perfunctory preliminaries, the examiner plunged right in:
Attention. Watch this man (pointing to demonstrator). He (pointing to demonstrator again) is going to do here (tapping blackboard with pointer) what you (pointing to different members of the group) are to do on your papers (here examiner points to several papers that lie before men in the group, picks up one, holds it next to the blackboard, returns the paper, points to demonstrator and the blackboard in succession, then to the men and their papers). Ask no questions. Wait till I say “Go ahead!” (p. 163).
By comparison, Alpha men were virtually inundated with information (p. 157), for the Alpha examiner said:
Attention! The purpose of this examination is to see how well you can remember, think, and carry out what you are told to do. We are not looking for crazy people. The aim is to help find out what you are best fitted to do in the Army. The grade you make in this examination will be put on your qualification card and will also go to your company commander. Some of the things you are told to do will be very easy. Some you may find hard. You are not expected to make a perfect grade, but do the very best you can.… Listen closely. Ask no questions.
The extreme limits imposed upon the Beta examiner’s vocabulary did not only reflect Yerkes’s poor opinion of what Beta recruits might understand by virtue of their stupidity. Many Beta examinees were recent immigrants who did not speak English, and instruction had to be as pictorial and gestural as possible. Yerkes advised (p. 163): “One camp has had great success with a ‘window seller’ as demonstrator. Actors should also be considered for the work.” One particularly important bit of information was not transmitted: examinees were not told that it was virtually impossible to finish at least three of the tests, and that they were not expected to do so.
Atop the platform, the demonstrator stood in front of a blackboard roll covered by a curtain; the examiner stood at his side. Before each of the seven tests, the curtain was raised to expose a sample problem (all reproduced in Figure 5.4), and examiner and demonstrator engaged in a bit of pantomime to illustrate proper procedure. The examiner then issued an order to work, and the demonstrator closed the curtain and advanced the roll to the next sample. The first test, maze running, received the following demonstration:
Demonstrator traces path through first maze with crayon, slowly and hesitatingly. Examiner then traces second maze and motions to demonstrator to go ahead. Demonstrator makes one mistake by going into the blind alley at upper left-hand corner of maze. Examiner apparently does not notice what demonstrator is doing until he crosses line at end of alley; then examiner shakes his head vigorously, says “No-no,” takes demonstrator’s hand and traces back to the place where he may start right again. Demonstrator traces rest of maze so as to indicate an attempt at haste, hesitating only at ambiguous points. Examiner says “Good.” Then holding up blank, “Look here,” and draws an imaginary line across the page from left to right for every maze on the page. Then, “All right. Go ahead. Do it (pointing to men and then to books). Hurry up.”
This paragraph may be naïvely amusing (some of my students thought so). The next statement, by comparison, is a bit diabolical.
The idea of working fast must be impressed on the men during the maze test. Examiner and orderlies walk around the room, motioning to men who are not working, and saying, “Do it, do it, hurry up, quick.” At the end of 2 minutes examiner says, “Stop! Turn over the page to test 2.”
The examiner demonstrated test 2, cube counting, with three-dimensional models (my son had some left over from his baby days). Note that recruits who could not write numbers would receive scores of zero even if they counted all the cubes correctly. Test 3, the X-O series, will be recognized by nearly everyone today as the pictorial version of “what is the next number in the sequence.” Test 4, digit symbols, required the translation of nine digits into corresponding symbols. It looks easy enough, but the test itself included ninety items and could hardly be finished by anybody in the two minutes allotted. A man who couldn’t write numbers was faced with two sets of unfamiliar symbols and suffered a severe additional disadvantage. Test 5, number checking, asked men to compare numerical sequences, up to eleven digits in length, in two parallel columns. If items on the same line were identical in the two columns, recruits were instructed (by gestures) to write an X next to the item. Fifty sequences occupied three minutes, and few recruits could finish. Again, an inability to write or recognize numbers would make the task virtually impossible.
Test 6, pictorial completion, is Beta’s visual analogue of Alpha’s multiple-choice examination for testing innate intelligence by asking recruits about commercial products, famous sporting or film stars, or the primary industries of various cities and states. Its instructions are worth repeating”:
“This is test 6 here. Look. A lot of pictures.” After everyone has found the place, “Now watch.” Examiner points to hand and says to demonstrator, “Fix it.” Demonstrator does nothing, but looks puzzled. Examiner points to the picture of the hand, and then to the place where the finger is missing and says to demonstrator, “Fix it; fix it.” Demonstrator then draws in finger. Examiner says, “That’s right.” Examiner then points to fish and place for eye and says, “Fix it.” After demonstrator has drawn missing eye, examiner points to each of the four remaining drawings and says, “Fix them all.” Demonstrator works samples out slowly and with apparent effort. When the samples are finished examiner says, “All right. Go ahead. Hurry up!” During the course of this test the orderlies walk around the room and locate individuals who are doing nothing, point to their pages and say, “Fix it. Fix them,” trying to set everyone working. At the end of 3 minutes examiner says, “Stop! But don’t turn over the page.”
5.4 The blackboard demonstrations for all seven parts of the Beta test. From Yerkes, 1921.
The examination itself is also worth reprinting (Fig. 5.5). Best of luck with pig tails, crab legs, bowling balls, tennis nets, and the Jack’s missing diamond, not to mention the phonograph horn (a real stumper for my students). Yerkes provided the following instructions for grading:
Rules for Individual Items
Item 4.—Any spoon at any angle in rightr hand receives credit. Left hand, or unattached spoon, no credit.
Item 5.—Chimney must be in right place. No credit for smoke.
Item 6.—Another ear on same side as first receives no credit.
Item 8.—Plain square, cross, etc., in proper location for stamp, receives credit.
Item 10.—Missing part is the rivet. Line of “ear” may be omitted.
Item 13.—Missing part is leg.
Item 15.—Ball should be drawn in hand of man. If represented in hand of woman, or in motion, no credit.
Item 16.—Single line indicating net receives credit.
Item 18.—Any representation intended for horn, pointing in any direction, receives credit.
Item 19.—Hand and powder puff must be put on proper side.
Item 20.—Diamond is the missing part. Failure to complete hilt on sword is not an error.
The seventh and last test, geometrical construction, required that a square be broken into component pieces. Its ten parts were allotted two and a half minutes.
I believe that the conditions of testing, and the basic character of the examination, make it ludicrous to believe that Beta measured any internal state deserving the label intelligence. Despite the plea for geniality, the examination was conducted in an almost frantic rush. Most parts could not be finished in the time allotted, but recruits were not forewarned. My students compiled the following record of completions on the seven parts (see p. 242). For two of the tests, digit symbols and number checking (4 and 5), most students simply couldn’t write fast enough to complete the ninety and fifty items, even though the protocol was clear to all. The third test with a majority of incompletes, cube counting (number 2), was too difficult for the number of items included and the time allotted.
In summary, many recruits could not see or hear the examiner; some had never taken a test before or even held a pencil. Many did not understand the instructions and were completely befuddled. Those who did comprehend could complete only a small part of most tests in the allotted time. Meanwhile, if anxiety and confusion had not already reached levels sufficiently high to invalidate the results, the orderlies continually marched about, pointing to individual recruits and ordering them to hurry in voices loud enough, as specifically mandated, to convey the message generally. Add to this the blatant cultural biases of test 6, and the more subtle biases directed against those who could not write numbers or who had little experience in writing anything at all, and what do you have but a shambles.
5.5 Part six of examination Beta for testing innate intelligence.
TEST FINISHED NOT FINISHED
1 44 9
2 21 32
3 45 8
4 12 41
5 18 35
6 49 4
7 40 13
The proof of inadequacy lies in the summary statistics, though Yerkes and Boring chose to interpret them differently. The monograph presents frequency distributions for scores on each part separately. Since Yerkes believed that innate intelligence was normally distributed (the “standard” pattern with a single mode at some middle score and symmetrically decreasing frequencies away from the mode in both directions), he expected that scores for each test would be normally distributed as well. But only two of the tests, maze running and picture completion (1 and 6), yielded a distribution even close to normal. (These are also the tests that my own students found easiest and completed in highest proportion.) All the other tests yielded a bimodal distribution, with one peak at a middle value and another squarely at the minimum value of zero (Fig. 5.6).
The common-sense interpretation of this bimodality holds that recruits had two different responses to the tests. Some understood what they were supposed to do, and performed in varied ways. Others, for whatever reasons, could not fathom the instructions and scored zero. With high levels of imposed anxiety, poor conditions for seeing and hearing, and general inexperience with testing for most recruits, it would be fatuous to interpret the zero scores as evidence of innate stupidity below the intelligence of men who made some points—though Yerkes wormed out of the difficulty this way (see pp. 244–247). (My own students compiled lowest rates of completion for the tests that yield the largest secondary modes at zero in Yerkes’s sample—tests 4 and 5. As the only exception to this pattern, most of my students completed test 3, which produced a strong zero mode in the army sample. But 3 is the visual analog of “what is the next number in this series,” a test that all my students have taken more times than they care to remember.)
5.6 Frequency distributions for four of the Beta tests. Note the prominent mode at zero for tests 4, 5, and 7.
Statisticians are trained to be suspicious of distributions with multiple modes. Such distributions usually indicate inhomogeneity in the system, or, in plainer language, different causes for the different modes. All familiar proverbs about the inadvisibility of mixing apples and oranges apply. The multiple modes should have guided Yerkes to a suspicion that his tests were not measuring a single entity called intelligence. Instead, his statisticians found a way to redistribute zero scores in a manner favorable to hereditarian assumptions (see next section).
Oh yes, was anyone wondering how my students fared? They did very well of course. Anything else would have been shocki
ng, since all the tests are greatly simplified precursors of examinations they have been taking all their lives. Of fifty-three students, thirty-one scored A and sixteen B. Still, more than 10 percent (six of fifty-three) scored at the intellectual borderline of C; by the standards of some camps, they would have been fit only for the duties of a buck private.
FINAGLING THE SUMMARY STATISTICS: THE PROBLEM OF ZERO VALUES
If the Beta test faltered on the artifact of a secondary mode for zero scores, the Alpha test became an unmitigated disaster for the same reason, vastly intensified. The zero modes were pronounced in Beta, but they never reached the height of the primary mode at a middle value. But six of eight Alpha tests yielded their highest mode at zero. (Only one had a normal distribution with a middle mode, while the other yielded a zero mode lower than the middle mode.) The zero mode often soared above all other values. In one test, nearly 40 percent of all scores were zero (Fig. 5.7a). In another, zero was the only common value, with a flat distribution of other scores (at about one-fifth the level of zero values) until an even decline began at high scores (Fig. 5.7b).