Junior High School: Classroom, by Harris & Ewing. Source: Library of Congress

Pass/Fail

An American History of Testing

04.08.16

In this episode of BackStory, we explore the history of testing in America. The hosts go back to the eighteenth-century and look at how elite colleges replaced social status with merit and behavior as a way to grade students. We uncover the links between President James Garfield’s 1881 assassination and the civil service test, and look at how officials created the first, “white,” affirmative action program by waiving the test for WWII veterans. The hosts explore the long and troubled history of how Americans have used tests to both exclude and include people from the citizenry.

In the “There’s Nothing Standard About It,” we misstated the name of a French psychologist. His name was Alfred Binet, not Albert.

measurement

View Full Episode Transcript

PETER: This is BackStory. I’m Peter Onuf.

Millions of American high school students are sitting down to take the newly revamped SAT.

FEMALE SPEAKER: The new exam will have an optional essay. More time will be allowed. And there will be 16 fewer questions.

PETER: Many have criticized the test for its cultural biases in favor of high income students– not exactly a new problem. Standardized tests a century ago asked students to draw what was missing from a house.

ALAN STOSKOPF: The testers, of course, would think, oh, missing chimney. The Italian immigrant kids will put a crucifix.

PETER: Today on BackStory, a history of testing in America, from IQ and personality tests to fighting government corruption with a Civil Service Exam.

MARK SUMMERS: You don’t want people that are going to be like the New York City politicians that were known as the Paint Eaters, because they take everything they can. It’s even said they eat the paint off the walls.

PETER: A history of testing, today on BackStory.

Major funding for BackStory is provided by the [? Shia ?] [? Khan ?] Foundation, the National Endowment for the Humanities, the Joseph and Robert Cornell Memorial Foundation, and the Arthur Vining Davis Foundations.

From the Virginia Foundation for the Humanities, this is BackStory with the American Backstory hosts.

BRIAN: Welcome to the show. I’m Brian Balogh, and I’m here with Ed Ayers.

ED: Hey, Brian.

BRIAN: And Peter Onuf’s with us.

PETER: Hey, Brian.

BRIAN: We’re going to start today in the Athens of America. Or at least, that’s how Boston thought of itself back in 1845. The city schools were considered the best in the country.

PETER: A guy named Horace Mann thought this fine reputation was not deserved. Mann was superintendent of Massachusetts Schools, and the leading educational reformer of his day. Historian William Reese says that after touring Europe, Mann was convinced Boston schools were falling behind.

WILLIAM REESE: So Mann devised an idea to give a written test.

PETER: Now, a written test sounds like no big deal, right? But in the 1840s, it was a novel idea. Revolutionary, in fact.

WILLIAM REESE: It was a world of oral recitations in which teachers in almost virtually every classroom would call on students one by one, listen to the answers, and students would sit down, and the next student would stand up.

PETER: Mann thought that these oral examinations were little more than well-rehearsed performances. Teachers would give students the questions in advance so the kids could figure out the answers. There was no way to tell if students were actually learning anything beyond rote memorization. But Mann believed written tests were different. They could really measure learning.

WILLIAM REESE: And he, in fact, said that written exams were like a daguerreotype of a child’s mind– that we could take a picture of what they know by what they write down.

PETER: Mann decided to try this method on grammar school students across the city. In the summer of 1845, the reformer and his allies on the Boston School Committee launched a surprise attack.

WILLIAM REESE: Members from the School Committee were assigned to hop on their horses, literally, and ride from school to school– there are about 19 grammar schools– and, one day after another, give one hour timed tests with printed questions in a variety of subjects that were the main subjects taught in these grammar schools.

MALE SPEAKER: Do the waters of Lake Erie run into Lake Ontario or the waters of Ontario run into Erie?

MALE SPEAKER: A man has a square piece of ground which contains 1/4 of 1 acre and 1/4 on which are trees. Which will make wood enough to form a pile around on the inside of the land 3 feet high and 4 feet wide? How many cords of wood are there?

MALE SPEAKER: How much is 1/2 of 1/3 of 9 hours and 18 minutes?

MALE SPEAKER: Name the rivers, gulfs, oceans, seas, and straits through which a vessel must pass in going from Pittsburgh in Pennsylvania to Vienna in Austria.

WILLIAM REESE: The end of the hour, the School Committee members picked up the exam results and left the school, ran off to the next school.

BRIAN: It was, in effect, America’s first standardized test, assessing 530 bewildered Boston area students. Now, Mann’s city-wide pop quiz did have some weaknesses. Because the tests were staggered over a few days, kids, of course, found the way to cheat.

WILLIAM REESE: What happened was, of course, children were attending neighborhood schools, and Boston was a very compact city, and so some of the questions started to leak out. Beyond kids on playgrounds rushing over to a neighborhood and leaking out some questions, when the examiner showed up at one particular school, all of the kids seemed to be writing identical answers to the questions. So that always makes the teacher suspicious.

BRIAN: Then there was the grading. It was a complete nightmare.

WILLIAM REESE: The Committee members did not have the benefit of machine readable tests. So what you had were a series of questions across a variety of subjects, and it yielded about 31,000 answers that needed to be graded. So you had six people poring over all the answers. They hauled in some mathematician friends who they never identified. I suspect they were Harvard profs. And they got high school students to help look over their shoulder. They went through them two or three times to make sure they got it right.

BRIAN: As to the results.

WILLIAM REESE: Well, the average score was 30%. So the majority of kids flunked the test.

BRIAN: By a lot. And remember, these were students in the country’s top school system. But Mann wasn’t surprised. In fact, he felt vindicated. He believed Boston Schoolmasters were a bunch of hacks. Now he had to proof he needed.

WILLIAM REESE: Over the next few years, similar exams were given, but in the coming decades, because tests became so popular not only in Boston, but in so many places, it became like a force you couldn’t stop. And so if you jump to the 1880s and 1890s, tests had become ubiquitous, not only in Boston, but they spread throughout the country.

ED: These days, we seem to take exams for just about everything. So today on the show, we’re looking at how Americans have used tests to sort and measure each other on everything, from IQ and academic achievement to personality type. We’ll hear how the Civil Service Exam was introduced in the 1880s to combat out-of-control political patronage. And we’ll discuss the role of eugenics in the development of intelligence tests.

PETER: But first, let’s return to my conversation with William Reese. He says to know why Mann’s approach was so revolutionary and infuriating to the people of Boston, you have to understand what it replaced. As we mentioned earlier, most Boston schools held oral exams. These often took the form of annual performances that students and teachers basically rehearsed.

WILLIAM REESE: It led to comical scenes sometimes, where a student would say, that’s not my question. That’s Johnny’s.

PETER: Nice.

WILLIAM REESE: Everyone in the audience would smile at the end of the exhibition. Everyone would give a rousing applause because it was a sign that learning and education was happening in every little hamlet in America, including theirs. And so there wasn’t a way to compare, except through impressions, how well everyone was doing, certainly not in any statistical or numerical fashion.

PETER: Right. So this is a reformer– Mann’s a reformer who’s imposing a new regime on local schools in Boston. Why the fad for statistics? Was this a new way of thinking about society?

WILLIAM REESE: Yeah. Mann spent time after the 1830s really immersing himself in ideas related to statistics and quantitative analysis, to use a modern phrase. Statistics were transforming how many people thought about the world. And when the reports were written after the exam in 1845, the response was quite unlike anything you’d have ever seen. That is, chart after chart comparing all of the schools with huge commentaries about specific teachers and why the schools didn’t do so well in one neighborhood. Why they did better because someone was a better teacher in another neighborhood. So this is what was a really new, is that now you had information, as the committees would later say, in black and white. Hard facts.

And so this notion that numbers brought a precision that words could never do was central to the way in which people started to think about schools.

PETER: Right. These are these forward-looking modernizers. This is the future. It’s not the world that people were accustomed to, though.

WILLIAM REESE: That’s right. And it challenged ways of thinking about learning, that you could break learning down into its precise components and then examine people uniformly– the notion that we call a standardized test, that everyone of roughly the same age who read the same books would take an exam. It would be timed. It would be graded impartially. There was no funny business about favoritism. And so local people– you know, ordinary people– thought of schools one way, and now the reformers bring a level of expertise to bear on education, that is, at times, quite unsettling.

PETER: Bill, who were the winners and losers in the history you’re telling? What are the outcomes? We know that tests are going to be ubiquitous throughout American history, but looking at it as an historian, what do you see on the ground?

WILLIAM REESE: You see a lot of people who think– and this was already evident in the late 19th century– that what happens to art? What happens to music? What happens to the subjects that give a full, liberal education?

PETER: And it can’t be quantified.

WILLIAM REESE: And it’s very difficult to quantify it. Mann’s aim was to use tests to expose rote teaching and memorization, which he thought was overemphasized far and above children understanding the subject matter at school. So what’s happened, ironically– and Mann would be, I think, appalled by this– is that, in fact, we know that teachers who are measured by how well students do on standardized tests, as a means of self-survival, often do engage in the very kind of teaching practices that tests reward, which is divide up the subject matter, memorize it, and hope you choose the right answer on an exam. So it’s one of the ironies of testing. They were meant to expose bad teaching, but many people would argue that lifeless instruction is reinforced by an exaggerated emphasis on standardized exams.

PETER: Bill, I’ve got one last question for you, and this is a historian’s prerogative. You know Horace Mann better than most people. How would he look at the testing regime that dominates American education today?

WILLIAM REESE: I think he would be startled at the reduction of public education to testing. I think he would think it’s a vital and important part of our public school system. He certainly would believe in accountability and assessment, but I think he would try to strike a better balance because, as I said, what I found most interesting about Mann was not only that he was a pioneer in testing, but he was also quite a sensitive person when it came to understanding that it’s very important for children to understand the material that they study as well as be able to memorize and regurgitate the material which, as we all know, many tests emphasize. So I think he’d be quite saddened by the imbalance, but I don’t think he would say that the tests should disappear. He would just want to put it in a certain balance.

PETER: All right. So he’s not going to apologize to us, huh?

WILLIAM REESE: I don’t think so. He’s a very strong-willed individual with a strong point of view.

PETER: William Reece is an education historian at the University of Wisconsin-Madison. He’s the author of Testing Wars in the Public Schools: A Forgotten History.

As students across the country sit down to take a revamped SAT, we’re exploring the history of testing in America.

ED: So Peter, I’ve got a question for you, man.

PETER: Yeah?

ED: Testing– we’re so used to it now, but I can’t even really imagine how society would function back in your time when they didn’t have tests.

BRIAN: Yeah, and what was the Blue Book industry like?

PETER: Well, you knew because it was in your face, Ed, Brian. And that is status was clearly marked by the way you dressed, the way you acted, the power you wielded through succession– that is, you got through inheritance. There really weren’t a lot of questions about where you stood. Think of the whole notion of a social order– a hierarchical social order. You need tests and ranking where everybody is supposedly equal, and then you’re trying to pull out of the great mass of people who have distinctive skills. I would say, however, that the tests that people faced in my period were much more rigorous than the ones that–

ED: Yeah–

BRIAN: Oh, come on.

ED: –that’s what people always say about the past, right?

BRIAN: He didn’t even have the Scantron then.

PETER: Oh, well, there are some very, very important tests that certain people face at certain crucial times. One would be on the dueling ground, where you have to show your courage and you have to defend your honor.

ED: Peter, that’s true. I’ve seen the Hamilton play, so I know about dueling, but not everybody dueled. That seemed to leave out a lot of people– people who were Christian, for example, women. Did they have tests?

PETER: Well yeah, Ed, especially if you were a really serious Christian– a Puritan– there was a very rigorous test you had to pass in order to become a church member. A lot of people will spend their whole lives in these churches and never become members.

Picture this. You’ve been listening to this preaching for all these years. Your family’s been talking to you. There’s Bible reading at home. And now your moment has come. You stand up there in front of the congregation, and it’s a very, very– the ultimate test. Everything’s at stake.

ED: Do they vote? How do you know if you passed?

PETER: Well, it’s pretty clear. Yes, they do vote, but it would be a consensus thing. It’s the whole person who’s on the line. Just as we say to kids now who are taking exams, no, no, this is just a test, and you’ll have a thousand more before you’re finished. And it doesn’t really test the real you, and we love you just the way you are. And this is, does God love you? Has He designated you to be among the saints who will be saved.

ED: And God doesn’t grade on a curve, either, so–

PETER: No. No, and eventually, though– this is so hard for everybody, such an agonizing ordeal, that it doesn’t take place in public by the end of the 17th century.

ED: So Peter, duelist, OK. Puritans, OK. But there’s a lot of people in America who were neither. It strikes me that there’s kind of another test that people have to have, with just trying to find a mate.

PETER: Oh, Ed, you’re exactly right. And you’re pointing a courtship as another public spectacle, you might say. Again, we think of it as a very private affair. It’s not something that everybody should know about, but in a small village in the 17th and 18th century, that’s a cross-generational transaction. You, as a suitor, and we’ll say it’s you, Ed, suing for the hand of Abby. You’re going to have to prove to everybody else, not just to her, that you’re going to be a good provider. And forming families may be the most crucial test that everybody faces. A test, if you will, of your suitability.

BRIAN: Peter, that’s so helpful to me. And what just jumps out at me is the public nature of all of these. All of these are asking, are you suited to be a member of our community?

PETER: To be brought into the fold.

BRIAN: Do you have the honor, in the case of the duels. Do you have the religious conviction, in the case of the conversion. And are you well-suited to be a proper couple in our community going forward?

PETER: Whereas, I think the modern model is much more, we’re going to sort out. We’re going to exclude large numbers as we sort through the masses of people who are– when they come to the testing table they’re equal. And some of them will have happy outcomes as a result. Their lives will be determined by that test, but it’s more of sorting–

BRIAN: The sorting for–

PETER: –and excluding–

BRIAN: –a whole array of specialized communities.

PETER: That’s right.

BRIAN: What college are you right for. What’s specialty are you right for if you’re going into our corporation. It’s a multiplicity of communities that you might or might not fit into.

PETER: And I think you’ve said exactly the right thing, Brian, because we’re talking in these earlier tests of the whole person about one community. Whereas in this modern world of proliferating communities, the sorting process is really crucial because you’re not born to it anymore.

BRIAN: You know, it’s interesting. Today the big fad in college admission is holistic admission, in which we set aside the SAT and the GPA to look at people’s challenges that they’ve overcome and what they’ve done outside the classroom. And so now we’re saying, you know? I think maybe we went a little too far in taking people apart in these different components, as if– acting as if your intelligence is something detachable from everything else that you are.

BRIAN: These days, it’s hard to make it through school without sitting through a slew of standardized tests. It starts in elementary school with a STAR or NAEP tests. Middle schoolers take the PSAT. For high school students, it’s the high stakes SAT, or the ACT, along with advanced placement exams. How well students score on these tests can determine where they go to college, and whether they’ll get scholarships.

ED: Education historian Alan Stoskopf says these types of tests became standard in the early 20th century. This was an era of unprecedented immigration and urbanization. Public school officials had to figure out how to educate millions of children, many of whom didn’t speak English. A prominent psychologist named Henry Goddard thought he could help. He devised a standardized test that could measure and rank students based on intelligence. His test was widely adopted throughout the US. The problem, Stoskopf says, was Goddard’s ideas about intelligence were heavily influenced by eugenics.

ALAN STOSKOPF: And eugenics, in its essence, claimed to be a civic biology, a scientific movement that could efficiently and quickly identify those who are superior, and those who are inferior to make society better functioning. And what the eugenicists did, especially in the United States, was to call eugenics both a scientific movement, and a public health movement.

ED: So what does testing have to do with all this? How does this create a context in which this cult of testing begins?

ALAN STOSKOPF: Well, Henry Goddard, who was a eugenicist and the director of the Center for Feeble-minded Boys and Girls in Vineland, New Jersey, had heard about the pioneering work going on in Paris, France by Alfred Binet. And Binet was a really interesting guy. He was hired by the Ministry of Education in Paris in the early 1900s to try and understand why some students couldn’t achieve to grade level, and some students could. Fair enough. I mean, those words could be applicable today. And he said, perhaps I can create an instrument that would be an aid to the teacher to allow her or him to identify those students who are struggling and those students who are doing well. He saw it as a form of mental orthotics. So he devised the first, what we would call, standardized tests.

And Goddard came over to visit him, and to look at the test. And Goddard saw something very, very different in the test. Binet would say, you know, these tests are not a theory of limits. They’re an index of possibilities. But Goddard saw in that test a very efficient instrument that could rank and sort, cheaply and efficiently, thousands, if not millions, of people into an appropriate social niche, because he believed that intelligence was fixed. It was finite. It was static.

BRIAN: So what kind of categories would they put people in after they took some of these tests?

ALAN STOSKOPF: Right. I mean, the big thing was feeble-mindedness. And they’d break that down. You know the joke, oh you’re a moron? Moron is a eugenic term. It was invented by Henry Goddard at that time. When he brought the test over and translated it into English and adapted it for an American context, he saw it as a powerful instrument that would immediately identify, in the case of immigration, who belonged in the country or who didn’t; in the case of schooling, who might belong into the special class, the average class, or the advanced class. And they used it. The first mass testing was at Ellis Island, then it was at the United States Army. Goddard a team of psychologists tested this on 1.5 million army recruits.

And by the way, there’s this humorous side to this, is that the Army realized that, oh my god, nearly 1/3 or more of the recruits were feeble-minded. This can’t be possible. I mean, we need these men to the service. But rather than question the efficacy of the questions and the whole methodology used, they continued to proceed on. And superintendents across the United States and public schools get wind of what’s going on there, and then they say, could we use this not just for special needs, but just to really figure out where all our students belong? And then Pandora’s box opens, because it’s after World War I that this takes off like a rocket ship.

ED: Tell me a little bit about what these test would have looked like.

ALAN STOSKOPF: The early tests, oftentimes, had 20 questions on them. Oftentimes they were visual images and the tester would say, fix what is wrong with the image. You might see a table with three legs on it, and the idea would be to draw a correct table, or to write in what’s wrong.

ED: I seem to remember these things were so culturally loaded. One of the what’s wrong with this picture was a tennis court without a net on it.

ALAN STOSKOPF: Yeah, yeah.

ED: So people who’d never seen a game of tennis weren’t sure what this should be.

ALAN STOSKOPF: Or there’s a building with a chimney on it, or a missing chimney, and what’s wrong with it? Well, the testers of course would think, oh, missing chimney. But a high percentage of the Italian immigrant kids would put a crucifix because cooking was done outside of the cottage, not in the house itself. So the chimney wouldn’t be there. Made sense to them, but of course they’d be marked incorrectly. And of course, it’s loaded in that way. But at the time it was very unexamined. The famous line by Lewis Terman was, “the tests have told the truth.”

ED: Now, didn’t they notice the fact that you would have 1/3 of the recruits for the United States Army in World War I be ranked as feeble-minded? But that didn’t cause them to recalibrate their faith in these tests?

ALAN STOSKOPF: Not at that point. It didn’t, actually. The result of those army tests would have a life beyond the original purpose. Those results would be used in testimony in the United States Congress to argue for the restriction of inferior types. Carl Brigham, the founder of the SAT, was a eugenicist at the time, was a psychologist at Princeton University, basically took all the tests and the results of the tests and said, see beyond a reasonable doubt. Can’t you see that the Hungarians, the Poles, the southern Europeans, African-Americans– look at where they rank. This should be a wake-up call that America needs not to dilute its racial purity. Eugenics, as a kind of belief system, was very, very well respected. We have very short memories. Later it would be discredited, and dissident voices would emerge, but into the 1900s, 1910s, 1920s, not really until the 1930s do you really get a broad-based reaction against that.

ED: So how could it pivot so quickly from being so powerful in 1920 to the 1930s being discredited?

ALAN STOSKOPF: I mean, what really accelerated it was Nazi Germany. Hitler and the Nazis were referring to the pioneering work done in the United States around sterilization, in immigration restriction, and the use of the tests. And there was a sense of, oh my god. We haven’t taken it this far.

ED: So that’s heartening. And yet, the testing, if anything, has gained momentum in the years since. How would we explain that relationship?

ALAN STOSKOPF: Certainly, a person of goodwill can believe that intelligence tests or many other kinds of standardized tests, really serve a purpose. And I would probably agree that they do in certain kinds of instances. However, there still is, with us today, the strong cultural belief that a single test and a single score that’s administered in a timed environment– it trumps a teacher’s evaluation. It trumps maybe a year of observation working with a human being, that there’s something about the rigor and the precision of those questions that tell all. I think that’s a very powerful belief, and that’s still with us today. Maybe it’s an American thing. Let’s get to the bottom line. Let’s really find out what you’re worth. Put your money on the table, so to speak, and who are you?

ED: Alan Stoskopf is an education historian at UMass, Boston. He’s the author Race and Membership in American History: The Eugenics Movement.

Hearing that story, we can’t help but be a little bit anxious about the eugenics origins of these tests in which so much depends. But I’d like to point out that they worked perfectly well. The SAT predicts, with extreme clarity, family income. So what do we do with that knowledge?

PETER: That’s a good point, but I think the reform impulse is to say, what’s wrong with the test? And how can we make it better? Find all the things that do map onto a social or cultural characteristics and trying to make it the perfect test. And I think the big problem is the dream of a perfect test that would be fair. I think it’s our version of the Field of Dreams. This is the level playing field, and we can be a true meritocracy–

BRIAN: And that embrace of merit. Somewhere in my century, in the middle of the 20th century, they will really– we’re going to rise above all these petty prejudices and craft the perfect meritocratic instrument.

ED: So if we know that this is the origins of these tests, why do we stick to them?

PETER: Because I think, Ed, we want to get it right. And to get it right is not to abandon the whole idea of merit and improving yourself on a level playing field, and so we keep nudging it in the direction of an imagined perfection. And I don’t know how we can do without it.

BRIAN: And I think it’s less about the test, Peter, and getting it right than wanting to know where we fit. Where exactly do we fit into a society that at least claims to be classless?

Earlier in the show, we discussed how written tests replace oral recitation in American schools. We’re going to turn to another kind of written test that was just as revolutionary in its time– the Civil Service Exam. Today, we assume that government workers have met the qualifications that go along with their jobs. A park ranger, say, can read a map. And employees in the Department of Justice know a little something about the law.

PETER: But historian Mark Summers reminds us that, for much of American history, that wasn’t necessarily true. Until the late 19th century, getting a job in the federal government was based almost exclusively on political patronage, not merit.

MARK SUMMERS: The real thing is what party do you belong to, how much service have you given to them. Party matters more than anything else.

PETER: I– I– I’m a little confused here. Are my qualifications entirely irrelevant?

MARK SUMMERS: No, not really. It depends. If your qualifications include not having been in jail, then those qualifications are relevant.

PETER: Summers reminds us that this type of quid pro quo was known as the spoils system, as in, to the victor go the spoils.

MARK SUMMERS: Don’t think of it as corruption. Think of it as democracy. I think that’s a much nicer way of describing it. After all, think about it. Do you really want your enemies to hold all the offices and implement the policy? Don’t you think, if the people say, we want a Republican government, they want one from the top to the bottom? The people have spoken.

PETER: That spoils system really got started under President Andrew Jackson in the 1820s, but Summers says that by 1881 it had reached a crisis point.

MARK SUMMERS: Republicans have been in power for 20 years. A Republican administration is going out. A new Republican administration under James Garfield is coming in. In the first months of this administration, the only thing that’s going to happen is an absolute political brawl between two Republican factions about who gets the spoils. It’s ugly. It’s bitter. It deadlocks the Congress.

And after about three months of this, just when the president has won the battle and has got his way on this, is when he goes to a railroad station, and a spoilsman for the opposing faction that has just lost out, shoots and fatally wounds him. The result is going to be an enormous trauma. If before there was a disorganized cry for reform, now there is almost a universal shout in both parties. This must not happen again. The spoils system is killing the republic. It’s certainly killing the presidents.

PETER: Something is seriously sick in the republic, yeah.

MARK SUMMERS: Oh, it’s no question.

PETER: So this is a perfect storm, you might say– the ideal conditions for rethinking the spoils system.

MARK SUMMERS: It is a perfect storm, yeah. The civil service bill gentleman George Pendleton had offered, had been offered regularly for about six years. Nothing had happened. Nothing was likely to happen. But now you’ve got motivation for it.

PETER: Mark, Garfield gets killed in 1881, and 18 months later, the Pendleton Act gets passed. What does it do?

MARK SUMMERS: First of all, it shuts off the flow of forced contributions and shakedowns of government employees– the assessments, those are known as. Second, it sets up a bunch of jobs in the government that you can’t be fired for your politics for, and you have to take an exam and be appointed on the basis of merit. Competitive examinations.

PETER: So Mark, tell us a little bit about this exam. It’s one example fits all?

MARK SUMMERS: Oh no, no, no. Every different department has a different merit exam. And it’s the kind of thing where, in point of fact, you can pass the exam, but if you don’t get the highest score– and there’s a lot of other people competing– the job isn’t going to go to you.

PETER: Help me understand why we need a test. Aren’t there other ways to clean the stables and give us good government? Do you need to give a test?

MARK SUMMERS: There is no other way than a test that will work. If you do it based on good character references, my goodness. I could forge good character references enough to fill a library out there. If you do it based on the good word of a congressman, well, it depends. Do you generally think of congressmen as honest? The only way to do it is a test that’s essentially taken anonymously. It’s graded not on the basis of the name at the top, but on how well you score, and how well you do. Without that, you’re going to go right back to the nudge, nudge, wink, wink way of appointing people. There’s no other alternative out there.

PETER: So Mark, the Pendleton Act of 1883 at least begins to change everything. It doesn’t change everything right away, but it points toward good government.

MARK SUMMERS: Yes. I have to offer some cautions. One of the kind of spoonfuls of sugar that makes this medicine go down is the Pendleton Act doesn’t cover every civil servant. It doesn’t come anywhere close. There’s about 120,000 people working for the government in 1883, and the Pendleton Act covers 14,000. So there’s plenty of goodies to give out, but what’s happening is the number of offices that are covered by the civil service system are increasing. And one reason is kind of a kicker. In the middle of the act it says that any president can, by an executive decree, increase the number of offices that are protected by the Civil Service Act.

Now, what this means is every time a president goes out of office, he’s going to go, ooh, look at that. Everybody I’ve hired and put in the Indian Bureau is going to be fired if I go out, unless, by an executive order, I say the Indian Bureau is covered. And so every president makes the Act bigger, and makes it cover more offices.

PETER: And it does, as you’ve suggested, create a hierarchy of meritocracy, as you rightly put it.

MARK SUMMERS: And if you want a government for the people, you’re going to need a meritocracy because it’s pretty clear from actual practice that a government so-called of the people, especially if they’re the politicians, is not going to serve you. I mean, this is not a good way to run a system out there. You don’t want people that are going to be like the New York City politicians that were known as the Paint Eaters because when they get into the office, they take everything they can. It’s even said they eat the paint off the walls. This is not a way to create a government that works for the people.

PETER: Well, Mark, nobody believes we live in a world that’s totally governed by merit. It still matters who you know and, well, even who you vote for. That still applies in government, doesn’t it on some level? I’m thinking about ambassadorships, for instance.

MARK SUMMERS: Oh, sure. There’s not a question about it. Ambassadors are rewarded for party loyalty, and they’re rewarded for having given enormous amounts of money. Very standard, very common even today. In the 1950s, this resulted– at least, I’m told– in our ambassador in France, speaking German but not French, and our ambassador in Germany, being able to speak French, but not German. And any time they give a merit exam for cabinet members, it’s going to be a blue moon and the cow’s going to be jumping over it.

PETER: Well, it’s the American way. And thanks for showing the way, Mark. It’s been great talking with you.

MARK SUMMERS: Oh, been a pleasure. Thank you.

PETER: Mark Summers is a historian at the University of Kentucky and author of Party Games: Getting, Using, and Keeping Power in Gilded Age Politics.

ED: We’re going to turn now to a test that about two million people take each year, but it’s not given in school. It’s the Myers-Briggs Type Indicator, and it promises to reveal which of 16 possible personality types best describe you.

DAVID PITTENGER: Well, like most personality tests, it involves a series of questions.

ED: This is David Pittenger, graduate dean of Marshall University, and he has written about the Myers-Briggs test. He says it’s a long list of questions with multiple choice answers. For example–

DAVID PITTENGER: At a party, I like to be, and then some options might be, the life of the party, or to engage in a conversation with a friend. So fairly standard in terms of how psychologists go about assessing personality.

BRIAN: The setup is simple, but the results are a little complicated. I took an online version of the test and discovered that I’m an ENFJ.

ED: Don’t be so hard on yourself.

PETER: I never would have believed that. That’s ridiculous.

BRIAN: According to the Myers-Briggs, I’m more extroverted than introverted. I rely on intuition more than sensing. I favor feelings over thinking. Well, that’s kind of obvious, isn’t it? And I have a slight preference for judgment over perception.

PETER: I’ve never perceived that about you, man.

ED: That sound exactly like you, Brian.

BRIAN: Yes.

ED: Now, this Myers-Briggs test was born in an American living room in 1917, and invented by a mother and daughter team. The mother was Katharine Cook Briggs, a schoolteacher and amateur psychologist. She used what she knew about child development to craft a list of personality types.

DAVID PITTENGER: She and her daughter, Isabel Briggs Myers, then begin to spend pretty much their professional/personal career developing the test. It was an attempt to craft questions that would divine personality.

ED: During World War II, Isabel adapted the test to help women find jobs on the home front. Since many women hadn’t worked outside the home before, she thought that a personality test could help steer them to the right jobs.

BRIAN: Today, the Myers-Briggs Type Indicator is a multi-million dollar industry, consisting of test-givers, test-takers, and workshops analyzing results. More than 10,000 companies, 2,500 colleges and universities, and 200 government agencies use the test in hiring, firing, and promotions. But Pittenger says it’s built on a questionable premise. He claims it’s not clear that personality is composed of types, as Myers and Briggs thought.

DAVID PITTENGER: Types are essentially dichotomous. You either are or aren’t. So human sexuality, you either are a man or a woman. A trait, however, is a continuous scale. And so we might say, if we’re using sex as a type, gender identity might be a trait. People can vary on masculinity and femininity. And the consensus among psychologists is that personality is largely a trait base, so that you are an extrovert. Well, that means that there are some people who are extraordinarily extroverted, that would, by comparison, make you look like a wallflower introvert.

And then at the other extreme is somebody who is an introvert, who is happiest when they are by themselves, contemplating their own thoughts. Most of us are in the middle. Perhaps an analogy might be height. Most of us are close to an average height. There are some who are a little bit taller and a little bit shorter. And yes, there are extremes, like a basketball player or a jockey. So that’s the first issue that most psychologists have been concerned by.

BRIAN: OK, that’s one problem. We’re really all a mix of extrovert, introvert. What’s another problem with the test?

DAVID PITTENGER: I think a second concern is what we call the psychometric, or the measurement of psychological factors, isn’t as strong as we would want them to be–

BRIAN: You’ll have to explain that to me. I already told you I’m not much of a thinker according to this test.

DAVID PITTENGER: So one of the things that we look at is reliability, which is a fancy word for consistency. If you take an intelligence test, for example, there is a high probability that if you take the test again, you’ll get a similar number, plus or minus 2 or 3 points. It’s considered a high reliability type of instrument. Personality is much less. The relationship between testing and retesting changes. And that’s not too terribly surprising, given the sort of ephemeral nature of personality.

But the way the Myers-Briggs treats things is it cuts nature at its joints. It says, you either are an extrovert, or you are an introvert. And here’s where the problem comes up. Most people, when they take the Myers-Briggs Type Indicator, when they take it again– and this is called test-retest reliability. When they take it again, there is a fairly high probability– 30% to 50%– that at least one of those letter items will change. So it’s possible that you could take the test again, and your feeling will turn to thinking, or your introversion will convert to extroversion.

BRIAN: And that’s pretty dramatic when they’re ostensibly opposites.

DAVID PITTENGER: Yes. And the problem is, if personality really were a type, then what we would expect to see is two normal distributions, one representing extroverts, and another distribution representing introverts with relatively little overlap. But we don’t see that. A lot of people then retest and end up crossing that magical line from one personality type to another. So that becomes very problematic.

BRIAN: But tell me something. I mean, how come all these companies keep using this test? Could 10,000 companies be wrong, not to mention those 2,500 colleges and universities? They must see some value here.

DAVID PITTENGER: I think it’s because it’s easy. It’s easy to understand. It’s easy to use it’s promoted as being a great panacea to a lot of perceived problems in their institution, and so they just naturally see some level of a success, which is misattributed, and stick with the instrument.

Let me give you an analogy. A company says that they can predict who’s going to win the men’s basketball championship. Out of the 64 teams, they say this is the team that will win. And so you pay them $5 and they tell you what the first bracket is going to look like. And lo and behold, it ends up being right. And so you say, wow. These guys really got it. I think the same thing’s happening with a corporation. They give the Myers-Briggs Type Indicator. Voila, the people that are working for them are doing just fine. The problem is that they could have picked people at random for the position, and those people would have done just as well. And so it’s sort of this–

BRIAN: And has that been demonstrated? Has somebody shown that?

DAVID PITTENGER: Yes. The predictability is pretty low. You’re really not going to get a lot of evidence saying, boy, you really need to hire introverts for your accounting staff. And if you want a good salesperson, get the extroverts who are bouncing off the wall. Those data just aren’t there. It doesn’t provide the sort of prediction that’s available, especially when you look at the individual identification– the 16 types.

BRIAN: I want to thank you so much for joining us today.

DAVID PITTENGER: Well, it’s a pleasure talking with you.

BRIAN: David Pittenger is graduate dean of Marshall University.

ED: So listening to all this history of testing is a little bit weird, guys, because, as you know, we’re academics. We live in a world of tests. We got here by passing tests. We give out tests.

BRIAN: Yeah, speak for yourself.

ED: So do you think that we still bear the marks of anxiety over the tests that we’ve had in the past?

BRIAN: I definitely do. I get extremely nervous every time I administer a test.

ED: Whoa. What’s with that?

BRIAN: I don’t know. You tell me.

PETER: It’s payback time, Brian. It’s payback time.

BRIAN: It’s sympath– I experience all of the feelings that I had. As a student, I felt like my entire being, my entire existence, was riding on doing well on this test. And the question of who is Knut Hamsun in my Strindberg and Ibsen class made me feel like, not only my future was riding on this, but my entire past, because what idiot wouldn’t know who Knut Hamsun was? Well, you’re looking at that idiot. And so what I put down was Knut Hamsun played third base for the Chicago Cubs. And I learned very quickly that my comparative lit teacher did not have a sense of humor.

PETER: Or was not a sports fan.

BRIAN: Neither. So Peter, did you have a test story?

PETER: Well, I hated tests, so I never give them. And I give midterm exams, but I always skip the final, against college rules. I just didn’t administer them.

ED: I believe I was the dean when you were doing that, Peter.

PETER: I think that’s right. Because I just thought it was garbage. And it was three hours of garbage.

ED: Which then had to be graded, I happened to notice.

PETER: Then you have– that is true. But I always gave a long paper that I could thoughtfully comment on. I do think if you’re privileged to teach the great students that we had that we don’t need to continue this relentless sorting business. Let’s pose a higher level of tests, and that is think through this stuff and tell me something I didn’t know.

ED: Wow, that’s idealistic. I mean, the test I remember is a test I never actually had, but the test I dreamed the night before I gave my first lecture. And I was back on the campus of my graduate school and I ran into some friends, and they said, hey, how’d you do on the test, Ed? And I said, what test? And they said, the test you have to have before you can be a professor. And I said, oh no! What test? And they said, yeah, the math part is a killer. I really literally woke up and then I realized, wait a minute. I never have to take another test in my life. I have a job.

BRIAN: So is this how we ended up as scholars? We all hate taking tests?

ED: Then all you have is to be tested every time you write a book or an article or give a talk. So we’re tested all the time.

[MUSIC – SAM COOKE, “WONDERFUL WORLD”]

PETER: That’s going to do it for us today. Head over to our website to let us know what you thought of the show. You’ll find us at backstoryradio.org. While you’re there, send us your questions about our upcoming episodes on the history of Judaism in America, and on gambling. Or send email to backstory@virginia.edu. We’re also on Facebook, Tumblr, and Twitter @BackStoryRadio. Whatever you do, don’t be a stranger.

ED: BackStory is produced by Andrew Parsons, Brigid McCarthy, Nina Earnest, Kelly Jones, and Emily Gadek. Jamal Millner is our engineer. Diana Williams is our digital editor, with help from Briana Azar. Melissa Gismondi helps with research.

BRIAN: BackStory is produced at the Virginia Foundation for the Humanities. Major support is provided by the [? Shia ?] [? Khan ?] Foundation, the National Endowment for the Humanities, the Joseph and Robert Cornell Memorial Foundation, and the Arthur Vining Davis Foundations.

Additional funding is provided by the Tomato Fund, cultivating fresh ideas in the arts, the humanities, and the environment, and by History Channel. History, made every day.

FEMALE SPEAKER: Special thanks this week to our 19th century test-takers, [? Shep ?] Davis, Omar [? Nazir, ?] and [? Sopher ?] Fine.

Brian Balogh is Professor of History at the University of Virginia. Peter Onuf is Professor of History Emeritus at UVA, and senior research fellow at Monticello. Ed Ayers is Professor of the Humanities and President Emeritus at the University of Richmond.

BackStory was created by Andrew Wyndham for the Virginia Foundation for the Humanities.

ED: Hey, Peter. Congratulations on your new book with an importantly, Most Blessed of the Patriarchs, about Thomas Jefferson.

BRIAN: Peter, can’t wait to read it. Congratulations.

PETER: Thanks, guys.

MALE SPEAKER: BackStory is distributed by PRX, the Public Radio Exchange.

Segments

Thanks, Mann

In the 1840s, while he was superintendent of Massachusetts schools, Horace Mann devised some of the first written tests in the United States that he thought would really measure learning and prove his hunch that Boston schools weren’t the best in the country.

Boston; Horace Mann; education; measurement

00:00:00 / 00:00:00

Jesus Doesn’t Grade On A Curve

Early American Christians had to pass a rigorous test if they wanted to become members of the Puritan church. Peter, Ed and Brian consider what this and other types of pre-19th century tests measured.

Puritans; measurement; religious conversion; theology

00:00:00 / 00:00:00

There’s Nothing Standard About It

Psychologist Henry Goddard thought he could devise a test that would accurately rank students based on intelligence, but his ideas and methods were heavily influenced by the eugenics movement.

education; eugenics; measurement

00:00:00 / 00:00:00

To The Victors Go The Government Jobs

Until the late 19th century, party affiliation mattered more than qualifications when it came to landing a job in the federal government. The hosts talk to historian Mark Summers about the origins of the civil service exam.

civil service; measurement; patronage

00:00:00 / 00:00:00

What’s Your Name, What’s Your Personality Type

The mother-daughter team of Katharine Cook Briggs and Isabel Briggs-Myers invented the Myers Briggs Type Indicator test to help women find jobs during WW2. Now a multi-million dollar industry, is it possible these tests can’t actually measure personality at all?

character; measurement; personality

00:00:00 / 00:00:00