Friday, March 16, 2012

Insider Information on How Standardized Tests are Scored

The Loneliness of the Long-Distance Test Scorer

Entire article linked above. Favorite parts below.

 I recently spent four months working for two test-scoring companies, scoring tens of thousands of papers, while routinely clocking up to seventy hours a week. This was my third straight year doing this job. While the reality of life as a test scorer has recently been chronicled by Todd Farley in his book Making the Grades: My Misadventures in the Standardized Testing Industry, a scathing insider’s account of his fourteen years in the industry, I want to tell my story to affirm that Farley’s indictment is rooted in experiences common throughout the test-scoring world.1

 Test scoring is a huge business, dominated by a few multinational corporations, which arrange the work in order to extract maximum profit. I was shocked when I found out that Pearson, the first company I worked for, also owned the Financial Times, The Economist, Penguin Books, and leading textbook publisher Prentice Hall. The CEO of Pearson, Marjorie Scardino, ranked seventeenth on the Forbes list of the one hundred most powerful women in the world in 2007.

 Test-scoring companies make their money by hiring a temporary workforce each spring, people willing to work for low wages (generally $11 to $13 an hour), no benefits, and no hope of long-term employment—not exactly the most attractive conditions for trained and licensed educators. When I began working in test scoring three years ago, my first “team leader” was qualified to supervise, not because of his credentials in the field of education, but because he had been a low-level manager at a local Target.

Remarkably, for a company entrusted with assessing students’ educational performance, messages from Pearson contain a disturbing number of misspellings, incorrect dates, typos, and missing information. Pearson’s online video orientation, for example, warns scorers that they may face “civil lawshits” from sexual harassment. Error-free communications are rare. I was considering whether this was a fair assessment, when I received a message from Pearson with the subject “Pearson Fall 2010.” The link in the e-mail took me to a survey to find out my availability—for the spring of 2011.

 I imagine that most students think their papers are being graded as if they are the most important thing in the world. Yet every day, each scorer is expected to read hundreds of papers. So for all the months of preparation and the dozens of hours of class time spent writing practice essays, a student’s writing probably will be processed and scored in about a minute. Scoring is particularly rushed when scorers are paid by piece-rate, as is the case when you are scoring from home, where a growing part of the industry’s work is done.

At 30 to 70 cents per paper, depending on the test, the incentive, especially for a home worker, is to score as quickly as possible in order to earn any money: at 30 cents per paper, you have to score forty papers an hour to make $12 an hour, and test scoring requires a lot of mental breaks.  So every night, while scoring from home, I would surf the Internet and cut and paste loads of articles—reports on Indian Maoists, scientific speculation on whether animals can be gay, critiques of standardized testing—into what typically came to be an eighty-page, single-spaced Word document. Then I would print it out and read it the next day while I was working at the scoring center. This was the only way to avoid going insane. I still managed to score at the average rate for the room and perform according to “quality” standards.

While scoring from home, I routinely carry on three or four intense conversations on Gchat. This is the reality of test scoring. Unfortunately, after scoring tests for at least five states over the past three years, the only truly standardized elements I have found are a mystifying training process, supervisors who are often more confused than the scorers themselves, and a pervasive inability of these tests to foster creativity and competent writing.

 Scorers often emerge from training more confused than when they started. Usually, within a day or two, when the scores we are giving are inevitably too low (as we attempt to follow the standards laid out in training), we are told to start giving higher scores, or, in the enigmatic language of scoring directors, to “learn to see more papers as a 4.” For some mysterious reason, unbeknownst to test scorers, the scores we are giving are supposed to closely match those given in previous years. So if 40 percent of papers received 3s the previous year (on a scale of 1 to 6), then a similar percentage should receive 3s this year.

Lest you think this is an isolated experience, Farley cites similar stories from his fourteen-year test-scoring career in his book, reporting instances where project managers announced that scoring would have to be changed because “our numbers don’t match up with what the psychometricians [the stats people] predicted.” Farley reports the disbelief of one employee that the stats people “know what the scores will be without reading the essays.”2

 I also question how these scores can possibly measure whether students or schools are improving. Are we just trying to match the scores from last year, or are we part of an elaborate game of “juking the stats,” as it’s called on HBO’s The Wire, when agents alter statistics to please superiors? For these companies, the ultimate goal is to present acceptable numbers to the state education departments as quickly as possible, beating their deadlines (there are, we are told, $1 million fines if they miss a deadline).

Proving their reliability so they will continue to get more contracts. I remember reading, for twenty-three straight days, the responses of thousands of middle-schoolers to the question, “What is a goal of yours in life?” A plurality devoted several paragraphs to explain that their life’s goal was to talk less in class, listen to their teacher, and stop fooling around so much. It’s asking too much to hope for great literature on a standardized test. But, given that this is the process through which so many students are learning to write and to think, one would hope for more. These rote responses, in themselves, are a testament to the failure of our education system, its failure to actually connect with kids’ lives, to help them develop their humanity and their critical thinking skills, to do more than discipline them and prepare them to be obedient workers—or troops. While we test scorers might be prone to blame these children for the monotony of their thoughts, it’s not their fault that their imaginations and inspirations are being sucked out of them.

 As a friend of mine was saying his goodbyes to the coworkers in his room at the end of this year’s scoring season, his seventy-year-old supervisor, a veteran test-scoring warrior, uttered the words I imagine many test scorers hope to hear: “I hope I never see you here again.” This is a measure of the cynicism with which many test scorers approach the industry, recognizing that it is fundamentally a game, which too many people are forced to play—but “hey, it beats working at McDonald’s or Subway!”

 If scoring is any indication, everyone should be worried about the logic of putting more of our education system in the hands of these for-profit companies, which would love to grow even deeper roots for the commodification of students’ minds.

