As pressure grows to raise scores, how do districts spot cheating?

This article was originally published in The Notebook. In August 2020, The Notebook became Chalkbeat Philadelphia.

What’s not to love about a classroom whose year-to-year test score gains are greater than those of 95 percent of all the other classrooms in its district?

Enough classrooms with that kind of gain would mean banners hanging outside the school, increased job security for teachers and administrators, and glowing press conferences for central office staff.

But what if you could look at every student’s responses on the test and your review turned up very suspicious response patterns?

Across the country, there have long been anecdotal reports of cheating on standardized tests, as well as occasional scandals. These have included allegations of principals who go over the response sheets with "magic erasers" and of teachers peeking at upcoming test questions and using the information for last-minute test prep.

As the stakes of standardized testing have been ratcheted up, in large part through the accountability targets and sanctions of the No Child Left Behind law, the incentive for teachers and administrators to manipulate test results has also grown. Ask a roomful of educators, and you’ll hear stories about varieties of ways that security rules for tests are bent or broken.

There are different means to detect and prevent such cheating. In Chicago, for example, the school district has started using a statistical model to examine test responses for cheating. Using this model, it has estimated that substantial test score manipulation occurred in between 1.5 and 4.5 percent of Chicago classrooms. This analysis has led to investigations of at least 29 classrooms and the resignations of several teachers.

In Philadelphia and throughout Pennsylvania, the highest-stakes assessment is the PSSA (the School District of Philadelphia also administers the TerraNova). The primary means of preventing and detecting cheating on standardized tests in Philadelphia is a random internal auditing process through which the District sends out between 65 and 70 unannounced trained monitors to classrooms to enforce the stringent regulations for how the tests are to be administered.

Those administering the tests may give students encouragement and general instruction and may make allowable accommodations for disabled students and students with Limited English Proficiency. They may not, however, edit any student response, assist any student in any way that would aid the student in answering an item, leave any student alone while taking the test, or leave anything hanging on a classroom wall that might aid a student in answering an item.

Jack Hoerner, an educational assessment specialist on test security for the Pennsylvania Department of Education (PDE), told the Notebook that there have been "five or six" incidents of test score manipulation in Pennsylvania in the last year, two of which could be considered "severe breaches of test security." According to Hoerner, none of these incidents occurred in Philadelphia.

Over the last two years, according to the School District’s recently departed Chief Accountability Officer Joe Jacovino, the District has conducted "twelve or thirteen" investigations into instances of suspected or possible cheating. Approximately six investigations each year resulted from the random, District-led audits of testing procedures.

The District’s only identified serious incident of cheating in the last two years involved the TerraNova exam, and it was identified not through an audit of test procedures but through a test score analysis.

In 2003, CTB/ McGraw-Hill, the company that produces and scores the TerraNova exam, conducted an analysis of score reports that revealed a pattern of suspicious classroom responses. After an investigation and retesting of the classroom, the cheating was confirmed. The teacher involved was "suspended without pay for a period of time," according to Jacovino.

Jacovino said he was not aware of whether Data Recognition Corporation (DRC), which produces and scores the PSSA exam for the state, conducted a formal analysis of student and classroom PSSA responses. While Hoerner indicated that DRC does indeed conduct such analyses, neither DRC nor state officials would provide details.

Jacovino noted that "the state has never approached [the District] as a result of [any such] analysis."

The estimate that 1.5 to 4.5 percent of classrooms in Chicago likely cheated might be "high for us," Jacovino speculated. Despite the apparent potential for a conflict of interest in having the District rely heavily on its own review process of an assessment that has such high stakes attached to it, he expressed faith in the District’s auditing process.

"Certainly, you’d like some more independent analysis," he said, "but … our vested interest was to ensure that procedures were being followed and to nip anything that might be troubling in the bud."