After years of test-score inflation, the State Education Department is fighting to retain its credibility after scores increased slightly this year.
Amid growing scrutiny, state officials took the unusual step on Monday of posting a memo to its web site that explains why they lowered the number of correct questions needed to pass some of this year’s state reading and math tests. Officials characterized the adjustments as routine and necessary to maintain a consistent level of difficulty over time.
“We’ve been doing [it] for decades and never talked about it in our press releases,” said Deputy State Education Commissioner Ken Wagner in an interview, explaining why the shifts were not disclosed when test scores were released.
But many are still skeptical, in large part because the state has allowed big fluctuations in test difficulty in the past—and those changes were often politically advantageous.
This year, test developers adjusted the number of points that students needed to earn in order to reach a level 3 or 4, denoting academic proficiency on the English and math tests, on eight of the 12 exams given in 2014. The raw scores were lowered on six tests because they were determined to be slightly more difficult than the tests given in 2013, when the state introduced new tests. Raw scores were raised on four tests because they were found to be too easy, and stayed the same for the remaining two.
Officials said the changes were made to ensure that a student would, in theory, get the same score this year as he or she on previous versions of the exam.
Consistency hasn’t been a hallmark of New York state tests, though. The new tests aligned to the Common Core standards forced scores to plummet last year for a second time since 2010. Before that, a three-year boom under then-Commissioner Richard Mills saw city proficiency rates exceed 80 percent in math and near 70 percent in reading. By 2009, the tests were so easy that students could guess on the multiple choice section and still hit the proficiency bar.
“The inflation seen from 2006 to 2009 damaged the credibility of the testing system quite a bit,” Teachers College Professor Aaron Pallas said.
There are a variety of explanations for the inflation during the Mills period. One reason, Pallas said, was that the tests covered a narrow spectrum of content that allowed schools to more easily prepare students; others have implied it had something more to do with Mills’ personal desire for score increases.
Wagner blamed faulty data from practice tests that the state relied on to design its tests. Students didn’t take the tests seriously, Wagner said, but the results were still used to determine the level of difficulty for the real thing.
“Kids were getting a lot of questions wrong, which made the questions look harder than they really were,” Wagner said of the practice tests.
Now, scores are creeping back up. In 2014, city math scores improved nearly five percentage points, while English scores rose two points, according to the state.
Critics who remember the testing bubble from last decade have not been so ready to believe the increases are real. Class Size Matters’ Leonie Haimson wrote on her blog that recent history “should teach us to be open to the possibility” that the scores are being manipulated.
Wagner dismissed the criticism as unwarranted and coming from people unwilling to fully understand the topic.
“Some people appreciate when something is complicated, and are willing to listen,” Wagner said, “and other people aren’t.”
For his part, Pallas—a critic of many of the state’s testing policies—said he believes the state’s scoring adjustments this time around were “credible” and said the new tests were harder to prepare for than the pre-Common Core versions. But, he said, the state needs to continue its efforts to improve transparency.
“I don’t think they’ve done a very good job of it in the past,” Pallas said.
The state has also come under fire for its tallying of city students who did not take the tests. The city insists that just under 2,000 students “opted out” of taking the tests, while the state’s number is more than 10 times that figure. Representatives for both the city and state said that they would have more complete details on Tuesday.
Correction: A previous version stated the wrong number of tests whose raw scores were raised on four tests, not two.