Testing

Why the IPS superintendent isn’t worried that test scores are down

PHOTO: Meghan Mangrum

Indianapolis Public Schools Superintendent Lewis Ferebee has this message for IPS parents who are worried about test scores: Don’t trust the numbers.

The ISTEP scores released last week show the percentage of IPS students passing the state exam declined for a second straight year, but Ferebee said he has little faith in the scores.

That’s because testing in Indiana has gone through so much turmoil in recent years that he says the scores cannot be compared from one year to the next.

“It’s not fuji apple to fuji apple,” he said. “It is very unfair to try to make that comparison.”

Though scores fell across the state, IPS saw more significant declines than the average Indiana district. However, dozens of other districts, including Washington and Warren townships, saw more precipitous declines.

The low scores for IPS were made harder to swallow by the fact that even some of the district’s top schools saw drops in scores. The dip ignited a firestorm of criticism online in part because the scores were released the same day IPS board awarded Ferebee a $26,999 bonus.

The bonus was not connected to test scores and was based on previously agreed upon criteria.

Ferebee is not alone in his criticism of ISTEP. After years of testing turmoil as the state changed standards and switched test makers, many school leaders, policymakers and observers have concluded the ISTEP is simply broken.

But Ferebee was hired three years ago on a promise to turn things around in the struggling district. Test scores are one of the clearest ways to determine whether changes he’s made such as partnering with charter schools and recruiting new leaders are bearing fruit.

With those scores on the decline, it’s no surprise that Ferebee would take issue with them but he argued the fact that schools across the state saw a second year of declines shows the new test is out of sync with typical exams.

“When you see a drop in the second year, that tells you that something is wrong,” he said. “If it was unique to one or two school districts, that’s understandable. It’s just a few districts being impacted. But it’s statewide issue.”

Although passing rates dropped in the district, unreleased, early data the district provided about its A-F score suggests that students are making improvements on the test.

Ferebee argues those growth scores are a more useful measure of student learning.

“I’ve said since the onset as we looked at our accountability model that a year’s growth annually is the expectation for all students,” he said. “Given that philosophy and that lens, I believe growth data is a better indicator of school success.”

Indiana has been going through testing turmoil throughout Ferebee’s tenure as superintendent. During his first year leading the district, when the state used an older version of ISTEP, scores increased slightly and the number of schools receiving Fs on the state accountability scale were cut by a third. Those are improvements district leaders have touted, and Ferebee said he has more faith in that earlier version of the test because it was more consistent.

When he arrived, Indiana had adopted the Common Core State Standards and was preparing to switch to the PARCC exam, a national test aligned with the standards. But the state legislature pulled out of the Common Core State Standards in 2014, and the state education department rushed to develop a replacement test.

That new, harder test was plagued by problems and student scores plummeted. But eventually lawmakers concluded the results would be good enough to set a baseline for future years. Although the state switched to a new vendor in 2016, the tests were designed to be comparable and there were few widespread issues with administration or scoring.

Still, the new version of ISTEP has proven so unpopular that lawmakers are aiming to replace it with yet another new test.

Ferebee said that as the state looks for a new exam, he hopes lawmakers find an option that can be used to give teachers feedback on what students know rather than the current system where teachers don’t learn how their students are doing until long after it’s too late to help them. He also said that if the test is used to measure teacher effectiveness, the focus should be on student growth.

“If a teacher has a classroom of students that were academically advanced, and you are basing it on proficiency, that teacher already has a leg up,” he said. “If you are basing it on growth, that can give more insight.”

research report

Three years in, some signs of (slight) academic growth at struggling ‘Renewal’ schools

PHOTO: Patrick Wall
Mayor Bill de Blasio at Brooklyn Generation School — part of the Renewal program

When Mayor Bill de Blasio launched an aggressive and expensive campaign to turn around the city’s lowest performing schools, he made a big promise: Schools would see “fast and intense” improvements within three years.

Almost exactly three years later, and after flooding 78 schools with more than $386 million in new social services and academic support, there are signs that the Renewal program has generated gains in student learning. The evidence is based on two newly updated analyses of test score data — one from Marcus Winters, a fellow at the conservative-learning Manhattan Institute, and the other from Aaron Pallas, a professor at Teachers College.

But the researchers caution that those improvements are modest — when they exist at all — and don’t yet match the mayor’s lofty promises.

The results may have implications far beyond New York City, as a national and political test case of whether injecting struggling schools with resources is more effective than closing them.

The two researchers previously reviewed the first two years of test score data in elementary and middle schools in the Renewal program: Winters found a positive effect on test scores, while Pallas generally found little to no effect.

Now, as the program reaches its third birthday, the pair of researchers have updated their findings with new test score data from last school year, and largely reaffirmed their earlier conclusions.

“We’re not seeing large increases” in student achievement, Pallas said. “And the reality is it’s hard to get large increases in struggling schools.”

Some advocates have argued that it is too early to expect big shifts in test scores, and that infusing schools with extra social services like mental health counseling and vision screenings are valuable in themselves. But de Blasio’s promise of quick academic turnaround has invited questions about Renewal’s effectiveness and whether resources can be more effective in improving low-performing schools than shuttering them.

To assess the program’s academic effect, Pallas compared changes in Renewal school test scores to other schools that had similar test results and student demographics when the program started, but did not receive extra support.

The biggest gains Pallas found were concentrated at the elementary level.

Over the past three school years, 20 elementary schools in the Renewal program have made larger gains on average in math and reading than 23 similar schools that didn’t get extra resources. The proportion of elementary school students considered proficient in reading at Renewal schools increased from 7 percent in 2014 to 18 percent last year — an 11-point jump. Meanwhile, the comparison schools also saw gains, but only by seven percentage points, giving Renewal schools a four percentage point advantage.

At the middle school level, the results are less encouraging. The 45 Renewal middle schools did not collectively outperform a group of 50 similar schools outside the program in reading or math.

In math, for instance, Renewal school students improved from 5 percent proficient to 7 percent. However, the comparison schools outside the program improved by roughly the same margin — increasing proficiency from 6 to 9 percent (and still far below city average). In reading, Renewal middle schools showed slightly less growth than the comparison group.

City officials have argued that Pallas’ findings are misleading partly because Renewal schools and the comparison schools are not actually comparable. Renewal schools, they say, were designated based on a range of factors like school climate or teacher effectiveness, not just student demographics and test scores.

“The schools included in the study are neither similar nor comparable in quality and a comparison of the two dissimilar groups is unreliable at best,” Michael Aciman, an education department spokesman, said in a statement. Aciman added that Renewal schools have made larger gains in reading and math than similar schools across the state, and have made progress in reducing chronic absenteeism and improving instruction.

Pallas notes that there are some limitations to his approach, and acknowledges that he could not account for some differences between the two groups, such as the quality of a school’s principal. He also does not use student-level data, for instance, which would allow a more fine-grained analysis of whether the Renewal program is boosting student achievement. But Pallas, and other researchers who have previously reviewed his data, have said his model is rigorous.

The Manhattan Institute’s Winters found more positive trends than Pallas, consistent with his earlier findings. Using an approach that evaluates whether Renewal schools are outperforming historical trends compared with schools outside the program, Winters found that the Renewal program appeared to have a statistically significant effect on both reading and math scores — roughly equivalent to the difference in student achievement between charter schools and traditional district schools in New York City.

Asked about how to interpret the fact that his results tended to be more positive, Winters said either interpretation is plausible.

“It’s hard to tell which of these is exactly right,” he said. But “neither of us are finding results that are consistent with what we would expect if the program is having a large positive effect.”

explainer

Five things to know about the latest brouhaha over Tennessee’s TNReady test

PHOTO: Laura Faith Kebede

Last week’s revelation that nearly 10,000 Tennessee high school tests were scored incorrectly has unleashed a new round of criticism of the standardized test known as TNReady.

Testing company Questar says it muffed some tests this spring after failing to update its scanning software. A year earlier, a series of mistakes got its predecessor, Measurement Inc., fired when Tennessee had to cancel most of TNReady in its first year after a failed transition to online testing.

While the two companies’ glitches are hardly comparable in scope, Questar’s flub has uncorked a tempest of frustration and anger over the standardized assessment and how it’s used to hold teachers accountable.

Here are five things to know about the latest TNReady flap:

1. A relatively small number of students, teachers, and schools are affected.

State officials report that the scoring problem was traced to only high school tests, not for its grade-schoolers. Of the 600,000 high school end-of-course tests, about 9,400 were scored incorrectly. Most of the fixes were so small that fewer than 1,700 tests — or less than one-tenth of 1 percent — saw any change in their overall performance level. A state spokeswoman says the corrected scores have been shared with the 33 impacted districts.

2. But the TNReady brand has taken another huge hit.

Tennessee has sought to rebuild public trust in TNReady under Questar and celebrated a relatively uneventful testing season last spring. But the parade of problems that surfaced during TNReady’s rollout, combined with this year’s drops in student performance under the new test, have made subsequent bumps feel more like sinkholes to educators who already are frustrated with the state’s emphasis on testing. Questar’s scanning problems were also tied to delays in delivering preliminary scores to school systems this spring — another bump that exasperated educators and parents at the end of the school year and led many districts to exclude the data from student report cards.

3. State lawmakers will revisit TNReady — and soon.

House Speaker Beth Harwell asked Monday for a hearing into the latest testing problems, and discussion could happen as early as next week when a legislative study committee is scheduled to meet in Nashville. Meanwhile, one Republican gubernatorial candidate says the state should eliminate student growth scores from teacher evaluations, and a teachers union in Memphis called on Tennessee to invalidate this year’s TNReady results.

4. Still, those talks are unlikely to derail TNReady.

Tennessee is heavily invested in its new assessment as part of its five-year strategic plan for raising student achievement. Changing course now would be a surprise. Last school year was the first time that all students in grades 3-11 took TNReady, a standardized test aligned to the Common Core standards, even though those expectations for what students should learn in math and English language arts have been in Tennessee classrooms since 2012. State officials view TNReady results as key to helping Tennessee reach its goal of ranking in the top half of states on the Nation’s Report Card by 2019.

5. Tennessee isn’t alone in traveling a bumpy testing road.

Questar was criticized this summer for its design of two tests in Missouri. Meanwhile, testing giant Pearson has logged errors and missteps in New York, Virginia, and Mississippi. And in Tennessee and Ohio this spring, the ACT testing company administered the wrong college entrance exam to almost 3,000 juniors from 31 schools. Officials with the Tennessee Department of Education emphasized this week that they expect 100 percent accuracy on scoring TNReady. “We hold our vendor and ourselves to the highest standard of delivery because that is what students, teachers, and families in Tennessee deserve,” said spokeswoman Sara Gast.