Just months after taking office, Mayor Bill de Blasio announced a big bet: He would flood schools that were among the city’s worst with extra social services and academic support and give them three years to improve.
Read more from this series
Nearly three years — and $386 million — later, city officials say the program is showing academic benefits, citing increases in test scores and graduation rates among its 86 “Renewal” schools.
But those improvements look very different when the schools are compared against others that are most like them, according to a first-of-its-kind analysis. Schools that had similarly low graduation rates and test scores when the program started — but did not receive an infusion of new resources — posted similar growth on those two metrics.
That finding doesn’t mean the Renewal program isn’t making a difference at those schools. But it does illustrate the pitfalls of the mayor’s promise of “fast and intense” improvement — and how far the city still has to go to prove its school turnaround strategy has been worth the investment.
“With an initiative this ambitious, there’s a lot of political pressure for the administration to demand — and claim — quick academic gains,” said Megan Hester, an organizer at the Annenberg Institute for School Reform. But “deep, sustainable institutional change takes time.”
The new findings come from Aaron Pallas, a Teachers College professor who conducted a review of test score and graduation rate data comparing Renewal schools with others that were similarly low-performing when the program started, but did not get the extra resources. (Chalkbeat interviewed several principals at these under-the-radar schools and found that in many cases, they are charting their own approaches to serving some of the city’s highest-need students.)
Pallas matched the city’s 78 current Renewal schools with 90 others with similar graduation rates, test scores and student demographics — a comparable proportion of black and Hispanic students, for instance, and similar percentages of students with disabilities or coming from low-income families. Each Renewal school was matched with three different configurations of comparison schools based on how similar they were along those categories to avoid the possibility that any single match would skew the results.
(City officials disputed this method of comparing schools because they said it does not account for differences in school performance beyond test scores and graduation rates.)
First, the graduation rates: City officials have often noted that the 31 current Renewal high schools have boosted graduation rates from an average of 52 to an average of 59 percent since the program started more than two years ago.
But over the same time frame, 23 high schools that are demographically similar to Renewal schools and started with slightly lower graduation rates (averaging 50 percent) made bigger gains. When each Renewal high school is matched to its three “nearest neighbors,” the comparison schools improved to graduate 63 percent of their students — nearly six percentage points more growth than the Renewal schools.
That number fluctuates slightly depending on which combination of comparison schools are matched with Renewal schools. But the comparison schools outperformed Renewal schools using three different combinations of matches. After eliminating two comparison schools that made unusually large gains, the comparison schools still outperformed Renewal schools, though only by a small, statistically unreliable margin.
Using the same approach for English and math proficiency in grades 3-8, as measured by state tests, Pallas found there were no significant differences between Renewal schools and the comparison group, which started out with similar scores.
Twenty Renewal elementary schools, for instance, increased the percentage of their students proficient in reading from 7 to 15 percent over the two previous school years — an eight percentage point jump. But when matched to the Renewal schools, the 23 comparison schools jumped seven points, almost the same margin.
Elementary school math proficiency among Renewal schools grew to 10 percent, a three point bump — but the comparison schools jumped two points, winding up at an identical level of student proficiency as the Renewal schools. (Citywide, reading proficiency has jumped almost 10 points over the same period, and math proficiency has increased just over 2 points.)
At the middle school level, there was virtually no measurable difference in growth in Renewal school math and reading scores measured against the comparison group.
“Based on the best evidence I see,” Pallas said, “the program is not having a meaningful impact on academic outcomes.”
The education department disputed that conclusion, arguing the findings are “grossly misleading” because the Renewal schools and the comparison schools aren’t comparable to begin with. City officials said all Renewal schools met a series of criteria — historically low performance, being labeled as “focus” or “priority” schools by the state because of their performance, and posting low scores on the school’s quality review — while none of the comparison schools met all three.
Eric Ashton, the education department’s executive director for school performance, noted that Renewal schools were chosen partly based on qualitative judgements of principal or teacher effectiveness — factors that weren’t directly considered in Pallas’s model.
“Renewal schools are the ones we expected to do worse to begin with,” Ashton said. “Just because you have two schools that start with the same graduation rates and demographics, that doesn’t mean they’re the same quality.”
Pallas acknowledged that there are some limitations to his approach. Unlike an experiment where schools are randomly assigned to the Renewal program, he used a statistical model to match schools with similar starting test scores, graduation rates, and student demographics in an attempt to tease out the effect that Renewal is having compared with schools outside the program.
That means differences in school quality that Pallas didn’t measure — principal effectiveness, for instance — could be important variables in explaining why Renewal schools did not improve more than the comparison schools. Still, Pallas argues his model is strong enough to show that the Renewal program has not yet produced convincing academic gains, a conclusion outside researchers said is reasonable.
“This research shows that positive gains by Renewal schools were not better than schools with similar demographics,” said Jonah Rockoff, an education researcher at Columbia University who has studied New York City schools in the past and reviewed Pallas’s data.
Rockoff echoed the city’s argument that Renewal schools were chosen because they were struggling, making comparisons difficult. But, he said, the findings are still “probably an indication that the Renewal program didn’t have dramatic short-run effects.”
Some supporters of the Renewal approach also point out that many of the city’s efforts, including partnerships with social service providers, took months to implement — leaving Renewal schools with just one full year of extra help so far.
And despite de Blasio’s claim that the schools would show quick gains, the city has also been clear that it sees improvements in attendance, student engagement, and school culture as measures of the program’s success. (City officials pointed out that attendance at Renewal schools has ticked up, and that graduation rates have increased faster since the program started.)
“I think the administration made itself very vulnerable by postulating big expectations for academic growth in Renewal schools,” said Norm Fruchter, a researcher at New York University, and a former de Blasio appointee to the Panel for Educational Policy. “The scale of resources for the system to intervene in poverty is a very hard thing to do, and it’s likely you won’t soon see the results you want.”
Clarification: This story has been updated to clarify that the elimination of two comparison schools that made unusually large gains in graduation rates resulted in a difference that was not statistically reliable.