high stakes

There’s always been confusion surrounding Tennessee’s growth model. With a missing year of data, new questions pile on

PHOTO: Laura Faith Kebede

At a time when scores are about to be used for high-stakes decisions in how to improve Tennessee’s schools, gaps in the state’s data and uncertainty about how scores were derived have left Memphis officials wondering how to interpret the torrent of information.

Last year’s chaotic state testing, which led to the cancellation of the state’s test for grades 3 to 8, left a crucial gap in the data meant to help make decisions about schools and teachers.

School leaders have also said they were puzzled by the state’s methodology in reaching the so-called growth scores upon which districts and schools are judged — particularly by how they arrived at the Memphis district’s low score.

Even those who are paid to sift through the data say they are having trouble getting answers to questions about the growth scores, known as TVAAS. Bill White, chief of planning and accountability for Shelby County Schools, conceded to board members last week that he didn’t know the ins and outs of the complex formula and the changes meant to compensate for the missing data.

“I have personally never been shown all the mathematics behind our data and how this works,” he told board members. “I do know that it has been peer-reviewed and vetted and it’s essentially been held up among those statisticians. But there is a lot that goes on behind the scenes that no one has been able to walk us through.”

The confusion has renewed skepticism about the state’s value-added model, which is supposed to help officials identify the impact that schools and teachers have on student performance. The system relies on the state’s data measuring student growth in districts.

Part of the problem is last year’s botched testing, which is having multiple ripple effects throughout the state.

This year, growth scores are comparing 2016-17 test results with the 2014-15 school year, the most recent data available. That throws a wrench in how to assess which school or teacher is responsible for a child’s growth over a two-year period. And for elementary schools, that means there is no data for fourth graders this year since testing in third grade, the first year students take state tests, was canceled.

In addition, one subject was dropped entirely from TVAAS calculations because social studies questions were a trial run for elementary and middle schools students and did not count.

Statisticians for the most part have figured out how to calculate growth even when a state transitions to a new test. But the missing data creates a whole other host of challenges the revisions attempt to account for.

One Memphis charter leader said he still isn’t quite sure how his school even got a score since last year his highest grade level at the school was third grade, the first year of testing.

“It’s such a convoluted formula, it’s hard for us to understand. We’re not sure how we got (our score),” said the charter leader, who declined to be named because he was still seeking answers from the state.

Damian Betebenner, a senior associate at Center for Assessment that regularly consults with state departments, said missing data on top of a testing transition “muddies the water” on results.

“When you look at growth over two years, so how much the student grew from third to fifth grade, then it’s probably going to be a meaningful quantity,” he said. “But to then assert that it isolates the school contribution becomes a pretty tenuous assertion… It adds another thing that’s changing underneath the scene.”

At the same time, TVAAS scores for struggling schools will be a significant factor to determine which improvement tracks they will be be placed on under the state’s new accountability system as outlined in its plan to comply with the federal Every Student Succeeds Act. For some schools, their TVAAS score will be the difference between continuing under a local intervention model or being eligible to enter the state-run Achievement School District. The school growth scores will also determine which charter schools are eligible for a new pot of state money for facilities.

The state has data analysts based across Tennessee to help districts with their questions and provide data simulations for the complex formula that has been replicated in other states.

“Of course, the reason it is complex is because we want it to be fair for educators and therefore capture as much data and nuance as possible – which is discussed at length in the technical documentation,” said a state department spokeswoman.

The state has also published an overview video of how the formula works and details on the recent changes in a 46-page, formula-packed document from SAS, the private company that calculates teacher and school scores for the state.

But as far as knowing how the state gets from A to Z, White said he still has questions.

“I’ve had some questions about getting access to certain data myself,” said White, who routinely interprets data for the district. “We would like a lot more access to what goes into TVAAS.” (He later declined to elaborate.)

He’s not the only one. When the Tennessee Education Association unsuccessfully sued Knox County Schools over its use of TVAAS in awarding teacher bonuses, access to data on how the scores were calculated was central to the association’s argument that the district denied teachers due process, said Rick Colbert, TEA’s general counsel.

When Colbert attempted to subpoena technical documents on the calculations, SAS blocked it partially because the request would divulge “trade secrets.”

“When they’re called upon to defend it you get a lot of general statements but you can’t get a lot of information to see if you can back that up,” Colbert said. “There’s so much about TVAAS that can’t be explained.”

Board member Mike Kernell called it a double standard and asked White last week if the district could request a demonstration of the complicated formula.

“I think the state department of education ought to show its work if they’re asking children to show their work,” he said.

research report

Three years in, some signs of (slight) academic growth at struggling ‘Renewal’ schools

PHOTO: Patrick Wall
Mayor Bill de Blasio at Brooklyn Generation School — part of the Renewal program

When Mayor Bill de Blasio launched an aggressive and expensive campaign to turn around the city’s lowest performing schools, he made a big promise: Schools would see “fast and intense” improvements within three years.

Almost exactly three years later, and after flooding 78 schools with more than $386 million in new social services and academic support, there are signs that the Renewal program has generated gains in student learning. The evidence is based on two newly updated analyses of test score data — one from Marcus Winters, a fellow at the conservative-learning Manhattan Institute, and the other from Aaron Pallas, a professor at Teachers College.

But the researchers caution that those improvements are modest — when they exist at all — and don’t yet match the mayor’s lofty promises.

The results may have implications far beyond New York City, as a national and political test case of whether injecting struggling schools with resources is more effective than closing them.

The two researchers previously reviewed the first two years of test score data in elementary and middle schools in the Renewal program: Winters found a positive effect on test scores, while Pallas generally found little to no effect.

Now, as the program reaches its third birthday, the pair of researchers have updated their findings with new test score data from last school year, and largely reaffirmed their earlier conclusions.

“We’re not seeing large increases” in student achievement, Pallas said. “And the reality is it’s hard to get large increases in struggling schools.”

Some advocates have argued that it is too early to expect big shifts in test scores, and that infusing schools with extra social services like mental health counseling and vision screenings are valuable in themselves. But de Blasio’s promise of quick academic turnaround has invited questions about Renewal’s effectiveness and whether resources can be more effective in improving low-performing schools than shuttering them.

To assess the program’s academic effect, Pallas compared changes in Renewal school test scores to other schools that had similar test results and student demographics when the program started, but did not receive extra support.

The biggest gains Pallas found were concentrated at the elementary level.

Over the past three school years, 20 elementary schools in the Renewal program have made larger gains on average in math and reading than 23 similar schools that didn’t get extra resources. The proportion of elementary school students considered proficient in reading at Renewal schools increased from 7 percent in 2014 to 18 percent last year — an 11-point jump. Meanwhile, the comparison schools also saw gains, but only by seven percentage points, giving Renewal schools a four percentage point advantage.

At the middle school level, the results are less encouraging. The 45 Renewal middle schools did not collectively outperform a group of 50 similar schools outside the program in reading or math.

In math, for instance, Renewal school students improved from 5 percent proficient to 7 percent. However, the comparison schools outside the program improved by roughly the same margin — increasing proficiency from 6 to 9 percent (and still far below city average). In reading, Renewal middle schools showed slightly less growth than the comparison group.

City officials have argued that Pallas’ findings are misleading partly because Renewal schools and the comparison schools are not actually comparable. Renewal schools, they say, were designated based on a range of factors like school climate or teacher effectiveness, not just student demographics and test scores.

“The schools included in the study are neither similar nor comparable in quality and a comparison of the two dissimilar groups is unreliable at best,” Michael Aciman, an education department spokesman, said in a statement. Aciman added that Renewal schools have made larger gains in reading and math than similar schools across the state, and have made progress in reducing chronic absenteeism and improving instruction.

Pallas notes that there are some limitations to his approach, and acknowledges that he could not account for some differences between the two groups, such as the quality of a school’s principal. He also does not use student-level data, for instance, which would allow a more fine-grained analysis of whether the Renewal program is boosting student achievement. But Pallas, and other researchers who have previously reviewed his data, have said his model is rigorous.

The Manhattan Institute’s Winters found more positive trends than Pallas, consistent with his earlier findings. Using an approach that evaluates whether Renewal schools are outperforming historical trends compared with schools outside the program, Winters found that the Renewal program appeared to have a statistically significant effect on both reading and math scores — roughly equivalent to the difference in student achievement between charter schools and traditional district schools in New York City.

Asked about how to interpret the fact that his results tended to be more positive, Winters said either interpretation is plausible.

“It’s hard to tell which of these is exactly right,” he said. But “neither of us are finding results that are consistent with what we would expect if the program is having a large positive effect.”

explainer

Five things to know about the latest brouhaha over Tennessee’s TNReady test

PHOTO: Laura Faith Kebede

Last week’s revelation that nearly 10,000 Tennessee high school tests were scored incorrectly has unleashed a new round of criticism of the standardized test known as TNReady.

Testing company Questar says it muffed some tests this spring after failing to update its scanning software. A year earlier, a series of mistakes got its predecessor, Measurement Inc., fired when Tennessee had to cancel most of TNReady in its first year after a failed transition to online testing.

While the two companies’ glitches are hardly comparable in scope, Questar’s flub has uncorked a tempest of frustration and anger over the standardized assessment and how it’s used to hold teachers accountable.

Here are five things to know about the latest TNReady flap:

1. A relatively small number of students, teachers, and schools are affected.

State officials report that the scoring problem was traced to only high school tests, not for its grade-schoolers. Of the 600,000 high school end-of-course tests, about 9,400 were scored incorrectly. Most of the fixes were so small that fewer than 1,700 tests — or less than one-tenth of 1 percent — saw any change in their overall performance level. A state spokeswoman says the corrected scores have been shared with the 33 impacted districts.

2. But the TNReady brand has taken another huge hit.

Tennessee has sought to rebuild public trust in TNReady under Questar and celebrated a relatively uneventful testing season last spring. But the parade of problems that surfaced during TNReady’s rollout, combined with this year’s drops in student performance under the new test, have made subsequent bumps feel more like sinkholes to educators who already are frustrated with the state’s emphasis on testing. Questar’s scanning problems were also tied to delays in delivering preliminary scores to school systems this spring — another bump that exasperated educators and parents at the end of the school year and led many districts to exclude the data from student report cards.

3. State lawmakers will revisit TNReady — and soon.

House Speaker Beth Harwell asked Monday for a hearing into the latest testing problems, and discussion could happen as early as next week when a legislative study committee is scheduled to meet in Nashville. Meanwhile, one Republican gubernatorial candidate says the state should eliminate student growth scores from teacher evaluations, and a teachers union in Memphis called on Tennessee to invalidate this year’s TNReady results.

4. Still, those talks are unlikely to derail TNReady.

Tennessee is heavily invested in its new assessment as part of its five-year strategic plan for raising student achievement. Changing course now would be a surprise. Last school year was the first time that all students in grades 3-11 took TNReady, a standardized test aligned to the Common Core standards, even though those expectations for what students should learn in math and English language arts have been in Tennessee classrooms since 2012. State officials view TNReady results as key to helping Tennessee reach its goal of ranking in the top half of states on the Nation’s Report Card by 2019.

5. Tennessee isn’t alone in traveling a bumpy testing road.

Questar was criticized this summer for its design of two tests in Missouri. Meanwhile, testing giant Pearson has logged errors and missteps in New York, Virginia, and Mississippi. And in Tennessee and Ohio this spring, the ACT testing company administered the wrong college entrance exam to almost 3,000 juniors from 31 schools. Officials with the Tennessee Department of Education emphasized this week that they expect 100 percent accuracy on scoring TNReady. “We hold our vendor and ourselves to the highest standard of delivery because that is what students, teachers, and families in Tennessee deserve,” said spokeswoman Sara Gast.