City releases Teacher Data Reports — and a slew of caveats

When the Department of Education’s embargo of Teacher Data Reports details lifted at noon today, news organizations across the city rushed to make the data available.

The Teacher Data Reports are “value-added” assessments of teachers’ effectiveness that were produced from 2008 to 2010 for reading and math teachers in grades 3 to 8.

This morning, department officials including Chancellor Dennis Walcott and Chief Academic Officer Shael Polakow-Suransky met with reporters to offer caution about how the data reports should be used. They emphasized the reports’ wide margins of error — 35 percentage points for math teachers and 53 percentage points for reading teachers, on average — and that the reports reflect only a small portion of teachers’ work.

“We would never advise anyone — parent, reporter, principal, teacher — to draw a conclusion based on this score alone,” Polakow-Suransky said.

Most of the news organizations that filed Freedom of Information Law requests for the ratings plan to publish them in searchable or streamlined databases, with the teachers’ names attached. GothamSchools does not plan to publish the data with teachers’ names or identifying characteristics included because of concerns about the data’s reliability.

At least two other news organizations that cover education are also not publishing the data: the local affiliate of Fox News, according to a representative of Fox, and the nonprofit school information website Insideschools.

Department officials are asking schools not to release the reports to parents. They issued a guide today advising principals about how to handle parents who demand that their child be removed from the class of a teacher rated ineffective.

“Resist changing student/teacher assignments mid-year, as doing so is disruptive to all students’ learning,” the guide advises.

“We definitely will not be moving kids around based on a data point that is two years old,” Polakow-Suransky said today.

Joel Klein, who was chancellor when the reports were launched, was a public champion of the idea that releasing teacher ratings would empower parents to demand better schools and teachers for their children. He argued publicly that the ratings of individual teachers should be released after news organizations filed legal requests for them.

But Walcott began distancing himself from that position almost immediately after becoming chancellor last year. And last fall, his administration announced it had stopped producing the reports, citing a new state teacher evaluation system that will include value-added measures. Walcott explained in a Daily News column today that he is not thrilled that teachers “might be denigrated” as a result of the data dump.

Today, Walcott and Polakow-Suransky argued that Klein’s initial stance was justified — at the time. “What we’re looking at in 2012 is quite different from what Joel was looking at in 2010,” Polakow-Suransky said.

Because New York City adopted value-added measures relatively early, he said, the ratings were in fact among the most sophisticated measures available at the time. But even at the time, the ratings were not meant for public consumption or to be used in high-stakes decision-making, he said.

Polakow-Suransky acknowledged one exception to the high-stakes decision ban: School officials did use value-added ratings to inform decisions about whether to grant some teachers tenure.

We reported last year that principals said they were being prevented from granting tenure to teachers whose students had low test scores, particularly if they were among the roughly 12,000 teachers annually to receive a data report.

Polakow-Suransky said today that 133 teachers up for tenure last year were flagged because of low value-added scores. More than a third of the flagged teachers received tenure after their principals and superintendents determined that other factors carried more weight, he said.

The city released reports for about 18,000 different teachers over the course of the three school years during which they were produced. About 12,000 teachers received reports in each of the 2008-2009 and 2009-2010 school years, and about half of them received two because they taught both reading and math. (A smaller pilot group received reports in 2007-2008.)

Because many teachers’ students were tested in both math and English, some teachers were given two different value-added ratings for a single year — below average in one subject and above average in the other.

The reports also indicate whether teachers were found to have done especially well at boosting the test scores of particular types of students, such as students with disabilities, low-income students, or students who are considered English language learners.

Department officials said 77 percent of the teachers who received reports are still working in the city schools.

Next week, ratings will be released for teachers in schools for students with severe disabilities and for charter school teachers. The release of charter school teachers’ data reports reverses the department’s earlier decision not to release them because charter schools had opted into the Teacher Data Report program voluntarily.

Officials cautioned that the scores were especially unreliable for teachers with relatively few students. That group includes elementary school teachers, who typically had fewer than 30 students factor into their scores.

They also said reports are less reliable for teachers whose students were either all high-performing or mostly low-performing. The difference is due to the nature of the state tests, which were designed to distinguish among middle-level students — those just above and below the state’s proficiency cutoff. As a result, small differences among students at the top and bottom had an outsized impact on a teachers’ rating, Polakow-Suransky said.

The issue was one of several lessons learned that Polakow-Suransky said the city has flagged for the State Education Department, which is building its own value-added algorithm to be used as a part of new teacher evaluations.

The state is working with the American Institute of Research to build its model; the city’s model was designed by a research center at the University of Wisconsin.

Other lessons Polakow-Suransky identified include the importance of publishing value-added scores in the context of their margins of error and the need to create a way for teachers and principals to verify data.

The city introduced a verification system that teachers could use two years into the Teacher Data Report project. Less than 40 percent of teachers signed in to use it, but a significant number of them found major errors. For example, three percent of the teachers who signed in discovered that they had been marked as teaching classes that they had not actually taught.

In their lawsuit attempting to stop the ratings from being released, United Federation of Teachers officials identified more than 200 such errors. UFT President Michael Mulgrew said today that the errors showed that the city had not been respectful of teachers in developing or releasing the ratings.

“This was a complete debacle in terms of how the DOE has handled [the reports] and the mismanagement of the data inside of the system,” he said.

Asked whether he thought the state’s model would avoid the pitfalls the UFT identified in the city’s reports, Mulgrew said, “I’m not confident.”

But he said the state’s evaluation requirements are an improvement over the Teacher Data Reports because the new evaluations will showcase the value-added calculations alongside other measures of teacher quality.