data-driven decisionmaking

Why we won't publish individual teachers' value-added scores

Tomorrow’s planned release of 12,000 New York City teacher ratings raises questions for the courts, parents, principals, bureaucrats, teachers — and one other party: news organizations. The journalists who requested the release of the data in the first place now must decide what to do with it all.

At GothamSchools, we joined other reporters in requesting to see the Teacher Data Reports back in 2010. But you will not see the database here, tomorrow or ever, as long as it is attached to individual teachers’ names.

The fact is that we feel a strong responsibility to report on the quality of the work the 80,000 New York City public school teachers do every day. This is a core part of our job and our mission.

But before we publish any piece of information, we always have to ask a question. Does the information we have do a fair job of describing the subject we want to write about? If it doesn’t, is there any additional information — context, anecdotes, quantitative data — that we can provide to paint a fuller picture?

In the case of the Teacher Data Reports, “value-added” assessments of teachers’ effectiveness that were produced in 2009 and 2010 for reading and math teachers in grades 3 to 8, the answer to both those questions was no.

We determined that the data were flawed, that the public might easily be misled by the ratings, and that no amount of context could justify attaching teachers’ names to the statistics. When the city released the reports, we decided, we would write about them, and maybe even release Excel files with names wiped out. But we would not enable our readers to generate lists of the city’s “best” and “worst” teachers or to search for individual teachers at all.

It’s true that the ratings the city is releasing might turn out to be powerful measures of a teacher’s success at helping students learn. The problem lies in that word: might.

Value-added measures do, by many readings, appear to do the job that no measure of a teacher’s quality has done before: They estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability. That is, a teacher judged to be more effective one year by value-added is likely to continue to be judged effective the next year, and the year after that.

But this is not true for every teacher — hardly. Many teachers will be mislabeled; no one disputes this. Value-added scores may be more reliable than existing alternatives, but they are still far from perfectly reliable. It’s completely possible, for instance, that a teacher judged as less effective one year will be judged as very effective the next, and vice versa.

As we reported two years ago, when the NYU economist Sean Corcoran looked at New York City’s value-added data, he found that 31 percent of English teachers who ranked in the bottom quintile of teachers in 2007 had jumped to one of the top two quintile by 2008. About 23 percent of math teachers made the same jump.

The fluctuation is acknowledged by even the strongest supporters of using value-added measures to evaluate teachers. One of the creators of the city’s original value-added model, the Columbia economist Jonah Rockoff, compares value-added scores to baseball players’ batting averages. One of his reasons: In each case, the year-to-year fluctuations of an individual’s score are about the same.

“If someone hit, you know, .280 last year, that doesn’t guarantee they’re going to hit .280 next year,” Rockoff said today. “However, if you hit .210 last year and I hit .300, there’s a very high likelhood I’m going to hit more than you next year, too. Whereas if you hit .280 and I hit .278, we’re basically the same.”

Another challenge is that many researchers still aren’t convinced that value-added scores are measuring the right sort of teacher impact. The challenge lies in the flaws of the measures on which value-added scores depend — standardized state test scores.

Tests are supposed to measure what a student has learned about a subject, but they can also reflect other things, like how well her teacher prepared her for the test, or how well she mastered the narrow band of the subject the test assessed.

The test-prep concern is magnified by findings that a single teacher can generate two different value-added scores if evaluators use two different student tests to determine them. The Gates Foundation’s Measures of Effective Teaching study calculated value-added scores for teachers based on both state tests and more conceptual tests. They found substantial differences between the two, according to an analysis by the economist Jesse Rothstein of the University of California at Berkeley.

“If it’s right that some teachers are good at raising the state test scores and other teachers are good at raising other test scores, then we have to decide which tests we care about,” Rothstein said today. “If we’re not sure that this is the test that captures what good teaching is, then we might be getting our estimates of teaching quality very wrong.”

Flags about exactly what high value-added ratings reward are also raised by studies that ask how the ratings match up with measures of what teachers actually say and do in the classroom. Heather Hill,  professor at Harvard’s Graduate School of Education, rated math teachers’ teaching quality based on an observation rubric called the Mathematical Quality of Instruction, which looks at factors like whether the teacher made mathematical errors and the quality of her explanations. Then Hill compared the math teaching rating to value-added measures.

Two individual cases stood out: One teacher had made a slew of math errors in her teaching, and the other had failed to connect a class activity to math concepts. But teachers’ value-added scores put them at the top of their cohort.

There is some reason to think that value-added measures reflect more than test prep. Rockoff points out that while different tests can produce different value-added scores for the same teacher, the two measures are still correlated. Using different tests, he said, is akin to looking at slugging percentage rather than batting average. “I’m sure those two things are positively correlated, but probably not one for one,” he said.

More persuasively, a recent study by Rockoff and two other colleagues concluded that value-added measures can actually predict long-term life success outcomes, including higher cumulative lifelong income, reduced chance of teen pregnancy, and living in a high-quality neighborhood as an adult. The study examined an anonymous very large urban school district that bears several similarities to New York City.

That study targeted another concern about value-added measures: that teachers score consistently well year after year not because of something they are doing, but because they consistently teach students with certain advantages.

Rothstein has used value-added models to conclude that fifth-grade teachers have strong effects on their students’ performances in third-grade — something they could not possibly influence, unless value-added scores reflect not just teachers’ influence but also advantages brought by students.

Rockoff and his colleagues evaluated the possibility by testing a question. If high-value added teachers do well because they get the “better” students of those in their grade, then their students’ high test score growth would be linked with mediocre performance in other classrooms. That would mean that, when researchers looked at growth for the entire grade, the “better” students’ growth would be canceled out by their less lucky peers. But the scores were not canceled out, suggesting that effective teachers did more than just have unusually good students.

None of this means that we won’t write about what the data dump includes or that we might not publish an adapted database that strips out information linking the city’s data to individual teachers. With more than 90 columns in the Excel sheet the city has developed — and more than 17,000 rows, representing the number of reports issued over their two-year lifespan — the release might well enable us to examine the city’s value-added experiment in new ways.

Value-added measures certainly aren’t going away. City officials only stopped producing Teacher Data Reports because they knew the State Education Department is preparing its own. The measures, which are expected to come out in 2013, will make up 25% of the evaluation for teachers of math and English in tested grades.

Rise & Shine

While you were waking up, the U.S. Senate took a big step toward confirming Betsy DeVos as education secretary

Betsy DeVos’s confirmation as education secretary is all but assured after an unusual and contentious early-morning vote by the U.S. Senate.

The Senate convened at 6:30 a.m. Friday to “invoke cloture” on DeVos’s embattled nomination, a move meant to end a debate that has grown unusually pitched both within the lawmaking body and in the wider public.

They voted 52-48 to advance her nomination, teeing up a final confirmation vote by the end of the day Monday.

Two Republican senators who said earlier this week that they would not vote to confirm DeVos joined their colleagues in voting to allow a final vote on Monday. Susan Collins of Maine and Lisa Murkowski of Alaska cited DeVos’s lack of experience in public education and the knowledge gaps she displayed during her confirmation hearing last month when announcing their decisions and each said feedback from constituents had informed their decisions.

Americans across the country have been flooding their senators with phone calls, faxes, and in-person visits to share opposition to DeVos, a Michigan philanthropist who has been a leading advocate for school vouchers but who has never worked in public education.

They are likely to keep up the pressure over the weekend and through the final vote, which could be decided by a tie-breaking vote by Vice President Mike Pence.

Two senators commented on the debate after the vote. Republican Lamar Alexander of Tennessee, who has been a leading cheerleader for DeVos, said he “couldn’t understand” criticism of programs that let families choose their schools.

But Democrat Patty Murray of Washington repeated the many critiques of DeVos that she has heard from constituents. She also said she was “extremely disappointed” in the confirmation process, including the early-morning debate-ending vote.

“Right from the start it was very clear that Republicans intended to jam this nomination through … Corners were cut, precedents were ignored, debate was cut off, and reasonable requests and questions were blocked,” she said. “I’ve never seen anything like it.”

Week In Review

Week In Review: A new board takes on ‘awesome responsibility’ as Detroit school lawsuits advance

PHOTO: Erin Einhorn
The new Detroit school board took the oath and took on the 'awesome responsibility' of Detroit's children

It’s been a busy week for local education news with a settlement in one Detroit schools lawsuit, a combative new filing in another, a push by a lawmaker to overhaul school closings, a new ranking of state high schools, and the swearing in of the first empowered school board in Detroit has 2009.

“And with that, you are imbued with the awesome responsibility of the children of the city of Detroit.”

—    Judge Cynthia Diane Stephens, after administering the oath to the seven new members of the new Detroit school board

Read on for details on these stories plus the latest on the sparring over Education Secretary nominee Betsy DeVos. Here’s the headlines:

 

The board

The first meeting of the new Detroit school board had a celebratory air to it, with little of the raucous heckling that was common during school meetings in the emergency manager era. The board, which put in “significant time and effort” preparing to take office, is focused on building trust with Detroiters. But the meeting was not without controversy.

One of the board’s first acts was to settle a lawsuit that was filed by teachers last year over the conditions of school buildings. The settlement calls for the creation of a five-person board that will oversee school repairs.

The lawyers behind another Detroit schools lawsuit, meanwhile, filed a motion in federal court blasting Gov. Rick Snyder for evading responsibility for the condition of Detroit schools. That suit alleges that deplorable conditions in Detroit schools have compromised childrens’ constitutional right to literacy — a notion Snyder has rejected.

 

In Lansing

On DeVos

In other news