Why NYC teacher evaluations don’t yield meaningful feedback

I love reflecting with teachers, but I hate the evaluation process.

First Person is where Chalkbeat features personal essays by educators, students, parents, and others thinking and writing about public education.

It’s my favorite moment during a debriefing with teachers. I ask a probing question; they pause, look off to the side, say, “That’s a good question,” and trail off. In these moments, the teacher is thinking and, in the process, they are becoming a better teacher.

As we head into the end of the school year here in New York City, one of the tasks we, administrators, have to do is have end-of-year conferences with each of our teachers; during these meetings, we reflect on their goals for the year and sign the Measures of Teacher Practice, or MoTP, evaluation summary. I love reflecting with teachers, but I hate the evaluation process.

Jeremy Kaplan (Courtesy photo)

One of the major educational trends in the past decade has been the revamping of teacher evaluations. In New York City, this has meant adopting Charlotte Danielson’s rubric in 2014 and using a four-point scale to evaluate teachers as Ineffective, Developing, Effective, or Highly Effective.

A lot of time, effort, and money is spent by New York City’s education department supporting teacher evaluation. Our superintendent’s team makes periodic visits to our school, and the instructional supervisors have to spend a half-day visiting classes together, “norming” our evaluations.

The premise behind this system and other similar systems is that if we get teacher evaluation right, then teachers will be pushed to develop their practice.

I have come to the conclusion that this premise is wrong. In fact, teacher evaluation makes teacher development more difficult.

Writing in Chalkbeat, Kim Marshall, a leader in teacher evaluation, offered eight criticisms of the current evaluation system back when New York City first adopted it. My experience bears out his concerns.

Evaluation hinders teacher growth. Everyone wants a good grade, and so during an evaluative observation, a teacher goes into defensive mode. They emphasize the positive and minimize the negative, explaining away problems: “That student has been absent for days,” or “You didn’t see the end of the lesson,” or “Those students need basic skills.” The goal is to get a good grade. Improvement be damned.

And then there is the growth stance. When I ask what the teacher thought of the lesson during a non-evaluative observation, they are more likely to be honest about weaknesses and look for ways to improve. When I offer a critical observation, they will often ask what ideas I have to address it. Inviting me to give ideas increases the chance that they will implement it. Sometimes a teacher will ask me to observe something new that they are trying. Taking risks is more likely to happen when evaluation is not in the picture. This is how teachers improve.

One way people grow is by getting feedback, but evaluations yield bad feedback. It is, arguably, not even feedback at all. Grant Wiggins, in his excellent 2012 article “Seven Keys to Effective Feedback,” explains that feedback “is information about how we are doing in our efforts to reach a goal.” If you tell a joke, for example, feedback is seeing if people laugh. The best feedback is descriptive, not value judgments. Evaluating teachers using the Danielson rubric is an attempt to be descriptive, but it is the grade that sticks.

As an instructional supervisor, I do as many non-evaluative observations as possible. I simply pop into classes, invited or not, and then have “coaching” debriefs soon after.

My technique comes mainly from David Rock’s excellent 2007 book “Quiet Leadership,” which argues that the best and most efficient way to improve someone’s performance in any field is to facilitate their own thinking, not to tell them what to do. The reason is brain functioning; a person understands an idea only when they form a new synapse in their brain. It can’t be done for them.

Teacher evaluation makes teacher development more difficult.

And so, when debriefing, I mostly ask questions, looking for an “entry-point” to get the teacher to recognize a good thing that can be expanded or a problem that needs a solution. “What do you think you did well? What would you have done differently? Where do you notice gaps in student answers?”

And I can tell when I hit on something meaningful when the teacher says, “Good question,” and stares off to the side. In these moments, they are generating an idea that they can implement.

Evaluation erodes trust. But I have been able to gain the trust of the teachers I supervise — and our end-of-year feedback survey bears this out — by downplaying evaluation in favor of non-rated cycles of observations.

I also invite my teachers to observe my classes and give me feedback — a type of reverse observation. This process enables teachers to see some of the things that I talk about.

And so, at our end-of-year conferences, teachers will sign their Measures of Teacher Practice summary ratings sheet. The number on that sheet is only a very blunt estimation of the quality of a teacher’s practice. But we will also, luckily, have a reflective conversation.

In his Chalkbeat piece from 2014, Kim Marshall argues for a teacher-evaluation system with one evaluative grade at the end of the year, a rating decided collaboratively between teacher and administrator, based on at least 10 unrated observations. Such a system would better support a teacher’s practice and growth.

Why not change the evaluation system to one that actually supports teacher development? That’s a good question.

Jeremy Kaplan is an assistant principal of supervision at High School for Health Professions and Human Services in New York City. He has been a teacher, instructional coach, and assistant principal in New York City since 1994.