A skeptic says class sizes and student’s gender also might affect teacher ratings

‘Why do the girls get all the good teachers?’

That’s the question that Rutgers University education professor Bruce Baker asked himself after he took a different look at the teacher ratings that New York State released last week. When he crunched the data, he found that teachers in elementary schools with more boys than girls were about 15 percent more likely to receive a lower rating on the portion of their evaluation based on state test scores.

“Is it that all the ‘bad’ teachers are teaching in schools with higher percentages of boys?” Baker wrote on his blog School Finance 101. “Or is there something about a class with a larger share of boys that makes it harder to generate comparable gains on fill in the bubble, standardized tests?”

Baker, an outspoken critic of teacher evaluation policies based on student test scores, said he found similar patterns in schools with larger class sizes and whose districts receive the least funding. In elementary schools where the average class size was larger by five students, for instance, teachers were about 20 percent more likely to receive the lowest two ratings — “developing” or “ineffective.”

To make his argument, Baker took school-level teacher evaluation data and matched it to student enrollment data for the same schools. He acknowledged weaknesses in his methodology, citing outliers he couldn’t immediately explain.Middle school teachers’ ratings did not show the same gender disparity and New York City teachers outperformed other both affluent and poor districts outside of the city, for instance.

Still, he said his critique challenged the state’s portrayal of its teacher quality data as a fair depiction of an individual teacher’s performance. The critique comes amid a new debate over how to design the state’s teacher evaluation system, which lawmakers overhauled earlier this year and whose finer details will be determined in the coming weeks.

“Let’s not just assume what a teacher’s effect is on student achievement growth,” he said. “Let’s ask, what are the other factors that might be explaining the difference in the student achievement growth?”

The state’s data release last week came with the conclusion that poor, black and Hispanic students New York City schools — and in schools across the state — were more likely to have lower-rated teachers. That conclusion was based on comparing growth scores of about 40,000 English and math teachers in fourth through eighth grade, based on an “enhanced growth model” that’s supposed to measure how much students learn under an individual teacher and control for a variety of outside factors.

But critics like Baker say that even though some student characteristics are accounted for, teachers are still penalized for circumstances that are out of their control — and unrelated to their performance. Baker said that would have been evident had the state looked into those factors — like class sizes differences, funding disparities or a class room’s share of boys, who typically have more behavioral challenges and perform worse on state tests.

 The State Education Department declined to comment on Baker’s critique or say if it looked at whether class sizes or a class’ gender composition could have affected a teacher’s score. Officials said last week that the data was a reliable indicator of teacher performance and could be used to help ensure top teachers are equitably distributed across a district’s entire school system.

 Tim Daly, president of TNTP, said the questions Baker was asking were “definitely the type of disaggregation and analysis we need to be doing.” But he said a more comprehensive examination would need to compare ratings based on the class sizes of individual teachers — instead of a school’s average.

How the state’s evaluation system is designed is once again up for public debate. Gov. Andrew Cuomo and lawmakers passed legislation in which a teacher’s evaluation next year will be based on two factors — student growth and observations — although the department and Board of Regents will have some discretion to determine the finer details of the evaluation plans. That process is currently underway and will need to be finished within the next eight weeks.