Making the grade

New data show more than half of NYC teachers judged, in part, by test scores they don’t directly affect

PHOTO: Christina Veiga

Just over half of New York City teachers were evaluated in the 2015–16 school year, in part, by tests in subjects or of students they didn’t teach, according to data obtained by Chalkbeat through a public records request.

At 53 percent of city teachers, it’s significant number, but substantially lower than in previous years, possibly thanks to a moratorium placed on using state tests, instituted mid-year.

That figure also highlights a key tension in evaluating all teachers by student achievement, even teachers who work with young students or in subjects like physical education. Being judged by other teachers’ students or subjects has long annoyed some educators and relieved others, who otherwise might have had to administer additional tests.

Supporters say evaluating teachers by group measures — often school-wide scores on standardized tests — helps create a sense of shared mission in a school. But the approach could also push teachers away from working in struggling schools.

“The key point around school-wide measures is that this could serve as a strong disincentive for these teachers in non-tested grades and subjects to stay in lower-performing schools,” said Matthew Steinberg at the University of Pennsylvania, who has studied teacher evaluation systems.

Will Mantell, a spokesperson for the New York City Department of Education, defended the district’s approach.

“Selecting school-wide [or] grade-wide … measures may better measure educators’ practice and support professional development,” he said. “For example, it makes sense for a social studies teacher who emphasizes writing in her classroom to be evaluated partially on an assessment of students’ ELA skills.”

New York’s evaluation system has gone through a number of substantial changes since it was first codified in state law in 2012, part of a nationwide push to connect teacher performance to student test scores, spurred by federal incentives.

Student assessments have comprised anywhere from 40 percent of the evaluation to essentially 50 percent, under a matrix system pushed by Governor Andrew Cuomo in 2015. Most recently, New York stopped using grades 3-8 English and math state tests as part of the system, but teachers must continue to be judged based on some assessment.

States across the country have struggled to evaluate teachers in traditionally non-tested grades and subjects. New York City has created a number of exams — known as performance assessments — in non-tested areas and given schools significant flexibility in which measures are used to judge their teachers.

In the 2015-16 school year, 53 percent of teachers were evaluated by a group metric, meaning one not focused on their subject or students. In the two previous years, the number was much higher — around 85 percent. It’s not clear why there was a substantial drop, but a spokesperson for the city’s education department notes that 2015-16 was an “outlier” due to the moratorium on state tests, instituted mid-year.

In all three years, most teachers were also evaluated by at least one individualized measure targeted to teachers’ grade, subject and students.

Data for the most recent school year are not yet available.

It’s also not clear what percentage of a teacher’s rating was based on group measures, and Mantell said this “varies from teacher to teacher.”

The United Federation of Teachers has pushed to give schools more individual options, including the use of more “authentic” assessments, not based on multiple choice questions.

“Right now, we don’t have enough options, which is why our most recent agreement with the DOE seeks to build more authentic assessments for additional grades and subjects,” said Michael Mulgrew, president of the UFT in a statement.

Group measures offer an alternative to creating exams for each teacher in every grade and subject, which can lead to a proliferation of new tests, though in New York City teachers have often been judged by both group and individual metrics.

The challenge of evaluating teachers in traditionally untested areas is not unique to New York, and a number of states have embraced group or school-wide approaches. An analysis of 32 states, conducted by Steinberg, found that the average teacher in a non-tested grade or subject had about 7 percent of his or her evaluation based on school-wide achievement measures, though this averaged together substantial variation from place to place. Teachers in Tennessee and Florida have sued (unsuccessfully), arguing that it is unfair to evaluate them based on students they didn’t teach.

A more popular option, used in some districts in New York, has been student-learning objectives, in which teachers set goals for students often based on classroom exams. This approach has been praised for helping teachers set specific goals, but criticized as burdensome and easy to manipulate.

Research has found that using school-wide measures of performance tends to bring teachers closer to average performance. An analysis by the Brookings Institution showed that these group measures pulled down ratings of teachers with higher individual ratings at low-performing schools.

year two

Tennessee high schoolers post higher test scores, but some subjects remain a struggle

PHOTO: Marta W. Aldrich
Tennessee Education Commissioner Candice McQueen presents 2017 high school test scores to the State Board of Education.

High school students in Tennessee saw their state test scores rise in 2017, the second year that a new test aligned to the Common Core standards were given in the state.

The increases were modest on average, but sharp for some of the students who have historically struggled most. Just one in five poor students scored at the lowest-level on the ninth-grade English exam, for example, compared to one in three last year.

But in most courses, especially in math, students continued to fall far below the state’s expectations. Even as the state estimates that 11,000 more students met the English proficiency bar this year, two thirds of students still fell below it. And in two advanced math courses, scores actually declined slightly.

The upward trajectory across most subjects puts Tennessee in line with other states that have seen their scores plummet in the first year of new exams, but then rise incrementally afterwards as students and teachers adjust to tougher standards.

Education Commissioner Candice McQueen touted the results Thursday during a brief presentation to the State Board of Education in Nashville.

“These results are encouraging because they show that we’re on the right track,” McQueen said. “As we have moved our standards forward, our teachers and students are meeting those expectations.”

She singled out improvements with historically underserved groups, particularly students with disabilities, and a reduction in the percentage of students performing at the lowest achievement level.

“This positive movement is showing we are taking seriously the work we’re doing with all of our student groups,” McQueen said.

High schoolers scored best on their science exams, which was expected since Tennessee has not yet switched to more rigorous science standards. Those standards will reach classrooms in the fall of 2018.

The statewide scores are the first batch to be released. District- and school-level high school scores come next in August, while results for students in grades 3-8 are due out this fall. Grades 3-8 took TNReady for the first time last school year after their 2016 exams were scuttled amid technical failures.

 

previewing TNReady

Why Tennessee’s high school test scores, out this week, matter more — and less — than usual

PHOTO: Nic Garcia

When scores dropped last year for most Tennessee high school students under a new state test, leaders spoke of “setting a new baseline” under a harder assessment aligned to more rigorous standards.

This week, Tennesseans will see if last year’s scores — in which nearly three-quarters of high schoolers performed below grade level — was in fact just a reset moment.

Education Commissioner Candice McQueen has scheduled a press conference for Thursday morning to release the highly anticipated second year of high school scores under TNReady, which replaced the state’s TCAP tests in 2015-16. (Students in grades 3-8 will get TNReady scores for the first time this fall; last year, their tests were canceled because of a series of testing failures.)

Here’s what you need to know about this week’s data dump, which will focus on statewide scores.

1. Last year’s low scores weren’t a big surprise.

Not only was it the first time Tennessee students took TNReady, it also was the first time that they were being tested on new academic standards in math and language arts known as the Common Core, which reached Tennessee classrooms in 2012.

Other states that switched to Common Core-aligned exams also saw their scores plummet. In New York, for example, the proportion of students who scored proficient or higher in reading dropped precipitously in 2013 during the first year of a new test for grades 3-8.

McQueen sought last year to prepare Tennessee for the same experience. After all, she said, the state was moving away from a multiple-choice test to one that challenges students’ higher-order thinking skills. Plus, while Tennessee students had been posting strong scores on the state’s own exam, they had struggled on national tests such as the ACT, raising questions about whether the previous state test was a good measure of students’ skills.

“We expected scores to be lower in the first year of a more rigorous assessment,” McQueen said after only 21 percent of high school students scored on or above grade level in math, while 30 percent tested ready in English and reading.

2. It’s expected that this year’s scores will rise … and it will be a bad sign if they don’t.

Over and over, state officials assured Tennesseans that 2016 was just the start.

“[We] expect that scores will rebound over time as all students grow to meet these higher expectations — just as we have seen in the past,” McQueen said.

She was referring to the state’s shift to Diploma Standards in 2009, when passing rates on end-of-course tests dropped by almost half. But in subsequent years, those scores rose steadily in a “sawtooth pattern” that has been documented over and over when states adopt new assessments and students and teachers grow accustomed to them.

That includes New York, where after the worrisome results in 2013, the percentage of students passing started inching up the following year, especially in math.

In Tennessee, this year’s high school scores will provide the first significant data point in establishing whether the state is on the same track. Higher scores would put the state on an upward trajectory, and suggest that students are increasingly proficient in the skills that the test is measuring. Scores that remain flat or go down would raise questions about whether teachers and students are adjusting to more rigorous standards.

3. There’s lots more scores to come.

This week’s statewide high school scores will kick off a cascade of other TNReady results that will be released in the weeks and months ahead.

Next comes district- and school-level high school scores, which will be shared first with school systems before being released to the public. That’s likely to happen in August.

In the fall, Tennessee will release its scores for students in grades 3-8, who took TNReady for the first time this year after the 2016 testing debacle. While testing went better this year, the state’s new testing company needed extra time to score the exams, because additional work goes into setting “cut scores” each time a new test is given.

A group of educators just concluded the process of reviewing the test data to recommend what scores should fall into the state’s four new categories for measuring performance: below grade level, approaching grade level, on grade level, or mastered. The State Board of Education will review and vote on those recommendations next month.

4. This year’s scores are lower stakes than usual, but that probably won’t last.

For years, Tennessee has been a leader in using test scores to judge students, teachers, and schools. Like most states, it uses the data to determine which schools are so low-performing that they should be closed or otherwise overhauled. It also crunches scores through a complicated “value-added” algorithm designed to assess how much learning that teachers contribute to their students — an approach that it has mostly stuck with as value-added measures have fallen out of favor across the nation. And unusually, the state exam scores are also supposed to factor into final student grades, this year counting for 10 percent.

But the rocky road to the new tests has temporarily diminished how much the scores count. Because preliminary scores arrived late this spring, most districts opted to grade students on the basis of their schoolwork alone.

And because of the testing transition, the scores won’t be given as much weight in this year’s teacher evaluations — an adjustment that lawmakers made to alleviate anxiety about the changes. Test scores will contribute only 10 percent to teachers’ ratings. Depending on the subject, that proportion is supposed to rise to between 15 and 25 percent by 2018-19.