a high-stakes evaluation

The Gates Foundation bet big on teacher evaluation. The report it commissioned explains how those efforts fell short.

PHOTO: Brandon Dill/The Commercial Appeal
Sixth-grade teacher James Johnson leads his students in a gameshow-style lesson on energy at Chickasaw Middle School in 2014 in Shelby County. The district was one of three that received a grant from the Gates Foundation to overhaul teacher evaluation.

Barack Obama’s 2012 State of the Union address reflected the heady moment in education. “We know a good teacher can increase the lifetime income of a classroom by over $250,000,” he said. “A great teacher can offer an escape from poverty to the child who dreams beyond his circumstance.”

Bad teachers were the problem; good teachers were the solution. It was a simplified binary, but the idea and the research it drew on had spurred policy changes across the country, including a spate of laws establishing new evaluation systems designed to reward top teachers and help weed out low performers.

Behind that effort was the Bill and Melinda Gates Foundation, which backed research and advocacy that ultimately shaped these changes.

It also funded the efforts themselves, specifically in several large school districts and charter networks open to changing how teachers were hired, trained, evaluated, and paid. Now, new research commissioned by the Gates Foundation finds scant evidence that those changes accomplished what they were meant to: improve teacher quality or boost student learning.  

The 500-plus page report by the Rand Corporation, released Thursday, details the political and technical challenges of putting complex new systems in place and the steep cost — $575 million — of doing so.

The post-mortem will likely serve as validation to the foundation’s critics, who have long complained about Gates’ heavy influence on education policy and what they call its top-down approach.

The report also comes as the foundation has shifted its priorities away from teacher evaluation and toward other issues, including improving curriculum.

“We have taken these lessons to heart, and they are reflected in the work that we’re doing moving forward,” the Gates Foundation’s Allan Golston said in a statement.

The initiative did not lead to clear gains in student learning.

At the three districts and four California-based charter school networks that took part of the Gates initiative — Pittsburgh; Shelby County (Memphis), Tennessee; Hillsborough County, Florida; and the Alliance-College Ready, Aspire, Green Dot, and Partnerships to Uplift Communities networks — results were spotty. The trends over time didn’t look much better than similar schools in the same state.

Several years into the initiative, there was evidence that it was helping high school reading in Pittsburgh and at the charter networks, but hurting elementary and middle school math in Memphis and among the charters. In most cases there were no clear effects, good or bad. There was also no consistent pattern of results over time.

A complicating factor here is that the comparison schools may also have been changing their teacher evaluations, as the study spanned from 2010 to 2015, when many states passed laws putting in place tougher evaluations and weakening tenure.

There were also lots of other changes going on in the districts and states — like the adoption of Common Core standards, changes in state tests, the expansion of school choice — making it hard to isolate cause and effect. Studies in Chicago, Cincinnati, and Washington D.C. have found that evaluation changes had more positive effects.

Matt Kraft, a professor at Brown who has extensively studied teacher evaluation efforts, said the disappointing results in the latest research couldn’t simply be chalked up to a messy rollout.

These “districts were very well poised to have high-quality implementation,” he said. “That speaks to the actual package of reforms being limited in its potential.”

Principals were generally positive about the changes, but teachers had more complicated views.

From Pittsburgh to Tampa, Florida, the vast majority of principals agreed at least somewhat that “in the long run, students will benefit from the teacher-evaluation system.”

Source: RAND Corporation

Teachers in district schools were far less confident.

When the initiative started, a majority of teachers in all three districts tended to agree with the sentiment. But several years later, support had dipped substantially. This may have reflected dissatisfaction with the previous system — the researchers note that “many veteran [Pittsburgh] teachers we interviewed reported that their principals had never observed them” — and growing disillusionment with the new one.

Majorities of teachers in all locations reported that they had received useful feedback from their classroom observations and changed their habits as a result.

At the same time, teachers in the three districts were highly skeptical that the evaluation system was fair — or that it made sense to attach high-stakes consequences to the results.

The initiative didn’t help ensure that poor students of color had more access to effective teachers.

Part of the impetus for evaluation reform was the idea, backed by some research, that black and Hispanic students from low-income families were more likely to have lower-quality teachers.  

But the initiative didn’t seem to make a difference. In Hillsborough County, inequity expanded. (Surprisingly, before the changes began, the study found that low-income kids of color actually had similar or slightly more effective teachers than other students in Pittsburgh, Hillsborough County, and Shelby County.)

Districts put in place modest bonuses to get top teachers to switch schools, but the evaluation system itself may have been a deterrent.

“Central-office staff in [Hillsborough County] reported that teachers were reluctant to transfer to high-need schools despite the cash incentive and extra support because they believed that obtaining a good VAM score would be difficult at a high-need school,” the report says.

Evaluation was costly — both in terms of time and money.

The total direct cost of all aspects of the program, across several years in the three districts and four charter networks, was $575 million.

That amounts to between 1.5 and 6.5 percent of district or network budgets, or a few hundred dollars per student per year. Over a third of that money came from the Gates Foundation.

The study also quantifies the strain of the new evaluations on school leaders’ and teachers’ time as costing upwards of $200 per student, nearly doubling the the price tag in some districts.

Teachers tended to get high marks on the evaluation system.

Before the new evaluation systems were put in place, the vast majority of teachers got high ratings. That hasn’t changed much, according to this study, which is consistent with national research.

In Pittsburgh, in the initial two years, when evaluations had low stakes, a substantial number of teachers got low marks. That drew objections from the union.

“According to central-office staff, the district adjusted the proposed performance ranges (i.e., lowered the ranges so fewer teachers would be at risk of receiving a low rating) at least once during the negotiations to accommodate union concerns,” the report says.

Morgaen Donaldson, a professor at the University of Connecticut, said the initial buy-in followed by pushback isn’t surprising, pointing to her own research in New Haven.

To some, aspects of the initiative “might be worth endorsing at an abstract level,” she said. “But then when the rubber hit the road … people started to resist.”

More effective teachers weren’t more likely to stay teaching, but less effective teachers were more likely to leave.

The basic theory of action of evaluation changes is to get more effective teachers into the classroom and then stay there, while getting less effective ones out or helping them improve.

The Gates research found that the new initiatives didn’t get top teachers to stick around any longer. But there was some evidence that the changes made lower-rated teachers more likely to leave. Less than 1 percent of teachers were formally dismissed from the places where data was available.

After the grants ran out, districts scrapped some of the changes but kept a few others.

One key test of success for any foundation initiative is whether it is politically and financially sustainable after the external funds run out. Here, the results are mixed.

Both Pittsburgh and Hillsborough have ended high-profile aspects of their program: the merit pay system and bringing in peer evaluators, respectively.

But other aspects of the initiative have been maintained, according to the study, including the use of classroom observation rubrics, evaluations that use multiple metrics, and certain career-ladder opportunities.

Donaldson said she was surprised that the peer evaluators didn’t go over well in Hillsborough. Teachers unions have long promoted peer-based evaluation, but district officials said that a few evaluators who were rude or hostile soured many teachers on the concept.

“It just underscores that any reform relies on people — no matter how well it’s structured, no matter how well it’s designed,” she said.

Correction: A previous version of this story stated that about half of the money for the initiative came from the Gates Foundation; in fact, the foundation’s share was 37 percent or about a third of the total.

resentment and hurt

‘We are all educators:’ How the teachers strike opened at a rift at one Denver middle school network that will take time to close

PHOTO: Melanie Asmar/Chalkbeat
Students at Kepner Beacon Middle School work on an assignment.

For the first time since this week’s Denver teacher strike exposed divisions in their ranks, the 100 grownups who make the Beacon middle school network run gathered in the same room.

Teachers, some still wearing red for the union cause, came with breakfast burritos to share. Upbeat soul music pumped through the speakers, an attempt to set a positive tone.  

Speaking to the group assembled Friday for a long-scheduled planning day in the cafeteria of Grant Beacon Middle School, Alex Magaña acknowledged the awkwardness and hurt feelings that have taken a toll on a school community that prides itself on a strong culture.  

The network’s two schools — Grant Beacon in east Denver and Kepner Beacon in southwest Denver — aim to provide a high-quality education to some of the city’s neediest students. A day after most teachers returned to work after the three-day strike, Denver students had a day off Friday, giving school leaders the opportunity to begin repairing any damage done.

“It’s never been administration-versus-teachers, district-versus-teachers, in the culture we have created here,” said Magaña, executive principal of the two schools. “We have a lot of good leadership, a lot of input from teachers. But this caught everyone kind of surprise.”

By “this,” Magaña means the tension that developed on the two campuses during the strike over teacher pay that put Denver in an unfamiliar national glare. The 93,000-student district is better known for its unique brand of at times controversial education reform — of which the Beacon network is part — than labor strife and division in the educator ranks.

Against the backdrop of the strike, Magaña realized words matter. Everyone in the building, he thought, not just teachers, ought to be considered educators and referred to as such. That was the role everyone was thrust into — administrators, deans, and district central office staff who through no choice of their own had to cover for absent teachers. Magaña, too. He taught math.

When teachers, administrators, and staff arrived for Friday morning’s meeting, they congregated at tables with colored pencils and “reflection forms.” Everyone was asked to write down answers to two questions: What did you learn about yourself? What did you learn about your colleagues?

“I also brought out the obvious — the elephant in the room,” Magaña said. “There are hurt feelings. There is resentment from teachers to staff to students to parents. That is something we can’t pretend isn’t there, and we put it out there and acknowledge it to move forward.”

Go to the vast majority of public schools in this country and classrooms look largely the same. Not so in Denver Public Schools, which is deep into its second decade of offering a menu of choices at traditional district-run, charter, and hybrid “innovation” schools.

From this approach sprung Grant Beacon Middle School, which opened on the east side of Denver in 2011. The school seeks to build students’ character and promote personalized learning — essentially, using data and technology to tailor instruction to individual students.

Grant Beacon is an innovation school, meaning it doesn’t need to follow all aspects of state law or the teachers union contract.

Using one of its more controversial school improvement strategies, the Denver district began phasing out struggling Kepner Middle School in 2014 and moved to put two schools in the same building: a new Beacon school and an outpost of the STRIVE charter network.  

The Denver district allows charter schools to use extra space in its school buildings essentially at cost, creating shared campuses with district-run schools. It’s an arrangement that would be unfathomable in most U.S. cities where districts and charter schools are in perpetual conflict.

Both schools on the shared campus were “green,” the second-highest ranking, on the district’s most recent school ratings report last fall.

The teacher strike, however, exposed the stark differences between the two Beacon campuses.

Both schools serve a high proportion of low-income students. At Grant Beacon, 80 percent of students qualify for subsidized lunch, a measure of poverty — slightly above the district average. But things are far more challenging at Kepner, where 96 percent of students fit that definition. The school is a refuge where students can be fed and be safe from trauma.

The differences in student attendance and teacher strike participation at the two schools were stark. About half of Grant Beacon students showed up for school during the strike, and six in 10 teachers joined the strike. Four miles and a world away at Kepner Beacon, 90 percent of students showed up for school — and all but a few teachers were out on strike.

At Kepner Beacon, the network’s “all-for-one, one-for-all” culture of togetherness helped unite its relatively young corps of teachers in a shared resolve to go on strike.

That and high student attendance meant Kepner Beacon faced far greater challenges to keep operating, perhaps as much as any of the city’s 147 district-run schools during the strike.

Linsey Cobb had an emotionally wrenching weekend ahead of the strike’s start. She was torn. A special education teacher and the special education team leader at Kepner Beacon, she stood with teachers fighting for a system they believed would pay them a better, fairer wage.

But the third-year teacher decided to report to work as usual Monday morning, feeling too strong of a pull to fulfill her responsibilities supporting the neediest students — those with individualized lesson plans, the complex and sometimes confounding binding documents for students with special needs.

Cobb was not fully prepared by what she experienced on that morning.

“Even though I am very close with my students, I felt incredibly isolated,” she said. “I got the weirdest feeling. I got a lot of, ‘Miss, why aren’t you striking? Don’t you believe what teachers are fighting for?’ I was like, ‘I do!’ I had a little bit of an internal struggle.”

Cobb’s Monday ended early enough for her to attend the big teachers union rally at the Capitol. She said she was touched by the camaraderie. She caught up with old friends from her days with the Denver Teachers Residency, an important training ground of the city’s teaching corps.

Taking all of that into consideration, Cobb joined her colleagues picketing the next day Teachers shared donuts and coffee. Parents brought them hand-warmers in the 20-degree chill.

One teacher sat in her car with the engine running recording a video message to her students, telling them where she was and spelling out the day’s lesson plan before she joined everyone else on the picket line.

Though the district spent $136,000 to prepare makeshift lesson plans for the strike, Beacon teachers prepared their own and uploaded them to the network’s cloud-based system.

On Friday, Cobb was back with all of her colleagues — striking teachers, those who never left the classroom, and staff and administrators who experienced the life of a teacher for three days.

“It’s about trust,” Magaña said. “Some of it was cracked a little bit. There was no contention in the room (Friday). It was really coming in with openness and willingness by everyone to say, ‘It’s done, and we did the right thing for ourselves. Now it’s time to come closer together.’”

“Normalcy will happen,” added Cobb, the special education teacher. “But it might take a bit.”

bonus

Aurora school district numbers shows some positive results from hard-to-staff bonus

Students work on algebra problems in a college-level course at Hinkley High School in Aurora.

When the Aurora school district offered some teachers and service providers a bonus for accepting or returning to hard-to-staff positions, the district saw less turnover in those jobs and had more of them filled by the start of the school year.

But the results weren’t consistent across schools, and there were differences in how teachers and other support staff responded to the bonus. Some schools still saw big increases in turnover. And the district still couldn’t fill all positions by the start of the school year.

In a report that district staff will present to the Aurora school board Tuesday, survey responses show the bonus was most influential for new special service providers, such as nurses, occupational therapists, or speech language pathologists. But only 33 percent of new teachers coming into the district said the bonus made an impact on their decision.

Aurora administrators refused to talk about the findings ahead of the board meeting. When the district first announced the bonuses, Superintendent Rico Munn said he had hoped the pilot bonus system would help the district attract more candidates, fill more vacancies, and retain more employees. The union objected to the bonuses. The union and the district begin negotiations next month on how to spend $10 million that voters approved to raise teacher pay.

An arbitrator ruled that the district should have negotiated the terms of the bonuses with the union first, but the school board refused to uphold the finding. District officials had indicated that the results of the pilot incentives would play a role in what changes they propose going forward, and it’s not clear where the school board, a majority of whom were elected with union support, will come down.

On a state and national level, incentives for teachers are being questioned after Denver teachers went on strike, in part over a disagreement about how effective incentives can be and whether that money is better spent on base pay. Ultimately, the tentative agreement that ended the strike on Thursday maintained a number of bonuses, including $2,000 for educators in hard-to-staff positions.

In the Aurora pilot program, the district offered a bonus for special education, secondary math and secondary science teachers at 20 targeted schools. If staff in those positions committed to returning to their job for this year, they could get $3,000. If they returned, but did not give an early commitment, the bonus would be $2,500.

The same rules applied for other positions such as psychologists, nurses, occupational therapists, and speech pathologists, but those employees were eligible at all district schools. New employees in those positions could get $2,500.

To pay for the bonuses, the district had set aside $1.8 million from an unexpected increase in revenue due in part to rising property values. The district only ended up spending about $1.1 million.

Among 229 eligible teachers, 133 returned to their jobs, committing early, and another 29 returned without making an early commitment, meaning about 70 percent of teachers were retained and received the bonus.

Of the 20 schools at which teachers of math, science, and special education received incentives, turnover went down at 13 schools, up at another five, and stayed the same at two.

Among 184 staff members in the other hard-to-staff positions districtwide, 141 returned to their jobs, or 77 percent, all of them committing early and receiving the higher bonus.

The report doesn’t compare those numbers with previous years’.

Ramie Randles, a math teacher, was at Aurora West Collegiate Prep last year and received the bonus. But, she says, she had already decided to return to the same job this school year even before she learned about the bonus.

“To be honest with you it’s nice to get a little extra, but it’s a very small amount that’s not going to sway me one way or another,” Randles said.

In the second quarter of the school year, she left her job at Aurora West and is now teaching math at North Middle School.

The bonus is offered at both schools, but it wasn’t a factor, she said.

“I just feel like I want to feel valued in a job,” Randles said. “If I’m feeling like I’m happy that affects not just me, it affects my students. It affects my coworkers.”

According to the district, 98.26 percent of those who received a bonus remain in the same position as of this week.

Fill rates, which represent how many of the district’s positions are filled by the start of the school year, show an increase, although often small, among all positions except for school psychologists.

Fill rates over time: Did Aurora have more positions filled at the start of this school year than in the past?

Position 16-17 17-18 18-19
Secondary math teachers at 20 schools 91.5% 92.6% 93.4%
Secondary science teachers at 20 schools 93.5% 93.8% 94.8%
Special education teachers at 20 schools 92.6% 89.4% 90.24%
Nurses, district-wide 87.3% 94.6% 98%
Occupational therapists, district-wide 95.4% 80% 96.1%
Psychologists, district-wide 94.4% 96% 95.4%
Speech language pathologists, district-wide 75% 81.4% 85.4%

Another goal of the pilot was to help the district save money by decreasing the use of contract agencies to fill important positions.

The report found that compared with last year, fewer positions were filled through contract agencies.

The Aurora district “was one of the few districts in the metro area that did not provide some form of differentiated pay or incentive for hard-to-fill subject areas,” according to the district. As examples, the report cites Cherry Creek, Denver, and Douglas school districts.

Bruce Wilcox, president of Aurora’s teachers union, said the union has “no interest in pay like Denver does.”

He is against the bonus because he disagrees with setting up different pay for people doing the same jobs in different schools, and because he doubts it will have a long- term effect.

“For some, maybe money was enough to lure them in, but will it be enough to lure them in over a period of time?” Wilcox asked. “Money’s nice and every teacher needs it, let’s be honest, but is it enough to make you continue to work if the leadership and culture aren’t there?”

Tuesday, Aurora staff will also present the school board with an update on overall strategies to improve teacher recruitment and retention. Among those strategies: the development of new training for principals, including on how to motivate and retain high-performing employees.

Another report on the pilot incentives will be prepared this fall with final numbers of how many teachers stayed.

Find turnover rates for the pilot, by school, in the district’s report below. Note: The colors in the second column represent a comparison over the prior year with green showing that it is a lower rate than in the past.