Academic Accountability

Coming soon: Not one, but two ratings for every Chicago school

Starting this month, Chicago schools will have to juggle two ratings — one from the school district, and another from the state.

The Illinois State Board of Education is scheduled to release on October 31 its annual report cards for schools across the state. This year, for the first time, each school will receive one of four quality stamps from the state: an “exemplary” or “commendable” rating signal the school is meeting standards while an “underperforming” or “lowest performing” designation could trigger intervention, according to state board of education spokeswoman Jackie Matthews.

A federal accountability law, the Every Student Succeeds Act, requires these new ratings.

To complicate matters, the city and state ratings are each based on different underlying metrics and even a different set of standardized tests. The state ratings, for example, are based on a modified version of the PARCC assessment, while Chicago ratings are based largely on the NWEA. The new state ratings, like those the school district issues, can be given out without observers ever having visited a classroom, which is why critics argue that the approach lacks the qualitative metrics necessary to assess the learning, teaching, and leadership at individual schools.

Patricia Brekke, principal at Back of the Yards College Preparatory High School, said she’s still waiting to see how the ratings will be used, “and how that matters for us,” but that parents at her school aren’t necessarily focused on what the state says.

“What our parents usually want to know is what [Chicago Public Schools] says about us, and how we’re doing in comparison to other schools nearby that their children are interested in,” she said.

Educators at Chicago Public Schools understand the power of school quality ratings.  The district already has its own five-tiered rating system: Level 1+ and Level 1 designate the highest performing schools, Level 2+ and Level 2 describe for average and below average performing schools, respectively, and Level 3, the lowest performance rating, is for schools in need of “intensive intervention.” The ratings help parents decide where to enroll their children, and are supposed to signal to the district that the school needs more support. But the ratings are also the source of angst — used to justify replacing school leaders, closing schools, or opening new schools in neighborhoods where options are deemed inadequate.

In contrast, the state’s school quality designations actually target underperforming and lowest-performing schools with additional federal funding and support with the goal of improving student outcomes. Matthews said schools will work with “school support managers” from the state to do a self-inquiry and identify areas for improvement. She described Chicago’s school quality rating system as “a local dashboard that they have developed to communicate with their communities.”

Staff from the Illinois State Board of Education will be traveling around the state next week to meet with district leaders and principals to discuss the new accountability system, including the ratings. They’ll be in Bloomington, Marion, O’Fallon, Chicago, and Melrose Park. The Chicago meeting is Wednesday, Oct. 24, at 5 p.m. at Chicago Public Schools headquarters.

Rae Clementz, director of assessment and accountability at the state board said that a second set of ratings reveals “that there are multiple valid ways to look at school quality and success; it’s a richer picture.”

Under auspices of the Every Student Succeeds Act, the state school report cards released at the end of the month for elementary schools are 75 percent based on academics, including English language arts and math test scores, English learner progress as measured by the ACCESS test, and academic growth. The other 25 percent reflects the school climate and success, such as attendance and chronic absenteeism.

Other measures are slated to be phased in over the next several years, including academic indicators like science proficiency and school quality indicators, such as school climate surveys of staff, students and parents

High school designations take a similar approach with English and math test scores but will take into account graduation rates, instead of academic growth, and also includes the percentage of  9th graders on track to graduate — that is freshmen who earn 10 semester credits, and no more than one semester F in a core course.

Critics of Chicago’s school rating system argue that the ratings correlate more with socioeconomic status and race than they do school quality, and say little about what’s happening in classrooms and how kids are learning. Chicago does try to mitigate these issues with a greater emphasis on growth in test scores rather than absolute attainment, school climate surveys, and including academic growth by priority groups, like African-American, Latino, ELL, and students in special education.

Cory Koedel, a professor of economics and public policy at the University of Missouri, said that many rating systems basically capture poverty status with a focus on how high or low students score on tests. Chicago’s approach is fairer than that of many other school systems.

“What I like about this is it does seem to have a high weight on growth and lower weight on attainment levels,” he said.

Morgan Polikoff, a professor at University of Southern California’s school of education, said that Chicago’s emphasis on student growth is a good thing “if the purpose of the system is to identify schools doing a good job educating kids.”

Chicago weights 50 percent of the rating on growth, but he’s seen 35 to as low as 15 percent at other districts. But he said the school district’s reliance on the NWEA test rather than the PARCC test used in the state school ratings was atypical.

“It’s not a state test, and though they say it aligns with standards, I know from talking to educators that a lot of them feel the tests are not well aligned with what they are supposed to be teaching,” he said. “It’s just a little odd to me they would have state assessment data, which is what they are held accountable for with the state, but use the other data.”

He’s skeptical about school systems relying too heavily on standardized test scores, whether the SAT, PARCC or NWEA, because “You worry that now you’re just turning the curriculum to test prep, and that’s an incentive you don’t want to create for educators.”

He said the high school measures in particular include a wide array of measures, including measures that follow students into college, “so I love that.”

“I really like the idea of broadening the set of indicators on which we evaluate schools and encouraging schools to really pay attention to how well they prepare students for what comes next,” he said.

where's the research

Summit Learning declined to be studied, then cited collaboration with Harvard researchers anyway

English teacher Adelaide Giornelli works with ninth grade students on computers at Shasta charter public high school, part of the Summit public school system. (Photo by Melanie Stetson Freeman/The Christian Science Monitor via Getty Images)

Summit Learning, a fast-growing “personalized learning” system, touts a partnership with Harvard researchers even though Summit actually turned down their proposal to study the model.

The online platform is backed by Facebook founder Mark Zuckerberg’s philanthropy and is now being used in 380 schools across the U.S.

The program “is based on collaborations with nationally acclaimed learning scientists, researchers and academics from institutions including the Harvard Center for Education Policy Research,” Summit’s website says. “Summit’s research-backed approach leads to better student outcomes.” Schools have used that seeming endorsement to back up their decision to adopt the model.

In fact, though, there is no academic research on whether Summit’s specific model is effective. And while Summit helped fund a study proposal crafted by Harvard researchers, it ultimately turned them down.

“They didn’t tell us explicitly why,” said Tom Kane, a Harvard education professor and faculty director of the Center for Education Policy Research. “All I can say is that the work that we did for Summit involved planning an evaluation; we have not measured impacts on student outcomes.”

Summit’s founder Diane Tavenner said the organization had a number of reasons for not moving forward with the proposed study, including its potential to burden teachers and to limit the platform’s ability to change or grow. Their general approach is backed by other research, she said, and their track record as a charter network.

As to the mention of the Harvard center on Summit’s website, Tavenner said the organization had learned a lot from the process of developing a potential study. Tavenner said that, after Chalkbeat began reporting this story, she offered to change the website’s language, but said Kane had not asked her to do so.

More broadly, Tavenner says she is skeptical of the usefulness of large-scale research of the sort the Harvard team proposed, saying the conclusions might be of interest to journalists and philanthropists, not schools.

“I’m not willing to give up what’s best for kids for those two audiences,” Tavenner told Chalkbeat last month.

It’s a notable stance for Summit, given its ambitious claims and the platform’s wide reach.

As “personalized learning” becomes a more popular idea among those trying to improve America’s schools, Summit’s platform has been adopted for free by schools across the country. That’s thanks largely to the backing of the Chan Zuckerberg Initiative, the philanthropy poised to receive Zuckerberg’s billions. Summit’s model has drawn praise from parents and teachers in some schools, but proven controversial in others.

Regardless, CZI’s support means Summit could continue to grow rapidly — which has some observers wondering when its backers will show that what it’s offering is particularly effective.

“I do think that there is an obligation to provide credible evidence to schools when you’re trying to convince them to adopt things,” said John Pane, a researcher at the RAND Corporation who has extensively studied personalized learning initiatives.

Summit spreads, but research talks with Harvard team fizzle

Summit’s claims about a Harvard collaboration have their roots in conversations that began in  late 2016.

Zuckerberg’s wife, pediatrician Priscilla Chan, took a fateful tour of a school in the Summit Public Schools charter network two years earlier. The network soon began working with a Facebook engineering team to build out its technology.

Summit’s model has a number of components: a curriculum in core subjects for grades four through 12; weeks scheduled for students to deeply examine a topic of interest; long-term mentors for students; and a technology platform, which serves as the approach’s organizing structure. The goal is to better engage students and to give them more control over what and how they learn, Summit says.

By the 2016-17 school year, Summit had rolled out its program to more than 100 schools outside its own network. That’s also about when Summit started talks with Harvard professors Marty West and Kane.

An ideal study might have randomly assigned schools or students to use the learning platform, creating two groups that could be compared. That was a non-starter for Tavenner, as it would limit schools’ access to the platform. If 250 schools were assigned to use it, and another 250 expressed interest but were not, for example, that would be bad for students, she said last month while discussing the organization’s approach to research.

“Am I really going to say to 250 people, ‘You know what, we’re not going to actually help you, even though we actually could right now?’” she said.

Kane says they came up with a few alternatives: comparing students using Summit to others not using it in the same school or comparing schools that had adopted Summit to similar schools that hadn’t. They suggested tracking test scores as well as suspensions and attendance, measuring the effectiveness of the support offered to teachers, and using surveys to measure concepts important to Summit, like whether students felt in control of their schoolwork.

But Summit passed on an evaluation. “After many conversations with Harvard and the exploration of multiple options, we came to recognize that external research would need to meet certain baseline criteria in order for us to uphold in good faith our partnership with schools, students, and parents,” Tavenner said.

Metrics were a particular concern. “Standardized tests are not good measures of the cognitive skills,” a Summit spokesperson said, saying the organization had developed better alternatives. “Attendance and discipline are not measures of habits of success, full stop.” Tavenner said she feared that a study could stop Summit from being able to make changes to the program or that it might stop participating schools from adding new grades. (Kane and West say their plan wouldn’t have limited growth or changes.)

Tavenner told Chalkbeat that research of the kind the Harvard team was offering isn’t needed to validate their approach. Summit is based on decades of research on ideas like project-based learning, she said, citing the organization’s report titled “The Science of Summit.”

Dan Willingham, a University of Virginia educational psychologist, said that’s useful, but not the same as knowing whether a specific program helps students.

“You take a noticeable step down in confidence when something is not research-based but rather research-inspired,” he said, while noting that many education initiatives lack hard evidence of success. “There’s a hell of a lot going on in education that’s not being evaluated.”

What about Summit’s original charter network, now 11 schools? Summit cites internal data showing its graduates have success being accepted to college. But outside research is limited. A 2017 study by the Stanford-based group CREDO found that attending Summit led to modest declines in students’ reading scores and had no clear effect in math, though it looked at only a small portion of the network’s students.

The Summit charter schools are also part of an ongoing study of economically integrated charter schools, and a few were included in two widely cited RAND studies looking at personalized learning, though they didn’t report any Summit-specific information. California’s notoriously limited education data access has stymied more research, Tavenner said.

What does philanthropy owe the public?

Today, Summit’s learning platform has far outpaced its charter network. About 380 schools, with over 72,000 students, use the platform; the national charter network KIPP, by comparison, runs 224 schools serving around 100,000 students.

Summit now gets its engineering help from the Chan Zuckerberg Initiative, not Facebook. That philanthropic partnership has fueled its growth: While CZI has not disclosed how much it’s given to Summit, the Silicon Valley Community Foundation — through which CZI funnels much of its education giving — lists grants to Summit totalling over $70 million in 2016 and 2017.

Summit has also netted $2.3 million for the platform from the Bill and Melinda Gates Foundation in 2016, and another $10 million in 2017. (CZI, the Gates Foundation, and the Silicon Valley Community Foundation are all funders of Chalkbeat.)

Some major foundations regularly invest in research to better understand whether their gifts are doing good, noted Sarah Reckhow, a Michigan State professor who studies education philanthropy. In a number of instances, that research comes to unfavorable conclusions, like a Gates-funded study on its teacher evaluation initiative or a Walton Family Foundation-backed evaluation of charter schools’ propensity to screen out students with disabilities. (A Gates spokesperson said that part of its $10 million to Summit was set aside for “measurement and evaluation.”)

Reckhow said she hasn’t yet seen that same inclination from CZI. And she worries that school districts might be less likely to carefully examine programs that are offered free of charge, like Summit.

“If you reduce that barrier, you’re making it potentially more likely to adopt something without as much scrutiny as they otherwise might do,” she said. “That increases the obligation of Summit and CZI to evaluate the work.”

CZI spokesperson Dakarai Aarons said the organization is committed to research and to Summit, and pointed to a number of schools and districts that saw academic improvements after introducing Summit’s platform. “As the program grows, we look forward to expanded research to help measure its long-term impact,” he said.

Tavenner said Summit is exploring other options to prove its approach is working, including talking to researchers who study continuous improvement. “We can’t just keep saying no to [randomized studies],” she said. “We’ve got to have another way, but I don’t have another way yet.”

Researchers Kane and West, for their part, say Summit’s concerns about evaluating its evolving model should also raise questions about Summit’s swift spread.

“The evaluation we proposed would have assessed the impact of the model at that point in time, even if the model continued to evolve,” they wrote in an email. “When a model is still changing so radically that a point in time estimate is irrelevant, it is too early to be operating in hundreds of schools.”

“Unfortunately, Summit is closer to the rule than the exception,” they said.

To Do

Tennessee’s new ed chief says troubleshooting testing is first priority

PHOTO: (Caiaimage/Robert Daly)

Penny Schwinn knows that ensuring a smooth testing experience for Tennessee students this spring will be her first order of business as the state’s new education chief.

Even before Gov.-elect Bill Lee announced her hiring on Thursday, she was poring over a recent report by the state’s chief investigator about what went wrong with TNReady testing last spring and figuring out her strategy for a different outcome.

“My first days will be spent talking with educators and superintendents in the field to really understand the scenario here in Tennessee,” said Schwinn, who’s been chief deputy commissioner of academics in Texas since 2016.

“I’ll approach this problem with a healthy mixture of listening and learning,” she added.

Schwinn’s experience with state assessment programs in Texas and in Delaware — where she was assistant secretary of education — is one of the strengths cited by Lee in selecting her for one of his most critical cabinet posts.

The Republican governor-elect has said that getting TNReady right is a must after three straight years of missteps in administration and scoring in Tennessee’s transition to online testing. Last year, technical disruptions interrupted so many testing days that state lawmakers passed emergency legislation ordering that poor scores couldn’t be used to penalize students, teachers, schools, or districts.

Schwinn, 36, recalls dealing with testing headaches during her first days on the job in Texas.

“We had testing disruptions. We had test booklets mailed to the wrong schools. We had answer documents in testing booklets. We had online administration failures,” she told Chalkbeat. “From that, we brought together teachers, superintendents, and experts to figure out solutions, and we had a near-perfect administration of our assessment the next year.”

What she learned in the process: the importance of tight vendor management, including setting clear expectations of what’s expected.

She plans to use the same approach in Tennessee, working closely with people in her new department and Questar Assessment, the state’s current vendor.

“Our job is to think about how to get online testing as close to perfect as possible for our students and educators, and that is going to be a major focus,” she said.

The test itself has gotten good reviews in Tennessee; it’s the online miscues that have many teachers and parents questioning the switch from paper-and-pencil exams. Schwinn sees no choice but to forge ahead online and is quick to list the benefits.

“If you think about how children learn and access information today, many are getting that information from hand-held devices and computers,” she said, “so reflecting that natural experience in our classrooms is incredibly important.”

Schwinn said computerized testing also holds promise for accommodating students with disabilities and provides for a more engaging experience for all students.

“When you look at the multiple-choice tests that we took in school and compare that to an online platform where students can watch videos, perform science experiments, do drag-and-drop and other features, students are just more engaged in the content,” she said.

“It’s a more authentic experience,” she added, “and therefore a better measure of learning.”

Schwinn plans to examine Tennessee’s overall state testing program to look for ways to reduce the number of minutes dedicated to assessment and also to elevate transparency.

She also will oversee the transition when one or more companies take over the state’s testing program beginning next school year. Former Commissioner Candice McQueen ordered a new request for proposals from vendors to provide paper testing for younger students and online testing for older ones. State officials have said they hope to award the contract by spring.

In Texas, a 2018 state audit criticized Schwinn’s handling of two major education contracts, including a no-bid special education contract that lost the state more than $2 million.

In Tennessee, an evaluation committee that includes programmatic, assessment, and technology experts will help to decide the new testing contract, and state lawmakers on the legislature’s Government Operations Committee plan to provide another layer of oversight.

Spring testing in Tennessee is scheduled to begin on April 15. You can learn more about TNReady on the state education department’s website.

Editor’s note: This story has been updated with new information about problems with the handling of two education contracts in Texas.