Datahead

Denver and Aurora schools showed modest gains on state tests. But gaps still remain.

PHOTO: Helen H. Richardson/Denver Post
Justin Machado, 9, reads on his iPad during his 3rd grade class at Ashley Elementary in 2015.

Two large Colorado school districts with substantial numbers of students living in poverty — Denver Public Schools and Aurora Public Schools — showed modest improvement in the second year of state tests measuring students’ mastery of tougher academic standards.

Despite those gains, both districts still lag behind the state average — and in Aurora’s case, the gap is substantial.

Overall, the state education department’s release Thursday of district- and school-level results from PARCC math and English tests amounted to a mixed bag for most, with scores creeping up in some subjects and grades, slipping in others or remaining steady.

The picture is further muddied by a lack of student growth data that measure changes in the same group of students over time — data state officials are still compiling. Low test participation rates in higher grades also call into question district and school level scores, officials concede.

The state last month released state-level 2016 PARCC results, which showed more elementary school students were meeting expectations, while middle school scores were flat.

For the first time since the state overhauled its annual testing system in 2014 to align to the politically controversial Common Core State Standards, teachers, parents and taxpayers are able to compare year-to-year results in most subjects and grades.

A student at Vista Peak in Aurora works on an assignment.
PHOTO: Nicholas Garcia
A student at Vista Peak in Aurora works on an assignment.

The results, along with other measures such as graduation rates, will determine the quality ratings of schools and districts. Those ratings will be released later this fall. For some schools, another round of poor results could mean state intervention — something that has never happened before.

Like last year, anti-testing sentiment ran highest in Boulder and wealthier suburban Denver enclaves, as well as in some rural districts. Older students were more likely to not take the tests while the overwhelming majority of younger students took them, also echoing last year’s trends.

At Boulder’s Fairview High School, just 72 of 514 ninth graders took the PARCC English test.

Joyce Zurkowski, the state’s director of assessments, said lower participation rates invite scrutiny.

“When I’m looking at a school with a high number of kids who met or exceeded expectations, with 98 percent participation, the confidence I can have in those results is higher than at a school with an even higher number of students who met or exceed expectations but had only 40 percent participation,” Zurkowski said.

The big five

Results from the state’s five largest school districts mostly mirrored statewide results, which most notably showed gains in elementary school math.

Denver Public Schools, the state’s largest school district, showed gains in all but one test. The district only lost ground in the percentage of students who met or exceeded the state’s benchmarks in seventh grade math, dropping by 2.6 percentage points. The district’s largest leap in the percentage of students who cleared the state’s benchmarks was in fourth grade English, with a jump of 5.5 percentage points.

“A decade ago, we were 25 points behind the rest of the state,” Superintendent Tom Boasberg said. “Now we’re about 3 or 4 points behind the rest of the state. Against that benchmark, we’ve made very consistent, very striking progress.”

Jeffco Public Schools, the state’s second largest school district, saw gains in math in every grade but sixth. But it lost ground across the board on the state’s English tests. Its largest drop, 6.4 percentage points, was in the ninth grade.

“We are pleased to be improving in math given the higher level expectations of the CMAS/PARCC assessments,” Superintendent Dan McMinimee said in a statement. “Reading will continue to be a focus for our district improvement planning and we won’t be satisfied until all of our students are meeting or exceeding state expectations.”

The state’s third largest school district, Douglas County, made gains in math at the elementary school level, but lost ground in middle and high school. The south-suburban school district had wild swings on the English test. Ninth graders gained 5.7 points on that test, but seventh graders lost 7 points.

Strong gains in math were made in Cherry Creek elementary schools. But sixth graders this year lost 4 percentage points. The state’s fourth largest school district had more mixed results on the English test. There was a 2.8 percentage point increase in the number of fourth graders who were at grade level in English. But there was a 2.3 percentage point drop at the eighth grade level.

Julie Skupa, Cherry Creek’s assistant superintendent of assessment and improvement, said the district could attribute its higher math scores in part to a new curriculum.

“It goes beyond knowledge and rote memorization, and requires students to do a lot of problem-solving,” she said.

Aurora Public Schools, the fifth largest school district, saw mostly improvements. Students made gains in every grade on the English test except for grades six and eight. Similarly, Aurora showed increases in the number of students who met state expectations on the math test in every grade except for sixth and seventh.

“We’re seeing the first overall increase in performance since the 2011 school year,” said Superintendent Rico Munn, who has been leading an aggressive school improvement agenda. “We hope we can attribute that to our increase in rigor and relevance, our effort to make sure we have a high-quality teaching staff, and focus on our strategic plan.”

Growing pains

Changes made to Colorado’s testing system during the 2015 legislative session — including who takes the tests and how they take the tests — complicated this year’s release. School leaders voiced their frustration over slow-to-be-released and incomplete data that in some cases can’t be used to make comparisons to last year’s results.

For starters, this year’s upper division math results can’t be compared to last year’s because different grade levels took those tests. In 2015, middle school and high school students were eligible to take the state’s most advanced math tests. However, after lawmakers eliminated testing in the 10th and 11th grades, only seventh through ninth graders were able to take those tests.

Seniors at Fairview High School in Boulder protested a standardized test in November 2014.
Seniors at Fairview High School in Boulder protested a standardized test in November 2014.

The upshot: you can’t compare results between the two years.

And for the second year in a row, Colorado has released results from its tests piecemeal. While schools are getting their results three months earlier than they did last year, the results are still slower than anyone expected.

“What they promised was that through the online system, we’d get results back sooner than later,” Skupa said, adding that it’s difficult for districts to make any meaningful changes after the school year has started. “It’s these bits and pieces that make it difficult to create a big picture view.”

Part of the slowdown this year, Zurkowski said, is that school districts were allowed to use pencil-and-paper tests on a much larger scale. But that’s only part of the problem, she said.

“I believe the PARCC consortium underestimated the complexity of scoring and reporting their assessments, especially in the first few years,” she said. “I do believe that not only Colorado but the PARCC consortium is committed to continue to find ways to improve that turnaround time.”

Paper-and-pencil tests are causing another set of concerns for the state. Last year, the state acknowledged that students who used paper tests performed better than they would have if they used online tests. Test that were impacted included the third grade English test and upper division high school math tests.

Zurkowski said that while the department has theories as to why the bump happened — students could have felt more comfortable writing out equations than keyboarding them — it doesn’t know for certain.

“There was a lot to sort through,” she said. “… Honestly, we don’t know what specifically the issue is.”

To ensure that didn’t happen this year, the state education department ran results from about 16 schools that used paper tests through a series of mathematical procedures ensuring students would get the same score whether they took the test on paper or online, Zurkowski said.

The additional steps make the results more reliable, Zurkowski said, but the department is still urging caution when it comes to looking at those schools.

“We’re not going to over-interpret at this point,” she said.

Search for your school

Use Chalkbeat’s database to search for your school’s individual results on the math and English tests. The green bar represents the number of students who met or exceeded the standards. The yellow bar represents the number of students who took the tests. State officials have cautioned that low participation rates could skew results.

First Person

Two fewer testing days in New York? Thank goodness. Here’s what else our students need

PHOTO: Christina Veiga

Every April, I feel the tension in my fifth-grade classroom rise. Students are concerned that all of their hard work throughout the year will boil down to six intense days of testing — three for math and three for English language arts.

Students know they need to be prepared to sit in a room for anywhere from 90 minutes to three hours with no opportunity to leave, barring an emergency. Many of them are sick to their stomachs, feeling more stress than a 10-year-old ever should, and yet they are expected to perform their best.

Meanwhile, teachers are frustrated that so many hours of valuable instruction have been replaced by testing, and that the results won’t be available until students are moving on to other classrooms.

This is what testing looks like in New York state. Or, at least it did. Last month, state officials voted to reduce testing from three days for each subject to two, to the elation of students, parents, and teachers across New York. It’s an example of our voices being heard — but there is still more to be done to make the testing process truly useful, and less stressful, for all of us.

As a fifth-grade teacher in the Bronx, I was thrilled by the news that testing time would be reduced. Though it doesn’t seem like much on paper, having two fewer days of gut-wrenching stress for students as young as eight means so much for their well-being and education. It gives students two more days of classroom instruction, interactive lessons, and engagement in thought-provoking discussions. Any reduction in testing also means more time with my students, since administrators can pull teachers out of their classrooms for up to a week to score each test.

Still, I know these tests provide us with critical data about how students are doing across our state and where we need to concentrate our resources. The changes address my worries about over-testing, while still ensuring that we have an objective measure of what students have learned across the state.

For those who fear that cutting one-third of the required state testing hours will not provide teachers with enough data to help our students, understand that we assess them before, during, and after each unit of study, along with mid-year tests and quizzes. It is unlikely that one extra day of testing will offer any significant additional insights into our students’ skills.

Also, the fact that we receive students’ state test results months later, at the end of June, means that we are more likely to have a snapshot of where are students were, rather than where they currently are — when it’s too late for us to use the information to help them.

That’s where New York can still do better. Teachers need timely data to tailor their teaching to meet student needs. As New York develops its next generation of tests and academic standards, we must ensure that they are developmentally appropriate. And officials need to continue to emphasize that state tests alone cannot fully assess a student’s knowledge and skills.

For this, parents and teachers must continue to demand that their voices are heard. Until then, thank you, New York Regents, for hearing us and reducing the number of testing days.

In my classroom, I’ll have two extra days to help my special needs students work towards the goals laid out in their individualized education plans. I’ll take it.

Rich Johnson teaches fifth grade at P.S. 105 in the Bronx.

a failure of accountability

High-stakes testing may push struggling teachers to younger grades, hurting students

PHOTO: Justin Weiner

Kindergarten, first grade, and second grade are often free of the high-stakes testing common in later grades — but those years are still high-stakes for students’ learning and development.

That means it’s a big problem when schools encourage their least effective teachers to work with their youngest students. And a new study says that the pressure of school accountability systems may be encouraging exactly that.

“Evidence on the importance of early-grades learning for later life outcomes suggests that a system that pushes schools to concentrate ineffective teachers in the earliest grades could have serious unintended consequences,” write study authors Jason Grissom of Vanderbilt and Demetra Kalogrides and Susanna Loeb of Stanford.

The research comes at an opportune time. All 50 states are in the middle of crafting new systems designed to hold schools accountable for student learning. And this is just the latest study to point out just how much those systems matter — for good and for ill.

The study, published earlier this month in the peer-reviewed American Educational Research Journal, focuses on Miami-Dade County schools, the fourth-largest district in the country, from 2003 to 2014. Florida had strict accountability rules during that period, including performance-based letter grades for schools. (Those policies have been promoted as a national model by former Florida Governor Jeb Bush and his national education reform outfit, where Education Secretary Betsy DeVos previously served on the board.)

The trio of researchers hypothesized that because Florida focuses on the performance of students in certain grades and subjects — generally third through 10th grade math and English — less-effective teachers would get shunted to other assignments, like early elementary grades or social studies.

That’s exactly what they found.

In particular, elementary teachers effective at raising test scores tended to end up teaching grades 3-6, while lower-performing ones moved toward early grades.

While that may have helped schools look better, it didn’t help students. Indeed, the study finds that being assigned a teacher in early elementary school who switched from a higher grade led to reduced academic achievement, effects that persisted through at least third grade.

The impact was modest in size, akin to being assigned a novice teacher as opposed to a more experienced one.

The study is limited in that it focuses on just a single district, albeit a very large one — a point the authors acknowledge. Still, the results are consistent with past research in North Carolina and Florida as a whole, and district leaders elsewhere have acknowledged responding to test pressure in the same way.

“There was once upon a time that, when the test was only grades 3 through 12, we put the least effective teachers in K-2,” schools chief Sharon Griffin of Shelby County schools in Memphis said earlier this year. “We can’t do that anymore. We’re killing third grade and then we have students who get in third grade whose challenges are so great, they never ever catch up.”

While the Florida study can’t definitively link the migration of teachers to the state’s accountability system, evidence suggests that it was a contributing factor.

For one, the pattern is more pronounced in F-rated schools, which face the greatest pressure to raise test scores. The pattern is also stronger when principals have more control over staffing decisions — consistent with the idea that school leaders are moving teachers around with accountability systems in mind.

Previous research of policies like No Child Left Behind that threaten to sanction schools with low test scores have found both benefits and downsides. On the positive side, accountability can lead to higher achievement on low-stakes exams and improved instruction; studies of Florida’s system, in particular, have found a number of positive effects. On the negative side, high-stakes testing has caused cheating, teaching to the test, and suspensions of students unlikely to test well.

So how can districts avoid the unintended consequences for young students documented by the Miami-Dade study?

One idea is to emphasize student proficiency in third grade, a proxy for how well schools have taught kids in kindergarten, first and second grades.

Scholars generally say that focusing on progress from year to year is a better gauge of school effectiveness than student proficiency. But a heavily growth-based system could actually give schools an incentive to lower student achievement in early grades.

“These results do make an argument for weighting [proficiency] in those early tests to essentially guard against totally ignoring those early grades,” said Grissom, who also noted that states could make more efforts to directly measure performance of the youngest students.

But Morgan Polikoff, an associate professor at the University of Southern California, was more skeptical of this approach.

“It’s not as if states are going to add grades K-2 testing, so schools and districts will always have this incentive (or think they do),” he told Chalkbeat in an email. “I think measurement is always going to be an issue in those early grades.”