Growing pains

Latest Colorado test results provide long-awaited glimpse at how students are growing academically

Students at Mrachek Middle School in Aurora work to solve a math problem. (Photo by Nicholas Garcia, Chalkbeat)

Newly released state test results measuring students’ academic growth show strong progress for Denver Public Schools in English, slow going for Aurora Public Schools in math and a potentially alarming achievement gap for students with disabilities statewide.

Unlike earlier math and English results that showed students’ proficiency in meeting academic standards, Colorado’s growth report measures how much students learn year-to-year compared to their academic peers. The information, which is the primary component of a school’s and district’s quality rating, is often heralded as providing a more complete picture of how much students, especially those who are less likely to be proficient on state tests, are faring in school.

Put simply, the growth numbers provide a picture of how students are progressing and how fast compared to their peers, not taking into account where they are proficiency-wise.

“If we can take our most struggling kids as far as they can go, and the highest (performing) kids who don’t get the attention they deserve as far as they can go, we’re succeeding,” said Leslie Nichols, superintendent of the tiny rural Hinsdale School District in southwest Colorado. Hinsdale students posted higher growth rates in English than any other school district in the state.

In 2009, Colorado began using the growth measure, which relies on results from the state’s English and math standardized tests, to supplement basic achievement data.

A student’s growth percentile, which ranges from 1 to 99, indicates how that student’s performance changed over time, relative to students with similar performances on state assessments. School and district growth rates, which make up the greatest share in their quality ratings, are determined by the median growth score from all students in that school or district.

Tuesday’s release marks the end of a multi-year transition from the state’s previous testing system, TCAP, to its current system that includes PARCC English and math tests.

Because of the transition to new tests, Colorado neither released growth data nor school quality ratings last year. While hiccups remain, state education officials say confidence is growing in the exams and the data they provide. But members of the State Board of Education last week signaled they were prepared to upend the entire system all over again.

“We’re glad to have a growth metric again,” said Alyssa Pearson, the state education department’s associate commissioner for accountability. “We believe it provides a really important dimension to understand the quality of a school to go along with achievement.”

State growth results

The state’s median growth percentile is always about 50. Groups of students, schools and districts that have a percentile score higher than 50 are on average learning at a faster rate than their peers. Conversely, a percentile score lower than 50 means on average students are learning at a slower rate than their peers.

Hitting the 50 mark represents about a year’s worth of academic growth.

Like under the old system, growth gaps exist between the state’s white students from middle-income households and their more at-risk peers. The gap between students with disabilities, those with individualized education plans, and their non-disabled peers was the largest on the state’s English and math tests. The gap between boys and girls also grew, state officials acknowledged, with girls learning at a faster rate.

Only English language learners demonstrated equal growth to their native-English speaking peers on the state’s English test.


While state education officials called attention to the yawning gap between students with disabilities and their peers without disabilities, officials were hesitant to prescribe cause.

“I don’t know if we can jump to conclusions yet,” Pearson said. “But it’s important to make it visible.”

Similar gaps did not exist with other student groups, including English learners and low-income students.

Growth results from Colorado’s 10 largest school districts were mostly in line with the state’s. Denver Public Schools posted the highest growth rate on the English tests, while the Adams 12 Five Star district posted the highest growth rate on math tests.

Aurora Public Schools posted the lowest growth rate on the state’s math tests. APS also tied the Douglas County School District for the lowest rate on the state’s English tests.


Like previously released achievement data measuring how well students are meeting academic expectations, state officials cautioned growth results had less reliability at schools with low participation rates.

Tracking growth in high schools was made more difficult by state lawmakers eliminating PARCC 10th and 11th grade tests last school year, leaving only 9th grade growth data.

To get a full picture of high school performance requires looking at PARCC in 9th grade, the PSAT in 10th grade, the ACT in 11th grade (and starting this academic year, the SAT) and some measure of postsecondary readiness for 12th graders, said Chris Gibbons, CEO of the Denver-based STRIVE Prep charter school network.

“It’s important our view of the performance of a high school – of any school – is informed by the entirety of the school and not just a single grade,” Gibbons said.

Tracking growth in math in higher grades also poses challenges — and in some circumstances, it’s impossible. Starting in the seventh grade, students may take any one of five math tests. Students who took math tests two grade levels higher than their actual grade level did not have growth results in math, said Pearson, of the education department.

‘Critical data’

Nichols, superintendent of the Hinsdale County School District in Lake City, has long been awaiting the state’s release of growth data.

“I’ve been holding my breath for this release of data,” she said, adding that her schools usually has too few students to publicly disclose achievement results.

Hinsdale posted the state’s highest median growth percentile on the English test. On average, the 32 students who took the state’s English test learned at a quicker rate than 82 percent of their academic peers.

Nichols immediately credited her teachers.

“Their expertise in writing and reading instruction is obviously shining through in these results,” she said.

Unlike some other rural superintendents who have been vocal critics of the PARCC tests, Nichols said she and her school district value the critical data the multi-state tests provide.

“I really need that connection to the larger world of education,” she said, adding another change in assessment would prove difficult for her small school district. “I could not do [standards and testing] by myself. I could not write my own. I get tired of everyone saying local is better all the time. It’s OK to measure my kids against something a little bigger.”

Find your school and district’s growth rate

Correction: An earlier version of this article incorrectly reported the Adams 12 Five Star district’s growth rate on the math tests. It is 55, making it the highest growth rate of the 10 largest school district’s in the state. An earlier version reported Cherry Creek had the highest rate.

First Person

Two fewer testing days in New York? Thank goodness. Here’s what else our students need

PHOTO: Christina Veiga

Every April, I feel the tension in my fifth-grade classroom rise. Students are concerned that all of their hard work throughout the year will boil down to six intense days of testing — three for math and three for English language arts.

Students know they need to be prepared to sit in a room for anywhere from 90 minutes to three hours with no opportunity to leave, barring an emergency. Many of them are sick to their stomachs, feeling more stress than a 10-year-old ever should, and yet they are expected to perform their best.

Meanwhile, teachers are frustrated that so many hours of valuable instruction have been replaced by testing, and that the results won’t be available until students are moving on to other classrooms.

This is what testing looks like in New York state. Or, at least it did. Last month, state officials voted to reduce testing from three days for each subject to two, to the elation of students, parents, and teachers across New York. It’s an example of our voices being heard — but there is still more to be done to make the testing process truly useful, and less stressful, for all of us.

As a fifth-grade teacher in the Bronx, I was thrilled by the news that testing time would be reduced. Though it doesn’t seem like much on paper, having two fewer days of gut-wrenching stress for students as young as eight means so much for their well-being and education. It gives students two more days of classroom instruction, interactive lessons, and engagement in thought-provoking discussions. Any reduction in testing also means more time with my students, since administrators can pull teachers out of their classrooms for up to a week to score each test.

Still, I know these tests provide us with critical data about how students are doing across our state and where we need to concentrate our resources. The changes address my worries about over-testing, while still ensuring that we have an objective measure of what students have learned across the state.

For those who fear that cutting one-third of the required state testing hours will not provide teachers with enough data to help our students, understand that we assess them before, during, and after each unit of study, along with mid-year tests and quizzes. It is unlikely that one extra day of testing will offer any significant additional insights into our students’ skills.

Also, the fact that we receive students’ state test results months later, at the end of June, means that we are more likely to have a snapshot of where are students were, rather than where they currently are — when it’s too late for us to use the information to help them.

That’s where New York can still do better. Teachers need timely data to tailor their teaching to meet student needs. As New York develops its next generation of tests and academic standards, we must ensure that they are developmentally appropriate. And officials need to continue to emphasize that state tests alone cannot fully assess a student’s knowledge and skills.

For this, parents and teachers must continue to demand that their voices are heard. Until then, thank you, New York Regents, for hearing us and reducing the number of testing days.

In my classroom, I’ll have two extra days to help my special needs students work towards the goals laid out in their individualized education plans. I’ll take it.

Rich Johnson teaches fifth grade at P.S. 105 in the Bronx.

a failure of accountability

High-stakes testing may push struggling teachers to younger grades, hurting students

PHOTO: Justin Weiner

Kindergarten, first grade, and second grade are often free of the high-stakes testing common in later grades — but those years are still high-stakes for students’ learning and development.

That means it’s a big problem when schools encourage their least effective teachers to work with their youngest students. And a new study says that the pressure of school accountability systems may be encouraging exactly that.

“Evidence on the importance of early-grades learning for later life outcomes suggests that a system that pushes schools to concentrate ineffective teachers in the earliest grades could have serious unintended consequences,” write study authors Jason Grissom of Vanderbilt and Demetra Kalogrides and Susanna Loeb of Stanford.

The research comes at an opportune time. All 50 states are in the middle of crafting new systems designed to hold schools accountable for student learning. And this is just the latest study to point out just how much those systems matter — for good and for ill.

The study, published earlier this month in the peer-reviewed American Educational Research Journal, focuses on Miami-Dade County schools, the fourth-largest district in the country, from 2003 to 2014. Florida had strict accountability rules during that period, including performance-based letter grades for schools. (Those policies have been promoted as a national model by former Florida Governor Jeb Bush and his national education reform outfit, where Education Secretary Betsy DeVos previously served on the board.)

The trio of researchers hypothesized that because Florida focuses on the performance of students in certain grades and subjects — generally third through 10th grade math and English — less-effective teachers would get shunted to other assignments, like early elementary grades or social studies.

That’s exactly what they found.

In particular, elementary teachers effective at raising test scores tended to end up teaching grades 3-6, while lower-performing ones moved toward early grades.

While that may have helped schools look better, it didn’t help students. Indeed, the study finds that being assigned a teacher in early elementary school who switched from a higher grade led to reduced academic achievement, effects that persisted through at least third grade.

The impact was modest in size, akin to being assigned a novice teacher as opposed to a more experienced one.

The study is limited in that it focuses on just a single district, albeit a very large one — a point the authors acknowledge. Still, the results are consistent with past research in North Carolina and Florida as a whole, and district leaders elsewhere have acknowledged responding to test pressure in the same way.

“There was once upon a time that, when the test was only grades 3 through 12, we put the least effective teachers in K-2,” schools chief Sharon Griffin of Shelby County schools in Memphis said earlier this year. “We can’t do that anymore. We’re killing third grade and then we have students who get in third grade whose challenges are so great, they never ever catch up.”

While the Florida study can’t definitively link the migration of teachers to the state’s accountability system, evidence suggests that it was a contributing factor.

For one, the pattern is more pronounced in F-rated schools, which face the greatest pressure to raise test scores. The pattern is also stronger when principals have more control over staffing decisions — consistent with the idea that school leaders are moving teachers around with accountability systems in mind.

Previous research of policies like No Child Left Behind that threaten to sanction schools with low test scores have found both benefits and downsides. On the positive side, accountability can lead to higher achievement on low-stakes exams and improved instruction; studies of Florida’s system, in particular, have found a number of positive effects. On the negative side, high-stakes testing has caused cheating, teaching to the test, and suspensions of students unlikely to test well.

So how can districts avoid the unintended consequences for young students documented by the Miami-Dade study?

One idea is to emphasize student proficiency in third grade, a proxy for how well schools have taught kids in kindergarten, first and second grades.

Scholars generally say that focusing on progress from year to year is a better gauge of school effectiveness than student proficiency. But a heavily growth-based system could actually give schools an incentive to lower student achievement in early grades.

“These results do make an argument for weighting [proficiency] in those early tests to essentially guard against totally ignoring those early grades,” said Grissom, who also noted that states could make more efforts to directly measure performance of the youngest students.

But Morgan Polikoff, an associate professor at the University of Southern California, was more skeptical of this approach.

“It’s not as if states are going to add grades K-2 testing, so schools and districts will always have this incentive (or think they do),” he told Chalkbeat in an email. “I think measurement is always going to be an issue in those early grades.”