It’s been a common phenomenon in the testing world: A new exam is introduced, and the scores for it are significantly lower than for the old one — but then in the following years, come back up.

This “sawtooth” pattern has long been widely observed and written about, so it stands to reason that it’s also worked its way into the expectations of state education officials closely monitoring test results each year.

But in Indiana, ISTEP scores at the state level have been stagnant. In the two rounds of ISTEP tests since 2015, scores statewide have barely changed — 51.5 percent of students passed both English and math exams this year, compared to 53.5 percent in 2015. That year, with the switch to a tougher new exam, the number of students passing was 22 percentage points lower than on the previous test.

“The fact that (passing rates) are as flat as they are is striking,” said Derek Briggs, a test researcher and professor at the University of Colorado Boulder.

The suspicion that the flat scores are the new normal for Indiana — and that the lingering hope for big improvements in a short period of time could be unrealistic — has testing experts and educators speculating about why, and what could be done to nudge scores upward.

“The evidence that there’s a large test score gain after the first year or two, I think, is based on a different era of testing,” said Ed Roeber, a former Michigan testing director and consultant who has worked with Indiana on ISTEP.

Indiana is not alone in seeing flat scores. Michigan appears to be seeing a similar effect, with scores staying largely the same. But other states are still experiencing a sawtooth pattern.

Some experts suspect that the switch in 2015 to a tougher test based on new state standards has effectively ended the days of sawtooth results in Indiana.

Roeber said before No Child Left Behind, state tests and standards received far less fanfare from the public and even from rank-and-file educators. When a test changed, it might have caught schools off-guard and led to score drops. In subsequent years when students became more familiar with the format, scores would improve.

Now, it’s not just the format that’s changing — the questions themselves are more difficult and ask students to think in deeper ways than previously. More practice questions or better test-taking strategies aren’t a quick fix.

“People pay far more attention to the tests, so hard tests don’t get easier just by people becoming familiar with them,” Roeber said.

Briggs said the new test requires different skills that defy easy fixes.

“You could argue that if the standards were successful at focusing both instruction and redevelopment of assessments to that higher depth of knowledge, that these would be the kinds of things that are much harder to coach or teach the students how to beat on a test,” Briggs said.

If so, that puts added pressure on teachers and schools, experts said.

“The only way to really change student performance is to make sure they are learning what they are being taught,” Roeber said.

Roeber said teachers need to assess their own teaching and see how they can measure student learning every day, not just once a year. And schools need to give them the freedom, time and support to do it.

He calls these “formative tests” — different from the computerized Northwest Evaluation Association exams Indiana teachers have advocated for. Instead, he’s talking about asking students to do things like answer math questions on whiteboards after a lesson or having teachers randomly call on students to gauge widespread understanding.

At the school level, teachers should also meet within and across grade levels to ensure they are prepared for the students they get each year. They need to know students’ strengths and weaknesses and then adjust their instruction and curriculum accordingly.

“We spend about 95 percent of our resources telling people what kids do or do not know and less than 5 percent helping teachers assess their teaching,” Roeber said. “We keep emphasizing testing kids for how much they’ve learned and think that is going to change how teachers teach and what (kids learn), and it doesn’t.”

To be sure, there are many teachers and schools who already do this. Ayana Wilson-Coles, a second grade teacher at Eagle Creek Elementary School in Pike Township, is one of them.

Although Wilson-Coles now teaches a non-tested grade, she was previously a third-grade teacher. She’s still very involved in conversations happening between teachers, she said, and she knows what to look for in her own students to make sure they are prepared when they do get ready to test.

But the new ISTEP represented “a huge shift in thinking,” Wilson-Coles said. The standards now are “requiring kids to really think critically and have higher level thinking. I think sometimes teachers are not sure of how to do that … that has a lot to do with why we’re not seeing a change. That kind of thinking has to happen genuinely and you can’t force it.”

Wilson-Coles also thinks that the amount of testing being done is having a burnout effect. She remembers her third-graders being overwhelmed with tests — IREAD, then ISTEP, then the NWEA Map test.

“By the time they were doing the computerized tests, they were tired and done,” she said. “They do know how to think, but having them be motivated, and showing that” on still another test is a challenge, she said.

Bob Schaeffer, spokesman for The National Center for Fair and Open Testing, an organization that acts as a testing watchdog, said the flat scores could be indicative of a larger issue, but also show the accountability system as a whole isn’t leading to improvement — its stated purpose.

“It’s worth an investigation to try to see what’s going on and why things are flat,” Schaeffer said. “But (the state) should look at better ways to assess Indiana’s public school students that actually improves academic excellence and equity.”

Damian Betebenner, a consultant with the Center for Assessment who has worked with Indiana on A-F grades, said educators have to keep in mind that test scores are still an important indicator of “whether there’s problems or what types of problems you might have.

“It’s a number, and it’s a very valuable set of numbers, that can be used as part of a larger investigation as to what to do,” Betebenner said.

And, Briggs said, with one more year left of ISTEP, there should still be a way to draw some comparison between its scores and those on the next test, ILEARN, in 2019. It would depend heavily on the test vendor, content and structure, but “it’s not impossible that connections couldn’t be made to trend over time,” he said.

At the same time, Briggs said it may be too soon to fully explain the flat scores. More definitive research is necessary to establish any kind of trend, he said.

“At this point we should have evidence across the country,” Briggs said. “It’d be nice if we had that in a more systematic way.”