Are Children Learning

Explaining the ISTEP debate: 6 reasons why the test ballooned

PHOTO: Alan Petersime
Frustrations with repeated problems with ISTEP have lawmakers looking for solutions.

The Indiana legislature is moving fast to cut at least three hours from the state ISTEP after two weeks of sharp words and behind-the-scenes negotiations over its length. Lawmakers are expected to rush a bill through both houses for the governor to sign next week to make the changes.

But with kids just days away from taking the exam, some are still asking: what caused the blow up?

The answer is a little complicated, but here are six reasons why ISTEP more than doubled in length from last year:

1. When standards change, tests must also change.

A big fight over Indiana’s academic standards last year ended when the state rapidly changed course and adopted quickly assembled new standards.

That disrupted a carefully coordinated plan in place since 2010 for the Indiana to adopt Common Core Standards along with 45 other states and use a shared exam that would test student knowledge with results that would be comparable across the country.

When Gov. Mike Pence and state Superintendent Glenda Ritz took office in 2012, Indiana had already adopted Common Core. Schools were putting it in place grade by grade, and a new Common Core-linked exam was scheduled to replace ISTEP this year.

But Pence was wary of the shared test — called the Partnership for the Assessment of Readiness for College and Careers or PARCC — and ordered the state to withdraw from the consortium creating the test in 2013. Six months later, both Pence and Ritz supported the idea of Indiana dropping out of Common Core and endorsed new locally made standards that were adopted last April.

Like Common Core,  Indiana’s new academic standards are more in-depth and ask students to do more analysis and critical thinking.

A test matching those expectations was needed in a hurry. Instead of taking years to adapt to the new standards and create the new exam, Indiana tried to do the whole process in a matter of months. That meant asking a lot of the 2015 ISTEP.

2. This year’s test had two extra goals — add questions to match the new standards and help create a test to replace ISTEP in 2016.

More difficult standards naturally meant Indiana needed a more difficult test. But there wasn’t time to completely overhaul ISTEP this year.

Instead, ISTEP was modified for this year to add several extra features. Many of the new standards were similar to the old standards, so many questions roughly matched the style and difficulty of past ISTEP exams. But new questions were added to also test students on new, tougher concepts included in the new standards, which were designed to make sure they graduate high school ready for college and careers.

The online version of ISTEP, for example, includes more advanced testing methods that ask kids to not only answer multiple-choice questions, but also answer questions in new ways, such as by dragging and dropping points on a graph or using drop-down menus.

Finally, this year’s ISTEP had one more job: Try out some questions that could be used on the 2016 exam.

But there was a problem. Indiana law requires release each year of all essay or short-answer test questions that are used in scoring. This would turn out to be a big factor in the length of the test.

3. A huge number of questions on this year’s test actually don’t count in a student’s score.

When test questions are released to the public they are effectively retired. They can never be used again on ISTEP.

So for this year’s exam, there were two big sets of essay and short answer questions: one group that counted toward each student’s score and must be released plus a large second set being tried out for use in 2016 that wouldn’t count.

Trying out questions is important. Test makers examine how students score on them to look for unexpected surprises. Questions they ask include: Was the question harder or easier for students than predicted? Was there reason to believe it was confusing to children? Was there any evidence the question was unfair to certain groups of students?

Trying out enough questions to be able to make a completely new test for 2016 was the main factor that caused what is normally a six-hour test to swell to more than 12 hours this year. All along, however, this was intended as a one-year problem. Future state exams are expected to be only slightly longer than the six-hour tests of the past.

The legislature appears poised to waive for one year the requirement that all essay and short-answer questions be released. This would allow some of this year’s questions to be reused so there could be far fewer extra questions that don’t count.

4. A longer test means more school days devoted to testing.

Indiana students don’t take all of ISTEP at once. They take sections of the exam in smaller doses over several days.

At its Feb. 4 meeting, the state board increased the number of days schools are allowed to use to give the test. The tests will be given over the course of almost a month, beginning Feb. 25 and ending in late March, followed by another set of testing days over three weeks at the end of April into May.

Schools can choose how to split up the parts of the test. Students might take just one section per day or do more depending on what teachers and principals decide. Danielle Shockey, the state’s deputy superintendent, said a testing day could take many shapes. In some schools, student take one 35-minute test section each day. In some schools, they spend an hour each day on testing. Other schools may do more.

“They have a long window of time,” Shockey said. “They can take one session a day if they so choose. It’s a local choice.”

5. Test makers had to consider that ISTEP is plays a critical role in school A-to-F grades and teacher evaluation ratings.

ISTEP is used to measure two things: how much students know of the content they were expected to learn this year, and how much they’ve improved from a previous year. Both factor into how Indiana measures the quality of schools with its A-to-F grading system, as well as how it evaluates teachers.

To determine a school’s A-to-F grade, the state considers both the percentage of students who pass ISTEP and how much students improved from last year. For teachers, the state expects to see their students’ test scores improve over the prior year.

When tests are roughly the same each year — measuring the same standards and using similar types of questions — it is easier to gauge how much students improved from the prior year. But when the standards change and the questions are crafted differently, test makers have to add extra questions to help determine each student’s improvement from the last test.

This spring’s test will include a few questions in English and math that are specifically designed to estimate roughly on what grade level each student best fits. For example, a fourth grade test might include a few third grade level questions and a few fifth grade level questions. Some students might do well on only the third grade questions but poorly on harder questions. Others might do well on all the questions, even the more challenging fifth grade questions.

Those extra questions help the test makers better estimate whether the student improved a little, a lot or not at all over the prior year. However, those extra questions also lengthen the test, but only by minutes, not hours, Michele Walker, testing director for the education department, said. The legislature agreed they were worth keeping — those questions will remain under the plan to shorten ISTEP.

6. Then, there’s the social studies question.

The federal No Child Left Behind Act, signed into law by President Bush in 2002, requires states to test students in English and math each year in grades 3 to 8, and once in high school, and also in science once during elementary, middle and high school.

Noticeably absent? Social studies.

Although Indiana’s social studies ISTEP test is only given to fifth- and seventh-graders each year, accounting for about an hour of testing for those grades, Pence’s test consultants recommended cutting that subject to reduce testing time further since it is only required by state law. That means the legislature could make an exception for this year.

State board members were divided on this idea. Some worried that it would send the message that social studies is not important. Others argued one hour for just two grades doesn’t add much test taking time.

But the legislature liked the idea of reducing test time further this way, so the Indiana Department of Education has told schools to expect the social studies exam to be optional this year. That means some students will take it, if the school decides they should, and others will be allowed to drop it for this year only.

research report

Three years in, some signs of (slight) academic growth at struggling ‘Renewal’ schools

PHOTO: Patrick Wall
Mayor Bill de Blasio at Brooklyn Generation School — part of the Renewal program

When Mayor Bill de Blasio launched an aggressive and expensive campaign to turn around the city’s lowest performing schools, he made a big promise: Schools would see “fast and intense” improvements within three years.

Almost exactly three years later, and after flooding 78 schools with more than $386 million in new social services and academic support, there are signs that the Renewal program has generated gains in student learning. The evidence is based on two newly updated analyses of test score data — one from Marcus Winters, a fellow at the conservative-learning Manhattan Institute, and the other from Aaron Pallas, a professor at Teachers College.

But the researchers caution that those improvements are modest — when they exist at all — and don’t yet match the mayor’s lofty promises.

The results may have implications far beyond New York City, as a national and political test case of whether injecting struggling schools with resources is more effective than closing them.

The two researchers previously reviewed the first two years of test score data in elementary and middle schools in the Renewal program: Winters found a positive effect on test scores, while Pallas generally found little to no effect.

Now, as the program reaches its third birthday, the pair of researchers have updated their findings with new test score data from last school year, and largely reaffirmed their earlier conclusions.

“We’re not seeing large increases” in student achievement, Pallas said. “And the reality is it’s hard to get large increases in struggling schools.”

Some advocates have argued that it is too early to expect big shifts in test scores, and that infusing schools with extra social services like mental health counseling and vision screenings are valuable in themselves. But de Blasio’s promise of quick academic turnaround has invited questions about Renewal’s effectiveness and whether resources can be more effective in improving low-performing schools than shuttering them.

To assess the program’s academic effect, Pallas compared changes in Renewal school test scores to other schools that had similar test results and student demographics when the program started, but did not receive extra support.

The biggest gains Pallas found were concentrated at the elementary level.

Over the past three school years, 20 elementary schools in the Renewal program have made larger gains on average in math and reading than 23 similar schools that didn’t get extra resources. The proportion of elementary school students considered proficient in reading at Renewal schools increased from 7 percent in 2014 to 18 percent last year — an 11-point jump. Meanwhile, the comparison schools also saw gains, but only by seven percentage points, giving Renewal schools a four percentage point advantage.

At the middle school level, the results are less encouraging. The 45 Renewal middle schools did not collectively outperform a group of 50 similar schools outside the program in reading or math.

In math, for instance, Renewal school students improved from 5 percent proficient to 7 percent. However, the comparison schools outside the program improved by roughly the same margin — increasing proficiency from 6 to 9 percent (and still far below city average). In reading, Renewal middle schools showed slightly less growth than the comparison group.

City officials have argued that Pallas’ findings are misleading partly because Renewal schools and the comparison schools are not actually comparable. Renewal schools, they say, were designated based on a range of factors like school climate or teacher effectiveness, not just student demographics and test scores.

“The schools included in the study are neither similar nor comparable in quality and a comparison of the two dissimilar groups is unreliable at best,” Michael Aciman, an education department spokesman, said in a statement. Aciman added that Renewal schools have made larger gains in reading and math than similar schools across the state, and have made progress in reducing chronic absenteeism and improving instruction.

Pallas notes that there are some limitations to his approach, and acknowledges that he could not account for some differences between the two groups, such as the quality of a school’s principal. He also does not use student-level data, for instance, which would allow a more fine-grained analysis of whether the Renewal program is boosting student achievement. But Pallas, and other researchers who have previously reviewed his data, have said his model is rigorous.

The Manhattan Institute’s Winters found more positive trends than Pallas, consistent with his earlier findings. Using an approach that evaluates whether Renewal schools are outperforming historical trends compared with schools outside the program, Winters found that the Renewal program appeared to have a statistically significant effect on both reading and math scores — roughly equivalent to the difference in student achievement between charter schools and traditional district schools in New York City.

Asked about how to interpret the fact that his results tended to be more positive, Winters said either interpretation is plausible.

“It’s hard to tell which of these is exactly right,” he said. But “neither of us are finding results that are consistent with what we would expect if the program is having a large positive effect.”

explainer

Five things to know about the latest brouhaha over Tennessee’s TNReady test

PHOTO: Laura Faith Kebede

Last week’s revelation that nearly 10,000 Tennessee high school tests were scored incorrectly has unleashed a new round of criticism of the standardized test known as TNReady.

Testing company Questar says it muffed some tests this spring after failing to update its scanning software. A year earlier, a series of mistakes got its predecessor, Measurement Inc., fired when Tennessee had to cancel most of TNReady in its first year after a failed transition to online testing.

While the two companies’ glitches are hardly comparable in scope, Questar’s flub has uncorked a tempest of frustration and anger over the standardized assessment and how it’s used to hold teachers accountable.

Here are five things to know about the latest TNReady flap:

1. A relatively small number of students, teachers, and schools are affected.

State officials report that the scoring problem was traced to only high school tests, not for its grade-schoolers. Of the 600,000 high school end-of-course tests, about 9,400 were scored incorrectly. Most of the fixes were so small that fewer than 1,700 tests — or less than one-tenth of 1 percent — saw any change in their overall performance level. A state spokeswoman says the corrected scores have been shared with the 33 impacted districts.

2. But the TNReady brand has taken another huge hit.

Tennessee has sought to rebuild public trust in TNReady under Questar and celebrated a relatively uneventful testing season last spring. But the parade of problems that surfaced during TNReady’s rollout, combined with this year’s drops in student performance under the new test, have made subsequent bumps feel more like sinkholes to educators who already are frustrated with the state’s emphasis on testing. Questar’s scanning problems were also tied to delays in delivering preliminary scores to school systems this spring — another bump that exasperated educators and parents at the end of the school year and led many districts to exclude the data from student report cards.

3. State lawmakers will revisit TNReady — and soon.

House Speaker Beth Harwell asked Monday for a hearing into the latest testing problems, and discussion could happen as early as next week when a legislative study committee is scheduled to meet in Nashville. Meanwhile, one Republican gubernatorial candidate says the state should eliminate student growth scores from teacher evaluations, and a teachers union in Memphis called on Tennessee to invalidate this year’s TNReady results.

4. Still, those talks are unlikely to derail TNReady.

Tennessee is heavily invested in its new assessment as part of its five-year strategic plan for raising student achievement. Changing course now would be a surprise. Last school year was the first time that all students in grades 3-11 took TNReady, a standardized test aligned to the Common Core standards, even though those expectations for what students should learn in math and English language arts have been in Tennessee classrooms since 2012. State officials view TNReady results as key to helping Tennessee reach its goal of ranking in the top half of states on the Nation’s Report Card by 2019.

5. Tennessee isn’t alone in traveling a bumpy testing road.

Questar was criticized this summer for its design of two tests in Missouri. Meanwhile, testing giant Pearson has logged errors and missteps in New York, Virginia, and Mississippi. And in Tennessee and Ohio this spring, the ACT testing company administered the wrong college entrance exam to almost 3,000 juniors from 31 schools. Officials with the Tennessee Department of Education emphasized this week that they expect 100 percent accuracy on scoring TNReady. “We hold our vendor and ourselves to the highest standard of delivery because that is what students, teachers, and families in Tennessee deserve,” said spokeswoman Sara Gast.