Speaking Out

Testmaker: What went wrong with TNReady

The head of the company that created TNReady accepts blame for this year’s botched rollout of Tennessee’s new standardized online assessment, but says the subsequent delays in delivering printed testing materials were unavoidable.

Measurement Inc. president and founder Henry Scherich says Tennessee Education Commissioner Candice McQueen’s decision to scrap the online assessment on the first day of testing in February set in motion a chain of logistical quagmires that were impossible to overcome.

Once McQueen ordered districts to switch back to paper tests, his company found the sudden task of printing and delivering up to 5 million documents this spring overwhelming, if not impossible.

Henry Scherich
Henry Scherich

“I understand the frustration of superintendents and the state department,” Scherich said. “Having said all of that, this was a huge job that we took on and there’s been no testing company in the country — in the world, probably — who has taken on the task of printing and shipping this many tests in this short a period of time, and we really struggled with it.”

Last week, the Tennessee Department of Education informed district leaders that many of the testing materials wouldn’t arrive in time for the opening of this week’s final TNReady testing window — the latest in a series of delivery delays that has wreaked havoc in districts and classrooms across the state. State leaders placed the blame squarely on Measurement Inc.

In an interview this week with Chalkbeat, Scherich acknowledged that developing and delivering TNReady in a new online platform was the biggest job that his 36-year-old Durham, N.C.-based company has ever undertaken — perhaps too big given the one-year deadline.

He offered a behind-the-scenes look at the snafus and challenges. At the same time, he insisted that TNReady is a strong test and — once its delivery platform is fixed — the assessment can can help Tennessee reach its accountability goals.

Measurement Inc. won the bid to create Tennessee’s test for grades 3-11 math and English language arts in October of 2014, only months after a vote by the Tennessee legislature prompted the Department of Education to pull out of PARCC, a consortium of other states with a shared Common Core-aligned assessment. The company would have a year to develop a test for Tennessee. A small number of high school students on block schedules would take the test in the fall of 2015, with the bulk of students in grades 3-11 taking it the following spring.

TNReady marked an unprecedented shift for Tennessee and, like PARCC, was supposed to be online and aligned with the current Common Core State Standards.

"It was a failure in some respects because we were supposed to design a system that would take 100,000 students in at one time."Henry Scherich

It was also an unprecedented task for Measurement Inc., which had never before developed and delivered a state’s entire online testing program.

But on Feb. 8, the very first day of statewide online testing, the test buckled as more and more students logged on. Even so, leaders of Measurement Inc. were surprised when McQueen quickly pulled the plug on the online assessment, and announced that the state would switch to paper-and-pencil versions.

Here’s what happened, according to Scherich:

Online ‘crash’

Scherich says that, first of all, the system never “crashed” on the first day. Students’ screens never went blank. Instead, he calls what happened “infrastructure saturation.” As more and more students logged on, their cursors began to spin, signaling that the test was taking longer to load than it should have.

What was the problem? Ultimately, Scherich says, there weren’t enough servers for the volume of students online, causing the system to clog up as more and more students logged in. He declined to speculate on how long it would have taken to fix the problem and add more primary servers, but said that it would have been possible to get back on track.

“We could have duplicated the system,” he said. “We would have said to half of the state, you work on these 64 servers and the other half work on another set.”

About 48,000 students logged on that day, and about 18,000 submitted assessments. It’s unknown the number of students who weren’t having troubles with the test, but stopped after McQueen sent an email instructing districts to halt testing.

“It was a failure in some respects because we were supposed to design a system that would take 100,000 students in at one time… We had a problem with 48,000,” Scherich said.

Printing delays

Scherich says the subsequent delays come down to this: There were a lot of tests to be printed, and not a lot of printers available on short notice. Overall, the switch to printing meant Measurement Inc. had to scramble to print answer sheets and test booklets for grades 3-11 amounting to 5 million documents — when only weeks before, they hadn’t planned on printing any.

"There’s been no testing company in the country — in the world, probably — who has taken on the task of printing and shipping this many tests in this short a period of time ..."

Measurement Inc. worked with the Department of Education to transfer different versions of the tests from computer to paper. Each test had several versions with different field test items embedded within.

“You can’t just push the button on the computer and have the test be printed out,” he said. “The formatting is all different.”

In the meantime, Measurement Inc. sought out printers who were able to fulfill the large order quickly. Through 36 years in testing, the company had a lot of connections, but only three printing plant operators said they were up to the task. Eventually, two backed out, leaving Measurement Inc. with one: RR Donnelley based in Chicago.

“It’s a large printing company, and they had plants all over the U.S. They printed one or more of the tests or the answer documents in 11 different printing plants around the country,” he said. “So we were getting tests from Minnesota, Missouri. They ran a lot of night shifts to do that for us.”

Once the tests and answer sheets arrived at Measurement Inc., they had to be sorted and distributed to schools. That’s 5 million tests, spread across nearly 1,000 schools.

The last documents arrived from the printer last Saturday, and Measurement Inc. is rushing to get them out in the next two to three days, Scherich said.

Tight timeline

Measurement Inc. had about a year to develop the test and roll out an online system for the entire state. In comparison, PARCC, the online assessment that Tennessee originally was slated to use, was developed in about five years.

Though Measurement Inc. had been working on its online platform for six years and used it previously in other states, including Tennessee for its writing test, the company had never before undertaken a state’s entire testing program.

Measurement Inc. not only developed the TNReady tests for math and English language arts, but also put the content for science and social studies on its online platform, known as MIST.

He said a lot was done right in developing TNReady, including the recruitment of 400 Tennessee teachers to help write test questions designed to measure critical thinking skills.

“I think that our staff and the state of Tennessee staff did an excellent job in building an assessment,” he said. “The math test is a good test. (English language arts) is a good test. Tennessee has a good catalog, a good library of test items for the future.”

measuring up

Civil rights and community groups: Adjust inflated Denver elementary school ratings

PHOTO: Helen H. Richardson, The Denver Post

The leaders of six community groups issued a joint letter Thursday calling on the Denver school board to immediately correct what they called misleading and inflated elementary school ratings.

“Parents rely on the accuracy of the district’s school rating system, and providing anything short of that is simply unacceptable,” says the letter, which noted that Denver Public Schools families will soon begin making choices about where to send their children to school next year.

Superintendent Tom Boasberg said the district plans to address the issue the group is raising but would not change this year’s School Performance Framework ratings, which were released in October.

The letter was signed by leaders from groups that advocate for people of color: the Urban League of Metropolitan Denver, the NAACP Denver Branch, the African Leadership Group, Together Colorado, Padres y Jovenes Unidos and Alpha Phi Alpha, Inc., the nation’s first African-American fraternity.

“The methods used to calculate school scores in the 2017 SPF have, as acknowledged in meetings between the superintendent and the undersigned, resulted in inflated performance rankings,” the letter says. “Specifically, the district is significantly overstating literacy gains, which distorts overall academic performance across all elementary schools.”

The School Performance Framework awards schools points based on various metrics. The points put them in one of five color categories: blue (the highest), green, yellow, orange and red. A record number of schools earned blue and green ratings this year.

The district increased the number of points elementary schools could earn this year if their students in kindergarten through third grade did well on state-required early literacy tests, the most common of which is called iStation.

The increase came at the same time schools across Denver saw big jumps in the number of students scoring at grade-level on iStation and similar tests. While the district celebrated those gains and credited an increased focus on early literacy, some community leaders and advocates questioned whether the scores paint an accurate picture of student achievement.

At some schools, there was a big gap between the percentage of third-graders reading at grade-level as measured by the early literacy tests and the percentage of third-graders reading and writing at grade-level according to the more rigorous PARCC tests. The state and the district consider the PARCC tests the gold standard measure of what students should know.

For example, 73 percent of third-graders at Castro Elementary in southwest Denver scored on grade-level on iStation, but just 17 percent did on PARCC.

Boasberg has acknowledged the misalignment. To address it, the district announced this fall that it plans to raise the early literacy test cut points starting in 2019 for the purposes of the School Performance Framework, which means it will be harder for schools to earn points. The delay in raising the cut points is to give schools time to get used to them, Boasberg said.

But the letter authors don’t want to wait. They’re asking the district to issue a “correction of the early literacy measure” before its school choice window opens in February.

“We call on the Denver Public Schools Board and Superintendent to re-issue corrected 2017 school performance results for all affected schools to ensure parents have honest information to choose the schools that are best for their students,” the letter says.

But Boasberg said changing the ratings now would be “fundamentally unfair and make very little sense.”

“If you’re going to change the rules of the game, it’s certainly advisable to change them before the game starts,” he said.

In an interview, Sean Bradley, the president of the Urban League of Metropolitan Denver, said, “This is not an attempt to come after the district. The Urban League has had a longstanding partnership with DPS. We work together on a lot of issues that really impact our community.

“But when our organizations see things that may not be in the full best interest of our communities,” Bradley said, “we have a real responsibility to talk about it and work with the district to rectify it.”

The concern about early literacy scores was one of several expressed by advocates and educators related to this year’s school ratings. Others complained the district’s new “academic gaps indicator” unfairly penalized schools that serve a diverse population.

Read the letter in its entirety below.

Digging in

We read all 279 pages of reports about grade changes in Memphis. Here are five big takeaways.

PHOTO: Seth McConnell/The Denver Post
At least 53 students who graduated from Trezevant High School shouldn’t have received their diplomas due to improper grade alterations, according to a report.

Reports detailing how grades were falsified at Trezevant High School have called into question whether grade changes happening at other Memphis high schools are legitimate.

Shelby County Schools released the results last week of a six-month investigation into how grades are handled at all 41 high schools in Tennessee’s largest district. The probe launched after a new Trezevant principal reported inconsistencies between report cards and transcripts at his school in September 2016.

We read all 279 pages of the reports by legal and accounting firms hired to look into the matter. Here are five takeaways:

1. Some of the allegations have merit.

Complaints that some grades had been changed on transcripts at Trezevant High ring true, according to the report, and there’s cause for suspicion at some other high schools, too.

A team of investigators led by former U.S. attorney Ed Stanton said at least 53 students who graduated from Trezevant shouldn’t have received their diplomas due to improper grade changes.

But Trezevant might not be alone. A separate report by a North Carolina accounting firm found a high rate of grade changes at six other high schools within Shelby County Schools. The average number of grade changes across all high schools was 53, but Trezevant had 461 and Kirby logged 582 between 2012 and 2016. A deeper probe into those schools has been ordered.

2. District leaders weren’t caught totally off guard

While expressing surprise at the findings, district administrators began building in safeguards to prevent illicit grade changes months before Trezevant Principal Ronnie Mackin reported finding discrepancies. Under a 2016 change, Shelby County Schools began requiring all teachers to use the same electronic grading database known as SMS.

“The District implemented this policy in an effort to effectuate a uniform and consistent method for grade entry which was designed to ensure truthful grading data,” the report said. “As an additional safeguard, SCS also required school principals to implement grading protocols aimed at ensuring the accuracy of the grades that teachers entered into SMS.”

Additionally, Mackin told investigators that Superintendent Dorsey Hopson and Chief of Schools Sharon Griffin “informed him that there was an ‘adult culture problem’ and ‘a financial mess’ that needed to be ‘cleaned up’ at Trezevant.”

And this week, Hopson said rumors of grade-changing have been floating around for years.

“As a Memphian, who went to school here, far back as high school, I would always hear rumors of people changing people’s grades,” he told reporters. “That’s persisted for a long period of time.”

3. Grade floors and grade tampering aren’t the same thing.

Around the same time that Mackin turned over evidence of falsified grades, he implemented a “grade floor” policy in which Trezevant students don’t receive grades below a certain threshold.

So if a student was failing a class, Mackin discouraged teachers from giving that student a grade below 60 percent because “there is a mathematical impossibility of scoring high enough to make up the grade in the future,” he wrote in an email to then-supervisor Tonye Smith-McBride. Such low grades would contribute to a lack of student motivation and behavior issues, he argued.

Mackin told investigators that he was referring to future grades, but the timing of his directive appeared to contribute to confusion about grading policies at Trezevant, making some teachers think that their principal was instructing them to retroactively change failing grades to passing ones.

Trezevant isn’t alone in having grade floors. Hopson said other schools have similar practices and that he would like a uniform policy on the issue. Stanton’s report makes that recommendation.

4. Investigators found no evidence to support other complaints that were not about academics.

Mackin’s six-page resignation letter on June 1 accused Shelby County Schools of a cover-up and said that he was being painted as a scapegoat for questionable finances at Trezevant.

“Our investigation has determined that no cover up occurred,” the report read, adding that investigators found no evidence that Mackin was “wrongfully targeted” either as the district looked into finances.

In fact, the report noted that, in several public statements, district leaders hailed Mackin for unearthing suspicious activity on grades. As for a cover-up, Hopson alerted the State Department of Education in a timely matter that the district was conducting an internal review into Mackin’s concerns.

Investigators also found no evidence to support Mackin’s allegations that Trezevant’s football coach mis-reported the school’s enrollment to state athletic officials and that his supervisor had sexually harassed him.

5. There’s still lots of questions to be answered.

The accounting firm hired to review transcript changes at Memphis high schools found that 10 schools had more than 200 instances from 2012 to 2016. However, the review team could not determine if any were fraudulent and concluded that “additional investigation around grade changes is warranted.”

Investigators also were hampered from getting to the truth at Trezevant without the subpoena power that compels witnesses to speak up.

For example, investigators could not locate several people that Mackin claimed had evidence that would incriminate football coach Teli White, who has since been fired, regarding allegations that he paid student-athletes. They also could not search White’s email or bank accounts to look into allegations of financial fraud.

Hopson told reporters this week that his administration is considering turning over a list of former school administrators to Shelby County’s district attorney, who would have subpoena power in the matter. The superintendent, who is an attorney, said the findings of the first external review may merit a criminal investigation.

The full report by Butler Snow & Dixon Hughes Goodman is available here.

The full Ogletree Deakins report is available here.