When Colorado high school student Isabel Castaneda checked her final grades for the International Baccalaureate program in July, she was shocked.
Despite being one of the top-ranking students in her public school, she failed a number of courses including high-level Spanish, her native language.
The International Baccalaureate (IB) program – a global standard of educational testing that also allows US high-school students to obtain college credit – canceled its exams in May, due to the coronavirus pandemic.
Instead of sitting final exams, which usually account for the majority of students’ scores, students were assigned their marks based on a mathematical “awarding model,” as described by the IB program.
“I come from a low-income family – and my entire last two years were driven by the goal of getting as many college credits as I could to save money on school,” Castaneda said in a phone interview. “When I saw those scores, my heart sank.”
The COVID-19 pandemic has disrupted exams all over the world, and educational institutions have adapted in a range of ways, from moving tests online to asking students to wear protective gear during testing.
Relying on an algorithm to help determine results comes with its own specific risks, researchers warn.
Depending on the kinds of data the model considers, and how it makes predictions, it has the potential to reproduce – or even exacerbate – existing patterns of inequality for low-income and minority students, they say.
About 160,000 students, including nearly 90,000 in the US, take IB courses every year. Almost 60 percent of public schools that offer IB in the US are “Title I” schools, with significant low-income student populations, according to the program.
“The choice to use a statistical model in place of a traditional examination warrants several concerns,” said Esther Rolf, a PhD candidate at the University of California Berkeley, who studies algorithmic fairness.
“Using historical records… often leads to bias against individuals from historical underprivileged groups.”
IB spokesman Dan Rene shared with Reuters an explanation of the model which relied on three main components. They were coursework, predictions teachers made about how students would perform on the exam, and the “school context,” which included historical data on predicted results, and performance on past coursework for each subject.
“This process was subjected to rigorous testing by educational statistic specialists,” the spokesman said in an emailed statement.
IB also released a statistical May bulletin showing that average scores in 2020 were in line with previous years, and said it had a process to “review extraordinary cases.”
Lost college offers
In previous years, students’ grades have been generated by combining final exams graded by IB and coursework marked by their teachers, which the IB spot-checked, according to its website.
Teachers also make predictions about their students’ final grades, which students can use to secure provisional college admissions before taking their final exams.
“School’s own record was built into the model” by using “historical data to model predicted grade accuracy, as well as the record of the school to do better or worse on examinations compared with coursework,” the IB’s statement noted.
Although the IB insists its model is not an algorithm, experts say it is.
Joe Lumsden, secondary principal at Stonehill International School in Bangalore, India, worried that an entire school’s record might not be an accurate indicator for an individual student’s performance or potential.
“If there are bright students at a struggling school that’s never performed well before, the algorithm could have pulled their scores down – we just don’t know,” said Lumsden.
Several students, as well as parents and teachers, have told Reuters that they have had university offers contingent on certain scores rescinded since the final exam results were published.
Testing in crisis
Many testing services have been forced to change their procedures as a result of the coronavirus pandemic.
The College Board, the US non-profit that runs the Advanced Placement exams, which allow high-school students to earn credit for some US college courses, moved the process online.
The ACT, another exam used in US college admissions, has postponed its testing. Other tests – including a number of state bar exams – have also been moved online.
Iris Palmer, a senior advisor with the Education Policy program at New America, a Washington-based think tank, said she had never heard of a statistical model being used to assign grades.
“The way we use algorithms in education can be especially problematic if there is bias,” she said. “The results can determine the course of the rest of your life.”
She was particularly worried about how the algorithm may have weighed the historical performance of a school when assigning students’ grades in 2020.
“In schools with a lot of turnover, or without a lot of resources, this could really not work well,” she said.
“Students… who are black or low-income are probably at a disadvantage from the algorithm,” Palmer added.
“You have to start with the assumption that the algorithm is flawed,” said Nicol Turner Lee, director of the Center for Technology Innovation at the Brookings Institution think tank, agreed it can be hard to build a fair model out of past educational data, given the inequality already baked into the educational system.
For Castaneda, her final IB results mean she will not receive the college credits she was expecting when she attends Colorado State University in the fall.
A security guard wearing a face shield checks a visitor’s temperature amid concerns over coronavirus, before a college entrance exam in Banda Aceh, Indonesia on July 5. Photo: AFP