To put it in O’Neil’s words: “A WMD is a predictive algorithm predicting success, making important decisions for a lot of people – like who gets a job, a loan or a college offer. It’s secret and it is unfair.” In her book, she defined WMD as having the three components; opacity, scale and damage.
She said: “They increase inequality because essentially naive algorithms just propagate the past because they are trained on historical data. They offer opportunities to people that have gotten opportunities in the past and they deny opportunities to people who were denied opportunities in the past. They essentially make lucky people luckier and make unlucky people unluckier.”
This propagation of luckiness or unluckiness serves to increase or at least maintain inequality. In her view, the threat to democracy comes from the newsfeed algorithms of social media platforms such as Facebook which serve up non-factual and misinformation that stoke outrage. In addition, they permit political campaigns to manipulate users through microtargeting.
O’Neil is fundamentally against WMD as she believes they fail the individual and they fail society; she goes as far as to say that they both increase inequality and threaten democracy. Furthermore, the fact that they are not trustworthy is undermining people’s trust in science.
The data scientist made the case for her argument during a keynote presentation at Teradata Universe. She also put forward suggestions that could help to make those who create algorithms, and those who implement them, more accountable and more aware of the risks they pose and the harm they could do to different stakeholders and then redesign to minimise the harm and risk.
“We should ask of algorithms: ’For whom might this fail?’”
O’Neil said that the first questions we should ask of algorithms: “For whom might this fail and what does it mean to fail or succeed?” She then introduced the audience to the paradigm of the ethical matrix. “The ethical matrix provides us is a way of surfacing the moral judgements and the values that we are embedding into our algorithm. Right now we are just doing it implicitly, through our definitions of success, through our objective function, but we should do this explicitly, especially when the stakes are very high.”
An ethical matrix is essentially a grid with stakeholders along one axis and outcomes along the other. The grid squares where the stakeholders and the outcomes correlate are then filled with green, amber or red to denote a positive, neutral or negative impact.
A famous example of algorithmic bias is Northpointe’s Compas recidivism tool for calculating crime risk score or the likelihood of a defendant being rearrested within two years. Those given higher scores by the algorithm tended to be given longer sentences as the scores are given to judges to help them decide. It was analysed by ProPublica which looked at 10,000 criminal defendants in Broward County, Florida. ProPublica found that black defendants were twice as likely to be incorrectly judged to be of high risk of recidivism than white defendants, while white defendants were twice as likely than black defendants to be erroneously assigned a low risk of recidivism. So it was turning out false positives as well as false negatives.
The algorithm assumed that the past had set the perfect model to follow and as such ignored the injustice of black people being more likely to be arrested or being given harsher sentences compared to white people for committing the same crime. It also ignored the nuances and blind spots in crime data which used arrest data as a proxy for it. Arrest data is a poor proxy the number of crimes committed far outweigh the number of arrests made. Arrest data is rather a reflection of policing.
ProPublica concluded Northpointe’s algorithm is racist and O’Neil created an ethical matrix to illustrate that as “the artefact of the conversation.” She said: “ProPublica had three stakeholders in mind: the court system itself, black defendants and white defendants. Their point was that based on their results, black defendants had a serious concern around their false positive rates. It’s red because it means there is likely a major problem here, maybe a civil rights violation.”
Northpointe produced a rebuttal which, according to O’Neil essentially said: “We don’t define racist as you do, we think we are being fair and we define fairness this way, through something called predictive parity and something called accuracy equity.” She produced another ethical matrix that correlated to Northpointe’s response.
O’Neil went on to give further examples of algorithms that fail the people or the stakeholders they are trying to help. One, reported in Virginia Eubanks’ Automating Inequality, dealt with a child abuse hotline in a town in Pennsylvania. The algorithm was used to triage cases reported to the hotline and decide which cases warranted a social worker making a family visit. As with the previous example, the stakes were extremely high because a false positive would mean a child remaining in an abusive home. In contrast, a false negative would be a child taken from their home and put into foster care. The human social worker, or the algorithm, has to decide how many false positives are acceptable, to have as few false negatives as possible.
“The ethical matrix sets up a monitor.”
The ethical matrix, according to O’Neil, encourages people to think about those questions before building and deploying the algorithm. She said: “The ethical matrix sets up a monitor for that question so that we can see whether are getting the target ratio. I don’t know what that number should be either but my point is that if you are going to use an algorithm, you actually have to make that choice. Either you don’t admit it or you do admit it. And I think it makes more sense if we do admit it.”
Another example she mentioned was the Value Added Model used by the New York City Department of Education to get rid of bad teachers. The controversial model assessed teachers on the increase in test scores of their students over an academic year. Teachers who scored badly risked losing their jobs but the code in the algorithm was confidential. A journalist published the teachers’ scores following an FOI request and a high school maths teacher found amongst the data that the same teacher could have wildly different scores for different classes of pupils. Essentially the scores were almost random. O’Neil guessed that the results would be noisy and random but wasn’t able to prove it because the terms of the licensing agreement prevent her or even the NYC DoE from seeing the code.
O’Neil illustrated these egregious examples of unfairness, secrecy and lack of accountability in algorithmic decision-making lead to grave risk of harm being done to black defendants, poor children and powerless teachers.
“The matrix shows you who is in power.”
Of the ethical matrix, O’Neil said: “This might not be a perfect approach to thinking about all the different stakeholders and concerns but this is my attempt.” She has set up a company ORCAA that helps companies with their algorithmic auditing. “The matrix is what shows you who is in power. We have to make sure the ethical matrix is done for the people that don’t have a seat at the table.”