Professional Insight
On the Shelf

The Darker Side of Data

Weapons of Math Destruction by Cathy O’Neil (Crown 2016. 259 pp. $26.00)

In 2011, Sarah Wysocki, a fifth-grade teacher in Washington, D.C., was fired from her job — apparently by an algorithm. Wysocki, who had received excellent reviews from her principal and her students’ parents, was given a low score that year by a model that evaluates teachers by crunching their students’ test scores. When she questioned the decision, Wysocki was unable to learn how her low score was calculated because the algorithm was proprietary.

Data scientist Cathy O’Neil loves numbers, but as she points out, “Analyzing the test scores of only 25 or 30 students is statistically unsound, even laughable.” As a result, she calls these types of big data algorithms run amok “Weapons of Math Destruction,” or WMDs for short. In her first book, O’Neil invites the reader on a tour of the damage WMDs can inflict, writing, “Welcome to the dark side of data.”

As a tour guide, O’Neil is a uniquely qualified insider. With a Ph.D. in math, she got a front-row seat to the workings of big data when she joined the Wall Street hedge fund D.E. Shaw as a quantitative analyst in 2007. In 1996 Fortune magazine called Shaw “the most intriguing and mysterious force on Wall Street today … [with] market-beating algorithms so secret, even limited partners … aren’t entirely sure what’s going on behind the curtain.”

Watching the 2008 financial crisis unfold left O’Neil both disillusioned and determined to pull back the curtain. “The crash made it clear that mathematics, once my refuge, was not only deeply entangled in the world’s problems but also fueling many of them,” she explains in the book. “The housing crisis, the collapse of major financial institutions … all had been aided and abetted by mathematicians wielding magic formulas.” After leaving the hedge fund in 2009, O’Neil worked as a data scientist for an e-commerce start-up. She began blogging about the problems she saw with big data under the moniker MathBabe, then quit her job in 2012 to write the book she calls “a wake-up call.”

O’Neil identifies three major problems with the magic formulas that turn into WMDs: opacity, scale and damage. As the fired teacher discovered, algorithms are often protected as intellectual property — and an algorithm you can’t analyze is one you can’t argue with.

Even algorithms that aren’t secret can become a problem when their use spreads exponentially. Credit scores, for example, are now commonly used to judge job applicants. FICO scores are widely considered a fair and objective measure of whether someone will pay their bills on time. It’s built from relevant data — a person’s payment history. But it was never designed to predict whether someone is conscientious or hard working.

The result? Because medical bills are the most common cause of bankruptcy in the U.S., O’Neil argues, a bad credit score can easily stem from a past medical emergency, not a lack of responsibility. The damage inflicted by this misreading of data then escalates by creating a negative feedback loop. The person trying to recover from bankruptcy now has trouble finding a job … which leads to more financial trouble, and a lower credit score.

Paved with Good Intentions

Ironically, O’Neil believes that many WMDs are born of good intentions and the hope that relying on data instead of human judgment can eliminate bias. “I don’t want to ascribe blame to people,” says O’Neil. “They’re using these risk scores because they want to do things fairly. But the problem is … that they’re not actually fair.” According to O’Neil, one flaw common to many WMDs is confusing correlation with causation. Recidivism scores are a case in point.

As the fired teacher discovered, algorithms are often protected as intellectual property — and an algorithm you can’t analyze is one you can’t argue with.

 

In many states, prisoners awaiting sentencing fill out a questionnaire, which is used to model their risk of becoming repeat offenders or violating parole or probation. Their answers can affect the length of their sentences. But among the questions are some that relate to factors beyond a person’s control, like whether any relatives have criminal convictions. “We might have something that’s very predictively accurate, like recidivism scores,” says O’Neil, “but that doesn’t mean that individuals should be held accountable for their risk. Because if my risk is high because I’m a black man living in a poor neighborhood, not because of criminal acts in my past, that’s not fair … that is a confusion between correlation and causation.”

While data analysts understand math and statistics, not all decision-makers do. O’Neil thinks that adds an extra layer to the WMD problem. “They’re both afraid of math, and they trust math,” she says. “There’s an element of math that prevents people from scrutinizing and interrogating the actual decision-making process.” As a result, she says, “the verdicts from WMDs land like dictates from the algorithmic gods.”

Insurance: The Big Explosion

O’Neil sums up the collision of big data and health insurance this way: “Insurance meets big data and there’s a big explosion.” The problem, she says, is that “The big data movement is essentially incompatible with insurance. Insurance is pooled risk … in order for insurance to work, you kind of need to be ignorant in certain ways. In particular, you don’t know exactly who’s going to need the money.”

With car insurance, O’Neil sees a different kind of damaging algorithm at work. The book cites a Consumer Reports study on car insurance prices that found adults in Florida with clean driving records, but poor credit scores, paid more — an average of $1,552 more — than drivers with drunk-driving convictions who happened to have excellent credit.

The result, according to O’Neil, is a pricing model that’s fundamentally unfair and punishes the poor. “The other thing that’s really frightening about [car] insurance,” O’Neil says, “is that people are being charged more for risk in advance of actually doing something risky, in advance of getting in trouble or having a car accident, because of big data.”

Calling All Actuaries

What can actuaries do about the intersection of big data and insurance? “I would call for them to make their own suggestions on how to deal with this,” says O’Neil. “Because I would like to know what experts think. It’s really hard. The example I gave in the book about understanding people’s future health risk … and how that could play out for good or for evil … if your doctor had [data] to help you stay healthy, [versus] if an employer had it to prevent you from getting a job, because you pose expenses on a health plan. I actually don’t know how to deal with that in a fair way.”

O’Neil thinks that professional organizations can use their expertise to defuse damaging models. She points to a statement released by the American Statistical Association in 2014 on the shortcomings of the using value-added models (VAMs) to evaluate teachers like Sarah Wysocki. One sentence in the seven-page statement reads, “VAMs typically measure correlation, not causation: Effects — positive or negative — attributed to a teacher may actually be caused by other factors that are not captured in the model.”

These statements can be effective, O’Neil says, because “the people building the models are relatively politically powerless … and they’re working for their bosses. If an individual can point to a statement made by their society, and say, ‘I’m just following the consensus in my industry,’ then that gives the individual more power.”

The Future of Big Data

Now that her book is finished, O’Neil is starting a new company called O’Neil Risk Consulting and Algorithmic Auditing. In its first phase, the company will consult with organizations about the risks they take on by using algorithms internally. The next phase, says O’Neil, “will be to build a tool that I could use more generally — and I want it to be open source, and I want people to understand what this tool is doing and how it tries to measure fairness and discriminatory trends. And ultimately I would want this tool to be used by regulators, not myself.”

Regulation, O’Neil believes, could hold the key to disarming WMDs. One example she cites as effective is the provision in the Affordable Care Act that prohibits insurers from charging people with preexisting conditions higher rates. “What happens here when there’s a law,” she says, “is that people who aren’t yet sick pay a little bit more, and people who are sick don’t have to pay as much as they would otherwise. So there’s a little bit of a leveling of the playing field.”

Another possibility for regulation: The U.S. could move closer to the European model in which it’s illegal to sell user data. O’Neil would also like to see more transparency. E-scores, which are measures of creditworthiness based on factors like the geographic location of a user’s computer, web browsing and purchase history, could be accessible to all consumers on an app.

O’Neil says she wrote the book “to warn the public,” but now that it’s finished, she says, “I have gotten a little more optimistic over the last four years … because more and more people are starting to step up and realize what’s going on. I certainly have found a lot of people who are interested in talking about this,” including sociologists, data journalists and researchers at Princeton who recently launched the Web Transparency and Accountability Project, which tracks bias in search engines and job placement sites.

Data scientists concerned about big data are a little like Dorothy at the end of The Wizard of Oz, when she pulls back the curtain to reveal an ordinary person, scolding him, “You’re a very bad man!” “Oh no,” he replies. “I’m a very good man. I’m just a very bad wizard.”

O’Neil’s book makes it clear: Big data has the potential to be helpful or harmful — the good man or the bad wizard. What people decide to do with the data will make all the difference.


Laurie McClellan is a freelance writer and photographer living in Arlington, Virginia. She is on the faculty of Johns Hopkins University, where she teaches in the M.A. in Science Writing program.