Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy

Weapons of Math Destruction

How Big Data Increases Inequality and Threatens Democracy

2016 • 274 pages

Ratings99

Average rating3.8

15

Cathy O'Neil is one of my heroes; I love listening to her on Slate Money, and her less pop/more technical book, Doing Data Science, occupies a prominent place on my desk at work - and a special place in my heart. Going beyond the usual data science manuals of what a random forest is, or how to account for large, sparse data, O'Neil and her coauthor - in that book - make an almost anthropological survey of the data science “field”/profession/whatever. It's enlightening, and it's very real; less a textbook, more a professional guide.

So if Doing Data Science is the “how to do your job” manual, and Slate Money is the occasional soapbox, Weapons of Math Destruction is the manifesto. With clarity, conviction and intelligence, Cathy O'Neil scalpels away at the utopian, misty-eyed tech worship that has contributed to the data science/big data bubble. She identifies the ways in which algorithms encode and perpetuate existing prejudices, and are hidden from plain sight by the obscuring (both intentional and not) use of Fancy Math.

She notes how anti-human “market efficiency” is, and how it's often been short-hand for unjust and unequal systems, and big data is just scaling those dumb systems way, way up. For example, the use of social network data to “enrich” lending platforms; or recidivism models that ask about where a former convict grew up, whether his/her friends are in jail, how many encounters with the police he/she has had. These algorithms - because they chase correlations to make predictions (rather than identifying causal factors) - create big-data-driven poverty traps. They are also beyond the purview of the government (and, given the new administration, will probably be for quite some time): just like your FICO score is ONLY about how often you pay your credit card bills, and NOT your demographic group, so too should these algorithms be scrutinized and regulated.

Increasingly, O'Neil notes, the “human touch” comes at a price; in upper income groups, you get more real, live service - hearing about a job through connections, informational interviews with alumni, blah - while lower income groups are serviced by increasingly mechanized, machine learning code. The hiring and HR practices of CVS, McDonald's, and other low wage places were especially striking (and severe): mindless, well-meaning algorithms deployed to effectively punish workers whose BMIs (itself a stupid measure of health) are too high, or who seem “anti-social” according to a creepy questionnaire, or whatever.

I think my favorite chapter - and, of course, the most relevant one - was about the toxic brew that can be politics and data science. It was especially powerful to think through the way that, as opposed to the past, where we had “agreed upon facts” given by carpet bombing-style hegemonic culture (e.g. four news channels that everyone watched), now we have hyper-tailored DARE I SAY “alternative facts” where, by the power of marketers (who prey on confirmation bias), your neighbor's political beliefs are the result of a very specific and private relationship between them, their web browser cookies, and some political marketer. This means you may have NO IDEA why your swingy neighbor in swingy state X thinks Candidate Y is running an underground child sex ring literally under the ground of a DC pizza shop. O'Neil (and this book was written Before The Madness) notes that these sorts of micro-targeted ads that follow you around the web are very smartly tailored; the marketers know who would be repulsed by such a stupid conspiracy theory, and who would be intrigued. And that's just terrifying.

Anywayyyyy. From a data perspective, my (cold dead) economist heart reminds me that OF COURSE we're going to scale up injustice when we mindlessly maximize profit, instead of humans. We need to go back to our roots: to the philosophy and debates of Jeremy Bentham, and welfare economics, and what “utility” really means. Efficiency is not the end all, be all, despite what our current economic system seems to diktat. O'Neil also makes a good point about the repeated erroneous conflating of correlation and causation; something even the fanciest of linear algebra stochastic process meddlers fall prey to. That's something that I see a LOT in data science: the field is built on the premise of perfect prediction - and prediction doesn't need causation, it just needs correlation.

OK, I will step off my soapbox. Highly recommended.

edited to add: OMG and I just read her delightful 2017 resolutions post, and had much LOLs. Cathy O'Neil, you da best.

January 26, 2017