How Big Data Increases Inequality and Threatens Democracy
Ratings99
Average rating3.8
Read this for a book club. The author has clearly done her research and presents a story to zoom in on a few big ways in which inequality is perpetuated and society is controlled and manipulated by unethical and lazy uses of statistics. It's really important shit to know about and try to fix. But honestly, I already knew of most of these things, maybe not so in depth though. I really didn't need a zoomed in view of all the reasons I already know I hate capitalism.
It's interesting to read this book and think about the media trying to scare us about China's “oppressive social credit score” system.
Meanwhile we have a patchwork of far less transparent black box systems that control...
• if you get into college
• if you get offered a job
• If you get a mortgage
• If you get targeted by scam universities or scam credit systems
• if you get approve to rent a home
• if you get fired or promoted
• if you get stopped by the police
• if you get bail
• if you get a longer criminal sentence
• if you get probation
And more. Existing systemic bias is coded into these algorithms, resulting in a venire of “science” and “objectivity” used to justify further systemic oppression.
Racist cops find more crime in poor non-white neighborhoods → algorithms designed to find “where crime might happen” takes this garbage data and outputs garbage results → Cops further oppress these neighborhoods, locking up more poor people → An algorithm looks at the material conditions of a defendant and determines that since he's poor, his friends and family are and have had run-ins with the law, and he has few professional prospects, he is likely to reoffend and gets a more stringent sentence.
This feedback loop reinforces our racist, classist criminal justice system while claiming to use “scientific, non-biased” tools. This is just one of the many examples of “big data” run amuck outlined in ths book.
Many more include leveraging big data to suck as much money out of poor people as they can possibly get away with. Because when we have a global economic system primarily driven by profit instead of helping people, the newest technological revolutionary tools will be used not to push humanity forward, but to suck up all our personal information to serve us targeted ads, many of which include ads to scam us.
Great book. highly recommended.
I love how much in depth Cathy O'Neil goes into her journey from working in academia as a professor in mathematics to working at a hedge fund, and then leaving after the 2008 recession. I love how accessible the book is to a wide variety of audiences.
Informative but ar times very dense with information. I really liked the many examples that were given
3.5 Stars. Nothing revolutionary, and a lot of the basic ideas are covered better by books like Automating Inequality, but I'm being a little generous, because this was one of the first books to start the conversation on this topic
This, I think, is one of the most important books I've read this year. For, one cannot expect to grasp even the most sketchy outline of our socio-economic reality if one is not familiar with the now-prevailing currency, namely data.
Computer is good at doing things fast, really fast. So, when it errs, it errs like the flash, resulting a gigantic accumulation of errors. It shouldn't be surprising that big data (a match made between statistics and computer science) with its inbuilt measures of inaccuracies paired with shortcomings in creating mathematical model that sufficiently mirror the reality will create tools of horrible injustice.
However, it is not always easy to notice. Technical difficulties and self-fulfilling feedback loop can deceive us quickly.
However, the writer herself have has been deep in this systems and saw these things closely. With her deep knowledge and a very conscientious mind, she is well equipped to discuss the matter in great depth and honesty.
Medium: audiobook narrated by the author
WMD is like a long podcast. It is filled it case studies and instances of how badly created statistical devices build using terrible proxy indicators effect real communities and oppress them.
This book is a very easy read (listen). Cathy doesn't dwell more in the details of how these models are created, something that I wanted to learn more about. This is basically a curation of important stories and how mathematical models lacking societal context hurts the very people it was made to help. I wish she described these black box models more and discussed the math a bit. But overall, it was a very informative and interesting read.
I really enjoyed the book and I would definitely recommend it to others. I actually have been actively trying to get more people to read it.
The book demystifies big data and statistics and raise awareness about the topic, through the chapters Cathy shows how deeply intertwined it is with public policies and day to day opportunities like buying a car or an apartment, getting affordable and good education for you or your children, even being stop by the police on the sole premise of ethnicity.
I think this is a good first book about the subject, most of the data is from US and a bit from Europe but consequences are global so it would be great to have more data from other countries as well.
Math! And social justice! Two of my favorite things! What's not to like?
Unfortunately, kind of a lot. Look: people who read math books for fun are math nerds. Dumbing down math concepts with cutesy terms is not needed. It will not make people who would not otherwise read math for fun read your book and it will piss off the rest of us. Also, it's lazy. And it's bad math – O'Neil uses the term “weapon of math destruction” (over and over) very vaguely, so that she doesn't have to define exactly what she's talking about. Oh, she claims that she has a clear definition, but then she calls things like Racial Profiling a WMD (cringe). Racial Profiling isn't an algorithm; it's a cognitive heuristic and it doesn't relay on Big Data.
More problematically, I think she uses this term to obscure that a lot of her points are actually about cognitive biases, racial inequality and socioeconomic inequality, rather than the data science used to enforce these. She herself acknowledges that some things (like, e.g. racial profiling) have happened to exactly the current degree long before data science was available.
Overall, I found her approach really shallow. She's a former tenured ivy league math professor! I wanted her to write a book that only she could write – full of nuance and equations I needed a scratchpad to struggle through.
Nonetheless, I think some of her points were good: that machine-learning algorithms are dense and require supervision and critical thinking as to their results rather than blind trust. It's an important book for the math-phobic.
If you're looking for hard data or a deep exploration into mathematical algorithms, this book will disappoint. It is, however, an eye-opening, bird's-eye view of a field that is quietly taking over quite a few parts of our lives. I applaud the author for expressing such a high level of empathy for people whose plights she does not share, and for providing such a well-written overview that even the layperson can understand.
For those that are being introduced to this topic, I highly recommend this book (my only criticism is the term Weapons of Math Destruction - or WMD - itself, and how often it is overused within the book). If you are interested in learning more about the specific ways in which machine learning and mathematical algorithms are wreaking havoc in different parts of society, other books are better poised to teach on the details of those topics, such as The New Jim Crow and Automating Inequality.
The book reads like a continuation of Ted Kaczynski's manifesto ‘The Industrial Society and Its Future'. This time focusing on machine learning and its use in coercing behavior change as well as discriminating the poor and disadvantaged. From the examples provided in the book, there are three categories of Weapons of Math Destructions (WMD).
First one is poor statistics. These are incorrectly calculated stats which are used to infer human behavior and performance. In them are lack of understanding of how certain statistics are interpreted or validated. A good example are proxy variables such as geography used to infer purchase power, reoffending propensity et cetera.
The second WMD are correct statistics that are misused. These seem to be the majority of the cases. It is more of an ethical issue rather than machine taking over of lives. When a company utilizes zip code to steer customers to high interest loans, that qualifies us an ethical use of machine leaning output and no necessarily anything wrong with the machine leaning process.
The last WMDs are dataset. From the book, certain attributes within data should never be used for prediction purposes, e.g race, gender, income, and zip code since they likely to correlate with outputs connected with discrimination.
In the end, machine learning is hailed as tool that can be used for social good - with several examples provided.
Excellent review of a lot of cases where big data is failing us right now. O'Neil terms them Weapons of Math Destruction, they are the algorithms and filters and data crunching methods that help people make decisions on who to hire, who to fire, who to give a loan to and how much to charge you. They are oversimplified, non-transparent and static, and they usually end up being feedback engines that help the rich get richer and discriminate against the poor. Not that humans before them weren't terribly biased and greedy in their decision making process, but now it happens on a larger scale without us necessarily noticing, because everyone trusts algorithms, because algorithms are fair, right?
Any decisions outsourced to big data will never be completely fair, the same way humans can never be completely fair. But raising awareness and having these discussions now is super important, so we learn how to finetune these tools so they'll be as fair and transparent as they can be.
O'Neil's chapter on micro-targeting of citizens with political ads on facebook is very on-point for these days.
An interesting topic which deserves better treatment than a collection of Vox-style op-eds. This is not a book that wants to teach you how mathematical models can fail, it's a book that wants you to feel OUTRAGED about UNFAIRNESS.
Here's how it works. There's some area that's supposed to be improved by using a mathematical model (say, teacher evaluation in public schools). But after implementing this system there are some casualties (say, unfairly fired teacher who was well-liked and respected both by students and parents), which is bad and leads to a lengthy discussion of perils of capitalism.
Don't get me wrong, all things discussed in the book (which include recidivism, future job performance, and insurance) are indeed hard to model, but that's not a good way to discuss this models. One of the book's ideas is that you should forgo some of the model's accuracy to make it more fair. However, it's hard to talk about trade-offs without talking about how much we have in accuracy and utility. Did this teacher evaluation model improve overall school performance? If it did, would it be fair to students to make them go back to their horribly unimproved previous school performance? Or was it actually not that bad, and their test results improved simply because of better lunches (or even less lead in water)?
The chapter on credit scores grudgingly admits that human curation wasn't perfect (painting an expected picture of a banker discussing credits with his golf partners). Skip ten pages, and there's a friendly woman who helps to clean up the mess made by automated system that confused a client with a criminal namesake. Humans are winning again!
Except that they still have their own models, which are also bad (albeit in a different way). However, it is much easier to fix biases in algorithms and data if you're dealing with computers. One of the common complaints of the book is that computers can only project past data on the future, saving all those biases. It's not a problem that can't be fixed. Humans are.
More generally, it may be fun to complain about the issues of the model, but it's only useful to compare it to the alternatives. An implicit message of the book seems to be that we should ban usage of some algorithms and data (as expected, there's no discussion of second-order effects—if credits become more expensive, what will happen to the economy? Is this trade-off useful?). However, we can't simply ban things and forget about them, we can only replace them with something else.
I don't think that a book that is strictly about negative sides of something should necessarily strive to be objective. However, I would like to see less diatribes against greediness and more interviews with people who designed the models. What do they think about these problems?
(By the way, if you explain something by greediness, you‘re already wrong).
Some quotes are amazing, though.
fairness is squishy and hard to quantify. It is a concept. And computers, for all of their advances in language and logic, still struggle mightily with concepts. They “understand” beauty only as a word associated with the Grand Canyon, ocean sunsets, and grooming tips in Vogue magazine. They try in vain to measure “friendship” by counting likes and connections on Facebook. And the concept of fairness utterly escapes them. Programmers don't know how to code for it, and few of their bosses ask them to.
But I would argue that the chief reason has to do with profits. If an insurer has a system that can pull in an extra $1,552 a year from a driver with a clean record, why change it?
A WMD, Weapon of Math Destruction, is an algorithm that is a block box (opaque), used at scale, and damages the lives of people, generally poor minorities. Cathy O'Neil goes through a lot of detail describing several of these WMDs and how they are ruining people's lives. Hate Clopening? (working at Closing and then Opening up the next morning). It's likely an algorithm created that schedule. Hate the fact that employers now use opaque personality tests to look for mental illness while you're applying for a job? Another WMD.
This book is important, and I think it should be read by anyone concerned about how Big Data can be used to harm us all. As someone whose future career depends upon algorithmic learning, statistics, and mathematics, I can say this book was eye opening. I'm used to hearing about the power of algorithms and modeling, but really, a model is not the thing that it models (as every mathematician knows).
This book is a lot more accessible than Derman's Models.Behaving.Badly, even if it is in the same vein. It has a much clearer focus, and it very clearly explains the traps mathematical modeling has created. I highly recommend this book to everyone. It doesn't require an understanding of math (there are no models or equations in this book). Just an understanding of how algorithms can contain bias through the use of proxies. Read it and share it.
Cathy O'Neil is one of my heroes; I love listening to her on Slate Money, and her less pop/more technical book, Doing Data Science, occupies a prominent place on my desk at work - and a special place in my heart. Going beyond the usual data science manuals of what a random forest is, or how to account for large, sparse data, O'Neil and her coauthor - in that book - make an almost anthropological survey of the data science “field”/profession/whatever. It's enlightening, and it's very real; less a textbook, more a professional guide.
So if Doing Data Science is the “how to do your job” manual, and Slate Money is the occasional soapbox, Weapons of Math Destruction is the manifesto. With clarity, conviction and intelligence, Cathy O'Neil scalpels away at the utopian, misty-eyed tech worship that has contributed to the data science/big data bubble. She identifies the ways in which algorithms encode and perpetuate existing prejudices, and are hidden from plain sight by the obscuring (both intentional and not) use of Fancy Math.
She notes how anti-human “market efficiency” is, and how it's often been short-hand for unjust and unequal systems, and big data is just scaling those dumb systems way, way up. For example, the use of social network data to “enrich” lending platforms; or recidivism models that ask about where a former convict grew up, whether his/her friends are in jail, how many encounters with the police he/she has had. These algorithms - because they chase correlations to make predictions (rather than identifying causal factors) - create big-data-driven poverty traps. They are also beyond the purview of the government (and, given the new administration, will probably be for quite some time): just like your FICO score is ONLY about how often you pay your credit card bills, and NOT your demographic group, so too should these algorithms be scrutinized and regulated.
Increasingly, O'Neil notes, the “human touch” comes at a price; in upper income groups, you get more real, live service - hearing about a job through connections, informational interviews with alumni, blah - while lower income groups are serviced by increasingly mechanized, machine learning code. The hiring and HR practices of CVS, McDonald's, and other low wage places were especially striking (and severe): mindless, well-meaning algorithms deployed to effectively punish workers whose BMIs (itself a stupid measure of health) are too high, or who seem “anti-social” according to a creepy questionnaire, or whatever.
I think my favorite chapter - and, of course, the most relevant one - was about the toxic brew that can be politics and data science. It was especially powerful to think through the way that, as opposed to the past, where we had “agreed upon facts” given by carpet bombing-style hegemonic culture (e.g. four news channels that everyone watched), now we have hyper-tailored DARE I SAY “alternative facts” where, by the power of marketers (who prey on confirmation bias), your neighbor's political beliefs are the result of a very specific and private relationship between them, their web browser cookies, and some political marketer. This means you may have NO IDEA why your swingy neighbor in swingy state X thinks Candidate Y is running an underground child sex ring literally under the ground of a DC pizza shop. O'Neil (and this book was written Before The Madness) notes that these sorts of micro-targeted ads that follow you around the web are very smartly tailored; the marketers know who would be repulsed by such a stupid conspiracy theory, and who would be intrigued. And that's just terrifying.
Anywayyyyy. From a data perspective, my (cold dead) economist heart reminds me that OF COURSE we're going to scale up injustice when we mindlessly maximize profit, instead of humans. We need to go back to our roots: to the philosophy and debates of Jeremy Bentham, and welfare economics, and what “utility” really means. Efficiency is not the end all, be all, despite what our current economic system seems to diktat. O'Neil also makes a good point about the repeated erroneous conflating of correlation and causation; something even the fanciest of linear algebra stochastic process meddlers fall prey to. That's something that I see a LOT in data science: the field is built on the premise of perfect prediction - and prediction doesn't need causation, it just needs correlation.
OK, I will step off my soapbox. Highly recommended.
edited to add: OMG and I just read her delightful 2017 resolutions post, and had much LOLs. Cathy O'Neil, you da best.
Changed my mind in respect to various number crunching instances, especially in cases where bias is baked into the institution developing the algorithms.
This book is a powerful critique and warning about the dangers of big data: how use of algorithms at a broad scale throughout or society to inform hiring decisions, financial offerings, policing, etc. can increase inequality and ruin the lives of vulnerable individuals. As big data becomes more ubiquitous, this book provides a compelling argument for creating accountability and applying analyses in a thoughtful way to harness their potential for good and challenge their threat to do harm.
A short book about how “WMDs” pose a great threat to society. The book actually makes some good arguments, and its subject is relevant to a thesis I'm writing on the use machine learning in public policy, and I'm actually on board with the author's critique. However, I don't think the critique goes far enough. The problem is not the encroachment of mathematics in our lives, but the existing social and economic inequalities that are amplified by the use of sophisticated mathematical models. The author also offers no alternatives, we can hardly step back from our data-intensive society. I may be overly harsh, however, as the alternatives posed by authors usually range from very useless to less useless.
EDIT: I must admit that I wrote the above before reading the concluding arguments, where the author (mostly adequately) adresses the above concerns. As such, I've revised my rating up to 4 stars. Recommended to all, including the non-technical reader.