Ratings2
Average rating3.5
Thus endeth my lunchtime reading book. I intermittently read this, over the course of many months, usually over a sandwich at lunch. For this style of reading, it holds up well: the chapters are discreet packets of data science chat. That said, I agree with other critiques of this book: if you're an aspiring data scientist, this book is NOT sufficient to get you off the ground. It's not a good beginner's book. It's maybe a good “pop data science” book, a pre-beginner's book. It's very light on the technical stuff, and, if anything, it's more like an anthropological survey of the state of the field.
Each chapter covers a technique or common challenge or strategy, describes the general jist of what's going on, and then points you in the direction of papers, other books, or tutorials online. Early chapters have some “exercises”, though they're more like general pointers of “oh, you could try this, I guess?” Later chapters don't even bother.
For an O'Reilly book, I was disappointed that the GitHub repo didn't have, for example, the code examples mentioned in the book, or the exercises and toy datasets. (What? Are we supposed to manually copy down several pages of R code?!) Or even just a README.md with a bibliography (given how many shortened Google links are used as citations)? This makes it a starkly UNFRIENDLY book, which is weird since O'Reilly books (well, the good ones) can be very, very rich resources. This, instead, felt thin - and the repo is basically pointless.
I will say that I enjoyed the banter-y tone of the book, and some of the discussions of techniques (e.g. there was a great, intuitive explanation of Principal Component Analysis) and “real world” issues (e.g. how Kaggle competitions are basically data science in a vacuum; what it's like to be a lady data scientist) were quite good. But, overall, yeah, this isn't really a “good enough” data science book.