Data Algorithms

Recipes for Scaling Up with Hadoop and Spark

2014 • 778 pages

Learn the algorithms and tools you need to build MapReduce applications with Hadoop and Spark for processing gigabyte, terabyte, or petabyte-sized datasets on clusters of commodity hardware. With this practical book, author Mahmoud Parsian, head of the big data team at Illumina, takes you step-by-stepthrough the design of machine-learning algorithms, such as Naive Bayes and Markov Chain, and shows you how apply them to clinical and biological datasets, using MapReduce design patterns. Apply MapReduce algorithms to clinical and biological data, such as DNA-Seq and RNA-Seq Use the most relevant regression/analytical algorithms used for different biological data types Apply t-test, joins, top-10, and correlation algorithms using MapReduce/Hadoop and Spark

Genre

Computers

Want to edit this book?Become a Librarian

Popular Reviews

Reviews with the most likes.

There are no reviews for this book. Add yours and it'll show up right here!

Community

Readers & Supporters

Join Our Discord

Follow Along

Blog Hardcover Live About Hardcover Request a feature

We're an Open Book

Frequently Asked Questions Contact Support Roadmap Our Policies

Data Algorithms

Tags

Genre

Reviews

Popular Reviews

Footer links