Tuesday, February 23, 2016

Mahout "Samsara" Book Is Out

After many months of work, our book is finally out.

Here is our formal announcement:

We are happy to announce that Mahout Samsara is finally documented in print. The book title is “Apache Mahout: Beyond MapReduce.”

Similar to other books on computer science or languages such as R, Python, or C++, the authors are hoping to put the reader on the path of designing and creating his or her own algorithms. Also included are tutorials and practical usage information about newer Mahout algorithms.

The emphasis is on Machine Learning algorithm design aspects in the context of Apache Mahout “Samsara” releases (0.10, 0.11). If there were another suitable name for this book, it might be “Beyond the Black Box”: the discussions go beyond the scope of just the Mahout environment or Mahout algorithms and touch on more general concepts for devising algorithms in the context of massive inputs and computations, using Mahout Samsara as a semantical implementation medium for the solutions.

Mahout “Samsara” currently targets H2O, Apache Flink and Apache Spark as backend execution options. As work on the Apache Flink backend is still in progress, all examples, while being execution engine agnostic (with one intended exception), are set up to run with the Apache Spark backend.

This work has been greatly helped by valuable reviews and ideas from other Mahout committers, contributors, and industry professionals, as indicated in the “Acknowledgments” section of the preface (Thank you!).

This book does not discuss legacy MapReduce-based algorithms.

Thank you for using Mahout!

Dmitriy Lyubimov (@dlieuOfTwit)
Andrew Palumbo (@andy_palumbo)

Technical info:

There are two editions of the book: a black and white paperback and a full color Kindle textbook.

The Kindle textbook completely preserves the layout of the paperback edition. It is enrolled in the Amazon “Matchbook” program (free with the purchase of a print copy when ordered via Amazon.com).

The paperback edition has been optimized specifically for a black-and-white print. The format is 7x10in. (a common textbook size).

Code examples are available on GitHub.



In the US a paperback-only copy can also be purchased here with a 25% off code QLZ8DLPL.

Post Scriptum 

Thank you for reading!

See also: Book prerequisitesErrata updates


  1. This comment has been removed by the author.

    1. Cherry, I think this is too little information. I suggest you to ask on mahout user or dev list, and please provide more specifics -- mahout version, spark version etc.

      It looks like you are missing the mahout classes, if you are running off the source tree, make sure you've compiled everything first.