Mahout 0.10.x is coming. It is a new generation of Mahout that entirely rethinks philosophy of previous Mahout line of releases.
First, and foremost, to state the obvious, Mahout abandons MapReduce based algorithms. We still support them (and they are now moved to mahoutmr artifact). But we do not create any new ones.
Second, we shift the focus from being a collection of things to being a programming environment, as it stands, for Scala.
Third, we support different distributed engines, aka "backs". As it stands, we run on Spark and H20, but hope to add Apache Flink translation as well, provided we get help there.
Many people are coming with obvious question: so how it is different from stuff like MLLib? Here, I'll try to emphasize key philosophical points of this work.
1. We want to help people to create their own math, as opposed to just taking and plugging in an offtheshelf blackbox solution.
Indeed, having a fairly welladjusted and complicated black box solutions is a good thing. People spent quite a bit of time researching and optimizing them.
Problem is, very often offtheshelf is just not enough. We want to customize and build our own models, our own features, our own specific rules and ensembles. And we want to do it quickly and try it now.
Let's look at some intentionally simple examples. Suppose we want to compute a columnwise variances on a matrix. For that we are going to use simple formula
\[
\mathrm{var}\left(x\right)=\mathbb{E}\left(x^{2}\right)\left[\mathbb{E}\left(x\right)\right]^{2}
\]
applied columnwise on our big, distributed matrix $\mathbf{X}$. Here is how it is going to look in new Mahout enviornment:
val mu = X colMeans
val variance = (X * X colMeans) = mu * mu
That's it. All in fully distributed fashion. On Spark or H20.
Let's take on a little bit more complicated case. What if we want to compute ndimensional columnwise covariance matrix of the dataset X?
Assuming that
\[
\boldsymbol{x}_{i}\triangleq\mathbf{X}_{i*},
\]
i.e. every row is a point in the dataset, we can use multivariate generalization of the previous formula:
\[\mathrm{cov}\left(\mathbf{X}\right)=\mathbb{E}\left(\boldsymbol{xx}^{\top}\right)\boldsymbol{\mu\mu}^{\top},\]
where
\[\boldsymbol{\mu}=\mathbb{E}\left(\boldsymbol{x}\right).\]
Here is the Mahout code for that formula:
val mu = X.colMeans()
val mxCov = (X.t %*% X).collect /= X.nrow = (mu cross mu)
This code is actually a so called "thin" procedure, i.e. the one that assumes that while the $m\times n$ input matrix $\mathbf{X}$ is too big to be an incore matrix, the $n\times n$ covariance matrix will fit into memory in one chunk. In other words, it assumes $n\ll m$. The code for "wide" distributed covariance computation needs couple more lines but is just as readable.
So, what is the difference between Mahout 0.10.x and MLLib? Well, what is the difference between R package and R? What is the difference between libsvm and Julia? What is difference between Weka and Python? And how to think about it  R for Spark? Hive for math? Choose your poison.
Bottom line, we want to help you create your own math at scale. (And hopefully, come back to us and share it with us).
2. We want to simplify dialect learning. We actually build the environment in image of R. Initially, we also had DSL dialect (enabled via separate import) for Matlab, but unfortunately Scala operator limits do not make it possible to implement entire Matlab operator set verbatim, so this work received much less attention, and instead, we focused on R side of things.
What it means is it should be easily readable by R programmers. E.g. A %*% B is matrix multiplication,
A * B is elementwise Hadamard product, methods like colMeans, colSums are following R naming for those.
Among other things, arguably,
math written in Rlike fashion is easier to understand and maintain than the same things written in other basic procedural or functional environments.
3. Environment is backendagnostic. Indeed, Mahout is not positioning itself as Sparkspecific. You can think of it that way if you use Spark, but if you use H20, you could think of it as H2ospecifc (or, hopefully, "Apache Flinkspecifc" in the future) just as easily.
All the examples above are not containing a single Spark (or h20) imported dependency. They are written once but run on any of supported backs.
Not all things can be written with the subset of backindependent techniques of course, more on that below. But quite a few can  and the majority can leverage at least some as the backbone. E.g. imagine that dataset $\mathbf{X}$ above is result of embarrassingly parallel statistical Monte Carlo technique (which is also backendindependent). And just like that we perhaps get a backendagnostic Gibbs sampler.
4. "Vs. mllib" is a false dilemma. Mahout is not taking away any capabilities of the backend. Instead, one can think of it as an "addon" over e.g. Spark and all its technologies. (same for h2o).
Indeed, it is quite common to expect that just Algebra and stats are not enough to make the ends meet. In Spark's case, one can embed algebraic pipelines by importing Sparkspecific capabilities. Import MLLib and all the goodies are here. Import GraphX and all the goodies are here as well. Import DataFrame(or SchemaRDD) and use languageintegrated QL. And so on.
I personally use one mllib method as a part of my program along with Mahout algebra. On the same Spark context session as the Mahout Algebra session. No problem here at all.
5. Mahout environment is more than just a Scala DSL. Behind the scenes, there is an algebraic optimizer that creates logical and physical execution plans, just like in a database.

Algebraic Optimizer Stack 
For example, we can understand that a thing like
dlog(X * X + 1).t %*% dlog(X * X + 1)
is in fact (in simplified physical operator pseudo code)
self_square(X.map(x => log(x * x + 1)).
6. Offtheshelf methods. Of course we want to provide offtheshelf experience too. Logical thing here is to go with stuff that complements, but not repeats other offtheshelf things available. Refer to the Mahout site what's in and what's in progress.
But even use of offtheshelf methods should be easily composable via programming environment and scripting.
Future plans. Work is by no means complete. In fact, it probably just started.
In more or less near future we plan to extend statistical side of things of the environment quite a bit.
We plan to work on performance issues, especially for incore performance of the algebraic API.
We want to port most of MR stuff to run in the new environment, preferably in a completely backendagnostic way (and where it is not plausible, we would isolate and add strategies for individual backs for the pieces that are not agnostic).
We plan to add some musthave offtheshelf techniques such as hyperparameter search.
And of course, there are always bugs and bug fixes.
Longer term, we think it'd be awesome to add some visualization techniques similar to ggplot2 in R, but we don't have capacity to develop anything similar from the ground up, and we are not yet sure what might be integrated in easily and which also works well with JVM based code. Any ideas are appreciated.