Here it goes, so i don't lose it. Also attached to the issue.

Still, more work to come in MAHOUT-797, although not tremendously a lot...

## Thursday, September 8, 2011

## Thursday, September 1, 2011

### #Mahout #MapReduce #SSVD version gets power iterations, fixes

I have been working on MAHOUT-796 this week (current work is in https://github.com/dlyubimov/mahout-commits/tree/MAHOUT-796).

Nathan Halko pointed to me that power iterations in practice improve precision and reduce noise.

After a little discussion, that's the route we went :

###

Nathan Halko pointed to me that power iterations in practice improve precision and reduce noise.

After a little discussion, that's the route we went :

### Original formula

### \[\mathbf{Q}=\mbox{qr}\left[\left(\mathbf{AA^{\top}}\right)^q\mathbf{A}\boldsymbol{\Omega}\right].\mathbf{Q},\]

### \[\mathbf{B}=\mathbf{Q}^{\top}\mathbf{A}.\]

### Modified version

### \[\mathbf{Y}=\mathbf{A}\boldsymbol{\Omega},\]

### \[\mathbf{B}_{0}=\left[\mbox{qr}\left(\mathbf{Y}\right).\mathbf{Q}\right]^{\top}\mathbf{A},\]

### \[\mathbf{B}_{i}=\left[\mbox{qr}\left(\mathbf{AB}_{i-1}^{\top}\right).\mathbf{Q}\right]^{\top}\mathbf{A},\, i\in\left[1..q\right].\]

### Notation \(\mbox{qr}\left(\cdot\right).\mathbf{Q}\) means "compute QR decomposition of the argument and retain Q as a result".

### Current combination of QJob and BtJob is essentially producing \(\mathbf{B}_{0}^{\top}=\mathbf{A}^{\top}\mbox{qr}\left(\mathbf{A}\boldsymbol{\Omega}\right).\mathbf{Q}\). Intermediate QJob results are QR blocks, not a final Q, so QJob is not terribly meaningful without BtJob.

### The task boils down to figuring out alternative pipeline modifications necessary to produce \(\mathbf{B}_{i}^{\top}\). After that, algorithm proceeds as before with assumption of \[\mathbf{B}\equiv\mathbf{B}_{q}.\]

### The existing processing will be equivalent to \({q}=0\).

### \(\mathbf{B}_{i}\) pipeline (some new code)

### \(\mathbf{B}_{i}\) pipeline produces \[\mathbf{B}_{i}^{\top}=\mathbf{A}^{\top}\mbox{qr}\left(\mathbf{A}\mathbf{B}^{\top}\right).\mathbf{Q}.\]

### this is very similar to \(\mathbf{B}_{0}\) pipeline with specfics being full multiplication of \(\mathbf{A}\mathbf{B}^{\top}\) in the first job and first pass qr pushdown to the reducer of the first job:

###
- map1 : \(\mathbf{A}\mathbf{B}^{\top}\) outer products are produced
- combiner1 : presumming outer products of \(\mathbf{A}\mathbf{B}^{\top}\).
- reducer1 finalizes summing up outer products of \(\mathbf{A}\mathbf{B}^{\top}\) and starts \(\mbox{qrFirstPass}\left(\mathbf{A}\mathbf{B}^{\top}\right)\to\mbox{qr blocks}\). (partitioning must be done by A block # it seems).
- mapper2, combiner2, reducer2 proceed exactly as mapper2, combiner2, reducer2 in B pipeline and output final \(\mathbf{B}_{i}^{\top}\) with blocks corresponding to initial splits of A input.

### Thus, this pipeline is 2 MR jobs with 2 sorts (map1 + combiner1 + shuffle and sort + reducer1 + map2 + combiner2 + shuffle and sort + reducer2).

### Integration of Cholesky trick route for computing power iterations

Ted Dunning pointed out that the whole B pipeline could avoid QR step.

### When power iterations are concened, note that whatever route is chosen to calculate \(\mathbf{B}_{i}=g\left(\mathbf{Y}_{i}\right)\), mathematically it should still be valid \(\forall i\in\left[0..q\right]\) for as long as we assume

### \[\mathbf{Y}_{0}=\mathbf{A}\boldsymbol{\Omega}, i=0;\]

### \[\mathbf{Y}_{i}=\mathbf{A}\mathbf{B}_{i-1}^{\top}, i>{0}.\]

### So, if Cholesky trick allows to produce \(\mathbf{B}_{0}\) efficiently, I see no reason why it could not be applied to producing the \(\mathbf{B}_{i}\).

I will be pursuing MR option of running Cholesky decompositon of \(\mathbf{Y}^{\top}\mathbf{Y}\) in MAHOUT-797.

If that holds, then power iterations with Cholesky decomposition will likely be similar.

\(\mathbf{LL}^{\top}=\mathbf{Y}^{\top}\mathbf{Y}=\mathbf{R}^{\top}\mathbf{R}\Rightarrow\) we know how to compute \(\mathbf{R}\);

\[\mathbf{B}_{i}=\left(\mathbf{R}^{-1}\right)^{\top}\mathbf{Y_{i}}^{\top}\mathbf{A};\]

### \[\mathbf{Y}_{0}=\mathbf{A}\boldsymbol{\Omega}, i=0;\]

### \[\mathbf{Y}_{i}=\mathbf{A}\mathbf{B}_{i-1}^{\top}, i>{0}.\]

**Tests**

Tests with local MR QR-based solver seem to be encouraging. There's clear improvement on precision.

With surrogate random input with predefined singular values 10,4,1,(0.1...), toy input 2000x1000 and k=3,p=10 I seem to be getting improvement after just one additional power iteration:

q=0:

--SSVD solver singular values:

svs: 9.998472 3.993542 0.990456 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000

svs: 9.998472 3.993542 0.990456 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000

q=1: (+2 more MR sequential steps):

--SSVD solver singular values:

svs: 10.000000 4.000000 0.999999 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000

--SSVD solver singular values:

svs: 10.000000 4.000000 0.999999 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000

After some tests and cleanup, I guess it is going to become a commit really soon.

**Bug fixes**

In addition to all this, MAHOUT-796 branch gets a lot of bug fixes (in particular, better handling sparse inputs which seemed to have still been broken in current version).

Subscribe to:
Posts (Atom)