Mahout Update
It’s been a while since I reported anything on Mahout (here’s why), but thought I would give an update. I know it’s been promised before, but the committers have been diligently working on a 0.1 release, which should be out very soon. I think I have all the Maven release stuff in place and am now testing and verifying the release candidate. Once that’s done, I’ll post an RC for vote and then we should be able to release.
Going forward, there should be several new algorithms going in post 0.1, as Isabel Drost has added some code for Winnow and Perceptron implementations and I believe Karl Wettin has some work on hierarchical clustering in place. I also believe Ted Dunning and Jeff Eastman are working on Dirichlet clustering. Finally, Sean Owen is always rocking on Taste’s collaborative filtering capabilities, so there will no doubt be more goodness in that regard. As for me, I’m working on integrating clustering (Carrot2 and Mahout) into Solr and will be writing a chapter on doing text clustering in Taming Text.





