Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog

k-means and other EM-like algorithms are trivial to parallelize because all the heavy computations in the inner loops are independent.

via Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog.

This is exactly what Apache Mahout does.  We have parallelized versions of a bunch of clustering algorithms, including k-means

2 Responses to “Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog”

  1. Is there a release of Mahout? I don’t see any javadoc or actual release linked from the home page:

    http://lucene.apache.org/mahout/

    It looks like I could just check out a version from the subversion archive.

    PS: Thanks for the link; whenever a more popular blog links to ours we see a huge uptick in traffic.

  2. We are in the process of voting on the 0.1 release as I type. From that, we will publish javadocs, etc.

Leave a Reply

*
To prove that you're not a bot, enter this code
Anti-Spam Image