<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Grant's Grunts: Lucene Edition &#187; kMeans clustering</title>
	<atom:link href="http://lucene.grantingersoll.com/category/kmeans-clustering/feed/" rel="self" type="application/rss+xml" />
	<link>http://lucene.grantingersoll.com</link>
	<description>Thoughts on Apache Lucene, Mahout, Solr, Tika and Nutch</description>
	<lastBuildDate>Mon, 06 Feb 2012 12:07:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog</title>
		<link>http://lucene.grantingersoll.com/2009/03/18/speeding-up-k-means-clustering-with-algebra-and-sparse-vectors-%c2%ab-lingpipe-blog/</link>
		<comments>http://lucene.grantingersoll.com/2009/03/18/speeding-up-k-means-clustering-with-algebra-and-sparse-vectors-%c2%ab-lingpipe-blog/#comments</comments>
		<pubDate>Wed, 18 Mar 2009 14:40:38 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[clustering]]></category>
		<category><![CDATA[kMeans clustering]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=168</guid>
		<description><![CDATA[k-means and other EM-like algorithms are trivial to parallelize because all the heavy computations in the inner loops are independent. via Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog. This is exactly what Apache Mahout does.  We have parallelized versions of a bunch of clustering algorithms, including k-means]]></description>
			<content:encoded><![CDATA[<p>k-means and other EM-like algorithms are trivial to parallelize because all the heavy computations in the inner loops are independent.</p>
<p>via <a href="http://lingpipe-blog.com/2009/03/12/speeding-up-k-means-clustering-algebra-sparse-vectors/">Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog</a>.</p>
<p>This is exactly what Apache Mahout does.  We have parallelized versions of a bunch of clustering algorithms, including k-means</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2009/03/18/speeding-up-k-means-clustering-with-algebra-and-sparse-vectors-%c2%ab-lingpipe-blog/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Mahout: k-means Clustering</title>
		<link>http://lucene.grantingersoll.com/2008/03/01/mahout-k-means-clustering/</link>
		<comments>http://lucene.grantingersoll.com/2008/03/01/mahout-k-means-clustering/#comments</comments>
		<pubDate>Sat, 01 Mar 2008 13:00:18 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[kMeans clustering]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/2008/03/01/mahout-k-means-clustering/</guid>
		<description><![CDATA[I committed a first crack at k-means clustering to Mahout last night, thanks again to Jeff Eastman&#8217;s excellent work.  This means Mahout now has two clustering algorithms designed to run using Hadoop&#8216;s map reduce algorithm, meaning it should be able to scale up to very large data sets. To learn more about k-means, see the [...]]]></description>
			<content:encoded><![CDATA[<p>I committed a first crack at k-means clustering to <a href="http://lucene.apache.org/mahout">Mahout</a> last night, thanks again to Jeff Eastman&#8217;s excellent <a href="https://issues.apache.org/jira/browse/MAHOUT-5">work</a>.  This means Mahout now has two clustering algorithms designed to run using <a href="http://hadoop.apache.org">Hadoop</a>&#8216;s map reduce algorithm, meaning it should be able to scale up to very large data sets.</p>
<p>To learn more about k-means, see the Mahout <a href="http://cwiki.apache.org/MAHOUT">wiki</a>, specifically our page on <a href="http://cwiki.apache.org/MAHOUT/k-means.html">k-means</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/03/01/mahout-k-means-clustering/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

