<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Grant's Grunts: Lucene Edition &#187; Map Reduce</title>
	<atom:link href="http://lucene.grantingersoll.com/category/map-reduce/feed/" rel="self" type="application/rss+xml" />
	<link>http://lucene.grantingersoll.com</link>
	<description>Thoughts on Apache Lucene, Mahout, Solr, Tika and Nutch</description>
	<lastBuildDate>Mon, 06 Feb 2012 12:07:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SXSW 2012 &#8211; Apache Mahout: Bringing Intelligence to Your App</title>
		<link>http://lucene.grantingersoll.com/2011/08/15/sxsw-2012-apache-mahout-bringing-intelligence-to-your-app/</link>
		<comments>http://lucene.grantingersoll.com/2011/08/15/sxsw-2012-apache-mahout-bringing-intelligence-to-your-app/#comments</comments>
		<pubDate>Mon, 15 Aug 2011 19:43:38 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=441</guid>
		<description><![CDATA[It&#8217;s that time of year again: time to vote for SXSW talks.  Last year I did a talk with RC Johnson of BazaarVoice on Solr as NoSQL, this year I thought I would try to fly solo and submitted a talk on Apache Mahout. So, if you are so inclined to do the whole crowdsourcing [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright" title="SXSW Panel Picker" src="http://panelpicker.sxsw.com/img/sxsw/my_SXSW_idea_2012.png" alt="" width="200" height="120" />It&#8217;s that time of year again: time to vote for <a href="http://www.sxsw.com">SXSW</a> talks.  Last year I did a talk with RC Johnson of <a href="http://www.bazaarvoice.com">BazaarVoice</a> on Solr as NoSQL, this year I thought I would try to fly solo and submitted a talk on <a href="http://mahout.apache.org">Apache Mahout</a>.</p>
<p>So, if you are so inclined to do the whole crowdsourcing thing, please go vote for my talk at <a href="http://panelpicker.sxsw.com/ideas/view/9001">SXSW 2012 &#8211; Apache Mahout: Bringing Intelligence to Your App</a> and then maybe I will see you at SXSW in 2012.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2011/08/15/sxsw-2012-apache-mahout-bringing-intelligence-to-your-app/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Surprise and Coincidence &#8211; musings from the long tail: Real-time decision making using map-reduce</title>
		<link>http://lucene.grantingersoll.com/2009/01/15/surprise-and-coincidence-musings-from-the-long-tail-real-time-decision-making-using-map-reduce/</link>
		<comments>http://lucene.grantingersoll.com/2009/01/15/surprise-and-coincidence-musings-from-the-long-tail-real-time-decision-making-using-map-reduce/#comments</comments>
		<pubDate>Thu, 15 Jan 2009 12:01:01 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=139</guid>
		<description><![CDATA[Ted Dunning has a nice blurb on &#8220;scale free&#8221; development and Mahout/Hadoop/Map Reduce that is worth the quick read: Surprise and Coincidence &#8211; musings from the long tail: Real-time decision making using map-reduce]]></description>
			<content:encoded><![CDATA[<p>Ted Dunning has a nice blurb on &#8220;scale free&#8221; development and Mahout/Hadoop/Map Reduce that is worth the quick read:</p>
<p><a href="http://tdunning.blogspot.com/2009/01/real-time-decision-making-using-map.html">Surprise and Coincidence &#8211; musings from the long tail: Real-time decision making using map-reduce</a></p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2009/01/15/surprise-and-coincidence-musings-from-the-long-tail-real-time-decision-making-using-map-reduce/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BarCamp wiki / BarCampRDU</title>
		<link>http://lucene.grantingersoll.com/2008/08/01/barcamp-wiki-barcamprdu/</link>
		<comments>http://lucene.grantingersoll.com/2008/08/01/barcamp-wiki-barcamprdu/#comments</comments>
		<pubDate>Fri, 01 Aug 2008 16:22:54 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[BarCampRDU]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>
		<category><![CDATA[Nutch]]></category>
		<category><![CDATA[Raleigh]]></category>
		<category><![CDATA[Triangle]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=91</guid>
		<description><![CDATA[BarCamp wiki / BarCampRDU I&#8217;ll be at BarCampRDU tomorrow.  I proposed two sessions, one on Hadoop and Mahout and one on Lucene and Solr.  I don&#8217;t think I really want to do both, but I would like to do at least one, so we&#8217;ll see what other people are interested in. If you&#8217;re around and [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://barcamp.org/BarCampRDU">BarCamp wiki / BarCampRDU</a></p>
<p>I&#8217;ll be at BarCampRDU tomorrow.  I proposed two sessions, one on Hadoop and Mahout and one on Lucene and Solr.  I don&#8217;t think I really want to do both, but I would like to do at least one, so we&#8217;ll see what other people are interested in.</p>
<p>If you&#8217;re around and you want to talk about any of these things, track me down.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/08/01/barcamp-wiki-barcamprdu/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>HP, Intel and Yahoo To Research Cloud Computing &#8211; Yahoo News</title>
		<link>http://lucene.grantingersoll.com/2008/07/30/hp-intel-and-yahoo-to-research-cloud-computing-yahoo-news/</link>
		<comments>http://lucene.grantingersoll.com/2008/07/30/hp-intel-and-yahoo-to-research-cloud-computing-yahoo-news/#comments</comments>
		<pubDate>Wed, 30 Jul 2008 20:58:10 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=89</guid>
		<description><![CDATA[HP, Intel and Yahoo To Research Cloud Computing &#8211; Yahoo News Boy, this could really come in handy in Open Source, especially projects like Mahout, Nutch and distributed Solr.  I find my biggest personal challenge on Mahout is access to computing resources.  I personally don&#8217;t have the financial backing to buy much time on Amazon [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://news.yahoo.com/s/nf/20080729/bs_nf/61031">HP, Intel and Yahoo To Research Cloud Computing &#8211; Yahoo News</a></p>
<p>Boy, this could really come in handy in Open Source, especially projects like Mahout, Nutch and distributed Solr.  I find my biggest personal challenge on Mahout is access to computing resources.  I personally don&#8217;t have the financial backing to buy much time on Amazon EC2.  I have been scraping by, here and there, but find myself constantly wanting access to more capabilities.</p>
<p>Sigh.  Maybe I should put more ads on this site and use the funds for buying EC2 time.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/07/30/hp-intel-and-yahoo-to-research-cloud-computing-yahoo-news/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!)</title>
		<link>http://lucene.grantingersoll.com/2008/07/03/apache-hadoop-wins-terabyte-sort-benchmark-hadoop-and-distributed-computing-at-yahoo/</link>
		<comments>http://lucene.grantingersoll.com/2008/07/03/apache-hadoop-wins-terabyte-sort-benchmark-hadoop-and-distributed-computing-at-yahoo/#comments</comments>
		<pubDate>Thu, 03 Jul 2008 12:57:55 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Map Reduce]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/2008/07/03/apache-hadoop-wins-terabyte-sort-benchmark-hadoop-and-distributed-computing-at-yahoo/</guid>
		<description><![CDATA[Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!) Congrats to the Hadoop team!  Score one for Open Source!]]></description>
			<content:encoded><![CDATA[<p><a href="http://developer.yahoo.com/blogs/hadoop/2008/07/apache_hadoop_wins_terabyte_sort_benchmark.html">Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!)</a></p>
<p>Congrats to the Hadoop team!  Score one for Open Source!</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/07/03/apache-hadoop-wins-terabyte-sort-benchmark-hadoop-and-distributed-computing-at-yahoo/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Taste is now committed</title>
		<link>http://lucene.grantingersoll.com/2008/05/15/taste-is-now-committed/</link>
		<comments>http://lucene.grantingersoll.com/2008/05/15/taste-is-now-committed/#comments</comments>
		<pubDate>Thu, 15 May 2008 11:18:51 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>
		<category><![CDATA[Taste]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=80</guid>
		<description><![CDATA[I haven&#8217;t tried it yet (pesky day job   ) but I see that Taste is now committed to Mahout.  In fact, I think Sean has already started on some parallelization efforts!  Very cool.]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t tried it yet (pesky day job <img src='http://lucene.grantingersoll.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />   ) but I see that Taste is now committed to <a href="http://lucene.apache.org/mahout">Mahout</a>.  In fact, I think Sean has already started on some parallelization efforts!  Very cool.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/05/15/taste-is-now-committed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mahout News</title>
		<link>http://lucene.grantingersoll.com/2008/05/06/mahout-news/</link>
		<comments>http://lucene.grantingersoll.com/2008/05/06/mahout-news/#comments</comments>
		<pubDate>Tue, 06 May 2008 11:02:03 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=79</guid>
		<description><![CDATA[Wow!  Mahout has just got me pumped up.  I feel like we&#8217;ve got a lot of positive momentum and that we are starting to get the various pieces of our suite of machine learning libraries in place.  Various news items include: Ted Dunning is now a committer!  Welcome Ted! I put up a patch for [...]]]></description>
			<content:encoded><![CDATA[<p>Wow!  <a href="http://lucene.apache.org/mahout">Mahout</a> has just got me pumped up.  I feel like we&#8217;ve got a lot of positive momentum and that we are starting to get the various pieces of our suite of machine learning libraries in place.  Various news items include:</p>
<ol>
<li>Ted Dunning is now a committer!  Welcome Ted!</li>
<li>I put up a patch for a map-reduce ready (MRR?) version of a Naive Bayes classifier.  Still needs some work, but feedback is appreciated.  It includes a sample of running it against the 20 Newsgroups data.</li>
<li>Seems like their is someone new providing comments, etc. every day.  The mailing list continues to grow.</li>
<li>We have slotted 3 Google Summer of Code participants and had many excellent proposals.  It was very difficult to decide.</li>
<li>The <a href="http://taste.sf.net">Taste</a> project is now officially accepted into Mahout and will be running in distributed mode in no time, I&#8217;m sure.</li>
<li>Karl is working on a hierarchical clustering implementation, amongst other things.</li>
</ol>
<p>We are also trying to obtain compute resources for committers to use such that we can do testing/benchmarking on a cluster as opposed to some hodge-podge of resources any individual has access to at any given point in time.  If you are in a position to donate cloud time, please drop me a line.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/05/06/mahout-news/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BarCampRDU</title>
		<link>http://lucene.grantingersoll.com/2008/04/23/barcamprdu/</link>
		<comments>http://lucene.grantingersoll.com/2008/04/23/barcamprdu/#comments</comments>
		<pubDate>Wed, 23 Apr 2008 13:23:44 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[BarCampRDU]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>
		<category><![CDATA[Apache Hadoop]]></category>
		<category><![CDATA[Apache Mahout]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=75</guid>
		<description><![CDATA[BarCamp wiki / BarCampRDU Threw my name in the ring for BarCamp RDU today.  Haven&#8217;t been to BarCamp before, but Erik Hatcher suggested I go and check it out. Also put in a Proposed Session of &#8220;Apache Mahout and Hadoop &#8211; Having fun with Map Reduce and distributed computing&#8221;.  Figure we talk about the basics of [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://barcamp.org/BarCampRDU">BarCamp wiki / BarCampRDU</a></p>
<p>Threw my name in the ring for BarCamp RDU today.  Haven&#8217;t been to BarCamp before, but <a href="http://code4lib.org/erikhatcher">Erik Hatcher</a> suggested I go and check it out.</p>
<p>Also put in a <a href="http://barcamp.pbwiki.com/BarCampRDUsessions">Proposed Session</a> of &#8220;<a href="http://lucene.apache.org/mahout">Apache Mahout</a> and <a href="http://hadoop.apache.org/">Hadoop</a> &#8211; Having fun with Map Reduce and distributed computing&#8221;.  Figure we talk about the basics of M/R, Hadoop and Mahout programming, look at some Mahout examples, run some code, maybe even go nuts and try setting up a distributed job across laptops just for the fun of it.  Was also thinking it might be fun to talk about Lucene/Solr, but figured one session was enough, especially since I am a BarCamp virgin.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/04/23/barcamprdu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Mahout Machine Learning Fun</title>
		<link>http://lucene.grantingersoll.com/2008/04/20/mahout-machine-learning-fun/</link>
		<comments>http://lucene.grantingersoll.com/2008/04/20/mahout-machine-learning-fun/#comments</comments>
		<pubDate>Sun, 20 Apr 2008 12:40:15 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[ApacheCon]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=74</guid>
		<description><![CDATA[It&#8217;s been an interesting few months over in Mahout land. First off, I am psyched about the response the project has been getting. Seems like there is a pent up demand for large scale machine learning these days.  I figured we would do all right in the early months, but I didn&#8217;t think we would [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been an interesting few months over in <a href="http://lucene.apache.org/mahout">Mahout</a> land.  First off, I am psyched about the response the project has been getting.  Seems like there is a pent up demand for large scale machine learning these days.    I figured we would do all right in the early months, but I didn&#8217;t think we would have as many subscribers and participants as we do this early.  Furthermore, the code contributions have started to come in and we had 15 or so applicants (for 2 or 3 spots) for our <a href="http://code.google.com/soc/2008/">Google Summer of Code</a> CFP.</p>
<p>Additionally, there were a fair number of inquiries about Mahout at <a href="http://www.eu.apachecon.com">ApacheCon EU</a> and I got to meet Karl Wettin and <a href="http://www.isabel-drost.de/">Isabel Drost</a> there (two Mahout committers).  I went well over 3 years in <a href="http://lucene.apache.org/java/">Lucene Java</a> land before meeting any of the other committers on Lucene.</p>
<p>Next, we added <a href="http://jeffeastman.blogspot.com/">Jeff Eastman</a> as a committer.  Jeff has jumped in head first and is already helping out a lot.</p>
<p>Sean Owen has donated the <a href="http://taste.sf.net">Taste</a> collaborative filtering project.  We also made Sean a committer, and he is already contributing in other areas.  We are still waiting to clear the legal hurdles here, but I think you will see Taste in the Mahout code base within the month.</p>
<p>Finally, it&#8217;s not official yet, but keep your eyes on Mahout for what should be another significant announcement of an NLP/ML project joining Mahout.  I know, I know, I&#8217;m such a tease&#8230; <img src='http://lucene.grantingersoll.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/04/20/mahout-machine-learning-fun/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Jeff Eastman&#8217;s Marvelous Cloud Computing Adventure</title>
		<link>http://lucene.grantingersoll.com/2008/03/28/jeff-eastmans-marvelous-cloud-computing-adventure/</link>
		<comments>http://lucene.grantingersoll.com/2008/03/28/jeff-eastmans-marvelous-cloud-computing-adventure/#comments</comments>
		<pubDate>Fri, 28 Mar 2008 11:51:22 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Apache]]></category>
		<category><![CDATA[clustering]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/2008/03/28/jeff-eastmans-marvelous-cloud-computing-adventure/</guid>
		<description><![CDATA[Jeff Eastman&#8217;s Marvelous Cloud Computing Adventure Mahout&#8217;s newest committer, Jeff Eastman, has a new blog on Mahout and Hadoop&#8230;]]></description>
			<content:encoded><![CDATA[<p><a href="http://jeffeastman.blogspot.com/">Jeff Eastman&#8217;s Marvelous Cloud Computing Adventure</a></p>
<p>Mahout&#8217;s newest committer, Jeff Eastman, has a new blog on Mahout and Hadoop&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2008/03/28/jeff-eastmans-marvelous-cloud-computing-adventure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

