<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Grant's Grunts: Lucene Edition &#187; Latent Dirichlet Allocation</title>
	<atom:link href="http://lucene.grantingersoll.com/category/latent-dirichlet-allocation/feed/" rel="self" type="application/rss+xml" />
	<link>http://lucene.grantingersoll.com</link>
	<description>Thoughts on Apache Lucene, Mahout, Solr, Tika and Nutch</description>
	<lastBuildDate>Mon, 06 Feb 2012 12:07:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Apache Mahout Status</title>
		<link>http://lucene.grantingersoll.com/2009/06/16/apache-mahout-status/</link>
		<comments>http://lucene.grantingersoll.com/2009/06/16/apache-mahout-status/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 20:11:41 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[Latent Dirichlet Allocation]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Mahout]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=202</guid>
		<description><![CDATA[It&#8217;s been a while since I&#8217;ve said much about Mahout, but it seems like things are trending upwards. From a subjective standpoint, it just feels like there is more going on both on mahout-user@lucene.a.o and mahout-dev@lucene.a.o. It also feels like people are starting to kick the tires, which is vital.  Furthermore,  it also seems like [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been a while since I&#8217;ve said much about <a href="http://lucene.apache.org/mahout">Mahout</a>, but it seems like things are trending upwards.  From a subjective standpoint, it just feels like there is more going on both on mahout-user@lucene.a.o and mahout-dev@lucene.a.o.  It also feels like people are starting to kick the tires, which is vital.  Furthermore,  it also seems like we&#8217;ve gotten a few more contributors as late, which is, of course, key for any open source project as well.  The coming months will be really important for us existing contributors to make sure we help newcomers have a good experience and address their issues.</p>
<p>In looking at the <a href="http://lucene.apache.org/mahout/mailinglists.html">mailing list</a> subscriptions, the user list is over 300  subscribers now (I believe a few months ago it was just under 250) and the dev list is up a little bit, although not a lot.  Of course, subscription rates come and go, but we&#8217;ve been pretty solid around these numbers for a while, which is a good sign that people are still interested, IMO.</p>
<p>From a personal standpoint, I&#8217;ve had a bit more time to work on it, which gets me jazzed up.  Right now, I&#8217;m working on getting some examples up for clustering documents (see https://issues.apache.org/jira/browse/MAHOUT-126 and https://issues.apache.org/jira/browse/MAHOUT-65), collaborative filtering (using Taste in Mahout) and categorization.  The clustering stuff is likely still pretty naive on my part, but it&#8217;s a start.  The clustering and categorization work will also feed into my book <a href="http://www.manning.com/affiliate/idevaffiliate.php?id=1069_148">Taming Text</a>.</p>
<p>We also have several Google Summer of Code students involved, which is always enjoyable and a learning experience.  I even got to <a href="http://www.meetup.com/SFBay-Lucene-Solr-Meetup/">meet the student</a> I&#8217;m mentoring (David Hall) in person this year, which was pretty cool.  David is implementing <a href="https://issues.apache.org/jira/browse/MAHOUT-123">Latent Dirichlet Allocation</a> on MapReduce for Mahout.  I&#8217;m not sure I understand it all just yet, but I trust David will make it clear to me by the end of the summer.</p>
<p>Speaking of meeting people, at that same meetup I finally got to meet fellow committers Jeff Eastman and Ted Dunning.  Always nice to put names to faces, having worked with them on Mahout now for well over a year.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2009/06/16/apache-mahout-status/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>SF Bay Area Lucene/Solr Meetup</title>
		<link>http://lucene.grantingersoll.com/2009/06/04/sf-bay-area-lucenesolr-meetup/</link>
		<comments>http://lucene.grantingersoll.com/2009/06/04/sf-bay-area-lucenesolr-meetup/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 17:49:35 +0000</pubDate>
		<dc:creator>grant_ingersoll</dc:creator>
				<category><![CDATA[canopy clustering]]></category>
		<category><![CDATA[Droids]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Latent Dirichlet Allocation]]></category>
		<category><![CDATA[Lucene]]></category>
		<category><![CDATA[Lucid Imagination]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[Mahout]]></category>
		<category><![CDATA[Open Relevance]]></category>
		<category><![CDATA[Real Time Search]]></category>
		<category><![CDATA[relevance]]></category>
		<category><![CDATA[Solr]]></category>
		<category><![CDATA[Tika]]></category>
		<category><![CDATA[Meetup]]></category>

		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=197</guid>
		<description><![CDATA[Just wanted to follow up on last night&#8217;s Lucene/Solr Meetup in San Francisco. First off, special thanks to all the speakers (Jason Rutherglen, Michael Busch, Erik Hatcher and all the lightning talks.)  We had a lot of excellent talks ranging from low level Lucene details on payloads and real time search to high level discussions [...]]]></description>
			<content:encoded><![CDATA[<p>Just wanted to follow up on last night&#8217;s Lucene/Solr <a href="http://www.meetup.com/SFBay-Lucene-Solr-Meetup/">Meetup</a> in San Francisco.</p>
<p>First off, special thanks to all the speakers (Jason Rutherglen, Michael Busch, Erik Hatcher and all the lightning talks.)  We had a lot of excellent talks ranging from low level Lucene details on payloads and real time search to high level discussions on new feature in Solr and best practices for working on stopwords and relevance.  Also had intros to <a href="http://lucene.apache.org/mahout">Mahout</a>, <a href="http://lucene.apache.org/tika">Tika</a> and the new <a href="http://www.lucidimagination.com/search/document/84205d273f3753c2/open_relevance_project_kickoff">Open Relevance</a> project at Lucene.  I&#8217;ll post the slides on the Meetup site when they are available (I am still waiting to get them from the speakers.)</p>
<p>Second, I really enjoyed engaging with so many people about what they are working on in Lucene/Solr.  It is always fun to hear all the different ways people are (ab)using Lucene/Solr to do cool things, etc.   It was especially good to meet some fellow Mahout committers (Ted Dunning and Jeff Eastman) for the first time, as well as one of Mahout&#8217;s Google Summer of Code student David Hall, who is working on adding <a href="http://www.lucidimagination.com/search/?q=Latent+Dirichlet">Latent Dirichlet Allocation</a>.</p>
<p>Finally, I look forward to doing more of these.  Right now, I&#8217;m looking for interest in Raleigh, NC, but I know we&#8217;ll likely have another one in the Bay Area again soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://lucene.grantingersoll.com/2009/06/04/sf-bay-area-lucenesolr-meetup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

