Archive for the 'Hadoop' Category

ApacheCon Goodness this Week

Lots of goodness this week at ApacheCon, at least when it comes to Lucene, Solr, Mahout, Tika and Hadoop (i.e. the Lucene eco-system).  There’s 2 full days on Hadoop, with lots of coverage of all the pieces that go into Hadoop.  There’s also a full day of Lucene related talks, plus Erik and I are [...]

ZooKeeper/Tao – Hadoop Wiki

ZooKeeper/Tao – Hadoop Wiki I like Zookeeper already, and I just started looking at it…  Hopefully the code lives up to the Tao.

BarCamp wiki / BarCampRDU

BarCamp wiki / BarCampRDU I’ll be at BarCampRDU tomorrow.  I proposed two sessions, one on Hadoop and Mahout and one on Lucene and Solr.  I don’t think I really want to do both, but I would like to do at least one, so we’ll see what other people are interested in. If you’re around and [...]

HP, Intel and Yahoo To Research Cloud Computing – Yahoo News

HP, Intel and Yahoo To Research Cloud Computing – Yahoo News Boy, this could really come in handy in Open Source, especially projects like Mahout, Nutch and distributed Solr.  I find my biggest personal challenge on Mahout is access to computing resources.  I personally don’t have the financial backing to buy much time on Amazon [...]

Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!)

Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!) Congrats to the Hadoop team!  Score one for Open Source!

Mahout News

Wow!  Mahout has just got me pumped up.  I feel like we’ve got a lot of positive momentum and that we are starting to get the various pieces of our suite of machine learning libraries in place.  Various news items include: Ted Dunning is now a committer!  Welcome Ted! I put up a patch for [...]

Manning: Taming Text

Manning: Taming Text Scary…  I guess it is real!

BarCampRDU

BarCamp wiki / BarCampRDU Threw my name in the ring for BarCamp RDU today.  Haven’t been to BarCamp before, but Erik Hatcher suggested I go and check it out. Also put in a Proposed Session of “Apache Mahout and Hadoop – Having fun with Map Reduce and distributed computing”.  Figure we talk about the basics of [...]

Mahout Machine Learning Fun

It’s been an interesting few months over in Mahout land. First off, I am psyched about the response the project has been getting. Seems like there is a pent up demand for large scale machine learning these days.  I figured we would do all right in the early months, but I didn’t think we would [...]

Jeff Eastman’s Marvelous Cloud Computing Adventure

Jeff Eastman’s Marvelous Cloud Computing Adventure Mahout’s newest committer, Jeff Eastman, has a new blog on Mahout and Hadoop…