It’s back to school time and for those with academic interests in machine learning, it’s great to see Apache Mahout is catching on in academic circles, in addition to commercial circles. The interest is no doubt due to its open code, active community and focus on real world machine learning techniques. The first class based on Mahout that I am aware of was Mahout committer Isabel Drost’s class at TU Berlin. Of course, with all due respect to Isabel, she is a bit biased towards Mahout; so I was pleasantly surprised when Dr. David Grossman (See his excellent Information Retrieval: Algorithms and Heuristics (The Information Retrieval Series)(2nd Edition)
book, for starters) from the Illinois Institute of Technology contacted me last spring about putting together a class on Mahout for the fall at IIT. Well, that class has finally come to fruition as CS 422: Data Mining Course Homepage and it looks to have nice coverage of the things near and dear to Mahout: clustering, classification, pattern mining and recommenders along with the requisite theoretical underpinnings.
I’ve also heard from a few other Professors who are working on adding Mahout to their coursework and would love to hear from more. So, if you are teaching a class on machine learning and interested in Mahout for teaching purposes, either let me know (gsingers@apache.org) or drop a line to the Mahout community mailing list: user@mahout.apache.org.
August 31st, 2010 | Posted in Apache, Mahout, machine learning | No Comments
Erik Hatcher and I are once again offering our Lucene and Solr training classes, but this time there are two opportunities to participate. The first will be at Lucene Revolution on October 5 and 6. The second is on Nov. 1st and 2nd at ApacheCon NA 2010. Both classes are designed to get people up to speed on either Lucene or Solr as quickly as possible. If you have any questions, feel free to drop me an email at trainer@lucenebootcamp.com.
August 30th, 2010 | Posted in Apache, Lucene, Lucene Boot Camp, Solr | No Comments
The next TriHUG meeting has been announced: Sept. 14. There will be two speakers:
Wei Wei on Practical Hadoop Security and Me on Hadoop and Lucene and Solr.
For more info and to RSVP, see Triangle Hadoop Users Group.
August 19th, 2010 | Posted in Apache, Hadoop, Java, Lucene | No Comments
My podcast with Lucene In Action 2 (http://lucene.li/e) co-authors Erik Hatcher, Michael McCandless and Otis Gospodnetic is now live:
Authors’ Podcast: Lucene in Action, Second Edition | Lucid Imagination.
Hope you enjoy!
August 2nd, 2010 | Posted in Apache, Lucene, Lucid Imagination | No Comments
I’m pleased to announce a few of us Apache Hadoop users in the Triangle (Raleigh, Durham, Chapel Hill North Carolina) have finally reached critical mass since I sent out an email over a year ago to the Hadoop mailing list asking for interested people. We’ve found a place to meet and discuss the Hadoop ecosystem, so please join us at our first meeting on July 20th at Bronto Software. Find out more details at: Triangle Hadoop Users Group
We are starting with some introductory talks, but we will no doubt get much deeper very quickly. If you’d like to give a talk or sponsor or just attend, drop on over to the TriHUG website for more information.
July 8th, 2010 | Posted in Apache, Hadoop, Mahout | No Comments
Thanks to committer Robin Anil for putting together Mahout’s shiny new website: Apache Mahout:: Scalable machine-learning and data-mining library.
June 28th, 2010 | Posted in Apache, Java, Mahout | No Comments
ccri – Blog – Latent Semantic Analysis in Solr using Clojure. I’ve been watching the github site for a while, nice to see some more written up on it.
June 16th, 2010 | Posted in Apache, Lucene, Mahout, Solr | No Comments
Amazon has a nice Guardian Case Study: Amazon Web Services up on their website about how the Guardian is using EC2 for scaling their Open Platform API. If you’ve been following along, the little boxes in the picture (the blue and gold ones) are actually Apache Solr. If you want to read more about the Solr part of it, check out the Lucid Imagination press release, which will lead you to more details on it.
June 11th, 2010 | Posted in Apache, Java, Lucene, Lucid Imagination, Solr | No Comments
May 17th, 2010 | Posted in Apache, Java, Lucene, Lucene Connector Framework, Lucid Imagination, Mahout, Nutch, Open Relevance, Performance, Real Time Search, Solr, Tika, spatial | No Comments