Archive for the 'machine learning' Category
My first article on Apache Mahout was just published on IBM devWorks. It’s targeted at people just getting started with machine learning and Mahout. You can read the article at Introducing Apache Mahout. Feedback welcome.
September 8th, 2009 | Posted in Apache, Mahout, machine learning | 2 Comments
Sean Owen, Apache Mahout committer has put up a brief post stating that Mippin is using Apache Mahout for it’s recommendation system.
Read it: Mippin Blog: May We Recommend….
September 3rd, 2009 | Posted in Apache, Mahout, machine learning | No Comments
Natural Language Processing Virtual Reading Group | Google Groups
A few people in the NLP LinkedIn group have decided to start up a virtual reading group. All are welcome.
July 7th, 2009 | Posted in machine learning | No Comments
Just wanted to follow up on last night’s Lucene/Solr Meetup in San Francisco.
First off, special thanks to all the speakers (Jason Rutherglen, Michael Busch, Erik Hatcher and all the lightning talks.) We had a lot of excellent talks ranging from low level Lucene details on payloads and real time search to high level discussions on [...]
June 4th, 2009 | Posted in Droids, Hadoop, Java, Latent Dirichlet Allocation, Lucene, Lucid Imagination, Mahout, Open Relevance, Real Time Search, Solr, Tika, canopy clustering, machine learning, relevance | No Comments
Copying TREC is the Wrong Track for the Enterprise | The Noisy Channel.
Daniel Tunkelang has written up an interesting post on the new Open Relevance Project that me and a few other Lucene people are starting up and I thought I would respond here:
Little late to the conversation, but I think maybe we should back [...]
May 18th, 2009 | Posted in Apache, Lucene, Mahout, Open Relevance, Performance, Solr, machine learning, relevance | 2 Comments
Interview with Mike Klaas | Lucid Imagination.
Looks like my interview with Solr committer Mike Klaas is up. Check it out, some interesting discussion about Worio’s use of Solr and also some discussion on machine learning in the context of search.
March 25th, 2009 | Posted in Solr, machine learning | No Comments
k-means and other EM-like algorithms are trivial to parallelize because all the heavy computations in the inner loops are independent.
via Speeding up K-means Clustering with Algebra and Sparse Vectors « LingPipe Blog.
This is exactly what Apache Mahout does. We have parallelized versions of a bunch of clustering algorithms, including k-means
March 18th, 2009 | Posted in Mahout, clustering, kMeans clustering, machine learning | 2 Comments
Hadoop, Analytical Software, Finds Uses Beyond Search – NYTimes.com.
Nice writeup on Hadoop in the NYT today. Of course, Hadoop is often used to power machine learning, too, which is the premise behind using it on Apache Mahout.
March 17th, 2009 | Posted in Hadoop, Mahout, machine learning | No Comments
Lucid Imagination » Add our Lucene Ecosystem Search Engine to Firefox
Mark Miller shows how to add Lucid’s Lucene ecosystem search as a Firefox plugin. Now you can search all the Lucene project (and subproject) archives, website, wiki from the comfort of your browser plugin.
March 3rd, 2009 | Posted in Lucene Boot Camp, Mahout, North Carolina, Tika, machine learning, wpSearch | No Comments
SummerOfCode2009 – General Wiki
It’s that time of year again. Time for students to sign up for Google Summer of Code. Gist of it: Get paid to work in Open Source for the summer.
I’ve signed up to mentor for Apache Mahout. We are looking for students interested in implementing cutting-edge machine learning algorithms, optionally using Hadoop [...]
February 18th, 2009 | Posted in Apache, Google Summer of Code, Lucene, Mahout, Solr, Tika, machine learning | No Comments