Archive for the 'Java' Category
Some of you may have noted that I’ve been quieter than usual lately. Well, the reason is I was preparing for the launch of the new company I helped found: Lucid Imagination. Now, I don’t blog too often about what I do for work on this site, other than it is Lucene, Solr and [...]
January 26th, 2009 | Posted in Apache, Java, Lucene, Lucid Imagination, Mahout, Solr, Tika | 4 Comments
Congratulations to Apache Tika (nevermind the incubator address, it’s still in the process of migrating) for graduating from Incubation! And welcome to the Lucene project! Tika is a content extraction framework that wraps many other content extraction libraries such as PDFBox, POI, and others into a single, easy to use framework that makes it easy [...]
November 13th, 2008 | Posted in Apache, clustering, Java, Lucene, machine learning, Mahout, Manning, OpenNLP, Search, Solr, Taming Text, Tika | 3 Comments
What’s new with Apache Solr. My latest article on Apache Solr, title “What’s New with Apache Solr” is now available over at IBM developerWorks. It covers some of the new features like spell checking, Data Import Handler, distributed search, editorial results placement (a.k.a. “paid placement”), SolrJ and a variety of other pieces. Hope it is [...]
November 5th, 2008 | Posted in Indexing, Java, Lucene, Search, Solr, spell checking | 1 Comment
Charlotte JUG » October Slides Available – Search & Analysis Had a lot of fun at my recent talk at the Charlotte JUG. They’ve got a good core of people and there was a lot of good discussion about the topic. Even managed to give away some free eBooks of “Taming Text“. Wish I would [...]
October 24th, 2008 | Posted in Charlotte, Java, Lucene, machine learning, Mahout, Manning, Taming Text | No Comments
Just a quick reminder that there is just over one week left before Lucene Boot Camp at this year’s ApacheCon. This year, it is a 2 day training, but for those who want to, they can sign up for the first day of Lucene Boot Camp, and then attend Solr Boot Camp on the second [...]
October 23rd, 2008 | Posted in Apache, ApacheCon, Indexing, Java, Lucene, Lucene Boot Camp, Search, Solr | 4 Comments
I’ve had a chance recently to work on some things in Solr that I think that can, in the right circumstances, really enhance Solr. First off, is SOLR-651, which implements what I am calling a Term Vector Component. The basic gist of it is that Solr can now serve up term vectors from Lucene. For [...]
October 23rd, 2008 | Posted in Apache, clustering, Java, Lucene, machine learning, Mahout, Manning, Search, Solr, spell checking, Taming Text, term vectors, tokenization | 1 Comment
Welcome to Lucene! Boy, I must be slipping, but Lucene 2.4.0 is open. See the link for more details.
October 17th, 2008 | Posted in Apache, Java, Lucene | No Comments
Lucene Boot Camp (ApacheCon site) Lucene Boot Camp (http://www.lucenebootcamp.com) is scheduled this year for ApacheCon US on November 3 and 4th in New Orleans. This year, I am doing a two day event, as I felt the one day event was just not enough time to get in all the goodness that is Lucene (not [...]
August 20th, 2008 | Posted in Apache, ApacheCon, Indexing, Java, Lucene, Lucene Boot Camp, Lucid Imagination, Search | No Comments
BarCamp wiki / BarCampRDU I’ll be at BarCampRDU tomorrow. I proposed two sessions, one on Hadoop and Mahout and one on Lucene and Solr. I don’t think I really want to do both, but I would like to do at least one, so we’ll see what other people are interested in. If you’re around and [...]
August 1st, 2008 | Posted in Apache, BarCampRDU, Hadoop, Java, Lucene, machine learning, Mahout, Map Reduce, Nutch, Raleigh, Triangle | 5 Comments
HP, Intel and Yahoo To Research Cloud Computing – Yahoo News Boy, this could really come in handy in Open Source, especially projects like Mahout, Nutch and distributed Solr. I find my biggest personal challenge on Mahout is access to computing resources. I personally don’t have the financial backing to buy much time on Amazon [...]
July 30th, 2008 | Posted in Apache, Hadoop, Java, Lucene, machine learning, Mahout, Map Reduce | 2 Comments