[#LUCENE-1313] Ocean Realtime Search - ASF JIRA
Jason Rutherglen has been up to some interesting things with Lucene lately concerning real time search. This has always been one of those parts of Lucene that has been needed over time by some people, but has never reached the critical mass whereby someone tackles it. Looks like Jason [...]
June 24th, 2008 | Posted in Apache, Lucene, Real Time Search, Search | No Comments
For a while now, I have been trying to get my hands on TREC data for the Lucene project. For those who aren’t familiar, TREC is an annual competition for search engines that provides a common set of documents to index, queries to execute and judgments to check your answers to see how good an [...]
May 18th, 2008 | Posted in Apache, Java, Lucene, Nutch, Performance, Search, Solr, TREC, relevance | 8 Comments
Lucid Imagination
Well, the cat is out of the bag. In case you haven’t heard, a few Lucene/Solr/Mahout committers (Erik Hatcher and Yonik Seeley) and I have teamed up with some other long time search veterans (Marc Krellenstein from Northern Light and former CTO of Reed Elsevier, amongst others) to build a company around providing product, [...]
May 2nd, 2008 | Posted in Lucene, Lucid Imagination, Mahout, Solr | No Comments
How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data | High Scalability
Nice article on how the Lucene/Hadoop/Solr stack was used to solve a really big problem. Someday, I hope (when we have actual code), they can add Mahout to the equation and do even more interesting things with the data.
February 1st, 2008 | Posted in Apache, Hadoop, Indexing, Java, Lucene, Mahout, Search, Solr, database | No Comments
Apache Mahout - Overview
It’s official! Mahout is now an official subproject of Lucene at the Apache Software Foundation. Mahout’s goal is to create a suite of practical, scalable machine learning libraries.
January 25th, 2008 | Posted in Apache, Java, Mahout, machine learning | No Comments
Coderspiel / The right tool for the slob
This guy’s comment system wasn’t working at the moment, so I will leave my comment here. This won’t make much sense without reading the post first:
It’s funny you mention Wikipedia as an example, since they are running Lucene. As is Technorati and the Internet Archive. [...]
January 19th, 2008 | Posted in Apache, Indexing, Java, Lucene, Nutch, Search, Solr | 2 Comments
I have setup a new site to support my Lucene Boot Camp training. Check it out at http://lucenebootcamp.com. From there, you can download training setup information, read the class outline, etc.
November 6th, 2007 | Posted in ApacheCon, Indexing, Java, Lucene, Search | No Comments
Lots of good things happening in Lucene land lately, all of which should benefit users with faster indexing and searching capabilities. Most notably, Lucene 2.3 (hopefully released this quarter) has some major changes in indexing memory management and performance. I have personally clocked indexing using release 2.2 at about 400 rec/s (single threaded, Mac Pro [...]
November 2nd, 2007 | Posted in Indexing, Java, Lucene, Performance, Search, term vectors | No Comments