Archive for the 'Indexing' Category

Lucene Indexing Performance: Managing RAM while Indexing

https://issues.apache.org/jira/browse/LUCENE-843
This patch, by Michael McCandless pretty much sums up what I love about Lucene and what makes Lucene an extraordinary open source project.
Take Lucene, which already has a pretty strong reputation as being fast, and add in a motivated committer (which Lucene has a high number of, IMO) and out comes a patch that, at [...]

More ApacheCon Info

As I posted earlier, I will be giving a talk and a tutorial at ApacheCon Europe this year on the Apache Lucene Java project. My talk is titled “Advance Lucene”. Here is the abstract:
Lucene Java is a high performance, scalable, cross-platform search engine that contains many advanced features that often are under utilized [...]

ApacheCon 2007 Europe Talks

I have received official word from ApacheCon that 2 of my proposals have been accepted. I will be giving the “Advanced Lucene” talk on Wednesday, May 2nd, 2007. This talk will focus on advanced querying capabilities, term vectors and Lucene performance. I will also be giving a full day tutorial on Lucene [...]

[#LUCENE-753] Use NIO positional read to avoid synchronization in FSIndexInput - ASF JIRA

[#LUCENE-753] Use NIO positional read to avoid synchronization in FSIndexInput - ASF JIRA
Want a crash course in NIO and fast IO in Java?  Then take a look at this issue for Lucene and then go do your homework.  The transferTo() trick is something I haven’t seen before and was a bit blown away by.  I [...]

IBM OmniFind Yahoo! Edition - Simple Search Just Got Easier

IBM OmniFind Yahoo! Edition - Simple Search Just Got Easier
FYI: Uses Lucene under the hood

Ferret (a.k.a Ruby Lucene)

Interesting interview with the creator of Ferret at http://on-ruby.blogspot.com/2006/10/ruby-hacker-interview-dave-balmain.html
Talks about some of the performance changes he has made in Ferret C version to make it run a lot faster than Java Lucene.  He says he doubts they can be ported to Java, but I wonder if the Java version might still benefit.

Lucene Benchmarking

There is some effort under way to implement a standard benchmarking contribution for Lucene. It is chronichled at http://issues.apache.org/jira/browse/LUCENE-675. The goal is to provide a way for developers to see whether changes they are making are worthwhile. By running the benchmarks before and after applying a patch, it should become obvious whether [...]