Archive for the 'Indexing' Category

Advance Lucene slides from ApacheCon Europe 2007

The latest version of my slides for “Advanced Lucene” are located at http://www.cnlp.org/presentations/present.asp?show=conference Talk covered term vectors, using various query types and Lucene performance tips and tricks.

Atlassian and Lucene

Nice presentation on Atlassian‘s use of Lucene at http://blogs.atlassian.com/rebelutionary/archives/2007/04/my_serverside_java_symposium_2007_presen.html

ApacheCon Europe “Advanced Lucene” slides

My (slightly old) slides for ApacheCon Europe are now available in the conference proceedings available at http://eu.apachecon.com/downloads/materials.zip I will post the latest version soon, but there is very little difference between this version and the latest. Topics covered include Lucene performance, term vectors and query tips and tricks. Feedback is always welcome

Lucene Indexing Performance: Managing RAM while Indexing

https://issues.apache.org/jira/browse/LUCENE-843 This patch, by Michael McCandless pretty much sums up what I love about Lucene and what makes Lucene an extraordinary open source project. Take Lucene, which already has a pretty strong reputation as being fast, and add in a motivated committer (which Lucene has a high number of, IMO) and out comes a patch [...]

More ApacheCon Info

As I posted earlier, I will be giving a talk and a tutorial at ApacheCon Europe this year on the Apache Lucene Java project. My talk is titled “Advance Lucene”. Here is the abstract: Lucene Java is a high performance, scalable, cross-platform search engine that contains many advanced features that often are under utilized by [...]

ApacheCon 2007 Europe Talks

I have received official word from ApacheCon that 2 of my proposals have been accepted. I will be giving the “Advanced Lucene” talk on Wednesday, May 2nd, 2007. This talk will focus on advanced querying capabilities, term vectors and Lucene performance. I will also be giving a full day tutorial on Lucene Java on May [...]

[#LUCENE-753] Use NIO positional read to avoid synchronization in FSIndexInput – ASF JIRA

[#LUCENE-753] Use NIO positional read to avoid synchronization in FSIndexInput – ASF JIRA Want a crash course in NIO and fast IO in Java?  Then take a look at this issue for Lucene and then go do your homework.  The transferTo() trick is something I haven’t seen before and was a bit blown away by.  [...]

IBM OmniFind Yahoo! Edition – Simple Search Just Got Easier

IBM OmniFind Yahoo! Edition – Simple Search Just Got Easier FYI: Uses Lucene under the hood

Ferret (a.k.a Ruby Lucene)

Interesting interview with the creator of Ferret at http://on-ruby.blogspot.com/2006/10/ruby-hacker-interview-dave-balmain.html Talks about some of the performance changes he has made in Ferret C version to make it run a lot faster than Java Lucene.  He says he doubts they can be ported to Java, but I wonder if the Java version might still benefit.

Lucene Benchmarking

There is some effort under way to implement a standard benchmarking contribution for Lucene. It is chronichled at http://issues.apache.org/jira/browse/LUCENE-675. The goal is to provide a way for developers to see whether changes they are making are worthwhile. By running the benchmarks before and after applying a patch, it should become obvious whether the patch adversely [...]