Archive for the 'Search' Category

Atlassian and Lucene

Nice presentation on Atlassian’s use of Lucene at http://blogs.atlassian.com/rebelutionary/archives/2007/04/my_serverside_java_symposium_2007_presen.html

ApacheCon Europe “Advanced Lucene” slides

My (slightly old) slides for ApacheCon Europe are now available in the conference proceedings available at http://eu.apachecon.com/downloads/materials.zip
I will post the latest version soon, but there is very little difference between this version and the latest.
Topics covered include Lucene performance, term vectors and query tips and tricks.
Feedback is always welcome

Payloads

Michael Busch recently committed some code that enables Lucene to store payloads at the term level (see https://issues.apache.org/jira/browse/LUCENE-755) and I have started working on enabling these payloads to be incorporated into search and scoring. (see http://wiki.apache.org/lucene-java/Payload_Planning and https://issues.apache.org/jira/browse/LUCENE-834)
So, you might be asking yourself, what exactly are payloads good for?  Naturally, the answer is a lot!  [...]

More ApacheCon Info

As I posted earlier, I will be giving a talk and a tutorial at ApacheCon Europe this year on the Apache Lucene Java project. My talk is titled “Advance Lucene”. Here is the abstract:
Lucene Java is a high performance, scalable, cross-platform search engine that contains many advanced features that often are under utilized [...]

ApacheCon 2007 Europe Talks

I have received official word from ApacheCon that 2 of my proposals have been accepted. I will be giving the “Advanced Lucene” talk on Wednesday, May 2nd, 2007. This talk will focus on advanced querying capabilities, term vectors and Lucene performance. I will also be giving a full day tutorial on Lucene [...]

IBM OmniFind Yahoo! Edition - Simple Search Just Got Easier

IBM OmniFind Yahoo! Edition - Simple Search Just Got Easier
FYI: Uses Lucene under the hood

Query Parser Badly Broken

Interesting discussion on the Lucene User list about the Query Parser being badly broken.  One alternative is available here.  However, as anyone who follows the QP will tell you, there is no perfect solution available for the wide range of Query types that Lucene supports.  I think most users of more advanced apps will say [...]

Ferret (a.k.a Ruby Lucene)

Interesting interview with the creator of Ferret at http://on-ruby.blogspot.com/2006/10/ruby-hacker-interview-dave-balmain.html
Talks about some of the performance changes he has made in Ferret C version to make it run a lot faster than Java Lucene.  He says he doubts they can be ported to Java, but I wonder if the Java version might still benefit.

Lucene Benchmarking

There is some effort under way to implement a standard benchmarking contribution for Lucene. It is chronichled at http://issues.apache.org/jira/browse/LUCENE-675. The goal is to provide a way for developers to see whether changes they are making are worthwhile. By running the benchmarks before and after applying a patch, it should become obvious whether [...]

Scoring Documentation

From my java-dev mailing list post this morning:
Steve Rowe and I have added scoring.xml (with some contributions from Karl Wettin, Chris Hostetter and others) to the xdocs directory (and scoring.html to the docs directory). Our goals in writing this document were:
1. To better understand scoring
2. To document how scoring works for the Lucene community, [...]