Archive for the 'term vectors' Category
I’ve had a chance recently to work on some things in Solr that I think that can, in the right circumstances, really enhance Solr.
First off, is SOLR-651, which implements what I am calling a Term Vector Component. The basic gist of it is that Solr can now serve up term vectors from Lucene. For those [...]
October 23rd, 2008 | Posted in Apache, Java, Lucene, Mahout, Manning, Search, Solr, Taming Text, clustering, machine learning, spell checking, term vectors, tokenization | 1 Comment
Lots of good things happening in Lucene land lately, all of which should benefit users with faster indexing and searching capabilities. Most notably, Lucene 2.3 (hopefully released this quarter) has some major changes in indexing memory management and performance. I have personally clocked indexing using release 2.2 at about 400 rec/s (single threaded, Mac Pro [...]
November 2nd, 2007 | Posted in Indexing, Java, Lucene, Performance, Search, term vectors | No Comments
The latest version of my slides for “Advanced Lucene” are located at http://www.cnlp.org/presentations/present.asp?show=conference
Talk covered term vectors, using various query types and Lucene performance tips and tricks.
May 7th, 2007 | Posted in ApacheCon, Europe, Indexing, Java, Lucene, Performance, Search, payloads, queries, term vectors | No Comments
My (slightly old) slides for ApacheCon Europe are now available in the conference proceedings available at http://eu.apachecon.com/downloads/materials.zip
I will post the latest version soon, but there is very little difference between this version and the latest.
Topics covered include Lucene performance, term vectors and query tips and tricks.
Feedback is always welcome
May 3rd, 2007 | Posted in ApacheCon, Europe, Indexing, Java, Lucene, Performance, Search, payloads, queries, term vectors | No Comments