First Real Commit

Ooh boy, I guess this is real now.  I just did my first real commit to Lucene of Issue 545 in JIRA.

545 adds in the ability to customize how Fields get loaded when retrieving a document and introduces the concept of lazy field loading, which is useful for when large amounts of data are stored in the index on a particular field. I think the main use case for this issue is you have a Document that has several fields on them, many of which are metadata fields along with one or two that contain the original document (word, pdf, XML, whatever) that you want to keep along with the Document for display purposes.  When you do a search and display the Hits the old way, you had to load every field into memory regardless of whether you used that field for display or not.  Now with the FieldSelector interface, you can decide whether to load a field or not and also delay the loading of the field until you actually access it.

This issue still has one main inefficieny in it due to a minor issue with stored Strings in the index that prevents Lucene from skipping past certain kinds of Fields.  However, I did implement a skip ahead function that minimizes the amount of processing required to move past these fields.  There are a couple other proposals in JIRA to implement a better seek functionality for these Fields.

Leave a Reply

*
To prove that you're not a bot, enter this code
Anti-Spam Image