Lucene and the Corporate Environment

I love statements like these…

Enterprise Search: Lucene/Solr Meet-Up in New York City 7/22
The catch is that the project technology is really enterprise scale; the packaging still leaves something to be desired to really succeed in the corporate environment.

Apache Lucene/Solr is used in more companies than a large majority, if not all, of the commercial vendors out there combined and it somehow is perceived as not “succeeding” in the corporate environment due to lack of “packaging” (at least the quote is right about the enterprise scale nature of Lucene/Solr).  If the list of companies using Lucene are not “corporate” environments, then I don’t know what corporate means.   If by corporate packaging, you mean it has a lot of bloat and charges exorbitant license fees, then no, unfortunately, Lucene is not ready to succeed in the corporate environment.  If by corporate environment, it means it is used to save time/money/energy, then Lucene should break out the khakis and button-down shirt and start punching the clock.

There’s not a week that doesn’t go by (often more frequent then that even and that’s just little old me) that I’m not on a call to replace one of the supposed “corporate” solutions with Lucene and Solr.  In most cases the work involved is helping them model their domain using Lucene or Solr, not holding their hand due to lack of “packaging”.  The fact is, that modeling is required by all the vendors due to the nature of search. Frankly, sometimes I think the analysts make up the “packaging” bit just so they can have something to talk about, while the rest of us needing search in the real world just go along on our merry way solving the real problem of making our content findable using Lucene and Solr.

7 Responses to “Lucene and the Corporate Environment”

  1. [...] read an article posted on the Lucene blog – “Lucene and the Corporate Environment” If the list of companies using Lucene are not “corporate” environments, then I don’t [...]

  2. I would interpret “packaging” as being more than just the arrangement of bits and the wrapping around it. Deploying and maintaining a search engine is hardly a trivial operation, and having a proper deployment project followed by maintenance service tied up with the “purchase” of the product is really valuable in environments where people don’t have the luxury of the time to really understand the technology or to hang around on mailing lists asking questions and picking up advice.

    So, here’s how I would package an “enterprise search” product based on Apache Lucene:

    * The bits from official Apache Lucene releases
    * Set of useful extra tools and plugins (proprietary and/or open source)
    * Good integration for a smooth OOTB experience
    * Guides for getting started with, using and administering the search engine
    * List of supported platforms with recommended scale (backed by an organized QA process)
    * Option to do a deployment project (at cost X/day) with a (certified?) Lucene search expert
    * Option for a maintenance service (at costs per level) backed by an organized support team with a ticketing system

    That’s quite a bit added value on top of the plain open source project, and I’m sure there are many places that are happy to pay for that value.

    As far as I can tell, Lucid seems to be on a good track to providing all of these items. When do we see a product coming out?

  3. Yeah, Jukka, I know. My point is mostly that this notion of “packaging” seems to be a bit overrated, given all the people (and that number is very high) that somehow have gotten past it, and like me when I started on Lucene, they aren’t necessarily IR experts by training.

    Lucene, of course, is a very different beast from Solr. Lucene is and will always be a Java library. Of course that takes more work.

    As for all the other bullet points, Lucid (http://www.lucidimagination.com) offers all of them.

  4. > given all the people that have gotten past it

    Agreed, though as shown by the original statement and the trackback above, not everyone is in this position either for real or perceived reasons. And often those people can and want to work around the issue by throwing money at it.

    In such cases it’s often more attractive to throw that money to an established vendor with a “packaged” solution, than to hire someone to set up a custom system based on open source project. Even when the latter would be technically superior and much cheaper!

    So yeah, while the notion of “packaging” may well be overrated, there’s still a good point here. Instead of “really succeed” in the quote above, I’d say “better succeed”.

  5. Sure, I get where the “packaging” thing comes from, especially in concerns to pure, Java Lucene, which is just a library, i.e. a bunch of APIs. However, the point I’m trying to make is these people who throw out the “packaging” argument make it sound like it is so hard to do and that you need to have a PHd in Information Retrieval to implement Lucene or Solr and that, simply isn’t true, as is evidenced by the VERY LARGE number of people who have already done it.

    I’ve seen plenty of installations of “packaged” vendors now that I feel comfortable saying Solr, in most cases, requires roughly the same amount of work, if not less than any of them.

  6. This is a sentiment I see in other open source projects too. It somehow has to do with the fact that a product that the engineers can download and use for free often sneaks in the back door and becomes part of the landscape. A proprietary package has to go through a formal vetting process, and hardly ever arrives unnoticed. And yes, the open source products often deliver more value, albeit with less fanfare.

    You said something, though, that helped me put a finer point on why the Drupal integration with Solr is so cool: “The fact is, that modeling is required by all the vendors due to the nature of search.” Yes! And in Drupal, you do the modeling while you build your site, and Drupal then conveys this model to Solr automatically, so in most cases, the site builder doesn’t need to pick a special model for the search – it’s already there as a result of deciding how to build the site. Thanks for stating it so succinctly.

    (the Drupal module: http://drupal.org/project/apachesolr )

  7. [...] This blog post includes the following very relevant thought: Apache Lucene/Solr is used in more companies than a large majority, if not all, of the commercial vendors out there combined [...]

Leave a Reply

*
To prove that you're not a bot, enter this code
Anti-Spam Image