Lucene Revolution in May 2012

In case you haven’t signed up already, Lucene Revolution is returning to Boston in May of this year, albeit at a different venue.  You can learn more at www.lucenerevolution.org.   I was just reviewing the submitted talks and looks to be another good conference.

The Art of Harnessing Unwieldy Data Big & Small

I’ll be doing a live panel interview today with DM Radio on The Art of Harnessing Unwieldy Data Big & Small.  Click the link to register.  Looks to be an interesting discussion on dealing with unstructured content.

Looking for a Research Engineer

I’m looking for a Research Engineer with Hadoop and Solr experience to work on next generation search and big data problems.  If you are interested or know someone who is, please take a look at Careers – Research Engineer | Lucid Imagination.

Berlin Buzzwords 2012

In case you haven’t heard, and are in Europe this June (or want to be), you should check out the Berlin Buzzwords conference.  It’s a great conference for all things related to Lucene, Solr, Hadoop, Mahout, NoSQL and generally scaling.  The CFP is open now through March 11.

Taming Text Update

Drew, Tom and I are feverishly working away on finishing up Taming Text.  We are currently in the process of addressing the feedback we got from our final review and should have updates up soon.  I have also posted all of the book’s source code up on Github under the Taming Text user.  The source includes, amongst other things, a simple Question Answering system using Solr and OpenNLP, as well as analyzers for Lucene that use OpenNLP for sentence detection, part of speech tagging and Named Entity Recognition.  As with most books, these examples are meant to be just that, examples.

Mahout in Action Review

 

 

 

I’ve posted my review of “Mahout in Action” on Lucid’s website: Mahout in Action Review.

TriHUG Next Meeting featuring Josh Patterson of Cloudera set for Oct. 11

 

 

 

Just a few more days until the next Triangle Hadoop User’s Group meeting.  Get the details and sign up via Triangle Hadoop Users Group, TriHUG Next Meeting featuring Josh Patterson of Cloudera set for Oct. 11.

Lucid Imagination » Flexible ranking in Lucene 4

For those who have wanted other scoring models in Lucene/Solr (Okapi, others) more details can be found on Lucid’s blog: Lucid Imagination » Flexible ranking in Lucene 4.

R in Action

Just ordered “R in Action” from Manning.  Looking forward to learning more about it, as it comes up often when discussing solving smaller problems that what is appropriate for Apache Mahout.  Hopefully, I will have time to post a review in the coming weeks.

TriHUG Next Meeting: Sept. 13 @ Bronto Software

Triangle Hadoop Users Group, Next Meeting: Sept. 13 @ Bronto Software.

Ted Dunning of Mahout fame will be speaking at the next TriHUG meeting on MapR and it’s relationship with Hadoop, etc.