Archive for the 'Apache' Category

How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data | High Scalability

How Rackspace Now Uses MapReduce and Hadoop to Query Terabytes of Data | High Scalability
Nice article on how the Lucene/Hadoop/Solr stack was used to solve a really big problem.  Someday, I hope (when we have actual code),  they can add Mahout to the equation and do even more interesting things with the data.

Good Math, Bad Math : Databases are hammers; MapReduce is a screwdriver.

Good Math, Bad Math : Databases are hammers; MapReduce is a screwdriver.
Well stated response to a criticism on Map Reduce.  Adding my own two cents, I once used Hadoop, a free open source implementation of Map Reduce (M/R) in a proof of concept implementation, to automatically translate (as in machine translation) a large (in my [...]

Apache Mahout - Overview

Apache Mahout - Overview
It’s official!  Mahout is now an official subproject of Lucene at the Apache Software Foundation.  Mahout’s goal is to create a suite of practical, scalable machine learning libraries.

Coderspiel / The right tool for the slob

Coderspiel / The right tool for the slob
This guy’s comment system wasn’t working at the moment, so I will leave my comment here. This won’t make much sense without reading the post first:
It’s funny you mention Wikipedia as an example, since they are running Lucene. As is Technorati and the Internet Archive. [...]