Good Math, Bad Math : Databases are hammers; MapReduce is a screwdriver.

Good Math, Bad Math : Databases are hammers; MapReduce is a screwdriver.

Well stated response to a criticism on Map Reduce.  Adding my own two cents, I once used Hadoop, a free open source implementation of Map Reduce (M/R) in a proof of concept implementation, to automatically translate (as in machine translation) a large (in my terms) collection of documents from one language (Arabic) to another (English).  It’s something that would be really hard to do in a database.  Besides, I had a bunch of dumb old machines laying around, while I didn’t have a $1 million plus license of Oracle laying around.

Other things M/R is nice for: crawling (see Nutch) and parallel indexing for search engines; log analysis,  machine learning, etc.

Finally, I first started doing parallel programming a fair number of years ago (remember the CM-5 from Thinking Machines?) and we used the Message Passing Interface APIs (MPI) amongst others.  As the author of the article above stresses, M/R is good for SOME large scale programs (see the new Mahout project at Apache, for some examples).  There are some problems that are really large and just don’t fit in the M/R model.  As with anything you do in life, take the time to figure out which one is right for you.  You may have to rise above yourself and learn something new.

Leave a Reply

*
To prove that you're not a bot, enter this code
Anti-Spam Image