Archive for July, 2008

HP, Intel and Yahoo To Research Cloud Computing - Yahoo News

HP, Intel and Yahoo To Research Cloud Computing - Yahoo News
Boy, this could really come in handy in Open Source, especially projects like Mahout, Nutch and distributed Solr.  I find my biggest personal challenge on Mahout is access to computing resources.  I personally don’t have the financial backing to buy much time on Amazon EC2.  [...]

MySQL, Solr and “Communications link failure”

So, I was indexing a 10+ million records in MySQL into Solr and kept coming across the following odd MySQL exception:
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications
link failure
Last packet sent to the server was 4467745 ms ago

com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1074)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2985) at
com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2871) at

In my code, I loop over a JDBC ResultSet and add the records to Solr per the Solr field schema, mapping [...]

Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!)

Apache Hadoop Wins Terabyte Sort Benchmark (Hadoop and Distributed Computing at Yahoo!)
Congrats to the Hadoop team!  Score one for Open Source!