MySQL, Solr and “Communications link failure”
So, I was indexing a 10+ million records in MySQL into Solr and kept coming across the following odd MySQL exception:
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure Last packet sent to the server was 4467745 ms ago ...com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1074) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2985) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2871) at
In my code, I loop over a JDBC ResultSet and add the records to Solr per the Solr field schema, mapping columns to fields, etc. This would happen after getting through something like 9M+ records. After some tracking down, hypothesizing, talking with others, we came to the conclusion that the issue was a combination of having the autocommit value in Solr set and MySQL timing out the ResultSet, such that when Lucene had to do a large merge (even in the background), Solr had to wait for said merge to finish, thus keeping the ResultSet open too long w/o activity. Now, these large merges can take some time. They can happen in the background, but Solr can’t refresh it’s IndexReader until the merge finishes, AIUI. Thus, we’re stuck in the middle of a ResultSet loop, holding the cursor open past MySQL’s default setting (600 seconds, more on that later), causing MySQL to kill the connection, and rightfully so. On the MySQL side of things, we are streaming the results, since it’s JDBC driver does not support setFetchSize() (ugh!). As it turns out MySQL has a Streaming timeout value named netTimeoutForStreamingResults (see http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html) which defaults to 600 seconds.
Long story short, I have at least two options:
- Turn off autocommit, meaning user’s won’t be able to see documents as soon as they may like
- Increase the netTimoutForStreamingResults value. This is great for MySQL and I have verified it works, but is not a generic value for other DBs, which our code supports
I am still deciding on what to do, and also thinking of some other options that can decouple DB retrieval from the indexing process. At any rate, I wanted to post the cause of my seeing this exception, because I did not see anyone else with this exception whose cause was due to a timeout during ResultSet processing and hopefully it will save them some time.

I avoid this by running my queries with limits while I reindex. So, you might query 100 rows at a time from your db, index them, then query the next 100 rows. You could also split up the indexing work across many machines with this approach then merge your index together after each separate process is finished. Or maybe use the new federated index features from solr. Happy indexing!
We use a queue (MSMQ) between database and Lucene, as well as between web crawlers and Lucene, that way indexing is asynchronous and you never get any such time out errors. Load up the queue from the database, and let SOLR index from the queue. You only need a small service to watch for new items in queue and submit to SOLR.
This is one of the reasons DataImportHandler re-opens connections inactive for more than 10 seconds because there is no JDBC driver-independent way of managing time-outs.
Bob, I am looking into the queue approach, as that seems most useful. Shalin, you are definitely right on connection timeouts, but this was on the ResultSet, so you’d have to reexecute the query (or automatically page, as Archie suggests) I think.