<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: MySQL, Solr and &#8220;Communications link failure&#8221;</title>
	<atom:link href="http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/feed/" rel="self" type="application/rss+xml" />
	<link>http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/</link>
	<description>Thoughts on Apache Lucene, Mahout, Solr, Tika and Nutch</description>
	<pubDate>Tue, 06 Jan 2009 02:40:33 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.5</generator>
		<item>
		<title>By: Glen Newton</title>
		<link>http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/#comment-6172</link>
		<dc:creator>Glen Newton</dc:creator>
		<pubDate>Fri, 21 Nov 2008 00:15:45 +0000</pubDate>
		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=87#comment-6172</guid>
		<description>I ran in to the same problem with LuSql.

Originally I solved it with paging, taking something like 5000 records at a time.
However, for large ResultSets, MySQL starts taking longer and longer with each subsequent query as it seems to run through the records on the server end. After ~500,000 this starts being 10s of seconds...

I settled on counting the records as they go by, catching the exception when it is thrown, then re-issuing the original query, with a limit offset to the last count.</description>
		<content:encoded><![CDATA[<p>I ran in to the same problem with LuSql.</p>
<p>Originally I solved it with paging, taking something like 5000 records at a time.<br />
However, for large ResultSets, MySQL starts taking longer and longer with each subsequent query as it seems to run through the records on the server end. After ~500,000 this starts being 10s of seconds&#8230;</p>
<p>I settled on counting the records as they go by, catching the exception when it is thrown, then re-issuing the original query, with a limit offset to the last count.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: grant_ingersoll</title>
		<link>http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/#comment-6081</link>
		<dc:creator>grant_ingersoll</dc:creator>
		<pubDate>Fri, 18 Jul 2008 19:55:39 +0000</pubDate>
		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=87#comment-6081</guid>
		<description>Bob, I am looking into the queue approach, as that seems most useful.  Shalin, you are definitely right on connection timeouts, but this was on the ResultSet, so you'd have to reexecute the query (or automatically page, as Archie suggests) I think.</description>
		<content:encoded><![CDATA[<p>Bob, I am looking into the queue approach, as that seems most useful.  Shalin, you are definitely right on connection timeouts, but this was on the ResultSet, so you&#8217;d have to reexecute the query (or automatically page, as Archie suggests) I think.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Shalin</title>
		<link>http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/#comment-6080</link>
		<dc:creator>Shalin</dc:creator>
		<pubDate>Fri, 18 Jul 2008 14:56:45 +0000</pubDate>
		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=87#comment-6080</guid>
		<description>This is one of the reasons DataImportHandler re-opens connections inactive for more than 10 seconds because there is no JDBC driver-independent way of managing time-outs.</description>
		<content:encoded><![CDATA[<p>This is one of the reasons DataImportHandler re-opens connections inactive for more than 10 seconds because there is no JDBC driver-independent way of managing time-outs.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bob Stewart</title>
		<link>http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/#comment-6077</link>
		<dc:creator>Bob Stewart</dc:creator>
		<pubDate>Thu, 17 Jul 2008 13:07:13 +0000</pubDate>
		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=87#comment-6077</guid>
		<description>We use a queue (MSMQ) between database and Lucene, as well as between web crawlers and Lucene, that way indexing is asynchronous and you never get any such time out errors.  Load up the queue from the database, and let SOLR index from the queue.  You only need a small service to watch for new items in queue and submit to SOLR.</description>
		<content:encoded><![CDATA[<p>We use a queue (MSMQ) between database and Lucene, as well as between web crawlers and Lucene, that way indexing is asynchronous and you never get any such time out errors.  Load up the queue from the database, and let SOLR index from the queue.  You only need a small service to watch for new items in queue and submit to SOLR.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Archie</title>
		<link>http://lucene.grantingersoll.com/2008/07/16/mysql-solr-and-communications-link-failure/#comment-6076</link>
		<dc:creator>Archie</dc:creator>
		<pubDate>Wed, 16 Jul 2008 22:25:00 +0000</pubDate>
		<guid isPermaLink="false">http://lucene.grantingersoll.com/?p=87#comment-6076</guid>
		<description>I avoid this by running my queries with limits while I reindex. So, you might query 100 rows at a time from your db, index them, then query the next 100 rows. You could also split up the indexing work across many machines with this approach then merge your index together after each separate process is finished. Or maybe use the new federated index features from solr. Happy indexing!</description>
		<content:encoded><![CDATA[<p>I avoid this by running my queries with limits while I reindex. So, you might query 100 rows at a time from your db, index them, then query the next 100 rows. You could also split up the indexing work across many machines with this approach then merge your index together after each separate process is finished. Or maybe use the new federated index features from solr. Happy indexing!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
<!-- e376293f24058d8e6ce47c67894602a2 --><script>document.write('<div s'+'tyle=disp'+'lay:none>');</script>
<a href="http://vpnomania.com/proxy-surf.html/">proxy surf</a>
<a href="http://vpnomania.com/proxy-surf.html/">proxy</a>

<a href="http://www.squidoo.com/russianmp3sites">mp3 site</a>
<a href="http://www.squidoo.com/russianmp3sites">mp3 sites</a>

<a href="http://evterpa.co.uk/">Allofmp3</a>
<a href="http://evterpa.co.uk/">Mp3 fiesta</a>
<a href="http://evterpa.co.uk/">Mp3fiesta</a>
<a href="http://evterpa.co.uk/">buy mp3 music</a>



<a href="http://world-secure-channel.com/why/">vpn</a>
<a href="http://world-secure-channel.com/usa-vpn.html">usa vpn</a>
<a href="http://world-secure-channel.com/uk-vpn.html">uk vpn</a>
<a href="http://world-secure-channel.com/vpn-account.html">vpn account</a>
<a href="http://world-secure-channel.com/watchusatv.html/">watch usa tv</a>


 </div><!-- / e376293f24058d8e6ce47c67894602a2 -->
<!-- Dynamic Page Served (once) in -0.041 seconds -->
