<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>techfounder &#187; range select</title>
	<atom:link href="http://www.techfounder.net/tag/range-select/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.techfounder.net</link>
	<description>Blog about web development and Internet entrepreneurship</description>
	<lastBuildDate>Mon, 21 Jun 2010 19:41:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Selecting closest values in MySQL</title>
		<link>http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/</link>
		<comments>http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 03:12:54 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Web development]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[range select]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=213</guid>
		<description><![CDATA[Sometimes the need arises to select several values in the vicinity of a certain value, preferably ordered by proximity. The values might be dates, zip-codes or any other meaningfully ordered values that can be represented as numerical values. How can we pull this off in MySQL? We can't use a simple ORDER BY, since we [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes the need arises to select several values in the vicinity of a certain value, preferably ordered by proximity. The values might be dates, zip-codes or any other meaningfully ordered values that can be represented as numerical values. How can we pull this off in MySQL?<br />
<span id="more-213"></span><br />
We can't use a simple ORDER BY, since we want values both larger and smaller than our selected value. We can however order by an aggregate function that calculates the distance from our selected value.</p>
<p>Suppose we want to find the 6 closest numbers to the number 2500 (including) from a numbers table:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> number, ABS<span style="color: #66cc66;">&#40;</span> number - <span style="color: #cc66cc;">2500</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> distance
<span style="color: #993333; font-weight: bold;">FROM</span> numbers
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> distance
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span></pre>
<p>This returns:</p>
<table id="table_results" class="data" border="0">
<thead>
<tr>
<th>number</th>
<th>distance</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>2502</td>
<td>2</td>
</tr>
<tr class="even">
<td>2494</td>
<td>6</td>
</tr>
<tr class="odd">
<td >2508</td>
<td>8</td>
</tr>
<tr class="even">
<td >2489</td>
<td>11</td>
</tr>
<tr class="odd">
<td >2513</td>
<td>13</td>
</tr>
<tr class="even">
<td >2487</td>
<td>13</td>
</tr>
</tbody>
</table>
<blockquote><p>(6 total, Query took 2.2792 sec)</p></blockquote>
<p>This works nicely, and is actually relatively performant if the number column is indexed. Despite having to run a full table scan in order to calculate the distance for every number in the table, running this on a ~2 million row table completes in just over 2 seconds.</p>
<p>This kind of performance would be quite enough for smaller tables, but often in real time applications we would like faster response times than 2 seconds for completing a query.</p>
<p>Since we know exactly what we need, we can help MySQL by limiting the range of numbers it has to calculate the distance for. Since we want the 6 closest numbers, we can be sure they'll be at most in a range of 6 numbers lower and 6 numbers higher than our selected value.</p>
<p>If we can do that, then the calculation would run for only 13 numbers, hopefully leading to much improved performance. Selecting those numbers can be done using a union on a couple of SELECT statements:</p>
<pre class="sql"><span style="color: #66cc66;">&#40;</span>
   <span style="color: #993333; font-weight: bold;">SELECT</span> number
   <span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
   <span style="color: #993333; font-weight: bold;">WHERE</span> number &gt;= <span style="color: #cc66cc;">2500</span>
   <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number
   <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">7</span>
<span style="color: #66cc66;">&#41;</span> UNION <span style="color: #993333; font-weight: bold;">ALL</span> <span style="color: #66cc66;">&#40;</span>
   <span style="color: #993333; font-weight: bold;">SELECT</span> number
   <span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
   <span style="color: #993333; font-weight: bold;">WHERE</span> number &lt; <span style="color: #cc66cc;">2500</span>
   <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number <span style="color: #993333; font-weight: bold;">DESC</span>
   <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span>
<span style="color: #66cc66;">&#41;</span></pre>
<p>Notice the first one includes the actual value (in case it exists) and hence has one more value to select in its limit clause.</p>
<p>Combining this with our previous successful attempt produces the following ungainly query:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> number, ABS<span style="color: #66cc66;">&#40;</span> number - <span style="color: #cc66cc;">2500</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> distance <span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #66cc66;">&#40;</span>
	<span style="color: #66cc66;">&#40;</span>
		<span style="color: #993333; font-weight: bold;">SELECT</span> number
		<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
		<span style="color: #993333; font-weight: bold;">WHERE</span> number &gt;=<span style="color: #cc66cc;">2500</span>
		<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number
		<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span>
	<span style="color: #66cc66;">&#41;</span> UNION <span style="color: #993333; font-weight: bold;">ALL</span> <span style="color: #66cc66;">&#40;</span>
		<span style="color: #993333; font-weight: bold;">SELECT</span> number
		<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
		<span style="color: #993333; font-weight: bold;">WHERE</span> number &lt; <span style="color: #cc66cc;">2500</span>
		<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number <span style="color: #993333; font-weight: bold;">DESC</span>
		<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span>
	<span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> n
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> distance
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span></pre>
<p>It does return the same results:</p>
<table id="table_results" class="data" border="0">
<thead>
<tr>
<th class="condition">number</th>
<th>distance</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="condition nowrap" align="right">2502</td>
<td class="nowrap" align="right">2</td>
</tr>
<tr class="even">
<td class="condition nowrap" align="right">2494</td>
<td class="condition nowrap" align="right">6</td>
</tr>
<tr class="odd">
<td class="condition nowrap" align="right">2508</td>
<td class="condition nowrap" align="right">8</td>
</tr>
<tr class="even">
<td class="condition nowrap" align="right">2489</td>
<td class="condition nowrap" align="right">11</td>
</tr>
<tr class="odd">
<td class="condition nowrap" align="right">2513</td>
<td class="condition nowrap" align="right">13</td>
</tr>
<tr class="even">
<td class="condition nowrap" align="right">2487</td>
<td class="condition nowrap" align="right">13</td>
</tr>
</tbody>
</table>
<p>Is much more performant:</p>
<blockquote><p>(6 total, Query took 0.0011 sec)</p></blockquote>
<p>And there you have it.</p>
<p>A couple of notes:<br />
- All queries were ran with SQL_NO_CACHE at least 5 times to ensure the timings were indicative of the performance.<br />
- The queries were ran against a ~2 million table filled with randomly generated values.<br />
- An index was created on the number column.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=213" width="1" height="1" style="display: none;" /><div class="tweetmeme_button" style="float: right; margin-left: 10px;"><a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fwww.techfounder.net%2F2009%2F02%2F02%2Fselecting-closest-values-in-mysql%2F"><img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fwww.techfounder.net%2F2009%2F02%2F02%2Fselecting-closest-values-in-mysql%2F" height="61" width="51" /></a></div>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
