<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>techfounder &#187; MySQL</title>
	<atom:link href="http://www.techfounder.net/category/webdev/mysql/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.techfounder.net</link>
	<description>Blog about web development and Internet entrepreneurship</description>
	<lastBuildDate>Mon, 22 Aug 2011 08:36:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1</generator>
		<item>
		<title>Database optimization techniques you can actually use</title>
		<link>http://www.techfounder.net/2011/03/25/database-profiling-and-optimizing-your-database-the-generic-version/</link>
		<comments>http://www.techfounder.net/2011/03/25/database-profiling-and-optimizing-your-database-the-generic-version/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 01:20:35 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Web development]]></category>
		<category><![CDATA[optimization]]></category>
		<category><![CDATA[profiling]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=626</guid>
		<description><![CDATA[I just saw an article on Smashing Magazine titled "Speeding up your website's database". I love Smashing's contribution to the webdev community, but their articles are getting longer and more basic at the same time. I understand the need for simplicity because of the wide audience of Smashing Magazine, but I'd wish they'd give something [...]]]></description>
			<content:encoded><![CDATA[<p>I just saw an article on Smashing Magazine titled "<a target="blank" rel="nofollow" href="http://www.smashingmagazine.com/2011/03/23/speeding-up-your-websites-database/">Speeding up your website's database</a>". I love Smashing's contribution to the webdev community, but their articles are getting longer and more basic at the same time. </p>
<p>I understand the need for simplicity because of the wide audience of Smashing Magazine, but I'd wish they'd give something more than the absolute basics you could find in almost any other site out there. I also didn't like some of the methods mentioned there for profiling (or the code itself), so I here is my starter guide to optimizing database performance.<br />
<span id="more-626"></span></p>
<h3>When do we optimize the database?</h3>
<p>As is noted in the article, page load speed is important. It affects the user-experience as well as Google pagerank when it gets too slow. There are so many variables to account for when trying to improve page load, including page download weight (including all the various assets such as images, javascript and CSS), network latency, browser cache and server headers, server load (requests per second and memory and CPU usage) among others. Yahoo has a <a target="blank" rel="nofollow" href="http://developer.yahoo.com/performance/rules.html">very nice guide</a> for client-side performance tips. We're going to suspect the database as the culprit for the purposes of this article, but you should first observe the complete picture before deciding on what to optimize.</p>
<p>Aside from page load speed, a busy database can affect the rest of the server as well, meaning parts that don't use the database or have very fast running queries could start to slow down.</p>
<h3>Profile first, optimize last</h3>
<p>The basic rule of optimization is to never assume - always verify, using actual data. The process of collecting performance metrics and determining performance issues is called <a target="blank" rel="nofollow"  href="http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29">profiling</a>. We want to know whether database performance is responsible for a significant part of our page load time.</p>
<p>Referring again to the smashing magazine article, the author suggests a profiling method that is basically correct however the implementation leaves a lot to be desired. We won't go into why using globals and outputting inside functions is not good practice, and the author even mentions that this could seriously mess up the layout of the site or the sessions and yet makes no attempt to give out a better solution.</p>
<p>We want to time how much queries are taking to run. There are plenty of timing solutions out in the open - such as <a target="blank" rel="nofollow" href="http://pear.php.net/package/Benchmark/docs/latest/li_Benchmark.html">PEAR_Benchmark</a>, that there is simply no need to build your own unless you want the exercise. The concept is simple - store microtime() values before and after the query for later observation, and the difference would be the timing of the query with good accuracy.</p>
<p>If you are using a database abstraction class (and you should), incorporating a timer to profile every query should be a piece of cake - so no need to hunt down every query and modify the code around it as suggested in the SM article. Wrap the query method of your abstraction class with the timer and use the queries as the keys in the timing array. We used the Zend Framework for all of our previous projects and for our current <a href="http://www.binpress.com">startup</a>, and it comes with a built-in support for profiling which makes it a breeze to get started.</p>
<p>Example code using <a target="blank" rel="nofollow" href="http://framework.zend.com/manual/en/zend.db.profiler.html">Zend_Db_Profiler</a></p>
<pre class="php"><span style="color: #0000ff;">$db</span> = Zend_Db::<span style="color: #006600;">factory</span><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'PDO_MYSQL'</span>, <span style="color: #0000ff;">$config</span><span style="color: #66cc66;">&#41;</span>; <span style="color: #808080; font-style: italic;">//Set up the database object</span>
 <span style="color: #0000ff;">$db</span> -&gt; <span style="color: #006600;">getProfiler</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>-&gt;<span style="color: #006600;">setEnabled</span><span style="color: #66cc66;">&#40;</span><span style="color: #000000; font-weight: bold;">true</span><span style="color: #66cc66;">&#41;</span>; <span style="color: #808080; font-style: italic;">// turn on profiler</span>
&nbsp;
<span style="color: #808080; font-style: italic;">//Queries are performed on the page</span>
<span style="color: #808080; font-style: italic;">//...</span>
&nbsp;
<span style="color: #808080; font-style: italic;">// Where we want to show the results</span>
<span style="color: #0000ff;">$profiles</span> = <span style="color: #0000ff;">$profiler</span> -&gt; <span style="color: #006600;">getQueryProfiles</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>; <span style="color: #808080; font-style: italic;">//An array of all the query profiling</span></pre>
<p>If you are using Firebug (which you should if you are running Firefox) you can install the FirePHP plugin for completely unobtrusive output. The Zend Framework comes with a FirePHP profiler that can send query results directly into your Firebug console.</p>
<pre class="php"><span style="color: #0000ff;">$profiler</span> = <span style="color: #000000; font-weight: bold;">new</span> Zend_Db_Profiler_Firebug<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'All DB Queries'</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #0000ff;">$profiler</span> -&gt; <span style="color: #006600;">setEnabled</span><span style="color: #66cc66;">&#40;</span><span style="color: #000000; font-weight: bold;">true</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #0000ff;">$db</span> -&gt; <span style="color: #006600;">setProfiler</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$profiler</span><span style="color: #66cc66;">&#41;</span>;
&nbsp;
<span style="color: #808080; font-style: italic;">//Queries are performed on the page</span>
<span style="color: #808080; font-style: italic;">//...</span>
<span style="color: #808080; font-style: italic;">// No need to output, query profiles will appear in your Firebug console</span></pre>
<p>Pretty convenient. We can go over page by page without disturbing content or messing up sessions and even run this in a production environment, provided we load the profiler only for specific users. The output looks something like this:</p>
<p><img title="firePHP results" src="http://www.techfounder.net/wp-content/uploads/2011/03/firephp.png" alt="" /></p>
<p><b>It's very important to profile using a relevant dataset</b>. If you profile on a development machine that has very different data (and probably much smaller tables) than your production machine, you will get very different results. You should create a test machine that resembles your live dataset as much as possible to get relevant data (as I've shown in an <a href="http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/">old article about profiling</a>).</p>
<p>Another important note is to avoid looking at cached results. MySQL will cache certain queries - so verify the results of your profiling by running the queries while avoiding caches using <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/query-cache-in-select.html">SQL_NO_CACHE</a> and <a target="blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2007/09/12/query-profiling-with-mysql-bypassing-caches/">other means</a>. Compare the first run with subsequent runs of the same query to be sure you are seeing non-cached results.</p>
<p>Aside from profiling the queries in real time, we can also profile queries that are used by daemons and cron jobs and log the results to a file. MySQL has a <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html">built in feature in MySQL</a> that can log slow queries for us while the database daemon is running. As of MySQL 5.1.21 we can get microsecond timing on queries (previously only one-second jumps were supported) so we can get very good measurements with the slow-query log. </p>
<p>The slow query log should be used for monitoring and be checked periodically for possible problems. The rate at which the log fills out also gives an indication of how much your database is slowing down over time and how much time you have left before you need to optimize.</p>
<h2>Optimizing performance</h2>
<p>Suppose we found out some problematic queries on slow pages. There are 4 basic ways to optimize query performance:</p>
<ul>
<li>Rewrite the queries</li>
<li>Change indexing strategy</li>
<li>Change schema</li>
<li>Use an external cache</li>
</ul>
<h3>Examining query execution plans (EXPLAIN)</h3>
<p>Before trying to optimize a slow query, we need to understand what makes it slow. For this purpose MySQL has a query examination tool called <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/explain.html">EXPLAIN</a>. Add the reserved word 'EXPLAIN' at the beginning of your query to get the execution plan for the query. The execution plan literally 'explains' to us what the database is doing to optimize the query. The MySQL manual has <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.0/en/explain-output.html">a full reference guide</a> to the different values that appear in the plan, and you can see a full walkthrough of using EXPLAIN to optimize a query in <a target="blank" rel="nofollow" href="http://www.slideshare.net/phpcodemonkey/mysql-explain-explained">this slideshow on slideshare</a> (as well as in the profiling article I linked to earlier).</p>
<p>This is a very useful tool, but like all other tools it should be used while <a target="blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2006/07/24/mysql-explain-limits-and-errors/">being aware of its limitations</a>.</p>
<h3>Common optimizations</h3>
<p><b>1. Looping queries</b></p>
<p>The most basic performance issues often will not be the fault of the database itself. One of the most common mistakes is to query in a loop without need. Most likely looped SELECT queries can be rewritten as a JOIN -</p>
<pre class="php">&nbsp;
<span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">'SELECT id,name FROM categories'</span>;
<span style="color: #0000ff;">$rows</span> = <span style="color: #0000ff;">$db</span> -&gt; <span style="color: #006600;">fetchAll</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #b1b100;">foreach</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$rows</span> <span style="color: #b1b100;">as</span> <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
     <span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">'SELECT id,name FROM sub_categories WHERE category_id='</span> . <span style="color: #66cc66;">&#40;</span>int<span style="color: #66cc66;">&#41;</span> <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#91;</span><span style="color: #ff0000;">'id'</span><span style="color: #66cc66;">&#93;</span>;
     <span style="color: #0000ff;">$subCategories</span> = <span style="color: #0000ff;">$db</span> -&gt; <span style="color: #006600;">fetchAll</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;
     <span style="color: #808080; font-style: italic;">//...</span>
<span style="color: #66cc66;">&#125;</span></pre>
<p>(I'm using Zend Framework syntax since we already assumed we are using a database abstraction class)</p>
<p>This could be rewritten as a join -</p>
<pre class="php">&nbsp;
<span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">'SELECT categories.id,categories.name,
sub_categories.id AS subcat_id,sub_categories.name AS subcat_name
FROM categories
LEFT JOIN sub_categories ON categories.id=sub_categories.category_id'</span>;
<span style="color: #0000ff;">$rows</span> = <span style="color: #0000ff;">$db</span> -&gt; <span style="color: #006600;">fetchAll</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #b1b100;">foreach</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$rows</span> <span style="color: #b1b100;">as</span> <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
   <span style="color: #808080; font-style: italic;">// Require a bit of additional logic to format the results, but we have one query instead of many</span>
<span style="color: #66cc66;">&#125;</span></pre>
<p>Inserting and updating rows in a loop can have major overhead as well, and those queries are generally slower than simple SELECT queries (since indexes often need to be updated) and they affect the performance of other queries since they use table / row locks while the data is written (this differs <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/internal-locking.html">depending on the table engine</a>). I wrote an article almost two years ago on <a href="http://www.techfounder.net/2009/05/14/multiple-row-operations-in-mysql-php/" target="blank" rel="nofollow">multiple row operations</a> that covers how to rewrite looped INSERT / UPDATE queries and includes some benchmarks to show how it improves performance.</p>
<p><b>2. Picking only needed columns</b></p>
<p>It is common to see a wildcard used to pick all columns ('SELECT * FROM ... ') - this however, is not efficient. Depending on the number of participating columns and their type (especially large types such as the TEXT variants), we could be selecting much more data from the database than we actually need. The query will take longer to return since it needs to transfer more data (from the hard-disk if it doesn't hit the cache) and it will take up more memory doing so.</p>
<p>Picking only the needed columns is a good general practice to use, and avoids those problems.</p>
<p><b>3. Filtering rows correctly and using indexes</b></p>
<p>Our main goal is to select the smallest amount of rows we need and doing so in the fastest way possible. We want to filter rows using indexes, and in general we want to avoid full table scans unless it is absolutely needed (aside from edge cases where <a target="blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2006/06/02/indexes-in-mysql/">it actually improves performance</a>). The MySQL manual <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/where-optimizations.html">has some great information</a> on optimizing the WHERE clause, and I'll dive into a bit more detail - </p>
<p>Filtering conditions include the WHERE, ON (for joins) and HAVING clauses. As much as possible, <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/how-to-avoid-table-scan.html">we want those clauses to hit indexes</a> - unless we are selecting a very large amount of rows, index lookup is much faster than a full table scan. Those clauses should be used along with the LIMIT clause if relevant to filter the amount of rows / data returned by the query. The LIMIT clause itself can lend some important optimizations for queries <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/limit-optimization.html">if used correctly</a>.</p>
<p>Since our goal is to hit indexes with our WHERE clause, an important rule is to avoid using calculations there. When the filtering condition has to be calculated for each row, the WHERE clause cannot use an index.</p>
<p>Example - fetching users created in the last 4 weeks:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> id,name <span style="color: #993333; font-weight: bold;">FROM</span> users <span style="color: #993333; font-weight: bold;">WHERE</span> created - NOW<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> &lt; INTERVAL <span style="color: #cc66cc;">4</span> WEEK</pre>
<p>Since the value of the `created` column changes from row to row, we now have a calculation in the left-hand side of the condition. This could be rewritten so that the calculation doesn't change and thus will only be performed once:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> id,name <span style="color: #993333; font-weight: bold;">FROM</span> users <span style="color: #993333; font-weight: bold;">WHERE</span> created &gt; NOW<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> - INTERVAL <span style="color: #cc66cc;">4</span> WEEK</pre>
<p>This query can use an index on the `created` column and should perform much better.</p>
<p>Another less common calculation but a very problematic one is <a target="blank" rel="nofollow" href="http://en.wikipedia.org/wiki/Correlated_subquery">correlated subqueries</a> in the WHERE clause.</p>
<p>Selecting the lowest priced fruit from several fruit types:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> type, variety, price
<span style="color: #993333; font-weight: bold;">FROM</span> fruits
<span style="color: #993333; font-weight: bold;">WHERE</span> price = <span style="color: #66cc66;">&#40;</span>
    <span style="color: #993333; font-weight: bold;">SELECT</span> MIN<span style="color: #66cc66;">&#40;</span>price<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">FROM</span> fruits <span style="color: #993333; font-weight: bold;">AS</span> f <span style="color: #993333; font-weight: bold;">WHERE</span> f.type = fruits.type
<span style="color: #66cc66;">&#41;</span></pre>
<p>(Example taken from the <a target="blank" rel="nofollow" href="http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/">excellent Xaprb article</a> on the topic - you should read it). This query could be rewritten as JOIN, moving the subquery from the WHERE clause - </p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> f.type, f.variety, f.price
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #66cc66;">&#40;</span>
   <span style="color: #993333; font-weight: bold;">SELECT</span> type, MIN<span style="color: #66cc66;">&#40;</span>price<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> minprice
   <span style="color: #993333; font-weight: bold;">FROM</span> fruits
   <span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> type
<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> minfruits
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> fruits <span style="color: #993333; font-weight: bold;">AS</span> f <span style="color: #993333; font-weight: bold;">ON</span> f.type = minfruits.type <span style="color: #993333; font-weight: bold;">AND</span> f.price = minfruits.minprice</pre>
<p>Ideally, MySQL would've optimized both of those the same, but since it is usually not the case, rewriting correlated subqueries as joins is preferred.</p>
<p><b>4. Indexing correctly</b></p>
<p>Whether optimizations are needed or not is dependent on the EXPLAIN results we mentioned previously. If the execution plan indicates an index is not being used or a non-selective index has been picked, we need to understand why and change our indexing strategy accordingly (or use <a target="_blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/index-hints.html">index hints</a>). </p>
<p>MySQL can use one index per table alias in a query, so we need to plan our indexes to maximize their effectiveness. Using more indexes than is necessary can have adverse affects - as it slows down the operation of INSERT and UPDATE queries, while taking up more memory. Some indexes can even <a target="_blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2007/08/28/do-you-always-need-index-on-where-column/">slow down performance</a> depending on their selectivity.</p>
<p>If we use ordering clauses such as ORDER and GROUP BY, our indexes should often be composite indexes (indexes covering more than one column) to allow for both the filtering and ordering to use an index.</p>
<p><b>5. Picking the right engine for your data</b></p>
<p>MySQL has <a target="_blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.5/en/storage-engines.html">a pluggable engine design</a>, which allows you to use different engine types to store your data, each with its own advantages and drawbacks. The two main engines are MyISAM and InnoDB, and the differences between them affect much more than just performance - InnoDB is an ACID compliant transactional engine, while MyISAM sacrifices some integrity and consistancy features for simplicity and performance. Having said that, <a href="http://www.mysqlperformanceblog.com/2006/05/29/join-performance-of-myisam-and-innodb/">MyISAM is not necessarily faster</a>, only in some cases.</p>
<p>I use InnoDB for all of my table unless I need a full-text index (a MyISAM only feature), since my tables usually have a lot of write activity. InnoDB uses row-level locks which really helps performance for such use, while MyISAM uses table locks for write operations. It also optimizes write operations by <a rel="nofollow" target="_blank" href="http://www.xaprb.com/blog/2006/07/04/how-to-exploit-mysql-index-optimizations/">treating indexes differently</a> than MyISAM - InnoDB uses <a target="_blank" rel="nofollow" href="http://www.xaprb.com/blog/2006/05/10/when-to-avoid-and-when-to-use-surrogate-keys-in-innodb-tables/">clustered indexes</a>, so picking the right primary key is critical.</p>
<h3>Caching</h3>
<p>In the case that our optimizations does not yield sufficient performance benefits (either due to technical reasons or our own skill level), <a target="blank" rel="nofollow" href="http://en.wikipedia.org/wiki/Cache">caching</a> is a viable strategy to reduce database load. You should always try and optimize the database itself first, since caching will add another of complexity to our application.</p>
<p>MySQL has an internal query cache that caches results from frequently running queries if it <a target="blank" rel="nofollow" href="http://dev.mysql.com/doc/refman/5.1/en/query-cache-operation.html">meets certain requirements</a>. If our queries are cached by MySQL (this can be verified by running the queries several times), there is no need to cache it - we just need to be aware that a MySQL service restart could cause a noticeable slow down while the cache is being primed again. You can read more about the <a target="blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2010/09/23/more-on-dangers-of-the-caches/">dangers of the internal cache</a> on the excellent MySQL performance blog (a must read for any serious MySQL user).</p>
<p>There are many caching strategies and that is the topic for another post. Common options include caching to disk (files) or caching to memory (using solutions such as <a target="blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2006/09/27/apc-or-memcached/">memcache or APC</a>). Another form of caching is to the database - by de-normalizing the schema to store data that is the result of expensive to run queries.</p>
<h3>Server tuning and beyond</h3>
<p>Everything covered here is just the tip of the iceberg - it gets rapidly more advanced as you get dig deeper, including tuning MySQL server variables (and you should - <a target="blank" rel="nofollow" href="http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/">at least the basics</a>), the server itself (hardware / software) and using related tools such as <a href="http://sphinxsearch.com/">sphinx</a> and <a href="http://lucene.apache.org/java/docs/index.html">lucene</a> to offload some of the work. I tried to give a good starting point and as many references as possible for getting a good start to getting your database in shape. </p>
<p>I linked to several excellent resources in this article, such as the <a href="http://dev.mysql.com/doc/refman/5.5/en/">MySQL manual</a>, <a href="http://www.mysqlperformanceblog.com/">MySQL performance blog</a> and <a href="http://www.xaprb.com/blog/">Xaprb</a> (the last two are of <a href="http://www.percona.com/">Percona</a> fame - world-class experts on MySQL). I suggest you start visiting those regularly as they offer excellent advice.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=626" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Database optimization techniques you can actually use" data-url="http://www.techfounder.net/2011/03/25/database-profiling-and-optimizing-your-database-the-generic-version/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2011/03/25/database-profiling-and-optimizing-your-database-the-generic-version/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Fetching specific rows from a group with MySQL</title>
		<link>http://www.techfounder.net/2010/03/12/fetching-specific-rows-from-a-group-with-mysql/</link>
		<comments>http://www.techfounder.net/2010/03/12/fetching-specific-rows-from-a-group-with-mysql/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 01:25:29 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Web development]]></category>
		<category><![CDATA[derived tables]]></category>
		<category><![CDATA[group by]]></category>
		<category><![CDATA[latest post]]></category>
		<category><![CDATA[subquery]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=476</guid>
		<description><![CDATA[In a normalized database, we store separate entities in separate tables and define relationships between them, those being - one-to-one, one-to-many and many-to-many. We often want to fetch data which is split over two or more related tables, which is dead simple when fetching the extra data from the 'one' side of a relationship (ie, [...]]]></description>
			<content:encoded><![CDATA[<p>In a normalized database, we store separate entities in separate tables and define relationships between them, those being - one-to-one, one-to-many and many-to-many. We often want to fetch data which is split over two or more related tables, which is dead simple when fetching the extra data from the 'one' side of a relationship (ie, from a "parent" table):</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> product_images.src,products.name
<span style="color: #993333; font-weight: bold;">FROM</span> product_images
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> products <span style="color: #993333; font-weight: bold;">ON</span> product_images.parent_id=products.id</pre>
<p>Literally: Get the name of the parent product for each row in the products_images table.</p>
<p>Going the other direction is not as trivial. If we just want all the referenced rows in the child table, the query seen before would do the trick. But if we wanted a specific row to be found - for example, the latest inserted image, it becomes somewhat more complicated.<br />
<span id="more-476"></span><br />
Lets explore a common scenario - fetching the latest comment for each row in a list of posts in a forum. </p>
<p><a href="http://forums.devnetwork.net/viewforum.php?f=1"><img src="http://www.techfounder.net/wp-content/uploads/2010/03/latest-posts.png" alt="" title="latest-posts" width="570" height="145" class="alignnone size-full wp-image-480" /></a><br />
Latest posts from the <a href="http://forums.devnetwork.net">devnet forums</a></p>
<p>Dealing with the following simplified schema for a forum:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #ff0000;">`threads`</span> <span style="color: #66cc66;">&#40;</span>
   <span style="color: #ff0000;">`id`</span> INT <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>  <span style="color: #993333; font-weight: bold;">AUTO_INCREMENT</span>,
   <span style="color: #ff0000;">`title`</span> VARCHAR<span style="color: #66cc66;">&#40;</span> <span style="color: #cc66cc;">250</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> ,
   <span style="color: #ff0000;">`created`</span> TIMESTAMP <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> CURRENT_TIMESTAMP ,
   <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #ff0000;">`id`</span> <span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> ENGINE = InnoDB;
&nbsp;
<span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #ff0000;">`posts`</span> <span style="color: #66cc66;">&#40;</span>
   <span style="color: #ff0000;">`id`</span> INT <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>  <span style="color: #993333; font-weight: bold;">AUTO_INCREMENT</span>,
   <span style="color: #ff0000;">`thread_id`</span> INT <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> ,
   <span style="color: #ff0000;">`content`</span> TEXT <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> ,
   <span style="color: #ff0000;">`created`</span> TIMESTAMP <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> CURRENT_TIMESTAMP ,
   <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #ff0000;">`id`</span> <span style="color: #66cc66;">&#41;</span> ,
   <span style="color: #993333; font-weight: bold;">INDEX</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #ff0000;">`thread_id`</span> <span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> ENGINE = InnoDB;</pre>
<p>A naive approach to fetching 10 threads with the latest post for each thread would be something like:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> threads.*, posts.*
<span style="color: #993333; font-weight: bold;">FROM</span> threads
<span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> posts <span style="color: #993333; font-weight: bold;">ON</span> posts.thread_id=threads.id
<span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> threads.id
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> posts.created <span style="color: #993333; font-weight: bold;">DESC</span>
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">10</span></pre>
<p>This query might appear to work as long as there is only one post per thread. Once there are more, you will start getting unexpected results (go ahead, try it). Lets try and understand why:</p>
<p>In the SQL standard, a GROUP BY modifier can return only columns that are either aggregated or the same for the entire group. MySQL extends this and allows returning non unique or aggregated columns from the group, instead of throwing an error. </p>
<p>However, values from those columns could come from any row in the group and we have no control over which row it will be.</p>
<blockquote><p>... all rows in each group should have the same values for the columns that are ommitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same.<br />
<cite><a href="http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html" target="_blank">The MySQL manual on GROUP BY modifiers</a></cite></p></blockquote>
<p>In addition, grouping occurs before the order clause is even applied. After the grouping, unpredictable values for the order column (posts.created) would exist for each group, making the sorting afterward somewhat meaningless (since we wanted the latest post for each thread).</p>
<p>A more experienced SQL user might think - why not just use an aggregate function instead? leading to the following query:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> threads.*, posts.*, MAX<span style="color: #66cc66;">&#40;</span>posts.created<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> maxcreated
<span style="color: #993333; font-weight: bold;">FROM</span> threads
<span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> posts <span style="color: #993333; font-weight: bold;">ON</span> posts.thread_id=threads.id
<span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> threads.id
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> maxcreated <span style="color: #993333; font-weight: bold;">DESC</span>
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">10</span></pre>
<p>This is somewhat better, as it does retrieve the timestamp belonging to the latest post in each thread - however, surprisingly (or not, if you've been following closely) the rest of the non grouped columns from the threads table are still indeterminate and do not necessarily belong to the row with the latest timestamp.</p>
<p>The solution to this issue is not completely trivial, but it makes sense once you understand the limitations. The key is to use the GROUP BY statement to return the filtering criteria (latest timestamps from the posts table) but to avoid it affecting the result set directly.</p>
<p>First, the filtering criteria - </p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> thread_id,MAX<span style="color: #66cc66;">&#40;</span>created<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> created
<span style="color: #993333; font-weight: bold;">FROM</span> posts
<span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> posts.thread_id</pre>
<p>Here we return only aggregate (latest timestamp) and unique (thread_id) data for each group. Now we have the timestamp of the latest post from each thread, along with the matching thread identifier. </p>
<p>We will move the filtering criteria into a subquery, and JOIN that against the threads and posts table respectively.</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> threads.*, posts.*
<span style="color: #993333; font-weight: bold;">FROM</span> threads
<span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #66cc66;">&#40;</span>
   <span style="color: #993333; font-weight: bold;">SELECT</span> thread_id,MAX<span style="color: #66cc66;">&#40;</span>created<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> created <span style="color: #993333; font-weight: bold;">FROM</span> posts
   <span style="color: #993333; font-weight: bold;">GROUP</span> <span style="color: #993333; font-weight: bold;">BY</span> thread_id
<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> latest <span style="color: #993333; font-weight: bold;">ON</span> latest.thread_id = threads.id
<span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> posts <span style="color: #993333; font-weight: bold;">ON</span> latest.created=posts.created <span style="color: #993333; font-weight: bold;">AND</span>
latest.thread_id=posts.thread_id
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> posts.created <span style="color: #993333; font-weight: bold;">DESC</span>
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">10</span></pre>
<p>Literally: Fetch rows from the threads table, (left) join that with the timestamp from the latest post in each thread, and then (left) join that with the posts table with the condition that the timestamp and thread_id match. This returns all the rows in the threads table along with the latest post from each thread.</p>
<p>People have a disposition for avoiding subqueries due to performance concerns, but in this case it is a <b>derived</b> subquery (also referred to as a derived table) which will compute once for the entire query as opposed to a correlated subquery which computes once per row ( = bad performance with many rows).</p>
<p>* Note - in real-world situations you would like to specify the columns to be returned in the result set</p>
<p>For more advanced usage, I suggest you read Baron Schwartz's article on <a href="http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/">selecting the first/least/max row per group in SQL</a></p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=476" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Fetching specific rows from a group with MySQL" data-url="http://www.techfounder.net/2010/03/12/fetching-specific-rows-from-a-group-with-mysql/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2010/03/12/fetching-specific-rows-from-a-group-with-mysql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Multiple row operations in MySQL / PHP</title>
		<link>http://www.techfounder.net/2009/05/14/multiple-row-operations-in-mysql-php/</link>
		<comments>http://www.techfounder.net/2009/05/14/multiple-row-operations-in-mysql-php/#comments</comments>
		<pubDate>Thu, 14 May 2009 01:43:43 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Web development]]></category>
		<category><![CDATA[insert]]></category>
		<category><![CDATA[multiple rows]]></category>
		<category><![CDATA[on duplicate key]]></category>
		<category><![CDATA[update]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=229</guid>
		<description><![CDATA[Multiple row operations are in common use in a normalized application databases as one database entity is often linked to multiple sub-entities (for example a user and his tags). By row operations I'm referring to write queries, namely UPDATE and INSERT queries (DELETE is less interesting so I'll leave it out for now). Too often [...]]]></description>
			<content:encoded><![CDATA[<p>Multiple row operations are in common use in a normalized application databases as one database entity is often linked to multiple sub-entities (for example a user and his tags). By row operations I'm referring to write queries, namely UPDATE and INSERT queries (DELETE is less interesting so I'll leave it out for now). </p>
<p>Too often I've seen such queries ran in long loops one at a time, which is very bad for performance (as I will show here) and sometimes equally bad for integrity (if the process is interrupted). So what are the alternatives?<br />
<span id="more-229"></span></p>
<h2>Inserting multiple rows</h2>
<p>Insertion of multiple rows comes about often in batch jobs, database migrations and handling table relationships. A naive approach, via PHP, would be to loop over the data to be inserted, inserting one row at a time. Suppose the data is already properly filtered and quoted:</p>
<pre class="php"><ol><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #b1b100;">foreach</span><span style="color: #66cc66;">&#40;</span> <span style="color: #0000ff;">$data</span> <span style="color: #b1b100;">as</span> <span style="color: #0000ff;">$row</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">      <span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">&quot;INSERT INTO `test_table` (user_id,content)&quot;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">          . <span style="color: #ff0000;">&quot; VALUES (&quot;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">          . <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#91;</span><span style="color: #ff0000;">'user_id'</span><span style="color: #66cc66;">&#93;</span> . <span style="color: #ff0000;">&quot;,&quot;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">          . <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#91;</span><span style="color: #ff0000;">'content'</span><span style="color: #66cc66;">&#93;</span> . <span style="color: #ff0000;">&quot;)&quot;</span>;</div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">      <a href="http://www.php.net/mysql_query"><span style="color: #000066;">mysql_query</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #66cc66;">&#125;</span></div></li></ol></pre>
<p>Depending on the amount of rows to be inserted, this can be a costly process. A better approach would be to concatenate the values into one insert query and then execute it:</p>
<pre class="php"><ol><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #0000ff;">$values</span> = <a href="http://www.php.net/array"><span style="color: #000066;">array</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #b1b100;">foreach</span><span style="color: #66cc66;">&#40;</span> <span style="color: #0000ff;">$data</span> <span style="color: #b1b100;">as</span> <span style="color: #0000ff;">$row</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">    <span style="color: #0000ff;">$values</span><span style="color: #66cc66;">&#91;</span><span style="color: #66cc66;">&#93;</span> =  <span style="color: #ff0000;">&quot;(&quot;</span> . <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#91;</span><span style="color: #ff0000;">'user_id'</span><span style="color: #66cc66;">&#93;</span> . <span style="color: #ff0000;">&quot;,&quot;</span> . <span style="color: #0000ff;">$row</span><span style="color: #66cc66;">&#91;</span><span style="color: #ff0000;">'content'</span><span style="color: #66cc66;">&#93;</span> . <span style="color: #ff0000;">&quot;)&quot;</span>;</div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #66cc66;">&#125;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #b1b100;">if</span><span style="color: #66cc66;">&#40;</span> !<a href="http://www.php.net/empty"><span style="color: #000066;">empty</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$values</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">    <span style="color: #0000ff;">$query</span> = <span style="color: #ff0000;">&quot;INSERT INTO `test_table` (user_id,content) VALUES &quot;</span></div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">             . <a href="http://www.php.net/implode"><span style="color: #000066;">implode</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">','</span>,<span style="color: #0000ff;">$values</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;">    <a href="http://www.php.net/mysql_query"><span style="color: #000066;">mysql_query</span></a><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span>;</div></li><li style="font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;"><div style="font-family: 'Courier New', Courier, monospace; font-weight: normal;"><span style="color: #66cc66;">&#125;</span></div></li></ol></pre>
<p>What's the difference? lets see some benchmarks:<br />
<img src="http://www.techfounder.net/wp-content/uploads/2009/05/time.png" alt="time" title="time" width="485" height="205"  /><br />
A single query completes <b>much</b> faster than looping through multiple queries. At 2560 rows inserted, it took the loop ~36 seconds to complete, yet the single query took just 0.14 seconds.</p>
<p>Memory consumption shows a reverse trend however:<br />
<img src="http://www.techfounder.net/wp-content/uploads/2009/05/memory.png" alt="memory" title="memory" width="481" /><br />
Since the values are concatenated to create the single query, it consumes more and more memory with more rows as it needs to hold a larger query string. At 2560 rows inserted, the single query approach consumed ~800kb in memory, while the loop consumed just ~60kb.</p>
<p>Since memory is much cheaper than CPU cycles and database connections, I'd usually opt for the single query approach. It's important though to be aware of the implications.</p>
<h2>Inserting / Updating (multiple) values</h2>
<p>Another use-case of multiple row operations is when we want to insert several rows that might exist already, and in case they exist we want to update existing values instead. Of course, we could check first which rows exist, update the ones that do and insert the ones that don't - for a total of at least three separate queries. </p>
<p>Inserting and updating in the same operation is done using <a href="http://dev.mysql.com/doc/refman/5.1/en/insert-on-duplicate.html" target="_blank">INSERT ... ON DUPLICATE KEY UPDATE</a>.</p>
<p>The KEY in this statement should be a unique key in the table you are inserting to (otherwise duplicates would be allowed and this operation would be moot). The syntax of this statement allows to pick which columns are updated in the case of matching rows. For example, suppose I'm adding translations for content pages. I have the following table schema:</p>
<pre class="sql">&nbsp;
pages
 - id
 - content
 - ...
&nbsp;
pages_translations
 - page_id
 - lang_id
 - content
&nbsp;</pre>
<p>The pages_translations table has a primary (unique) key on ( page_id , lang_id ). When adding / updating a translation, the query would look something like:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">INSERT</span> <span style="color: #993333; font-weight: bold;">INTO</span> <span style="color: #ff0000;">`pages_translations`</span> <span style="color: #66cc66;">&#40;</span>page_id,lang_id,content<span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">VALUES</span> <span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">5</span>,<span style="color: #cc66cc;">2</span>,<span style="color: #ff0000;">'the brown fox ...'</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ON</span> DUPLICATE <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #993333; font-weight: bold;">UPDATE</span> content=<span style="color: #993333; font-weight: bold;">VALUES</span><span style="color: #66cc66;">&#40;</span>content<span style="color: #66cc66;">&#41;</span>
&nbsp;</pre>
<p>This will create a translation if one does not exist and update the content field if it does exist. We can combine this statement with the previous approach for multiple insertion to insert / update multiple rows with one statement.</p>
<p>Updating multiple rows can be done using this technique, or through a CASE .. WHEN condition for each row. This method is a bit more involved and usually results in bigger queries but it's worth noting. Example:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">UPDATE</span> <span style="color: #ff0000;">`pages`</span> <span style="color: #993333; font-weight: bold;">SET</span> content=<span style="color: #66cc66;">&#40;</span>CASE
    WHEN id=<span style="color: #cc66cc;">1</span> THEN <span style="color: #ff0000;">'first page content ...'</span>
    WHEN id=<span style="color: #cc66cc;">2</span> THEN <span style="color: #ff0000;">'second page content ...'</span>
END<span style="color: #66cc66;">&#41;</span></pre>
<p>We need to repeat this format for each row which results is a somewhat verbose query.</p>
<h2>Inserting / Updating with multiple tables</h2>
<p>There are some cases where we would like to use data in one or several tables to update / insert into another table. </p>
<p>With insertion the <a href="http://dev.mysql.com/doc/refman/5.1/en/insert-select.html" target="_blank">syntax is relatively straightforward</a> - replace the VALUES part of the statement with a SELECT statement. </p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">INSERT</span> <span style="color: #993333; font-weight: bold;">INTO</span> <span style="color: #ff0000;">`my_table`</span> <span style="color: #66cc66;">&#40;</span>col1,col2,col3<span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">SELECT</span> col4,col5,col6
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`another_table`</span></pre>
<p>The ON DUPLICATE KEY condition can be used with this statement as well.</p>
<p>Updating rows cross-tables is done with a JOIN statement (of the less declarative type).</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">UPDATE</span> <span style="color: #ff0000;">`my_table`</span>,<span style="color: #ff0000;">`other_table`</span>
   <span style="color: #993333; font-weight: bold;">SET</span> <span style="color: #ff0000;">`my_table`</span>.col1 = <span style="color: #ff0000;">`other_table`</span>.col2
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #ff0000;">`my_table`</span>.parent_id = <span style="color: #ff0000;">`other_table`</span>.id</pre>
<p>It's important to try the INSERT / UPDATE select statements separately to determine how many rows they will select and how fast will they complete. INSERT / UPDATE operations are relatively costly and can be a severe bottleneck for the database if not managed correctly.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=229" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Multiple row operations in MySQL / PHP" data-url="http://www.techfounder.net/2009/05/14/multiple-row-operations-in-mysql-php/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2009/05/14/multiple-row-operations-in-mysql-php/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Selecting closest values in MySQL</title>
		<link>http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/</link>
		<comments>http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 03:12:54 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Web development]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[range select]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=213</guid>
		<description><![CDATA[Sometimes the need arises to select several values in the vicinity of a certain value, preferably ordered by proximity. The values might be dates, zip-codes or any other meaningfully ordered values that can be represented as numerical values. How can we pull this off in MySQL? We can't use a simple ORDER BY, since we [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes the need arises to select several values in the vicinity of a certain value, preferably ordered by proximity. The values might be dates, zip-codes or any other meaningfully ordered values that can be represented as numerical values. How can we pull this off in MySQL?<br />
<span id="more-213"></span><br />
We can't use a simple ORDER BY, since we want values both larger and smaller than our selected value. We can however order by an aggregate function that calculates the distance from our selected value.</p>
<p>Suppose we want to find the 6 closest numbers to the number 2500 (including) from a numbers table:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> number, ABS<span style="color: #66cc66;">&#40;</span> number - <span style="color: #cc66cc;">2500</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> distance
<span style="color: #993333; font-weight: bold;">FROM</span> numbers
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> distance
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span></pre>
<p>This returns:</p>
<table id="table_results" class="data" border="0">
<thead>
<tr>
<th>number</th>
<th>distance</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>2502</td>
<td>2</td>
</tr>
<tr class="even">
<td>2494</td>
<td>6</td>
</tr>
<tr class="odd">
<td >2508</td>
<td>8</td>
</tr>
<tr class="even">
<td >2489</td>
<td>11</td>
</tr>
<tr class="odd">
<td >2513</td>
<td>13</td>
</tr>
<tr class="even">
<td >2487</td>
<td>13</td>
</tr>
</tbody>
</table>
<blockquote><p>(6 total, Query took 2.2792 sec)</p></blockquote>
<p>This works nicely, and is actually relatively performant if the number column is indexed. Despite having to run a full table scan in order to calculate the distance for every number in the table, running this on a ~2 million row table completes in just over 2 seconds.</p>
<p>This kind of performance would be quite enough for smaller tables, but often in real time applications we would like faster response times than 2 seconds for completing a query.</p>
<p>Since we know exactly what we need, we can help MySQL by limiting the range of numbers it has to calculate the distance for. Since we want the 6 closest numbers, we can be sure they'll be at most in a range of 6 numbers lower and 6 numbers higher than our selected value.</p>
<p>If we can do that, then the calculation would run for only 13 numbers, hopefully leading to much improved performance. Selecting those numbers can be done using a union on a couple of SELECT statements:</p>
<pre class="sql"><span style="color: #66cc66;">&#40;</span>
   <span style="color: #993333; font-weight: bold;">SELECT</span> number
   <span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
   <span style="color: #993333; font-weight: bold;">WHERE</span> number &gt;= <span style="color: #cc66cc;">2500</span>
   <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number
   <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">7</span>
<span style="color: #66cc66;">&#41;</span> UNION <span style="color: #993333; font-weight: bold;">ALL</span> <span style="color: #66cc66;">&#40;</span>
   <span style="color: #993333; font-weight: bold;">SELECT</span> number
   <span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
   <span style="color: #993333; font-weight: bold;">WHERE</span> number &lt; <span style="color: #cc66cc;">2500</span>
   <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number <span style="color: #993333; font-weight: bold;">DESC</span>
   <span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span>
<span style="color: #66cc66;">&#41;</span></pre>
<p>Notice the first one includes the actual value (in case it exists) and hence has one more value to select in its limit clause.</p>
<p>Combining this with our previous successful attempt produces the following ungainly query:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> number, ABS<span style="color: #66cc66;">&#40;</span> number - <span style="color: #cc66cc;">2500</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> distance <span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #66cc66;">&#40;</span>
	<span style="color: #66cc66;">&#40;</span>
		<span style="color: #993333; font-weight: bold;">SELECT</span> number
		<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
		<span style="color: #993333; font-weight: bold;">WHERE</span> number &gt;=<span style="color: #cc66cc;">2500</span>
		<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number
		<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span>
	<span style="color: #66cc66;">&#41;</span> UNION <span style="color: #993333; font-weight: bold;">ALL</span> <span style="color: #66cc66;">&#40;</span>
		<span style="color: #993333; font-weight: bold;">SELECT</span> number
		<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`numbers`</span>
		<span style="color: #993333; font-weight: bold;">WHERE</span> number &lt; <span style="color: #cc66cc;">2500</span>
		<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> number <span style="color: #993333; font-weight: bold;">DESC</span>
		<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span>
	<span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> n
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> distance
<span style="color: #993333; font-weight: bold;">LIMIT</span> <span style="color: #cc66cc;">6</span></pre>
<p>It does return the same results:</p>
<table id="table_results" class="data" border="0">
<thead>
<tr>
<th class="condition">number</th>
<th>distance</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="condition nowrap" align="right">2502</td>
<td class="nowrap" align="right">2</td>
</tr>
<tr class="even">
<td class="condition nowrap" align="right">2494</td>
<td class="condition nowrap" align="right">6</td>
</tr>
<tr class="odd">
<td class="condition nowrap" align="right">2508</td>
<td class="condition nowrap" align="right">8</td>
</tr>
<tr class="even">
<td class="condition nowrap" align="right">2489</td>
<td class="condition nowrap" align="right">11</td>
</tr>
<tr class="odd">
<td class="condition nowrap" align="right">2513</td>
<td class="condition nowrap" align="right">13</td>
</tr>
<tr class="even">
<td class="condition nowrap" align="right">2487</td>
<td class="condition nowrap" align="right">13</td>
</tr>
</tbody>
</table>
<p>Is much more performant:</p>
<blockquote><p>(6 total, Query took 0.0011 sec)</p></blockquote>
<p>And there you have it.</p>
<p>A couple of notes:<br />
- All queries were ran with SQL_NO_CACHE at least 5 times to ensure the timings were indicative of the performance.<br />
- The queries were ran against a ~2 million table filled with randomly generated values.<br />
- An index was created on the number column.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=213" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Selecting closest values in MySQL" data-url="http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2009/02/02/selecting-closest-values-in-mysql/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Optimizing OR (union) operations in MySQL</title>
		<link>http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/</link>
		<comments>http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/#comments</comments>
		<pubDate>Wed, 15 Oct 2008 06:12:32 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=126</guid>
		<description><![CDATA[In my last post on database optimization, I focused on improving query performance by optimizing schema - exploring indexing strategies by reading the execution plan. In this post I'll show how different query structures can also have a major impact on performance. Dealing with OR operators in a WHERE clause of an SQL statement in [...]]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/" title="Profiling MySQL queries with Zend_Db, optimizing by hand">last post</a> on database optimization, I focused on improving query performance by optimizing schema - exploring indexing strategies by reading the execution plan. In this post I'll show how different query structures can also have a major impact on performance.<br />
<span id="more-126"></span><br />
Dealing with OR operators in a WHERE clause of an SQL statement in MySQL can be tricky. Up until recently, MySQL could only use one index per table referenced in a query. A multi-column index can be used for  filtering conditions with an AND operator (which is more restrictive by nature), but a condition added by OR must use a separate index because of the logical nature of the opertaor (a <a href="http://en.wikipedia.org/wiki/Union_(set_theory)" target="_blank">union</a>, as opposed to an <a href="http://en.wikipedia.org/wiki/Intersection_(set_theory)">intersection</a> that the AND represents).</p>
<p>MySQL 5.0 added the <a href="http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html" target="_blank">index_merge</a> select type, which allows the query optimizer to possibly select several indexes from a single table and merge them to improve query performance. I say possibly, since leaving such decisions to the optimizer is risky at best. In fact, as I will show next, you are sometimes left with no indexes selected out of several possible options, resulting in a full table scan.</p>
<p>Continuing from my last post, I'll use a real-world example to show the different paths the queries optimizer can take when preparing an execution plan for our queries.</p>
<p>I'll actually be working with the same query I profiled last time, with a minor change. The relevant table structure is as follows:</p>
<pre class="sql"><span style="color: #808080; font-style: italic;">--</span>
<span style="color: #808080; font-style: italic;">-- Table structure for table `tasks`</span>
<span style="color: #808080; font-style: italic;">--</span>
&nbsp;
<span style="color: #993333; font-weight: bold;">CREATE</span> <span style="color: #993333; font-weight: bold;">TABLE</span> <span style="color: #993333; font-weight: bold;">IF</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">EXISTS</span> <span style="color: #ff0000;">`tasks`</span> <span style="color: #66cc66;">&#40;</span>
  <span style="color: #ff0000;">`id`</span> int<span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">13</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">AUTO_INCREMENT</span>,
  <span style="color: #ff0000;">`list_id`</span> int<span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">13</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #ff0000;">`user_id`</span> int<span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">13</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #ff0000;">`task`</span> varchar<span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">255</span><span style="color: #66cc66;">&#41;</span> collate utf8_unicode_ci <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #ff0000;">`due`</span> timestamp <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #ff0000;">`created`</span> timestamp <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> CURRENT_TIMESTAMP,
  <span style="color: #ff0000;">`checked`</span> timestamp <span style="color: #993333; font-weight: bold;">NULL</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #ff0000;">`assigned`</span> int<span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">13</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #ff0000;">`done`</span> enum<span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'1'</span><span style="color: #66cc66;">&#41;</span> collate utf8_unicode_ci <span style="color: #993333; font-weight: bold;">DEFAULT</span> <span style="color: #993333; font-weight: bold;">NULL</span>,
  <span style="color: #993333; font-weight: bold;">PRIMARY</span> <span style="color: #993333; font-weight: bold;">KEY</span>  <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">`id`</span><span style="color: #66cc66;">&#41;</span>,
  <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #ff0000;">`user_id`</span> <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">`user_id`</span>,<span style="color: #ff0000;">`due`</span><span style="color: #66cc66;">&#41;</span>,
  <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #ff0000;">`assigned`</span> <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">`assigned`</span>,<span style="color: #ff0000;">`due`</span><span style="color: #66cc66;">&#41;</span>,
  <span style="color: #993333; font-weight: bold;">KEY</span> <span style="color: #ff0000;">`due`</span> <span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">`due`</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> ENGINE=InnoDB</pre>
<p>And the query itself:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`id`</span>,
           <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`task`</span>,
           UNIX_TIMESTAMP<span style="color: #66cc66;">&#40;</span>due<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`time`</span>,
           <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`name`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_name`</span>,
           <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`tasks`</span>
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`lists`</span> <span style="color: #993333; font-weight: bold;">ON</span> lists.id=tasks.list_id
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.user_id=<span style="color: #ff0000;">'1'</span> <span style="color: #993333; font-weight: bold;">OR</span> assigned=<span style="color: #ff0000;">'1'</span><span style="color: #66cc66;">&#41;</span>
     <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.done <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
     <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.due <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`due`</span> <span style="color: #993333; font-weight: bold;">ASC</span></pre>
<p>This query is almost the same as the one I worked with last time. The only difference being the first statement in the WHERE clause, involving an OR operator:</p>
<blockquote><p>(tasks.user_id='1' OR assigned='1')</p></blockquote>
<p>Just to understand what I'm doing here - The query selects from the tasks table with several filtering criteria. The last OR statement conveys the condition that the tasks selected either belong to specific user (i.e created by him) or assigned to him (those two possibly coincide).</p>
<p>For testing purposes I will be running the queries against a testing database I set up with plenty of mock data. The database is around 1Gb in total size, with the tasks table at about 1.8 million rows. It's not very large, but enough for significant data to be obtained while allowing relatively online tampering with schema (modifying keys takes <em>only</em> around 4 minutes to complete).</p>
<p>Running in original form computes as (average of 10 queries):</p>
<blockquote><p>(304 total, Query took 6.84 sec)</p></blockquote>
<p>6.84 sec is way too long for running a query in a typical web application. If you'd recall, I last got this query running at <strong>0.0008 sec</strong> without the additional OR condition, meaning it is running at around 7,000 times slower (yikes). In such a case you would suspect the indexes are not selective enough, and sure enough an EXPLAIN reveals:</p>
<table class="data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>index</td>
<td>user_id,assigned,due</td>
<td>due</td>
<td>5</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1875432</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>MySQL is performing a regular index select and not an index_merge as we'd like, and doing that it selects the worst possible index - the only one that doesn't filter the result set. Sure enough, all 1.8M rows are scanned and the query is underperforming badly. Trying to force the issue, I first add an IGNORE INDEX(due) to remove it from the equation:</p>
<blockquote><p>(304 total, Query took 1.41 sec)</p></blockquote>
<table class="data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>ALL</td>
<td>user_id,assigned</td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1870838</td>
<td>Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>The query optimizer has decided not to use an index at all, despite several good candidates. However, query performance is improved since a full table scan with no index is performed straight up. It's still way too slow, and we'd like it to use a filtering index so it can avoid a full table scan. Trying to force the other indexes doesn't work so we're left with no choice but to try alternative query structures.</p>
<p>Our first candidate is replacing our OR condition with two UNION'ed select statements (the inspiration is from a <a href="http://www.mysqlperformanceblog.com/2007/09/18/possible-optimization-for-sort_merge-and-union-order-by-limit/" target="_blank">couple</a> of <a href="http://www.mysqlperformanceblog.com/2007/10/05/union-vs-union-all-performance/" target="_blank">posts</a> over at the MySQL performance blog). Breaking the original query into a UNION form results in:</p>
<pre class="sql"><span style="color: #66cc66;">&#40;</span><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`id`</span>,
            <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`task`</span>,
            UNIX_TIMESTAMP<span style="color: #66cc66;">&#40;</span>due<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`time`</span>,
            <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`name`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_name`</span>,
            <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`tasks`</span>
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`lists`</span> <span style="color: #993333; font-weight: bold;">ON</span> lists.id=tasks.list_id
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.done <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.due <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.user_id=<span style="color: #ff0000;">'1'</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> UNION <span style="color: #993333; font-weight: bold;">ALL</span> <span style="color: #66cc66;">&#40;</span>
 <span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`id`</span>,
            <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`task`</span>,
            UNIX_TIMESTAMP<span style="color: #66cc66;">&#40;</span>due<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`time`</span>,
            <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`name`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_name`</span>,
            <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`tasks`</span>
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`lists`</span> <span style="color: #993333; font-weight: bold;">ON</span> lists.id=tasks.list_id
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.done <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.due <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
    <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>assigned=<span style="color: #ff0000;">'1'</span> <span style="color: #993333; font-weight: bold;">AND</span> tasks.user_id!=<span style="color: #ff0000;">'1'</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`time`</span> <span style="color: #993333; font-weight: bold;">ASC</span>
&nbsp;</pre>
<p>This unsightly looking query looks much more complex, but actually gets the job done:</p>
<blockquote><p>(304 total, Query took 0.0021 sec)</p></blockquote>
<p>A big improvement to say the least (~3500 faster than the original query).<br />
The explain shows why:</p>
<table class="data" border="0">
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>tasks</td>
<td>range</td>
<td>user_id,due</td>
<td>user_id</td>
<td>9</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">304</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
<tr class="odd">
<td class="nowrap" align="right">2</td>
<td>UNION</td>
<td>tasks</td>
<td>range</td>
<td>user_id,assigned,due</td>
<td>assigned</td>
<td>10</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">2</td>
<td>UNION</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
<tr class="odd">
<td align="right"><em>NULL</em></td>
<td>UNION RESULT</td>
<td>&lt;union1,2&gt;</td>
<td>ALL</td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td align="right"><em>NULL</em></td>
<td>Using filesort</td>
</tr>
</tbody>
</table>
<p>Only 304 rows are scanned (as opposed to the entire 1.8M table before). Despite the filesort, we are in the area of usability for this query.</p>
<p>Another approach would be to determine why the index_merging isn't take place and how can we enforce it. By reducing the WHERE clause to include only the columns covered by an index, index_merging kicks in resulting in a similar performance as the union, albeit with less selective filtering:</p>
<blockquote><p>(368 total, Query took 0.0018 sec)</p></blockquote>
<p>We could filter our result set in the application, but I'd like to keep it in the SQL where it belongs. In order to integrate the rest of the filtering statements, I use an IN condition on another index I have available - the primary key. This results in:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`id`</span>,
           <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`task`</span>,
           UNIX_TIMESTAMP<span style="color: #66cc66;">&#40;</span>due<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`time`</span>,
           <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`name`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_name`</span>,
           <span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`tasks`</span>
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`lists`</span> <span style="color: #993333; font-weight: bold;">ON</span> lists.id=tasks.list_id
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.user_id=<span style="color: #ff0000;">'1'</span> <span style="color: #993333; font-weight: bold;">OR</span> assigned=<span style="color: #ff0000;">'1'</span><span style="color: #66cc66;">&#41;</span>
   <span style="color: #993333; font-weight: bold;">AND</span> tasks.id <span style="color: #993333; font-weight: bold;">IN</span>
     <span style="color: #66cc66;">&#40;</span><span style="color: #993333; font-weight: bold;">SELECT</span> id
      <span style="color: #993333; font-weight: bold;">FROM</span> tasks
      <span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.done <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
        <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.due <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span>
     <span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`due`</span> <span style="color: #993333; font-weight: bold;">ASC</span>
&nbsp;</pre>
<p>Which brings us back to the performance levels of the UNION form:</p>
<blockquote><p>(304 total, Query took 0.0021 sec)</p></blockquote>
<p>And we can see in the EXPLAIN that it is indeed an index_merge operation:</p>
<table class="data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>tasks</td>
<td>index_merge</td>
<td>user_id,assigned</td>
<td>user_id,assigned</td>
<td>4,5</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">369</td>
<td>Using sort_union(user_id,assigned); Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
<tr class="odd">
<td class="nowrap" align="right">2</td>
<td>DEPENDENT SUBQUERY</td>
<td>tasks</td>
<td>unique_subquery</td>
<td>PRIMARY,due</td>
<td>PRIMARY</td>
<td>4</td>
<td>func</td>
<td class="nowrap" align="right">1</td>
<td>Using where</td>
</tr>
</tbody>
</table>
<p>So there you have it. From 6.83 seconds to 0.0021 seconds with some tweaking to the query structure. I am still not completely satisfied with the fact that it's using filesort, but I couldn't get it to use another index for the operation. Without the sorting the query is twice as fast:</p>
<blockquote><p>(304 total, Query took 0.0010 sec)</p></blockquote>
<p>So if it ever becomes an issue I could move sorting to the application code. Hopefully by then merge_index is better implemented in MySQL.</p>
<p>Regarding the query strucutre - personally I use the subquery format since it is more compact and maintainable. It is always good however to be familiar with all the alternatives.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=126" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Optimizing OR (union) operations in MySQL" data-url="http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Profiling queries with Zend_Db and optimizing them by hand</title>
		<link>http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/</link>
		<comments>http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/#comments</comments>
		<pubDate>Sun, 12 Oct 2008 05:42:55 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Open Source]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Web development]]></category>
		<category><![CDATA[Zend Framework]]></category>
		<category><![CDATA[profiling]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=125</guid>
		<description><![CDATA[Database performance is one of the major bottlenecks for most web applications. Most web developers are not database experts (and I'm no exception), there are however several basic methods to analyze and optimize database performance without resorting to expert consultants (such as those, whose founders blogs are an invaluable source of MySQL knowledge). The Performance [...]]]></description>
			<content:encoded><![CDATA[<p>Database performance is one of the major bottlenecks for most web applications. Most web developers are not database experts (and I'm no exception), there are however several basic methods to analyze and optimize database performance without resorting to expert consultants (such as <a href="http://www.percona.com/">those</a>, whose founders blogs are an <a href="http://www.mysqlperformanceblog.com/">invaluable</a> <a href="http://www.xaprb.com/blog/">source</a> of MySQL knowledge).<br />
<span id="more-125"></span></p>
<h3>The Performance Equation</h3>
<p>Database performance is affected by many different variables - the running machine specs, OS, database engine and configuration, table schema and the queries running against it. Since I'm dealing mostly (only?) with MySQL, this article covers it mainly (though it is probably relevant to a large degree for other engines). As for OS and machine specs, I'll take them out of the equation as I'm interested in optimizing <strong>relative</strong> performance on the same machine.</p>
<p>Basically I'm interested in optimizing:</p>
<ul>
<li>The structure of my database tables (schema)</li>
<li>The structure of my application-level queries (SELECT, UPDATE, INSERT, DELETE)</li>
</ul>
<h3>Profiling Database Performance</h3>
<p>Blindly optimizing queries and database schema is counter-productive.  First we should <em>know</em> what we should be optimizing, and for that we need data. The act of gathering data for optimization is called profiling or <a href="http://en.wikipedia.org/wiki/Performance_analysis">performance analysis</a>.</p>
<p>The first step I take when profiling database performance for a web app is to measure the running time of all the queries running in it. Absolute run time of a query is not necessarily a good measure of how optimized / performant it is, since some queries are naturally more complex or pull more data - It will give me a good idea however of where to start improving the response time of the application I'm optimizing. My main goal is not specific query performance, but overall system performance.</p>
<p>Measuring the run time of a query can be done with simple timers using <a title="PHP: microtime()" href="http://www.php.net/microtime">microtime()</a> calls (Check out <a href="http://www.coderholic.com/php-profile-class/">this post</a> for an abstraction), running it in a tool that automatically provides such statistics (phpMyAdmin for example) or using an integrated profiler.</p>
<p>I am using the <a title="Zend_Db_Profiler" href="http://framework.zend.com/manual/en/zend.db.profiler.html">Zend_Db_Profiler,</a> which is convenient for me since I'm using the Zend Framework and all database access converges to a Zend_Db_Adapter connection. The profiler basically uses the microtime() approach but integrates it transparently into all of the queries without me having to wrap them one by one.</p>
<p>Usage is pretty simple. First you need to pass an extra parameter to your Zend_Db_Adapter instance to activate profiling:</p>
<pre class="php"><span style="color: #0000ff;">$params</span> = <a href="http://www.php.net/array"><span style="color: #000066;">array</span></a><span style="color: #66cc66;">&#40;</span>
    <span style="color: #ff0000;">'host'</span>     =&gt; <span style="color: #ff0000;">'localhost'</span>,
    <span style="color: #ff0000;">'username'</span> =&gt; <span style="color: #ff0000;">'dbusername'</span>,
    <span style="color: #ff0000;">'password'</span> =&gt; <span style="color: #ff0000;">'dbpassword'</span>,
    <span style="color: #ff0000;">'dbname'</span>   =&gt; <span style="color: #ff0000;">'dbname'</span>,
    <span style="color: #ff0000;">'profiler'</span> =&gt; <span style="color: #000000; font-weight: bold;">true</span>  <span style="color: #808080; font-style: italic;">// turn on profiler, disabled by default</span>
<span style="color: #66cc66;">&#41;</span>;
&nbsp;
<span style="color: #0000ff;">$db</span> = Zend_Db::<span style="color: #006600;">factory</span><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">'PDO_MYSQL'</span>, <span style="color: #0000ff;">$params</span><span style="color: #66cc66;">&#41;</span>;</pre>
<p>The Zend Framework manual shows some advanced usage (such as profiling directly into firebug, which is very convenient), however for our purposes we simply want to dump the queries and their run time.</p>
<pre class="php"><span style="color: #0000ff;">$profiler</span> = <span style="color: #0000ff;">$db</span> -&gt; <span style="color: #006600;">getProfiler</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;
&nbsp;
<span style="color: #0000ff;">$profile</span> = <span style="color: #ff0000;">''</span>;
<span style="color: #b1b100;">foreach</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$profiler</span> -&gt; <span style="color: #006600;">getQueryProfiles</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> <span style="color: #b1b100;">as</span> <span style="color: #0000ff;">$query</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
	<span style="color: #0000ff;">$profile</span> .= <span style="color: #0000ff;">$query</span> -&gt; <span style="color: #006600;">getQuery</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> . <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>
	     . <span style="color: #ff0000;">'Time: '</span> . <span style="color: #0000ff;">$query</span> -&gt; <span style="color: #006600;">getElapsedSecs</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #66cc66;">&#125;</span>
&nbsp;
<a href="http://www.php.net/echo"><span style="color: #000066;">echo</span></a> <span style="color: #0000ff;">$profile</span>;</pre>
<p>This snippet goes of course after the view has been rendered and all queries executed.</p>
<h3>Using a relevant dataset</h3>
<p>It's important to know whether the data you are profiling against is relevant for when you believe you will start hitting performance issues. Granted, you can never really know when will that be, however it is always preferable to work with a dataset that resembles your (projected) production environment.</p>
<p>I will walk through a specific use-case I recently went through while preparing for the beta release of my startup, business platform Octabox (since defunct).</p>
<p>Running the profiler on one of the views in the application produced the following output on my development machine:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`id`</span>, <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`type`</span>, <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`action`</span>, <span style="color: #ff0000;">`tags`</span>.<span style="color: #ff0000;">`color`</span>, <span style="color: #ff0000;">`tags`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`tag_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`history`</span>
<span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`tags`</span>
<span style="color: #993333; font-weight: bold;">ON</span> tags.id=history.tag_id <span style="color: #993333; font-weight: bold;">AND</span> tags.user_id=<span style="color: #cc66cc;">1</span>
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>history.user_id=<span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>history.logged_on &gt;= <span style="color: #ff0000;">'2008-09-27 00:00:00'</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`logged_on`</span> <span style="color: #993333; font-weight: bold;">DESC</span></pre>
<p><strong>Time: 0.00269</strong></p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`id`</span>, <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`task`</span>,<span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`tasks`</span>
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`lists`</span> <span style="color: #993333; font-weight: bold;">ON</span> lists.id=tasks.list_id
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.done <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.due <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.user_id=<span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`due`</span> <span style="color: #993333; font-weight: bold;">ASC</span></pre>
<p><strong>Time: 0.000592</strong></p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> COUNT<span style="color: #66cc66;">&#40;</span>*<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`count`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`dispatch`</span>
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>dispatch.user_id=<span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>dispatch.<span style="color: #993333; font-weight: bold;">STATUS</span>=<span style="color: #cc66cc;">0</span><span style="color: #66cc66;">&#41;</span></pre>
<p><strong>Time: 0.00044</strong></p>
<p>At first sight it would appear that the first query would be our immediate suspect for optimization, though the gains would not be great (completes in just over 0.002 seconds). Switching to a database I've prepared before hand with plenty of dummy data (~1.1Gb, some tables over 5M rows) the output looks very different:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`id`</span>, <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`type`</span>, <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`action`</span>, <span style="color: #ff0000;">`tags`</span>.<span style="color: #ff0000;">`color`</span>, <span style="color: #ff0000;">`tags`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`tag_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`history`</span>
<span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`tags`</span>
<span style="color: #993333; font-weight: bold;">ON</span> tags.id=history.tag_id <span style="color: #993333; font-weight: bold;">AND</span> tags.user_id=<span style="color: #cc66cc;">1</span>
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>history.user_id=<span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>history.logged_on &gt;= <span style="color: #ff0000;">'2008-09-27 00:00:00'</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`history`</span>.<span style="color: #ff0000;">`logged_on`</span> <span style="color: #993333; font-weight: bold;">DESC</span></pre>
<p><strong>Time: 0.000426</strong></p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`id`</span>, <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`task`</span>,<span style="color: #ff0000;">`lists`</span>.<span style="color: #ff0000;">`id`</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`list_id`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`tasks`</span>
<span style="color: #993333; font-weight: bold;">INNER</span> <span style="color: #993333; font-weight: bold;">JOIN</span> <span style="color: #ff0000;">`lists`</span> <span style="color: #993333; font-weight: bold;">ON</span> lists.id=tasks.list_id
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>tasks.done <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.due <span style="color: #993333; font-weight: bold;">IS</span> <span style="color: #993333; font-weight: bold;">NOT</span> <span style="color: #993333; font-weight: bold;">NULL</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>tasks.user_id=<span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span>
<span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> <span style="color: #ff0000;">`tasks`</span>.<span style="color: #ff0000;">`due`</span> <span style="color: #993333; font-weight: bold;">ASC</span></pre>
<p><strong>Time: 58.16</strong> <em>//Hmmm... problem?</em></p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> COUNT<span style="color: #66cc66;">&#40;</span>*<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> <span style="color: #ff0000;">`count`</span>
<span style="color: #993333; font-weight: bold;">FROM</span> <span style="color: #ff0000;">`dispatch`</span>
<span style="color: #993333; font-weight: bold;">WHERE</span> <span style="color: #66cc66;">&#40;</span>dispatch.user_id=<span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AND</span> <span style="color: #66cc66;">&#40;</span>dispatch.<span style="color: #993333; font-weight: bold;">STATUS</span>=<span style="color: #cc66cc;">0</span><span style="color: #66cc66;">&#41;</span></pre>
<p><strong>Time: 0.000322</strong></p>
<p>A query that took just over 0.0005 seconds on a small database, took almost a minute to complete on a bigger one. Surprisingly enough the other queries remained blazingly fast (even faster than on my development machine with a small database, which is a credit to our server's power).</p>
<h3>Optimizing The Errant Query</h3>
<p>So I've found a very problematic query to say the least. Taking 58.1 seconds to complete is obviously not acceptable for any real time application. Running EXPLAIN against the query revealed the following execution plan:</p>
<table class="data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>index</td>
<td>user_id</td>
<td>user_id</td>
<td>9</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1868081</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>The EXPLAIN results show the most immediate problem with the query - it was performing a join against every row in a 1.8M row table.</p>
<p>Examining the details of the query plan, it appears both selects are using an index, the first one limited by a WHERE (on the surface at least - since all rows were scanned). Since despite using an index the entire table was scanned very slowly, I tried to see the query will respond if I removed the index. Using IGNORE INDEX(user_id), I re-ran the query (this time in the Mysql command line):</p>
<blockquote><p>304 rows in set (<strong>1.53 sec</strong>)</p></blockquote>
<p>Wow, that's a big difference <strong>for not using</strong> an index. Running EXPLAIN on this reveals:</p>
<table class="data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>ALL</td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1866244</td>
<td>Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>Despite using filesort to sort the query, it was running almost <strong>30 times</strong> faster than before. By forcing it not to use an index which was not selective enough (i.e. not at all), the query ran a full table scan instead and completed much faster due to a better execution plan.</p>
<p>So we've improved greatly, but still not enough for my needs. Having one query taking over a second to complete when all the others are completing in sub 0.01 second times means it will be a bottleneck as the database grows (remember, the same query completed in just over 0.0005 seconds on a much smaller database).</p>
<p>Examining the structure of the table in question (tasks) revealed another interesting revelation - there is <strong>no index</strong> on the user_id column. So what happened? it appears there used to be an index on the user_id column, however it was removed just before all the dummy data was inserted into the database. Running ANALYZE TABLE tasks repaired the key information to the current state (no index on user_id).</p>
<p>I wanted to try the original query, this time with an actual index on the filtering column (user_id). After adding the index, I re-ran the query:</p>
<blockquote><p>(304 total, Query took <strong>0.0021 sec</strong>)</p></blockquote>
<p>Much better! but I was not done. Running EXPLAIN yet again reveals the following execution plan:</p>
<table class="data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>ref</td>
<td>user_id</td>
<td>user_id</td>
<td>4</td>
<td>const</td>
<td class="nowrap" align="right">368</td>
<td>Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>So indeed, using <strong>an actual index</strong> was a big help, filtering the set to only 368 rows before processing the rest of the query. However, you'd notice it is running a filesort which we'd like to avoid. Preferably, I could set up an index combination that both filters and sorts using an index.</p>
<p>Creating an index composed of both the 'user_id' and 'due' columns got it right on the head, leading to:</p>
<blockquote><p>(304 total, Query took <strong>0.0008 sec</strong>)</p></blockquote>
<p>Not as dramatic as previous improvements, but still three times better than the last iteration.</p>
<h3>Final Words</h3>
<p>The entire process took several hours of head scratching to arrive at all the results I've shown here. I managed to take a query that would cripple our server (58 seconds for one completion!) to an extremely fast one (0.0008 seconds against a 1.8M row table), through the use of some basic profiling and examining the execution plan.</p>
<p>Being an edge case (a non existent index used to filter against a 1.8M row table...), I gained valuable experience on potential pitfalls and on reading between the lines in the execution plan.</p>
<p>Some technical gibberish:</p>
<p>All queries were run with SQL_NO_CACHE to prevent caching from tampering with the results. I ran each query at least 5 times to make sure I was getting the right readings.</p>
<p>The specs of the server machine that ran the queries are as follows:</p>
<ul>
<li>Intel \ 2.4 GHz 1066FSB - Conroe \ Xeon 3060 (Dual Core)</li>
<li>2 x Generic \ 1024 MB \ DDR2 667 ECC</li>
<li>2 x Maxtor \ 146GB:SAS:10K RPM \ Atlas 10K - SAS</li>
<li>CentOS Enterprise Linux - x86_64 - OS ES 5.0</li>
<li>Running PHP 5.2.6 with MySQL 5.0.45 on Apache 2.2.4</li>
</ul>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=125" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Profiling queries with Zend_Db and optimizing them by hand" data-url="http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
		</item>
		<item>
		<title>Octabox launched and I&#8217;m back to blogging</title>
		<link>http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/</link>
		<comments>http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/#comments</comments>
		<pubDate>Fri, 10 Oct 2008 05:42:26 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[techfounder]]></category>
		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=123</guid>
		<description><![CDATA[I've been super-busy the last couple of months - I've came across a stupendous amount of work that I couldn't refuse in addition to the effort towards the release of my own startup, business platform Octabox. Things are finally calming down, and I'll be getting back to blogging, writing about plenty of things I've learned [...]]]></description>
			<content:encoded><![CDATA[<p>I've been super-busy the last couple of months - I've came across a stupendous amount of work that I couldn't refuse in addition to the effort towards the release of my own startup, <a href="http://www.octabox.com" title="Octabox Business Platform">business platform Octabox</a>. Things are finally calming down, and I'll be getting back to blogging, writing about plenty of things I've learned / implemented / experimented with the last two months.</p>
<p><span id="more-123"></span></p>
<p>First I'd like to expand on the (soft) release of Octabox - Octabox is a web platform for professional individuals and small businesses, providing a complete informartion management and collaboration environment. The concept was borne in the mind of my good friend and Octabox co-founder, <a href="http://adambe.com">Adam Benayoun</a>, as the result of accumulated experience in building information management systems for many small businesses over the years.</p>
<p>Octabox is a cross between the vast functionality offered by application platform <a title="Salesforce" href="http://www.salesforce.com/">Salesforce</a> and the simple and usable approach of <a title="Basecamp project managment" href="http://www.basecamphq.com/">basecamp</a> (of <a title="37Signals" href="http://www.37signals.com/">37signals</a>). It is an application platform in the sense that it offers many on-demand applications for managing information (such as task management, contact relationship mangement, whiteboard/brainstorming and much more), but the focus is on keeping everything as simple and usable as possible to accomodate the needs of small business operations (all the way down to single individuals - freelancers, consultants and so forth). It is our belief that this sector is feeling the need for more powerful tools, yet is reluctant to use so called "enterprise" applications due to complexity and cost.</p>
<p>Octabox now enters a private beta phase to weed out issues and to figure what features are in demand so they'd be integrated in the public release. Hence, the "soft" release - the marketing effort has not begun yet (as can be seen by missing content on the site). We are handing out beta invitations if you register through our waiting list on the site. Blog readers get special treatment, so contact me through the contact form to secure your invitation <img src='http://www.techfounder.net/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> .</p>
<p>Octabox by the way runs on PHP 5.2.6 with MySQL 5.1 with Zend Framework 1.6 as the abstraction layer. I have much to say about building a high-availability entreprise web-application using those tools, so stay tuned as I already have a couple article drafts on the pipeline on framework performance, mysql query optimizations, server setup and much more.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=123" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Octabox launched and I'm back to blogging" data-url="http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Almost useful IE replacement</title>
		<link>http://www.techfounder.net/2008/06/08/almost-useful-ie-replacement/</link>
		<comments>http://www.techfounder.net/2008/06/08/almost-useful-ie-replacement/#comments</comments>
		<pubDate>Mon, 09 Jun 2008 01:19:37 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
				<category><![CDATA[CSS]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[techfounder]]></category>
		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=60</guid>
		<description><![CDATA[I stumbled upon a tool by the name of IETester today, that is supposed to render websites in different IE engines from version 5.5 to 8 beta. It appears to be working quite well, allowing to open multiple tabs of different IE versions. Unfortunately its Javascript support is too limited to be of real use [...]]]></description>
			<content:encoded><![CDATA[<p>I stumbled upon a tool by the name of IETester today, that is supposed to render websites in different IE engines from version 5.5 to 8 beta. It appears to be working quite well, allowing to open multiple tabs of different IE versions. Unfortunately its Javascript support is too limited to be of real use for serious application development.</p>
<p>Still, a nice tool for web designers wishing to test their HTML and CSS layouts against several generations of IE, without having to resort to hacking multiple installations of different versions (such as <a href="http://tredosoft.com/Multiple_IE">multipleIE</a>).</p>
<p><a href="http://www.my-debugbar.com/wiki/IETester/HomePage">IETester</a> [via <a href="http://lifehacker.com/395353/ietester-renders-sites-like-internet-explorer-55-through-8">LifeHacker</a>]</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=60" width="1" height="1" style="display: none;" />
	<div style="">
		<a href="http://twitter.com/share" class="twitter-share-button" data-count="vertical" data-text="Almost useful IE replacement" data-url="http://www.techfounder.net/2008/06/08/almost-useful-ie-replacement/"  data-via="erangalperin">Tweet</a>
	</div>
	<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/06/08/almost-useful-ie-replacement/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

