<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>techfounder</title>
	<atom:link href="http://www.techfounder.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.techfounder.net</link>
	<description>Blog about web development and Internet entrepreneurship</description>
	<pubDate>Tue, 18 Nov 2008 05:12:35 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
	<language>en</language>
			<item>
		<title>OO PHP templating</title>
		<link>http://www.techfounder.net/2008/11/18/oo-php-templating/</link>
		<comments>http://www.techfounder.net/2008/11/18/oo-php-templating/#comments</comments>
		<pubDate>Tue, 18 Nov 2008 04:00:05 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web development]]></category>

		<category><![CDATA[templating]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=135</guid>
		<description><![CDATA[Templating is a common technique for separation of concerns in applications - separating presentational logic from domain (or business) logic. This kind of separation promotes higher maintainability and a better chance to reuse presentational code (by encapsulating it in templates), the kind of traits we would all love to have in our code base.

PHP as [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Wikipedia - templating" href="http://en.wikipedia.org/wiki/Web_template_system" target="_blank">Templating</a> is a common technique for <a title="Wikipedia - separation of concerns" href="http://en.wikipedia.org/wiki/Separation_of_concerns" target="_blank">separation of concerns</a> in applications - separating presentational logic from domain (or business) logic. This kind of separation promotes higher maintainability and a better chance to reuse presentational code (by encapsulating it in templates), the kind of traits we would all love to have in our code base.<br />
<span id="more-135"></span><br />
PHP as a language can be considered a templating system, as in its root it was meant to modify HTML pages dynamically. The need for more structured templating systems arose as PHP applications have grown more and more complex, giving birth to much more specialized and focused solutions.</p>
<h2>Search-and-replace Vs. Include</h2>
<p>As the title suggests, there are two prevalent forms of PHP templating, each with its own cons and pros:</p>
<ol>
<li>Search-and-replace systems (to which Smarty belongs) involve running string replacement filters on a template, replacing placeholders with dynamic content.Search-and-replace systems suffer from parsing overhead which can be significant depending on the amount of content being parsed. More advanced systems in this category attempt to offset the overhead by caching parse results (Smarty for example relies heavily on caching for its performance), but this introduces the concern of maintaining the cache (determining when it becomes stale).
<p>Search-and-replace systems are also better suited for serving mostly static content, as regenerating content dynamically incurs the parsing overhead. Hence it is more suited for content web-sites than highly dynamic web-applications, in which every user sees a different page than the others, limiting the effectiveness of caching.</li>
<li> Include systems involve including PHP scripts which serve as templates, and passing variables directly to those. Wordpress for example uses a basic implementation of such a system for its templating mechanism.Include systems suffer much less overhead for parsing templates than search-and-replace schemes, simply due to the fact that no such action takes place.
<p>On the other hand, include systems are prone to scoping problems. If templates are included in the normal flow of the application, then they share and pollute the scope they are in - meaning they could affect data in other parts of the application which could lead to weird behavior and decrease in reusability of templates.</li>
</ol>
<h2>Templating with Objects</h2>
<p>Ideally, we would like to combine the pros and solve the cons of both systems. We need a way to pass an arbitrary number of variables into a template without polluting the general scope of the application, while avoiding the parsing overhead of search-and-replace routines.</p>
<p>Fortunately PHP presents us with such a construct - Classes. Class methods are different from regular functions in that they have a shared local scope always available - the object instance ($this). Recent PHP templating systems use this property of classes to their advantage, creating a separate scope for templates thereby allowing them to process without affecting the rest of the application.</p>
<p><a title="Savant" href="http://phpsavant.com/" target="_blank">Savant</a> is a good example of such a system, and so is the <a href="http://framework.zend.com/manual/en/zend.view.html" target="_blank">View</a> component of the Zend Framework. I&#8217;ll be borrowing concepts from both to show how a simple implementation of such a system can be built.</p>
<h2>In Code</h2>
<p>In the most basic form, a templating system built with the principles I&#8217;ve mentioned above will look like this:</p>
<pre class="php">class View {
     public function render($script) {
        ob_start();
        $this-&gt;_include($script);
        return ob_get_clean();
    }

    public function __get($key) {
    	return (isset($this -&gt; $key) ? $this -&gt; $key : null);
    }

    protected function _include() {
        include func_get_arg(0);
    }
}</pre>
<p>Usage is pretty simple - load some vars into a View class instance and pass a template into the render method. For a template that looks like:</p>
<pre class="php">&lt;h2&gt;&lt;?php echo $this -&gt; name; ?&gt;&lt;/h2&gt;
&lt;p&gt;Current position: &lt;?php echo $this -&gt; job; ?&gt;&lt;/p&gt;</pre>
<p>Passing parameters into a view instance:</p>
<pre class="php">$view = new View();

$view -&gt; name = 'Obama';
$view -&gt; job = 'President of the USA';

echo $view -&gt; render('/path/to/template.phtml');</pre>
<p>Will result in:</p>
<pre class="php">&lt;h2&gt;Obama&lt;/h2&gt;
&lt;p&gt;Current position: President of the USA&lt;/p&gt;</pre>
<p>We can also invoke the render() method inside a template to chain several templates together:</p>
<pre class="php">
&lt;h2&gt;&lt;?php echo $this -&gt; name; ?&gt;&lt;/h2&gt;
&lt;p&gt;Current position: &lt;?php echo $this -&gt; position; ?&gt;&lt;/p&gt;

&lt;?php echo $this -&gt; render('footer.phtml'); ?&gt;
</pre>
<p>&#8216;footer.phtml&#8217; being another separate template file.</p>
<h2>In Review</h2>
<p>Lets review what the class does:</p>
<pre class="php">     public function render($script) {
        ob_start();
        $this-&gt;_include($script);
        return ob_get_clean();
    }</pre>
<p>The render() method receives a $script parameter which is the path to a template script. It passes that parameter to separate method called _include() and captures the output using output buffering.</p>
<p>Capturing the output into a string instead of outputting to the screen at once is very useful if we want to perform this as a part of a complete process, such as preparing headers and starting sessions(), which has to happen before any output is sent to the browser. It is also useful in case we want to perform additional transformations on the output before sending it to the browser (such as parsing and filtering).</p>
<pre class="php">    protected function _include() {
        include func_get_arg(0);
    }</pre>
<p>The _include() method is the core of this templating approach. By including the template inside a class method with no arguments (calling func_get_arg() instead), the included script has only the object instance ($this) as the available scope to get information from and pass information to. Any variables we declare inside this scope have no bearing on the outside and so collisions and scoping issues are avoided.</p>
<pre class="php">     public function __get($key) {
    	return (isset($this -&gt; $key) ? $this -&gt; $key : null);
    }</pre>
<p>This is really a utility method to prevent PHP from throwing a &#8216;variable is not defined&#8217; notice in case the template attempts to access a non existing parameter.</p>
<p>Complete solutions like Zend_View and Savant offer many more features, such as plug-ins, filters, base directory configuration and more, but the principle is the same. While the same structure could be mimicked to some degree using procedural functions only, the simple and convenient interface of passing and retrieving parameter to/from templates is unique to this approach - making it the clear winner in my opinion.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=135" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/11/18/oo-php-templating/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Making web-pages go faster using PHP</title>
		<link>http://www.techfounder.net/2008/11/16/making-web-pages-go-faster-using-php/</link>
		<comments>http://www.techfounder.net/2008/11/16/making-web-pages-go-faster-using-php/#comments</comments>
		<pubDate>Sun, 16 Nov 2008 04:52:06 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[Javascript]]></category>

		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=121</guid>
		<description><![CDATA[UI responsiveness is one of the basics of a good user experience. In the web environment, this often translates to the loading time of pages.
As it might be expected, there are several techniques to optimize the delivery of web pages. The Exceptional Performance guide by Yahoo is a great resource for a multitude of optimizations [...]]]></description>
			<content:encoded><![CDATA[<p>UI responsiveness is <a href="http://www.useit.com/papers/responsetime.html" target="_blank">one of the basics</a> of a good user experience. In the web environment, this often translates to the loading time of pages.</p>
<p>As it might be expected, there are several techniques to optimize the delivery of web pages. The <a title="Yahoo Exceptional Performance" href="http://developer.yahoo.com/performance/" target="_blank">Exceptional Performance</a> guide by Yahoo is a great resource for a multitude of optimizations practices, including specifically two techniques which I will address in this article - script <a title="Yahoo on minifying javascript" href="http://developer.yahoo.net/blog/archives/2007/07/high_performanc_8.html" target="_blank">minifcation</a> and <a title="The Yahoo blog on combining scripts" href="http://yuiblog.com/blog/2008/07/21/performance-research-part-6/" target="_blank">concatenation</a>.<br />
<span id="more-121"></span></p>
<h2>Reducing script size and the number of request</h2>
<p>Modern web sites and web application serve increasingly heavier javascript and CSS files. The price for the increase in script payload is felt both on the client side, where loading time increase making pages feel slugish, and on the server side, where bandwidth and requests are at a premium.</p>
<p>Javascript files especially block the browser while downloading (no more than 2 components can be downloaded simultaneously as per <a title="HTML Specifications HTTP/1.1" href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.1.4" target="_blank">HTML specifications</a>), and are the type of scripts seeing the steepest increase in size in the last several years.</p>
<h2>Minification and concatenation</h2>
<p>Minifying is the process of removing all unnecessary characters from the source code without changing its functionality (removing whitespaces, line-breaks and comments, replacing variables names and more). Concatenation is the process of unifying two strings into one. Applying both to our scripts we reduce total script size and the amount of requests.</p>
<p>There are several public algorithms for minifying Javascript, and Dean Edwards gives <a title="Dean Edward on Javascript compression" href="http://dean.edwards.name/weblog/2007/08/js-compression/" target="_blank">a nice overview of those</a> on his blog.</p>
<p>For PHP we have the handy <a href="http://code.google.com/p/minify/" target="_blank">Minify</a> library, which handles both the minifcation and concatenation of scripts. Minify is quite robust, and comes with <a href="http://code.google.com/p/minify/wiki/ComponentClasses" target="_blank">a set of standalone classes</a> that can be used in other contexts as well. It also sets Etags and far-future headers for better client script caching.</p>
<h2>Deploying Minify</h2>
<p>Downloading and extracting the <a href="http://minify.googlecode.com/files/release_2.1.1.zip">package available</a> on google code creates several text files and two sub-directories. The directory that is of interest to us is &#8216;/min&#8217; which contains the actual source code, and we&#8217;ll place it somewhere on our server about our document root (so it can be accessed by a regular http request).</p>
<p>The basic workflow of using Minify is to replace our &lt;script&gt; and &lt;link&gt; tags which load our javascript and CSS respectively with calls to the minify script. We need to tell the Minify script which scripts to process, and there are two main ways to accomplish that:</p>
<ol>
<li>Passing the scripts as parameters in the call to the minify script. We pass the path to the scripts separated by commas, translating the following:
<pre class="php">
&lt;script type="text/javascript" src="/path/to/script_one.js"&gt;&lt;/script&gt;
&lt;script type="text/javascript" src="/path/to/script_two.js"&gt;&lt;/script&gt;
</pre>
<p>Into:</p>
<pre class="php">
&lt;script type="text/javascript" src="/path/to/min/?f=/path/to/script_one.js,/path/to/script_two.js"&gt;&lt;/script&gt;
</pre>
<p>There is actually a shorter way writing this, by using a common base path for both scripts:</p>
<pre class="php">
&lt;script type="text/javascript" src="/path/to/min/?b=path/to&#038;f=script_one.js,script_two.js"&gt;&lt;/script&gt;
</pre>
<p>Provided that all the paths are correct, we have successful concatenated two scripts into one minified script. This can repeated for as many scripts as we like, though there is an artificial limit (due to performance and memory limitations I assume) of 10 scripts per Minify request. This limit can be adjusted in the configuration file which exists at the base path of the library (config.php).
</li>
<li>Using pre-defined script arrays. This method is less flexible as it requires hardcoding the script names into an array, but is slightly more performant. Minify calls those array &#8216;groups&#8217;, which look something like:
<pre class="php">
return array(
    'js' => array('//path/to/script_one.js', '//path/to/script_two.js')
);
</pre>
<p>You can find an example of those in the installation folder of Minify (groupsConfig.php). The request to the script is then made by specifying the group name:</p>
<pre class="php">
&lt;script type="text/javascript" src="/path/to/min/?g=js"&gt;&lt;/script&gt;
</pre>
<p>There are several options when using groups, and you can read on those on the library&#8217;s <a href="http://code.google.com/p/minify/w/list">wiki</a>.
</ol>
<p>Its important to set up the cache folder for Minify. Minifying the scripts can actually take a couple of seconds, which is a relatively long delay - however once generated once they will be cached until the scripts change (provided the cache folder is set-up properly). </p>
<p>To declare the cache folder, simply define a variable named $min_cachePath in the configuration file or the main app file before the main application start.</p>
<pre class="php">
$min_cachePath = '/path/to/cache';
</pre>
<p>Make sure the folder exists and has write permissions by PHP.</p>
<h2>Real world use</h2>
<p>I&#8217;ve implemented Minify in all of my recent projects to great effect. An extreme case would be script deployment in my current startup, <a href="http://www.octabox.com">Octabox web platform</a>. Being a heavy-duty web-application, it consumes much more javascript and styles than your average marketing site.</p>
<h3>Without Minify</h3>
<p>Javascript requests:<br />
<img src="http://www.techfounder.net/wp-content/uploads/2008/11/js.gif" alt="Javascript" style="float:none;" /><br />
CSS requests:<br />
<img src="http://www.techfounder.net/wp-content/uploads/2008/11/css.gif" alt="CSS" style="float:none;" /></p>
<h3>With Minify</h3>
<p>Javascript requests:<br />
<img src="http://www.techfounder.net/wp-content/uploads/2008/11/minifiedjs.gif" alt="Minified Javascript" style="float:none;" /><br />
CSS requests:<br />
<img src="http://www.techfounder.net/wp-content/uploads/2008/11/minifiedcss.gif" alt="Minified CSS" style="float:none;" /></p>
<p>51 total requests weighing 540kb reduced to 5 requests weighing 116kb.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=121" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/11/16/making-web-pages-go-faster-using-php/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Optimizing OR (union) operations in MySQL</title>
		<link>http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/</link>
		<comments>http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/#comments</comments>
		<pubDate>Wed, 15 Oct 2008 06:12:32 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[MySQL]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=126</guid>
		<description><![CDATA[In my last post on database optimization, I focused on improving query performance by optimizing schema - exploring indexing strategies by reading the execution plan. In this post I&#8217;ll show how different query structures can also have a major impact on performance.

Dealing with OR operators in a WHERE clause of an SQL statement in MySQL [...]]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/" title="Profiling MySQL queries with Zend_Db, optimizing by hand">last post</a> on database optimization, I focused on improving query performance by optimizing schema - exploring indexing strategies by reading the execution plan. In this post I&#8217;ll show how different query structures can also have a major impact on performance.<br />
<span id="more-126"></span><br />
Dealing with OR operators in a WHERE clause of an SQL statement in MySQL can be tricky. Up until recently, MySQL could only use one index per table referenced in a query. A multi-column index can be used for  filtering conditions with an AND operator (which is more restrictive by nature), but a condition added by OR must use a separate index because of the logical nature of the opertaor (a <a href="http://en.wikipedia.org/wiki/Union_(set_theory)" target="_blank">union</a>, as opposed to an <a href="http://en.wikipedia.org/wiki/Intersection_(set_theory)">intersection</a> that the AND represents).</p>
<p>MySQL 5.0 added the <a href="http://dev.mysql.com/doc/refman/5.0/en/index-merge-optimization.html" target="_blank">index_merge</a> select type, which allows the query optimizer to possibly select several indexes from a single table and merge them to improve query performance. I say possibly, since leaving such decisions to the optimizer is risky at best. In fact, as I will show next, you are sometimes left with no indexes selected out of several possible options, resulting in a full table scan.</p>
<p>Continuing from my last post, I&#8217;ll use a real-world example to show the different paths the queries optimizer can take when preparing an execution plan for our queries.</p>
<p>I&#8217;ll actually be working with the same query I profiled last time, with a minor change. The relevant table structure is as follows:</p>
<pre class="sql">--
-- Table structure for table `tasks`
--

CREATE TABLE IF NOT EXISTS `tasks` (
  `id` int(13) NOT NULL auto_increment,
  `list_id` int(13) NOT NULL,
  `user_id` int(13) NOT NULL,
  `task` varchar(255) collate utf8_unicode_ci NOT NULL,
  `due` timestamp NULL default NULL,
  `created` timestamp NOT NULL default CURRENT_TIMESTAMP,
  `checked` timestamp NULL default NULL,
  `assigned` int(13) default NULL,
  `done` enum('1') collate utf8_unicode_ci default NULL,
  PRIMARY KEY  (`id`),
  KEY `user_id` (`user_id`,`due`),
  KEY `assigned` (`assigned`,`due`),
  KEY `due` (`due`)
) ENGINE=InnoDB</pre>
<p>And the query itself:</p>
<pre class="sql">SELECT `tasks`.`id`,
           `tasks`.`task`,
           UNIX_TIMESTAMP(due) AS `time`,
           `lists`.`name` AS `list_name`,
           `lists`.`id` AS `list_id`
FROM `tasks`
INNER JOIN `lists` ON lists.id=tasks.list_id
WHERE (tasks.user_id='1' OR assigned='1')
     AND (tasks.done IS NULL)
     AND (tasks.due IS NOT NULL)
ORDER BY `tasks`.`due` ASC</pre>
<p>This query is almost the same as the one I worked with last time. The only difference being the first statement in the WHERE clause, involving an OR operator:</p>
<blockquote><p>(tasks.user_id=&#8217;1&#8242; OR assigned=&#8217;1&#8242;)</p></blockquote>
<p>Just to understand what I&#8217;m doing here - The query selects from the tasks table with several filtering criteria. The last OR statement conveys the condition that the tasks selected either belong to specific user (i.e created by him) or assigned to him (those two possibly coincide).</p>
<p>For testing purposes I will be running the queries against a testing database I set up with plenty of mock data. The database is around 1Gb in total size, with the tasks table at about 1.8 million rows. It&#8217;s not very large, but enough for significant data to be obtained while allowing relatively online tampering with schema (modifying keys takes <em>only</em> around 4 minutes to complete).</p>
<p>Running in original form computes as (average of 10 queries):</p>
<blockquote><p>(304 total, Query took 6.84 sec)</p></blockquote>
<p>6.84 sec is way too long for running a query in a typical web application. If you&#8217;d recall, I last got this query running at <strong>0.0008 sec</strong> without the additional OR condition, meaning it is running at around 7,000 times slower (yikes). In such a case you would suspect the indexes are not selective enough, and sure enough an EXPLAIN reveals:</p>
<table class="explain_data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>index</td>
<td>user_id,assigned,due</td>
<td>due</td>
<td>5</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1875432</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>MySQL is performing a regular index select and not an index_merge as we&#8217;d like, and doing that it selects the worst possible index - the only one that doesn&#8217;t filter the result set. Sure enough, all 1.8M rows are scanned and the query is underperforming badly. Trying to force the issue, I first add an IGNORE INDEX(due) to remove it from the equation:</p>
<blockquote><p>(304 total, Query took 1.41 sec)</p></blockquote>
<table class="explain_data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>ALL</td>
<td>user_id,assigned</td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1870838</td>
<td>Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>The query optimizer has decided not to use an index at all, despite several good candidates. However, query performance is improved since a full table scan with no index is performed straight up. It&#8217;s still way too slow, and we&#8217;d like it to use a filtering index so it can avoid a full table scan. Trying to force the other indexes doesn&#8217;t work so we&#8217;re left with no choice but to try alternative query structures.</p>
<p>Our first candidate is replacing our OR condition with two UNION&#8217;ed select statements (the inspiration is from a <a href="http://www.mysqlperformanceblog.com/2007/09/18/possible-optimization-for-sort_merge-and-union-order-by-limit/" target="_blank">couple</a> of <a href="http://www.mysqlperformanceblog.com/2007/10/05/union-vs-union-all-performance/" target="_blank">posts</a> over at the MySQL performance blog). Breaking the original query into a UNION form results in:</p>
<pre class="sql">(SELECT `tasks`.`id`,
            `tasks`.`task`,
            UNIX_TIMESTAMP(due) AS `time`,
            `lists`.`name` AS `list_name`,
            `lists`.`id` AS `list_id`
FROM `tasks`
INNER JOIN `lists` ON lists.id=tasks.list_id
WHERE (tasks.done IS NULL)
    AND (tasks.due IS NOT NULL)
    AND (tasks.user_id='1')
) UNION ALL (
 SELECT `tasks`.`id`,
            `tasks`.`task`,
            UNIX_TIMESTAMP(due) AS `time`,
            `lists`.`name` AS `list_name`,
            `lists`.`id` AS `list_id`
FROM `tasks`
INNER JOIN `lists` ON lists.id=tasks.list_id
WHERE (tasks.done IS NULL)
    AND (tasks.due IS NOT NULL)
    AND (assigned='1' AND tasks.user_id!='1')
) ORDER BY `time` ASC
</pre>
<p>This unsightly looking query looks much more complex, but actually gets the job done:</p>
<blockquote><p>(304 total, Query took 0.0021 sec)</p></blockquote>
<p>A big improvement to say the least (~3500 faster than the original query).<br />
The explain shows why:</p>
<table class="explain_data" border="0">
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>tasks</td>
<td>range</td>
<td>user_id,due</td>
<td>user_id</td>
<td>9</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">304</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
<tr class="odd">
<td class="nowrap" align="right">2</td>
<td>UNION</td>
<td>tasks</td>
<td>range</td>
<td>user_id,assigned,due</td>
<td>assigned</td>
<td>10</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">2</td>
<td>UNION</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
<tr class="odd">
<td align="right"><em>NULL</em></td>
<td>UNION RESULT</td>
<td>&lt;union1,2&gt;</td>
<td>ALL</td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td align="right"><em>NULL</em></td>
<td>Using filesort</td>
</tr>
</tbody>
</table>
<p>Only 304 rows are scanned (as opposed to the entire 1.8M table before). Despite the filesort, we are in the area of usability for this query.</p>
<p>Another approach would be to determine why the index_merging isn&#8217;t take place and how can we enforce it. By reducing the WHERE clause to include only the columns covered by an index, index_merging kicks in resulting in a similar performance as the union, albeit with less selective filtering:</p>
<blockquote><p>(368 total, Query took 0.0018 sec)</p></blockquote>
<p>We could filter our result set in the application, but I&#8217;d like to keep it in the SQL where it belongs. In order to integrate the rest of the filtering statements, I use an IN condition on another index I have available - the primary key. This results in:</p>
<pre class="sql">SELECT `tasks`.`id`,
           `tasks`.`task`,
           UNIX_TIMESTAMP(due) AS `time`,
           `lists`.`name` AS `list_name`,
           `lists`.`id` AS `list_id`
FROM `tasks`
INNER JOIN `lists` ON lists.id=tasks.list_id
WHERE (tasks.user_id='1' OR assigned='1')
   AND tasks.id IN
     (SELECT id
      FROM tasks
      WHERE (tasks.done IS NULL)
        AND (tasks.due IS NOT NULL)
     )
ORDER BY `tasks`.`due` ASC
</pre>
<p>Which brings us back to the performance levels of the UNION form:</p>
<blockquote><p>(304 total, Query took 0.0021 sec)</p></blockquote>
<p>And we can see in the EXPLAIN that it is indeed an index_merge operation:</p>
<table class="explain_data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>tasks</td>
<td>index_merge</td>
<td>user_id,assigned</td>
<td>user_id,assigned</td>
<td>4,5</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">369</td>
<td>Using sort_union(user_id,assigned); Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>PRIMARY</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
<tr class="odd">
<td class="nowrap" align="right">2</td>
<td>DEPENDENT SUBQUERY</td>
<td>tasks</td>
<td>unique_subquery</td>
<td>PRIMARY,due</td>
<td>PRIMARY</td>
<td>4</td>
<td>func</td>
<td class="nowrap" align="right">1</td>
<td>Using where</td>
</tr>
</tbody>
</table>
<p>So there you have it. From 6.83 seconds to 0.0021 seconds with some tweaking to the query structure. I am still not completely satisfied with the fact that it&#8217;s using filesort, but I couldn&#8217;t get it to use another index for the operation. Without the sorting the query is twice as fast:</p>
<blockquote><p>(304 total, Query took 0.0010 sec)</p></blockquote>
<p>So if it ever becomes an issue I could move sorting to the application code. Hopefully by then merge_index is better implemented in MySQL.</p>
<p>Regarding the query strucutre - personally I use the subquery format since it is more compact and maintainable. It is always good however to be familiar with all the alternatives.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=126" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/10/15/optimizing-or-union-operations-in-mysql/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Profiling queries with Zend_Db and optimizing them by hand</title>
		<link>http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/</link>
		<comments>http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/#comments</comments>
		<pubDate>Sun, 12 Oct 2008 05:42:55 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[MySQL]]></category>

		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=125</guid>
		<description><![CDATA[Database performance is one of the major bottlenecks for most web applications. Most web developers are not database experts (and I&#8217;m no exception), there are however several basic methods to analyze and optimize database performance without resorting to expert consultants (such as those, whose founders blogs are an invaluable source of MySQL knowledge).

The Performance Equation
Database [...]]]></description>
			<content:encoded><![CDATA[<p>Database performance is one of the major bottlenecks for most web applications. Most web developers are not database experts (and I&#8217;m no exception), there are however several basic methods to analyze and optimize database performance without resorting to expert consultants (such as <a href="http://www.percona.com/">those</a>, whose founders blogs are an <a href="http://www.mysqlperformanceblog.com/">invaluable</a> <a href="http://www.xaprb.com/blog/">source</a> of MySQL knowledge).<br />
<span id="more-125"></span></p>
<h3>The Performance Equation</h3>
<p>Database performance is affected by many different variables - the running machine specs, OS, database engine and configuration, table schema and the queries running against it. Since I&#8217;m dealing mostly (only?) with MySQL, this article covers it mainly (though it is probably relevant to a large degree for other engines). As for OS and machine specs, I&#8217;ll take them out of the equation as I&#8217;m interested in optimizing <strong>relative</strong> performance on the same machine.</p>
<p>Basically I&#8217;m interested in optimizing:</p>
<ul>
<li>The structure of my database tables (schema)</li>
<li>The structure of my application-level queries (SELECT, UPDATE, INSERT, DELETE)</li>
</ul>
<h3>Profiling Database Performance</h3>
<p>Blindly optimizing queries and database schema is counter-productive.  First we should <em>know</em> what we should be optimizing, and for that we need data. The act of gathering data for optimization is called profiling or <a href="http://en.wikipedia.org/wiki/Performance_analysis">performance analysis</a>.</p>
<p>The first step I take when profiling database performance for a web app is to measure the running time of all the queries running in it. Absolute run time of a query is not necessarily a good measure of how optimized / performant it is, since some queries are naturally more complex or pull more data - It will give me a good idea however of where to start improving the response time of the application I&#8217;m optimizing. My main goal is not specific query performance, but overall system performance.</p>
<p>Measuring the run time of a query can be done with simple timers using <a title="PHP: microtime()" href="http://www.php.net/microtime">microtime()</a> calls (Check out <a href="http://www.coderholic.com/php-profile-class/">this post</a> for an abstraction), running it in a tool that automatically provides such statistics (phpMyAdmin for example) or using an integrated profiler.</p>
<p>I am using the <a title="Zend_Db_Profiler" href="http://framework.zend.com/manual/en/zend.db.profiler.html">Zend_Db_Profiler,</a> which is convenient for me since I&#8217;m using the Zend Framework and all database access converges to a Zend_Db_Adapter connection. The profiler basically uses the microtime() approach but integrates it transparently into all of the queries without me having to wrap them one by one.</p>
<p>Usage is pretty simple. First you need to pass an extra parameter to your Zend_Db_Adapter instance to activate profiling:</p>
<pre class="php">$params = array(
    'host'     =&gt; 'localhost',
    'username' =&gt; 'dbusername',
    'password' =&gt; 'dbpassword',
    'dbname'   =&gt; 'dbname',
    'profiler' =&gt; true  // turn on profiler, disabled by default
);

$db = Zend_Db::factory('PDO_MYSQL', $params);</pre>
<p>The Zend Framework manual shows some advanced usage (such as profiling directly into firebug, which is very convenient), however for our purposes we simply want to dump the queries and their run time.</p>
<pre class="php">$profiler = $db -&gt; getProfiler();

$profile = '';
foreach($profiler -&gt; getQueryProfiles() as $query) {
	$profile .= $query -&gt; getQuery() . "\n"
	     . 'Time: ' . $query -&gt; getElapsedSecs();
}

echo $profile;</pre>
<p>This snippet goes of course after the view has been rendered and all queries executed.</p>
<h3>Using a relevant dataset</h3>
<p>It&#8217;s important to know whether the data you are profiling against is relevant for when you believe you will start hitting performance issues. Granted, you can never really know when will that be, however it is always preferable to work with a dataset that resembles your (projected) production environment.</p>
<p>I will walk through a specific use-case I recently went through while preparing for the beta release of my startup, <a title="Octabox web platform" href="http://www.octabox.com" target="_blank" title="Business Platform Octabox">business platform Octabox</a>.</p>
<p>Running the profiler on one of the views in the application produced the following output on my development machine:</p>
<blockquote><p>SELECT `history`.`id`, `history`.`type`, `history`.`action`, `history`.`value`, UNIX_TIMESTAMP(history.logged_on) AS `time`, `history`.`user_id`, `history`.`user_name`, `history`.`parent_id`, `history`.`parent_type`, `history`.`parent_name`, `history`.`list_id`, `history`.`list_type`, `history`.`list_name`, `history`.`logging_user_id`, `tags`.`color`, `tags`.`id` AS `tag_id` FROM `history` LEFT JOIN `tags` ON tags.id=history.tag_id AND tags.user_id=&#8217;1&#8242; WHERE (history.user_id=1) AND (history.logged_on &gt;= &#8216;2008-09-27 00:00:00&#8242;) ORDER BY `history`.`logged_on` DESC<br />
<strong>Time: 0.0026991367340088</strong></p>
<p>SELECT `tasks`.`id`, `tasks`.`task`, UNIX_TIMESTAMP(due) AS `time`, `lists`.`name` AS `list_name`, `lists`.`id` AS `list_id` FROM `tasks` INNER JOIN `lists` ON lists.id=tasks.list_id WHERE (tasks.done IS NULL) AND (tasks.due IS NOT NULL) AND (tasks.user_id=&#8217;1&#8242;) ORDER BY `tasks`.`due` ASC<br />
<strong>Time: 0.00059294700622559</strong></p>
<p>SELECT COUNT(*) AS `count` FROM `dispatch` WHERE (dispatch.user_id=&#8217;1&#8242;) AND (dispatch.status=&#8217;0&#8242;)<br />
<strong>Time: 0.00044488906860352</strong></p>
<p>SELECT COUNT(*) AS `count` FROM `conversations_to_users` WHERE (user_id=&#8217;1&#8242; AND status=&#8217;0&#8242;)<br />
<strong>Time: 0.00038599967956543</strong></p></blockquote>
<p>At first sight it would appear that the first query would be our immediate suspect for optimization, though the gains would not be great (completes in just over 0.002 seconds). Switching to a database I&#8217;ve prepared before hand with plenty of dummy data (~1.1Gb, some tables over 5M rows) the output looks very different:</p>
<blockquote><p>&nbsp;</p></blockquote>
<p>That&#8217;s right, there was no output. My development machine was hanging on this particular view when switched to the dummy database (I confirmed this was database related by running the queries independently in the MySQL command line). I switched over to our staging server to see if it can handle rendering this view against the dummy database. Luckily, it could and the suspect query immediately showed up in the output:</p>
<blockquote><p>SELECT `history`.`id`, `history`.`type`, `history`.`action`, `history`.`value`, UNIX_TIMESTAMP(history.logged_on) AS `time`, `history`.`user_id`, `history`.`user_name`, `history`.`parent_id`, `history`.`parent_type`, `history`.`parent_name`, `history`.`list_id`, `history`.`list_type`, `history`.`list_name`, `history`.`logging_user_id`, `tags`.`color`, `tags`.`id` AS `tag_id` FROM `history`<br />
LEFT JOIN `tags` ON tags.id=history.tag_id AND tags.user_id=&#8217;1&#8242; WHERE (history.user_id=1) AND (history.logged_on &gt;= &#8216;2008-09-27 00:00:00&#8242;) ORDER BY `history`.`logged_on` DESC<br />
<strong>Time: 0.00042605400085449</strong></p>
<p>SELECT `tasks`.`id`, `tasks`.`task`, UNIX_TIMESTAMP(due) AS `time`, `lists`.`name` AS `list_name`, `lists`.`id` AS `list_id` FROM `tasks`<br />
INNER JOIN `lists` ON lists.id=tasks.list_id WHERE (tasks.done IS NULL) AND (tasks.due IS NOT NULL) AND (tasks.user_id=&#8217;1&#8242;) ORDER BY `tasks`.`due` ASC<br />
<strong>Time: 58.160161972046</strong> <em>//OMFG! what gives&#8230;</em></p>
<p>SELECT COUNT(*) AS `count` FROM `dispatch` WHERE (dispatch.user_id=&#8217;1&#8242;) AND (dispatch.status=&#8217;0&#8242;)<br />
<strong>Time: 0.00032281875610352</strong></p>
<p>SELECT COUNT(*) AS `count` FROM `conversations_to_users` WHERE (user_id=&#8217;1&#8242; AND status=&#8217;0&#8242;)<br />
<strong>Time: 0.00026297569274902</strong></p></blockquote>
<p>A query that took just over 0.0005 seconds on a small database, took almost a minute to complete on a bigger one. Surprisingly enough the other queries remained blazingly fast (even faster than on my development machine with a small database, which is a credit to our server&#8217;s power).</p>
<h3>Optimizing The Errant Query</h3>
<p>So I&#8217;ve found a very problematic query to say the least. Taking 58.1 seconds to complete is obviously not acceptable for any real time application. Running EXPLAIN against the query revealed the following execution plan:</p>
<table class="explain_data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>index</td>
<td>user_id</td>
<td>user_id</td>
<td>9</td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1868081</td>
<td>Using where</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>The EXPLAIN results show the most immediate problem with the query - it was performing a join against every row in a 1.8M row table. </p>
<p>Examining the details of the query plan, it appears both selects are using an index, the first one limited by a WHERE (on the surface at least - since all rows were scanned). Since despite using an index the entire table was scanned very slowly, I tried to see the query will respond if I removed the index. Using IGNORE INDEX(user_id), I re-ran the query (this time in the Mysql command line):</p>
<blockquote><p>304 rows in set (<strong>1.53 sec</strong>)</p></blockquote>
<p>Wow, that&#8217;s a big difference <strong>for not using</strong> an index. Running EXPLAIN on this reveals:</p>
<table class="explain_data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>ALL</td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td><em>NULL</em></td>
<td class="nowrap" align="right">1866244</td>
<td>Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>Despite using filesort to sort the query, it was running almost <strong>30 times</strong> faster than before. By forcing it not to use an index which was not selective enough (i.e. not at all), the query ran a full table scan instead and completed much faster due to a better execution plan.</p>
<p>So we&#8217;ve improved greatly, but still not enough for my needs. Having one query taking over a second to complete when all the others are completing in sub 0.01 second times means it will be a bottleneck as the database grows (remember, the same query completed in just over 0.0005 seconds on a much smaller database).</p>
<p>Examining the structure of the table in question (tasks) revealed another interesting revealation - there is <strong>no index</strong> on the user_id column. So what happened? it appears there used to be an index on the user_id column, however it was removed just before all the dummy data was inserted into the database. Running ANALYZE TABLE tasks repaired the key information to the current state (no index on user_id).</p>
<p>I wanted to try the original query, this time with an actual index on the filtering column (user_id). After adding the index, I re-ran the query:</p>
<blockquote><p>(304 total, Query took <strong>0.0021 sec</strong>)</p></blockquote>
<p>Much better! but I was not done. Running EXPLAIN yet again reveals the following execution plan:</p>
<table class="explain_data" border="0">
<thead>
<tr>
<th>id</th>
<th>select_type</th>
<th>table</th>
<th>type</th>
<th>possible_keys</th>
<th>key</th>
<th>key_len</th>
<th>ref</th>
<th>rows</th>
<th>Extra</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>tasks</td>
<td>ref</td>
<td>user_id</td>
<td>user_id</td>
<td>4</td>
<td>const</td>
<td class="nowrap" align="right">368</td>
<td>Using where; Using filesort</td>
</tr>
<tr class="even">
<td class="nowrap" align="right">1</td>
<td>SIMPLE</td>
<td>lists</td>
<td>eq_ref</td>
<td>PRIMARY</td>
<td>PRIMARY</td>
<td>4</td>
<td>tasks.list_id</td>
<td class="nowrap" align="right">1</td>
<td></td>
</tr>
</tbody>
</table>
<p>So indeed, using <strong>an actual index</strong> was a big help, filtering the set to only 368 rows before processing the rest of the query. However, you&#8217;d notice it is running a filesort which we&#8217;d like to avoid. Preferably, I could set up an index combination that both filters and sorts using an index.</p>
<p>Creating an index composed of both the &#8216;user_id&#8217; and &#8216;due&#8217; columns got it right on the head, leading to:</p>
<blockquote><p>(304 total, Query took <strong>0.0008 sec</strong>)</p></blockquote>
<p>Not as dramatic as previous improvements, but still three times better than the last iteration.</p>
<h3>Final Words</h3>
<p>The entire process took several hours of head scratching to arrive at all the results I&#8217;ve shown here. I managed to take a query that would cripple our server (58 seconds for one completion!) to an extremely fast one (0.0008 seconds against a 1.8M row table), through the use of some basic profiling and examining the execution plan.</p>
<p>Being an edge case (a non existent index used to filter against a 1.8M row table&#8230;), I gained valuable experience on potential pitfalls and on reading between the lines in the execution plan.</p>
<p>Some technical gibberish:</p>
<p>All queries were run with SQL_NO_CACHE to prevent caching from tampering with the results. I ran each query at least 5 times to make sure I was getting the right readings.</p>
<p>The specs of the server machine that ran the queries are as follows:</p>
<ul>
<li>Intel \ 2.4 GHz 1066FSB - Conroe \ Xeon 3060 (Dual Core)</li>
<li>2 x Generic \ 1024 MB \ DDR2 667 ECC</li>
<li>2 x Maxtor \ 146GB:SAS:10K RPM \ Atlas 10K - SAS</li>
<li>CentOS Enterprise Linux - x86_64 - OS ES 5.0</li>
<li>Running PHP 5.2.6 with MySQL 5.0.45 on Apache 2.2.4</li>
</ul>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=125" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/10/12/profiling-queries-with-zend_db-and-optimizing-them-by-hand/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Octabox launched and I&#8217;m back to blogging</title>
		<link>http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/</link>
		<comments>http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/#comments</comments>
		<pubDate>Fri, 10 Oct 2008 05:42:26 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[MySQL]]></category>

		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Web development]]></category>

		<category><![CDATA[techfounder]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=123</guid>
		<description><![CDATA[I&#8217;ve been super-busy the last couple of months - I&#8217;ve came across a stupendous amount of work that I couldn&#8217;t refuse in addition to the effort towards the release of my own startup, business platform Octabox. Things are finally calming down, and I&#8217;ll be getting back to blogging, writing about plenty of things I&#8217;ve learned [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been super-busy the last couple of months - I&#8217;ve came across a stupendous amount of work that I couldn&#8217;t refuse in addition to the effort towards the release of my own startup, <a href="http://www.octabox.com" title="Octabox Business Platform">business platform Octabox</a>. Things are finally calming down, and I&#8217;ll be getting back to blogging, writing about plenty of things I&#8217;ve learned / implemented / experimented with the last two months.</p>
<p><span id="more-123"></span></p>
<p>First I&#8217;d like to expand on the (soft) release of Octabox - Octabox is a web platform for professional individuals and small businesses, providing a complete informartion management and collaboration environment. The concept was borne in the mind of my good friend and Octabox co-founder, <a href="http://adambe.com">Adam Benayoun</a>, as the result of accumulated experience in building information management systems for many small businesses over the years.</p>
<p>Octabox is a cross between the vast functionality offered by application platform <a title="Salesforce" href="http://www.salesforce.com/">Salesforce</a> and the simple and usable approach of <a title="Basecamp project managment" href="http://www.basecamphq.com/">basecamp</a> (of <a title="37Signals" href="http://www.37signals.com/">37signals</a>). It is an application platform in the sense that it offers many on-demand applications for managing information (such as task management, contact relationship mangement, whiteboard/brainstorming and much more), but the focus is on keeping everything as simple and usable as possible to accomodate the needs of small business operations (all the way down to single individuals - freelancers, consultants and so forth). It is our belief that this sector is feeling the need for more powerful tools, yet is reluctant to use so called &#8220;enterprise&#8221; applications due to complexity and cost.</p>
<p>Octabox now enters a private beta phase to weed out issues and to figure what features are in demand so they&#8217;d be integrated in the public release. Hence, the &#8220;soft&#8221; release - the marketing effort has not begun yet (as can be seen by missing content on the site). We are handing out beta invitations if you register through our waiting list on the site. Blog readers get special treatment, so contact me through the contact form to secure your invitation ;-).</p>
<p>Octabox by the way runs on PHP 5.2.6 with MySQL 5.1 with Zend Framework 1.6 as the abstraction layer. I have much to say about building a high-availability entreprise web-application using those tools, so stay tuned as I already have a couple article drafts on the pipeline on framework performance, mysql query optimizations, server setup and much more.</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=123" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/10/10/octabox-launched-and-im-back-to-blogging/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Distiliing Usability</title>
		<link>http://www.techfounder.net/2008/09/23/distiliing-usability/</link>
		<comments>http://www.techfounder.net/2008/09/23/distiliing-usability/#comments</comments>
		<pubDate>Mon, 22 Sep 2008 23:46:59 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[UI]]></category>

		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=122</guid>
		<description><![CDATA[Brilliant.
http://stuffthathappens.com/blog/2008/03/05/simplicity/
 ]]></description>
			<content:encoded><![CDATA[<p>Brilliant.</p>
<p><a href="http://stuffthathappens.com/blog/2008/03/05/simplicity/">http://stuffthathappens.com/blog/2008/03/05/simplicity/</a></p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=122" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/09/23/distiliing-usability/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Chrome is out. Google has my vote</title>
		<link>http://www.techfounder.net/2008/09/04/chrome-is-out-google-has-my-vote/</link>
		<comments>http://www.techfounder.net/2008/09/04/chrome-is-out-google-has-my-vote/#comments</comments>
		<pubDate>Thu, 04 Sep 2008 21:36:24 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[Open Source]]></category>

		<category><![CDATA[The Webs]]></category>

		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=120</guid>
		<description><![CDATA[So Google Chrome was unleashed on the unsuspecting public yesterday with very little preceding hype. It enters a market that has thus far has had only two major players - Mozilla and Microsoft. Backed by marketing power that is unrivaled in the online world, it is strongly positioned to take both on (and especially Microsoft).
A [...]]]></description>
			<content:encoded><![CDATA[<p>So Google Chrome was unleashed on the unsuspecting public yesterday with very little preceding hype. It enters a market that has thus far has had only two major players - Mozilla and Microsoft. Backed by marketing power that is unrivaled in the online world, it is strongly positioned to take both on (and especially Microsoft).</p>
<p>A web browser built on the webkit engine (same as Safari), Chrome offers a simple UI and extensive support for web technologies. Having used it for a couple of days now, it is striking to me how obvious it is that Google is a web company - in bold contrast to another software giant currently pushing for their next-gen browser.</p>
<p><span id="more-120"></span></p>
<p>There is a lot to like about Chrome: <strong>It is open-source</strong>. The tab oriented UI is a great innovation. The simple UI allows for an impressively large work area in the browser. It comes with a built in DOM inspector and Javascript debugger (almost rivaling Firebug for functionality). The tab/process manager is an awesome feature (though <a title="John Resign on Chrome process manager" href="http://ejohn.org/blog/google-chrome-process-manager" target="_blank">a possible source of contention</a>). And most importantly - aside from minor inconsistencies from the standards (shared by Safari), all web sites render perfectly under Chrome.</p>
<p>I was somehow surprised though it only reaches a score of 79 on the acid3 test, since it is based on an engine that already scored a perfect 100 (webkit). For comparison, Firefox 3.0.1 achieves a score of 71 (though with less graphical glitches), IE7 achieves a measly score of 13 in about triple the time, and IE8 beta 2 just barely beats that with a very low 21.</p>
<p>It is my personal hope that Chrome steals enough market share from Microsoft to help push out older IE versions, thus catapulting the web forward. This has been a fantastic move by Google, and the online world is watching to see how the market will respond. Is a Google OS next on the agenda?</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=120" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/09/04/chrome-is-out-google-has-my-vote/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A web 2.0 business model can work, and work well</title>
		<link>http://www.techfounder.net/2008/08/23/a-web-20-business-model-can-work-and-work-well/</link>
		<comments>http://www.techfounder.net/2008/08/23/a-web-20-business-model-can-work-and-work-well/#comments</comments>
		<pubDate>Sat, 23 Aug 2008 04:36:43 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[Business Development]]></category>

		<category><![CDATA[Interesting]]></category>

		<category><![CDATA[The Webs]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=114</guid>
		<description><![CDATA[The term web 2.0 has been frequently misused and misunderstood, however it is more than a buzz word - it defines a very real phenomenon in which user generated content can be the driving force behind an online site / service. 
Some very well known and successful online entities can be considered as such - [...]]]></description>
			<content:encoded><![CDATA[<p>The term <a href="http://en.wikipedia.org/wiki/Web_2.0" target="_blank" title="Wikipedia: Web 2.0">web 2.0</a> has been frequently misused and misunderstood, however it is more than a buzz word - it defines a very real phenomenon in which user generated content can be the driving force behind an online site / service. </p>
<p>Some very well known and successful online entities can be considered as such - <a href="http://www.wikipedia.org" target="_blank" title="Wikipedia">Wikipedia</a> in which users contribute knowledge, <a href="http://www.digg.com" target="_blank" title="Digg">Digg</a> in which users help others find interesting articles by voting and <a href="http://www.facebook.com" target="_blank" title="facebook">facebook</a> which is the current golden standard for social networking (in which user generated content - <a href="http://en.wikipedia.org/wiki/User-generated_content" target="_blank" title="Wikipedia: User Generated Content">UGC</a> - is a given).<br />
<span id="more-114"></span><br />
However, those sites are not thought of as having strong business models. Wikipedia, of course, is free. Digg and facebook rely on advertising, which is the fallback business model on the web - conversion is relatively low and you can only count on decent revenue when you reach the size of the aforementioned sites.</p>
<p>As can be guessed from the title of this article, I would like to discuss a couple of sites / services that use UGC as the basis for a viable business model:</p>
<h2>Case 1: Threadless</h2>
<p><img src="http://www.techfounder.net/wp-content/uploads/2008/08/threadless1.jpg" alt="" /></p>
<p><a href="http://www.threadless.com/" target="_blank" title="Threadless">Threadless</a> is an online t-shirt retailer with a twist - the concepts for the t-shirts are submitted by users, voted on by the community and finally hand picked by the staff. A winning design will be printed and sold through the site, and its creator wins a nice sum of 2,500$. That&#8217;s the basic premise - there are several variations such as contests and reprints, but at its base - its a marketplace for ideas. </p>
<p>Threadless is already an established online brand, existing since 2000. It proves that you can crowd source creativity for fresh product ideas in a way that is both beneficial for the product distributor and the creative contributor. Threadless builds on this premise and succeeds because of their excellent blend of branding, community integration and good service. I have personally purchased at threadless multiple times and I&#8217;ve had nothing but satisfaction and enjoyment from the service.</p>
<p>Threadless currently has revenue in excess of 30$ million, and income of 10$ million annually. Not bad for a startup funded with a 1000$ seed (earned in an online t-shirt content, no less).</p>
<h2>Case 2: crowdSPRING</h2>
<p><img src="http://www.techfounder.net/wp-content/uploads/2008/08/crowdspring.jpg" alt="" style="float:none; margin:auto;" /></p>
<p>I had just recently discovered <a href="http://www.crowdspring.com/" title="crowdSPRING" target="_blank">crowdSPRING</a>, an online service that mates graphical design talent with design related projects. The concept is to try to improve traditional design projects proceedings for both sides:</p>
<p> - Project requesting parties (called buyers) are guaranteed to get at least 25 different concepts for their needs (be it a logo, website design, print and others), with a money-back guarantee.</p>
<p> - Graphical designers (called creatives) are given a global stage to show their work and generate income. Also, since buyers pay in advance, the winning piece is guaranteed to receive payment - there is no scenario in which a buyer can say he doesn&#8217;t like anything and walk away.</p>
<p> - crowdSPRING itself takes a small commission on top of the project award money. This is the main business model for the site.</p>
<p>What drives the service is the interaction between buyers and creatives. Creating a project and watching the concepts improve as both sides learn more about the requirements through iteration and communication is a very interesting experience.</p>
<p>We&#8217;ve recently created a <a href="http://www.crowdspring.com/projects/graphic_design/logo/logo_needed_for_a_web_platform" target="_blank" title="Logo needed for a web platform">logo design project</a> on crowdSPRING, and the reaction has been phenomenal. Though we have a couple of designers on board at <a href="http://www.octabox.com" target="_blank" title="Octabox web platform">Octabox</a>, we felt we needed a fresh approach as we have been too deeply involved for a long time now. There are still 9 days (out of 14) till the project ends, and we already have an incredible amount of entries (over 300).</p>
<p>The process itself was worth the price of admission - through the interaction with the many contributors, we achieved some insights on what we want in a logo and a design direction for our new website.</p>
<p>crowdSPRING is not yet an established entity like Threadless, but it is well on its way to becoming one. Another example of how to crowd source creativity in a win-win situation for all involved.</p>
<h2>Web 2.0 as a viable business model</h2>
<p>Those two sites are just a small sample of many successful sites / services built on UGC as the driving force. I believe this market is still mainly untapped - there are plenty of possibilities to be explored. Despite that the basic premise is always the same:</p>
<p> - Create a community around a product / service concept<br />
 - Allow / encourage the talent within the community to offer their skills to the rest of the community<br />
 - Facilitate the interaction between the talent and the community, while trying to interfere as little as possible</p>
<p>What other successful web2.0 business models have you seen?</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=114" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/08/23/a-web-20-business-model-can-work-and-work-well/feed/</wfw:commentRss>
		</item>
		<item>
		<title>IE8 Wants You! but only if you&#8217;re awesome</title>
		<link>http://www.techfounder.net/2008/08/09/ie8-wants-you-but-only-if-youre-awesome/</link>
		<comments>http://www.techfounder.net/2008/08/09/ie8-wants-you-but-only-if-youre-awesome/#comments</comments>
		<pubDate>Fri, 08 Aug 2008 23:11:40 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[Web development]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=112</guid>
		<description><![CDATA[Internet Explorer development is moving along and and as part of an effort to improve the quality of their next browser, Microsoft has posted a request for bug-testers on the IEBlog. 
 Beta 2 is right around the corner and we are expanding our reach!  If you wish to be a part of making [...]]]></description>
			<content:encoded><![CDATA[<p>Internet Explorer development is moving along and and as part of an effort to improve the quality of their next browser, Microsoft has posted <a href="http://blogs.msdn.com/ie/archive/2008/07/30/wanted-ie8-beta-testers.aspx" target="_blank">a request for bug-testers</a> on the IEBlog. </p>
<blockquote><p> Beta 2 is right around the corner and we are expanding our reach!  If you wish to be a part of making IE better by contributing great bug reports then please email us at IESO@microsoft.com and tell us a little about yourself including why you’d be a great beta tester.</p></blockquote>
<p>Reading the blog responses and <a href="http://whyiesucks.blogspot.com/2008/07/request-to-join-ie8-technical-beta.html" target="_blank">this post</a> on the subtly named &#8216;Why IE sucks&#8217; blog, it seems the reactions are mostly negative. Aside from some unfortunate phrasing, I have to wonder why.<br />
<span id="more-112"></span><br />
Most commentators are comparing it to open-source development and wonder out loud why should they contribute their precious time to a commercial product. <b>The reason is simple</b>: as long as Windows remains the dominant OS, Internet Explorer will remain the dominant browser. All of us developers will have to support IE8 in the near future - it is in our best interest that it will be as bug-free and standards-compliant as it can be.</p>
<p>Comparing IE development to open-source projects is a mistake. A more appropriate comparison should be to the public testing periods some computer games go through. Those testing releases are by invitation only most of the time, and the end product is almost never available for free (this is the business model of computer games, at least for now). Beta testers are motivated to participate since they intend the purchase the final product, and they want it to be the best it can be for their own personal benefit.</p>
<p>IE8 development is similar - possible users and developers are expected to contribute since it will be for their benefit. The question at the end of the day is - can you come to terms with Microsoft and do it their way? if previous history is any indication, complaining about it is pretty much ineffective. To quote the Flight of the Conchords - <b>Be more constructive with your feedback, please</b>.</p>
<p>I have submitted my tester-status request. What about you?</p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=112" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/08/09/ie8-wants-you-but-only-if-youre-awesome/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The lost art of user experience</title>
		<link>http://www.techfounder.net/2008/07/26/the-lost-art-of-user-experience/</link>
		<comments>http://www.techfounder.net/2008/07/26/the-lost-art-of-user-experience/#comments</comments>
		<pubDate>Sat, 26 Jul 2008 05:16:03 +0000</pubDate>
		<dc:creator>Eran Galperin</dc:creator>
		
		<category><![CDATA[UI]]></category>

		<guid isPermaLink="false">http://www.techfounder.net/?p=55</guid>
		<description><![CDATA[User interface design is my favorite part of the development process. The problems it poses are the most interesting, and thinking up solutions is a form of creative expression. Users consume our applications through the interface - one chance to either deliver a satisfying experience or fail miserably.
It is a topic I have very strong [...]]]></description>
			<content:encoded><![CDATA[<p>User interface design is my favorite part of the development process. The problems it poses are the most interesting, and thinking up solutions is a form of creative expression. Users consume our applications through the interface - one chance to either deliver a satisfying experience or fail miserably.</p>
<p>It is a topic I have very strong and passionate opinions of, and motivated by <a href="http://jonoscript.wordpress.com/2008/07/17/these-things-i-believe/" target="_blank">this beautiful prose</a> by Jono over at Not the User&#8217;s Fault, these are my guidelines for user interaction design:<br />
<span id="more-55"></span></p>
<h2>Know your users</h2>
<p>The first step in interaction design is to know who it will be interacting with. Users can be profiled on many criteria, such as age, technical orientation, vocation, cultural background and more. The user profiles created from segmentation of those criteria are called <a href="http://www.cooper.com/journal/2001/08/perfecting_your_personas.html" target="_blank">Personas</a>.</p>
<p>While defining Personas is a common practice for designing user interactions, it might not be possible to engage in all the steps required to fully understand the needs and tendencies of the users they represent - such as interviews, surveys, focus groups etc. </p>
<p>This is especially true on the web, where projects have limited funds and are very quick from inception to implementation. In this case, experience and common sense rule the day - but it is still important to define the base Personas for which the interaction under design applies. Going through the process brings out some considerations that can influence design decisions.</p>
<p>Watching actual users go through an interaction is very important to learn about its effectiveness. Watching live users in action is the best learning experience in interaction design.</p>
<h2>Know yourself</h2>
<p>Knowing your users is the most basic step to interaction design. Yet, for each user type and interaction requirements there are as many possible implementations as there are interaction designers. At this point the interaction designer has to make choices for his users based on his experience, attitude and style. </p>
<p>It is sometimes hard to avoid designing an interaction for yourself rather than for your users. It is a natural tendency to try to solve interaction problems in a way that seems most natural <em>to you</em>, however that might not always be in the user&#8217;s best interest. Ideally, the interaction designer is a part of the target audience. When that is not the case, observing prospective users is very important to understanding their needs and deciding on the approach to solve their problems.</p>
<p>I believe that good UI design is more intuitive than science, and in that respect it is not so different from graphical design. However the two should never be confused - as I&#8217;ve argued in my post on <a href="http://www.techfounder.net/2008/07/20/common-misconceptions-in-web-application-development/">common misconceptions in web development</a>.</p>
<h2>Keep it as simple as possible</h2>
<p>This <a href="http://en.wikipedia.org/wiki/KISS_principle" target="_blank">old mantra</a> is very much an integral part of a successful user interface. By keeping interactions as simple as possible you will:</p>
<ul>
<li>Have less opportunities to fail your users</li>
<li>Give your users less to think about, allowing them to make easier decisions</li>
<li>Reward your users quicker (at the completion of the interaction)</li>
<li>Increase the chance that the interaction will get completed at all</li>
</ul>
<p>A large part of the success of web-based services can be attributed to the simpler interfaces they provide compared to desktop solutions. Some of this is a result of technological limitations on the delivery software (ie, web browsers), but it&#8217;s hard to argue with the results.</p>
<p>There are several common ways to simplify interactions:</p>
<ul>
<li>Use intelligent defaults</li>
<li>Hide optional paths (or form fields) by default (progressive disclosure)</li>
<li>Remove unnecessary steps from the interaction (and do so aggressively)</li>
<li>Reduce mouse clicks. Make each click do more</li>
</ul>
<p>This is obviously a very partial list, but it&#8217;s a good start.</p>
<p>For me as a developer, the KISS principle is deeply ingrained in my thought process. Translating it into user interface design took some getting used to, but once it happened it became second nature. </p>
<p>In fact, many parallels can be drawn between UI design and software architecture design almost to the point you wonder why most developers aren&#8217;t interested in designing interactions (actually I know why - most developers resent users for constantly breaking their code. I know this since a couple years back I had the same mindset). </p>
<h2>Don&#8217;t break conventions - And if you do, make it obvious</h2>
<p>The tools available to us developers have evolved much in recent years, allowing us to create richer interfaces and interactions. With power comes responsibility - we need to apply discretion when using advanced techniques and tools, as to not confuse users. Breaking interface conventions by using new technologies where they are not needed <strong><a href="http://billhiggins.us/weblog/2007/04/20/the-value-of-ui-consistency/" target="_blank">is a mistake</a></strong>. </p>
<p>Conventions should only be broken when they result in a bad user experience or when the alternative is significantly better. The latter is very uncommon when the former does not apply, so be advised. </p>
<p>If you do design a unique interaction (or at least, one that isn&#8217;t in common use) - make it as obvious as possible for the user. A user can only begin to understand your new interaction when he realizes that something is different. Disguising buttons as links, hiding drop down menus in small target zones, making background changes to the document without notifying the user - all result in user confusion and a bad user experience. </p>
<p>The more you need to educate the user on how to complete an interaction - the less likely he will bother to. Good interactions are self explanatory.</p>
<h2>Interactions should be fun</h2>
<p>Users interact with your application since they want to achieve a goal. That goal might be to complete an item purchase, to indulge a curiosity, to gather information and many others. There are several factors that affect the user&#8217;s motivation to complete an interaction:</p>
<ul>
<li>How important is the interaction to achieving the user&#8217;s goal</li>
<li>How unique is your application (ie, how easy would it be for the user to find a better place to achieve his goal)</li>
<li>How hard it is for the user to complete the interaction</li>
</ul>
<p>Negative factors can be offset by a fourth one:</p>
<ul>
<li>How <strong>fun</strong> is it to progress through the interaction</li>
</ul>
<p>The fun factor in interactions is often ignored as they are considered strictly functional. It&#8217;s no coincidence the word functional begins with <strong>fun</strong> <img src='http://www.techfounder.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . The fun factor of an interaction increases motivation for completing it - it makes your application more unique and it increases the user&#8217;s tolerance for the interaction.</p>
<p>There are several ways to increase the fun factor of an interaction:</p>
<h3>Increasing aesthetics</h3>
<p>Making your interface prettier will have a positive effect on your users&#8217; perception of it. It&#8217;s no secret that most of the illusion of Apple&#8217;s superior OS interface is made on the grounds of aesthetics. </p>
<h3>Make it gamely</h3>
<p>Computer games have some of the best interfaces of any computer software, and it&#8217;s no coincidence. You can make your interactions more gamely by adding feedback, reward the user for completing steps and transmit the feeling that the interaction is a part of the user&#8217;s journey towards his goal, rather than a functional requirement that he must take care of.</p>
<h3>Make the interaction do more with less</h3>
<p>The user&#8217;s time and patience are limited. Advance the interaction as much as you can with each user input. Make the user feel the interaction is smart and that it is working with him towards completion.</p>
<h2>Learn from others</h2>
<p>Those are my thoughts on user interaction design, born out of my experience in web development and of my introspection as a long-time user. If you are interested in user interface design, the <a href="http://library.gnome.org/devel/hig-book/stable/" target="_blank">GNOME Human Interface guidelines</a> are as good as reference as you will find. My favorite web authors on the subject include Jono at <a href="http://jonoscript.wordpress.com/" target="_blank">Not the User&#8217;s Fault</a>, Bill Scott at <a href="http://looksgoodworkswell.blogspot.com/" target="_blank">Looks Good Works Well</a> and Aza at <a href="http://azarask.in/blog/" target="_blank">Aza&#8217;s Thoughts</a>.</p>
<p><script type="text/javascript" src="http://www.reddit.com/button.js?t=1"></script></p>
 <img src="http://www.techfounder.net/wp-content/plugins/feed-statistics.php?view=1&post_id=55" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://www.techfounder.net/2008/07/26/the-lost-art-of-user-experience/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
