Models In The Zend Framework – Part 2

Continuing with this series, we’ll get right to business.

Filtering and validation

Filtering and validation are the basics of dealing with user input in the web. We filter data before passing it to the database to prevent attacks (such as SQL Injections) and we validate user input to make sure it is what we expect it to be (hence, valid).

I will be using the Zend_Filter and Zend_Validate components to deal with those two tasks, and combine their usage via the Zend_Filter_Input component. Zend_Filter and Zend_Validate are self explanatory, however I will cover Zend_Filter_Input in brief.

Taken from the Zend Framework documentation:

Zend_Filter_Input provides a declarative interface to associate multiple filters and validators, apply them to collections of data, and to retrieve input values after they have been processed by the filters and validators. Values are returned in escaped format by default for safe HTML output.

Basically Zend_Filter_Input (ZFI from now on, I’m getting tired) allows to assign multiple filters and validators in one declarative statement, and perform all the filtering / validation on an array of data in one go. This is very convenient – User input from http requests is available automatically as arrays in PHP (via the superglobals $_POST and $_GET) and database operations in the Zend_Db_Table component use array arguments as well.

Putting it all together

To recap, we want:

  1. A generic way to define filters and validators for a database based model
  2. Use those filters and validators to prepare data for use in insert / update queries
  3. Provide users with feedback in case validation has failed

For this purpose we extend the Zend_Db_Table class:

class Techfounder_Db_Model extends Zend_Db_Table {
	/**
	 * Filter array for insert/update methods
	 * @var array
	 */
	protected $_filters = null;

	/**
	 * Validator array for insert/update methods
	 * @var array
	 */
	protected $_validators = null;

	/**
	 * Validates and Filters array data
	 *
	 * @param array $data
	 * @param array $filters
	 * @param array $validators
	 * @return boolean / array Validation success / Error messages
	 */
	protected function isValid(array &$data,$options = null) {
		$input = new Zend_Filter_Input($this -> _filters,$this -> _validators,$data,$options);
		if ($input->hasInvalid()){
			return $input -> getInvalid();
		} else if($input -> hasMissing()){
			return $input -> getMissing();
		} else {
			$data = $input -> getEscaped();
			return true;
		}
	}
}

The isValid() method is the core of the process. It loads class members $_filters and $_validators into a Zend_Filter_Input instance then proceeds to validate and filter the data, as $_filters and $_validators represent arrays of filters / validators chains written in the ZFI syntax (read the documentation for more on that). In concrete models extending this class, the filters and validators arrays would have to be defined for anything of meaning to happen in this method.

The method returns true if the data is valid or an array of errors corresponding to the failed validation fields (as per the ZFI method of operation). The data array is passed by reference – in addition to telling us whether validation passed or failed, the isValid() method performs the filtering (on success). Note that even if no filters are defined, ZFI uses htmlentities as a default escape filter (which is pretty good practice).

We are not there yet. We now define our API for performing database insert / update operations:

class Techfounder_Db_Model extends Zend_Db_Table
{
...
     /**
	 * Database valid insert method
	 *
	 * @param array $data User input
	 * @return mixed LastInsertId / Error messages
	 */
	public function insertValid(array $data) {
		$valid = $this -> isValid($data,array('presence' => 'required'));
		if($valid === true){
			return parent::insert($data);
		} else {
			return $valid;
		}
	}

	/**
	 * Database valid update method
	 *
	 * @param array $data User input
	 * @param mixed $where
	 * @return mixed Number of rows updated / Error messages
	 */
	public function updateValid(array $data,$where) {
		$valid = $this -> isValid($data);
		if(empty($data)){
			return 0;
		}

		if($valid === true && !empty($where)){
			return parent::update($data,$where);
		} else {
			return $valid;
		}
	}
}

Methods insertValid() and updateValid() accept the same arguments as the insert() and update() methods and use the isValid() method to sanitize to data, returning either the insert() / update() methods results or an array of validation failures.
I could have overloaded the insert() and update() methods but since I would like to still be able to use them directly (a use-case that comes up every now and then) I renamed them appropriately.

One thing to note here is the ‘fields’ => ‘required’ option I pass to the ZFI. In most tables, most fields are required fields, therefor I would like ZFI to fail validation for missing fields (which is what this option does). For optional fields I pass the ‘presence’ => ‘optional’ option in the validation array (as shown in the following example).

Lets go over an example use-case of how to use this generic class:

Suppose we are working on a registration process. We created a user table in our database and we’ll want a users model to handle user-related operations. The table schema is:

Users
– id (primary, auto-increment)
– name
– email
– password
– role
– age

We extend the Techfounder_Db_Model:

class Users extends Techfounder_Db_Model
{
       /**
        * Table name
        * @var string
        */
	protected $_name = 'users';

       /**
        * Primary key
        * @var string
        */
	protected $_primary = 'id';

       /**
        * Table validators
        * @var array
        */
	protected $_validators = array(
		'name' => 'NotEmpty',
		'email' => 'EmailAddress',
		'password' => array(
        	       'StringEquals',
        	       'fields' => array('password', 'confirm_password')
   		),
		'role' => 'NotEmpty',
                'age' => array ('Digits','presence' => 'optional')
	);
}

The validation rules are very simple in this case – name and role can’t be empty, Email must be a valid Email address, age must be numeric (and is optional) and password expects two password fields to be equal (named ‘password’ and ‘confirm_password’ in a surprising turn of events).

We quickly setup a controller to use this class as part of a registration action:

class IndexController extends Zend_Controller_Action  {
       /**
        * Registration Action
        *
        * If request is by Post attempt to insert the data:
        *   - On success, redirect to successAction
        *   - On failure load error array into the view
        *
        * Show registration form
        */
        public function registerAction() {
		if($this -> isPost()) {
			$users = new Users();
			$result = $users -> insertValid($_POST);
			if(is_numeric($result)) {
				$this -> _redirect('/index/success');
			} else {
				$this -> initView() -> errors = $result;
			}
		}
		echo $this -> render();
	}

	/**
	 * Registration success action
	 *
	 * Display success view
	 */
	public function successAction() {
		echo $this -> render();
	}
}

The register action shows the registration form. If data has been posted into it, it will attempt to insert, redirecting to a success page on successful insert or loading the error messages into the view for user feedback about required data. I usually have a view helper prepared to render the error array in a pleasing manner in the view (as a styled unordered list).

Before I end this part, I want to talk about Zend_Form – The Zend_Form component provides similar interface to filtering / validating in a form creation context. The reason I put the validation / filtering rules in the model context is that it makes them reusable – I can call insert and update operations for that model from controllers and other models, and be certain that it will take care of its own filtering and validation.
As an aside I’m not a big fan of Zend_Form as it makes it more difficult to implement different form layouts (from a design point of view).

This concludes part 2 of this series. Next part will be dealing with retrieving model data which is split over several tables (due to normalization).

* Part 1 can found here

To know when the next article is published, please subscribe to new articles using your Email below or follow me on Twitter.

Subscribe to Blog via Email

Enter your email address to receive notification about new posts.

  • matthijs

    Looks very good. Exactly the approach we’ve been looking at in our framework development.

    The one thing I don’t like is the use of htmlentities filtering before putting the data in the db. In my opinion you should put data in it’s original state in the db. Or at least only filter out stuff you never, ever want. And use html escaping only when that data is outputted to html. The thing is, htmlentities will convert some characters in html-ready stuff. But what if you want to use the data from your db in another context? Then you have to convert those html-converted characters again to something else. But I guess it’s ok if it’s a conscious choice and you are aware of the consequences up front.

  • matthijs

    Looks very good. Exactly the approach we’ve been looking at in our framework development.

    The one thing I don’t like is the use of htmlentities filtering before putting the data in the db. In my opinion you should put data in it’s original state in the db. Or at least only filter out stuff you never, ever want. And use html escaping only when that data is outputted to html. The thing is, htmlentities will convert some characters in html-ready stuff. But what if you want to use the data from your db in another context? Then you have to convert those html-converted characters again to something else. But I guess it’s ok if it’s a conscious choice and you are aware of the consequences up front.

  • Eran Galperin

    htmlentities is for security reasons obviously. I’m not a security expert, so I have to use brute-force. If you have a better idea you are welcome to suggest :)

  • Eran Galperin

    htmlentities is for security reasons obviously. I’m not a security expert, so I have to use brute-force. If you have a better idea you are welcome to suggest :)

  • matthijs

    I understand that htmlentities is for “security” reasons. But the thing is, escaping output is context dependent. So say an ampersand &. That doesn’t have to be escaped if it’s inserted in a database, because in that context it does no harm. However, when it is outputted to HTML, it needs to be escaped with htmlentities to keep your HTML valid (no real security risk there, but that’s just this example).

    The basic principle is that you want to keep your data the way it is original. using htmlentities before inserting it in the db changes the data for no reason. Because htmlentities does nothing in the context of inserting data in a db. Only when outputting data to HTML is htmlentities of use. When you insert data in a db, it needs to be escaped using a function that is useful in that context. That’s mysql_real_escape_string for mysql.

  • matthijs

    I understand that htmlentities is for “security” reasons. But the thing is, escaping output is context dependent. So say an ampersand &. That doesn’t have to be escaped if it’s inserted in a database, because in that context it does no harm. However, when it is outputted to HTML, it needs to be escaped with htmlentities to keep your HTML valid (no real security risk there, but that’s just this example).

    The basic principle is that you want to keep your data the way it is original. using htmlentities before inserting it in the db changes the data for no reason. Because htmlentities does nothing in the context of inserting data in a db. Only when outputting data to HTML is htmlentities of use. When you insert data in a db, it needs to be escaped using a function that is useful in that context. That’s mysql_real_escape_string for mysql.

  • Eran Galperin

    To my knowledge the mysql_real_escape function is not bullet-proof, and htmlentities provides better protection. Of course I would like to preserve the original data in its form, but better escape than suffer an attack. Again, I’m no security expert but as far as I know htmlentities is the recommended practice under PHP.

  • Eran Galperin

    To my knowledge the mysql_real_escape function is not bullet-proof, and htmlentities provides better protection. Of course I would like to preserve the original data in its form, but better escape than suffer an attack. Again, I’m no security expert but as far as I know htmlentities is the recommended practice under PHP.

  • matthijs

    I’m no security expert either, but I’m pretty sure that’s not true. Htmlentities is not the function to be used to protect against sql injection. If you want more protection against that you can also use prepared statements (using PDO for example). I’m also pretty sure that mysql_real_escape_string is bullet-proof (although in security terms you are not allowed to speak about 100% security).

    Htmlentities is the recommended practice to escape html to prevent XSS, but that’s a whole different security problem which has nothing to do with the db.

    Both Chris Shiflett and Ilia Alshanetsky have both books and sites with some pretty good info on these subjects.

  • matthijs

    I’m no security expert either, but I’m pretty sure that’s not true. Htmlentities is not the function to be used to protect against sql injection. If you want more protection against that you can also use prepared statements (using PDO for example). I’m also pretty sure that mysql_real_escape_string is bullet-proof (although in security terms you are not allowed to speak about 100% security).

    Htmlentities is the recommended practice to escape html to prevent XSS, but that’s a whole different security problem which has nothing to do with the db.

    Both Chris Shiflett and Ilia Alshanetsky have both books and sites with some pretty good info on these subjects.

  • Eran Galperin

    I’ve been reading about SQL injection attacks recently and it appears that you might be right. It’s better to use the Zend_Db intrinsic quoting functions that use the real_escape_string methods of the different adapters.

  • Eran Galperin

    I’ve been reading about SQL injection attacks recently and it appears that you might be right. It’s better to use the Zend_Db intrinsic quoting functions that use the real_escape_string methods of the different adapters.

  • matthijs

    It is a pretty difficult subject (also for me). I just remembered this article http://www.webappsec.org/projects/articles/091007.shtml
    which explains the sql injection issue in more detail. An excellent read. And together with the examples you can test locally, it makes it obvious that it’s not as simple as use function A to protect for B or C for D, but there’s a lot more to think about for every query one writes. So me saying mysql_real_escape_string() is pretty bullet-proof is not entirely true .. I must also stand corrected.

  • matthijs

    It is a pretty difficult subject (also for me). I just remembered this article http://www.webappsec.org/projects/articles/091007.shtml
    which explains the sql injection issue in more detail. An excellent read. And together with the examples you can test locally, it makes it obvious that it’s not as simple as use function A to protect for B or C for D, but there’s a lot more to think about for every query one writes. So me saying mysql_real_escape_string() is pretty bullet-proof is not entirely true .. I must also stand corrected.

  • Doug

    Hey, this was exactly what I was looking for. Coming from a Rails background I had trouble wrapping my head around how to put validation in the models. ZF’s approach is so much more decoupled than Rails’ model/validation mechanism. Pretty interesting. One question: The first code example is with a class named Techfounder_Db_Model, while the second class, which adds the update and insert methods, is named Octabox_Db_Model. That should be Techfounder_Db_Model too, correct?

  • Doug

    Hey, this was exactly what I was looking for. Coming from a Rails background I had trouble wrapping my head around how to put validation in the models. ZF’s approach is so much more decoupled than Rails’ model/validation mechanism. Pretty interesting. One question: The first code example is with a class named Techfounder_Db_Model, while the second class, which adds the update and insert methods, is named Octabox_Db_Model. That should be Techfounder_Db_Model too, correct?

  • http://www.techfounder.net Eran Galperin

    Yes, you are absolutely correct – thanks for pointing that out :)

    As can be inferred, I used example code from my start-up Octabox in which I employ this solution extensively.

  • http://www.techfounder.net Eran Galperin

    Yes, you are absolutely correct – thanks for pointing that out :)

    As can be inferred, I used example code from my start-up Octabox in which I employ this solution extensively.

  • Pingback: Wheres the Model in Zend Frameworks MVC? | Child of the Machine()

  • Pingback: Grantus Maximus Web Blog » Wheres the Model in Zend Frameworks MVC?()