Thursday, 11 April 2013

How to handle undefined variables in PHP


I have dominant frustration in non strictly OO PHP projects. It's usually full of complex variables, where you need to dig deep in order to access the values you're interested in. Which is all right, that's the way our Lord created the language and in general the code. But those unexpected undefined variables ...


In Drupal a very typical example is forms, entities and configurations. These variables contain tens or hundreds of multi-nested variables in various data structures. So you have expressions like:

$form_status['values']['#node']->field_image[LANGUAGE_NONE][0]['item']['fid']...;


And then suddenly your syslog will be full of:

MESSAGE Notice: Trying to get property of non-object in ...


Or:

MESSAGE Notice: Undefined index: ...


So what you do? Check the last item if it exist. But that's not enough. If you want to play nice you need to check all. It looks like the following when you use the aforementioned code:

if (isset($form_status)) {
  if (isset($form_status['values'])) {
    if (isset($form_status['values']['#node'])) {
      if (isset($form_status['values']['#node']->field_image)) {
        if (isset($form_status['values']['#node']->field_image[LANGUAGE_NONE])) {
          if (isset($form_status['values']['#node']->field_image[LANGUAGE_NONE][0])) {
            if (isset($form_status['values']['#node']->field_image[LANGUAGE_NONE][0]['item'])) {
              if (isset($form_status['values']['#node']->field_image[LANGUAGE_NONE][0]['item']['fid'])) {
                // do Something with $form_status['values']['#node']->field_image[LANGUAGE_NONE][0]['item']['fid'];
              }
            }
          }
        }
      }
    }
  }
}


I hate it. Couple of weeks ago I tried to approach this problem in a post. But it wasn't an elegant solution. However now I have something better. By using instance chains through a validator class we can easily workaround the problem. Even more!

So again the problem is - too many checks are required to make sure that a nested value exist, and at the end we have to get the value. But we just cannot pass the full expression to anything, because it's already interpreted by the binary and will fire a PHP notice/warning in case of problems.

The new approach is using a class, let's call it VarCheck

class VarCheck {
}


We keep the value internally in order to be able to do examinations and evaluation:

  private $value;


Using the constructor would add some unnecessary characters to the applied code, so let's add a quick accessor:

  public function __construct($value) {
    $this->value = $value;
  }

  public static function instance($value) {
    return new VarCheck($value);
  }


Calling

VarCheck::instance(...);


will do the initialization better. As said the 2 most important functionality is to check the existence of the variable and the value itself:

  public function exist() {
    return isset($this->value);
  }

  public function value() {
    return $this->value;
  }


Now comes the magic. How we apply the nesting route without expressing a potential unexisting tag? Simply with the VarCheck instance by narrowing down the variable whenever is possible. There are 3 different basic data structures in PHP: value, array and object. Value is already covered by the ->value() and ->exist() function. Let's care about arrays. Arrays contain keys, so we have to make sure in each step that a key is exist:

  public function key($key) {
    if (isset($this->value) && is_array($this->value) && isset($this->value[$key])) {
      $this->value = $this->value[$key];
    }
    else {
      unset($this->value);
    }
    return $this;
  }


By passing the key we check if we can narrow down the value by using the key. Let's unset our internal variable if not. This is safe internally, and we can communicate outside the status of the check. Last is the object type. Objects have attributes, similar to keys it will like:

  public function attr($attr) {
    if (isset($this->value) && is_object($this->value) && isset($this->value->{$attr})) {
      $this->value = $this->value->{$attr};
    }
    else {
      unset($this->value);
    }
    return $this;
  }


Look at the last lines - we always return with the instance. Now that it's finished, let's see how it works on the example we used on the top. That 16 line of code can be done in 1 line:

VarCheck::instance($form_status)->key('values')->key('#node')->attr('field_image')->key(LANGUAGE_NONE)->key(0)->key('item')->key('fid')->exist();


We also can extract the value if there is any:

if ($value = VarCheck::instance($form_status)->key('values')->key('#node')->attr('field_image')->key(LANGUAGE_NONE)->key(0)->key('item')->key('fid')->value()) {
  // Use $value;
}


It can be even more concise by shortening ket to k, attr to a and the factory method to VarCheck::i().

The good thing is you can extend the class with any kind of validator you want. I added an universal validator:

  public function validateWith(callable $callback) {
    return $callback($this->value);
  }


So you can do things like:

VarCheck::instance($myVar)->key(3)->attr('title')->validateWith(function($v) {
  return $v > 10;
});


Of course this is a critical functionality so requires unit testing. Added some tests for the most crucial parts.

The whole code can be found on GitHub. Let me know if you have idea how to improve.

---

Peter

4 comments:

  1. This looks like overkill - you shouldn't need to go to that much trouble in PHP to find data in nested arrays. Have a look at this example, which runs with no warnings:

    https://gist.github.com/rupertj/5370574

    ReplyDelete
    Replies
    1. Hi there,
      Depends what you want to do. It's overkill if you don't care about php notices. Notice is a good indicator of revealing coding problems - unchecked variables, non-existing values, etc. If you don't have the highest level of error reporting then you miss those errors. If you have it then a lot of 'fake' report will be generated.
      But yes, it's sortof an overkill if you only have to check 1 level for sure.
      You can even use @ I think.

      Delete
  2. isset() and empty() already shortcut without emitting errors - as rupertj points out.

    Also, entity_metadata_wrapper()

    ReplyDelete
    Replies
    1. Hi there,
      Yes, however as mentioned it's mostly php notices. It's not essentially bad having those, but a good indicator of not being careful.
      entity_metadata_wrapper is a unit, yeah. If only everything else would be that abstracted :)

      Delete