Part of the Well Formed Errors Initiative

This meta ticket serves a couple purposes. First, an explanation of what the PHP assert statement does, what it does not, and why it should be used over Exception throwing in the situations it is suited for. Second is a roadmap to the incorporation of asserts throughout the code base and some examples of the things that can be done with assertions taken from the two tickets I'm currently working on that add assertions to the code.

This issue is marked as critical because I feel the proper use of the assert statement is critical for the health of the code and there is no moment too soon to add them. In and of itself assert is a tool that already exists - the Drupal community just isn't using it. Both Simpletest and PHPUnit force Assert failures to be treated as warnings and log them accordingly during the execution of automated tests. They'll do this now, no changes to any code necessary. The only price to adding them is discovering errors no one knew about that cannot be found any other way. But first, let's look at them.

What Are Assertions? Why should Drupal use them?

Short answer:
You use Exceptions to test for conditions which should be true.
You use assert to test for conditions which MUST be true.

Ok, so what do I mean by that. Well, some specific basic examples of both.

  • Interaction with an external API, such as PDO to a database - Exception.
  • Verification that a function argument is of the expected type - Assert. Note that Type hints can be used to object and array inputs, but assert must be used to check for Boolean, string, integer, float and any other primitive I'm forgetting
  • Checking for a file that may not be present and if it isn't we do something else - such as settings.php. Exception
  • Checking for that same file later in the program flow where we expect it to always exist because we should not have reached the function we are in at all under any circumstances with it missing. Assert.
  • Verifying a function indeed returns a value of a specific type or range of types. Assert.

As can be seen from the above list, Assertions are not control structures of any sort. They aren't meant to be. You can't catch them with a try/catch block. Unlike errors, the developer can determine how the PHP interpreter deals with them:

  1. They can be turned off entirely. In this state, they are nothing but comment text the interpreter ignores, but see a bit of PHP sadness below.
  2. They can cause the program to halt immediately, like a Fatal error.
  3. They can cause the program to raise E_WARNING. This is how PHPUnit and Simpletest deal with them
  4. They can be handled by a callback and from there logged, ignored, whatever.

That first point is why they should never be confused with exception handling - they can be turned off. You would never want to do that to a control structure. Why do it at all? Well, here's a physical analogy. If you've ever watched a building go up you've seen scaffolding. With more complex buildings there will be temporary support structures. Consider an arch - until the keystone is placed the stones cannot support themselves - instead a temporary structure must be in place to carry the weight. When the arch is completed the temporary structure is removed.

Assert has much the same role in programming. The statements set and document constraints on program behavior and ability. Which is worse, leaving a function open to taking any argument when it expects an integer, or asserting that a received argument is an integer? In the latter case the developer of the outside code is forced to fix their code before proceeding, where in the former a problem might not crop up until many months down the line.

That's why this issue is critical to me. Assert helps to stop regression and all manner of other behavior problems. Also since they can be turned off their performance cost is not an issue. For example, I'm currently writing a lot of validation assertions into the attributes class, things like this in \Drupal\Core\Template\AttributeValueBase::__construct.

    // Assert legal attribute name pursuant to W3C standard
    // http://www.w3.org/TR/2000/REC-xml-20001006#NT-Name
    assert('preg_match(\'/^([\\w_:][\\w\\d\\-_:\\.]*)*$/\', $name)', 'invalidAttributeName');

(Courtesy request - if you see a problem with the regex above please notify me privately, it's off topic and I'll admit up front regex is my weakest area of programming)

Preg_match evaluation is computationally expensive. Given enough of them the performance cost is very real, and in a low level library like Attribute it can be quickly felt, especially on a server under load. In development we can be patient and usually our dev servers only have to deal with one user at a time, ourselves. But by being assertions that processing cost becomes immaterial because we can get rid of it.

True, the assert statements remain behind in the code, but that isn't a bad thing. Once you understand how they help to understand the code. The above assert makes it rather clear that this class isn't expected to validate user input - if it was we'd be using a conditional to check the name followed by an exception throw. Asserts are used to validate developer input, most importantly our own. PHP critically lacks the any other ability to type hint function returns, at least until 7. Assertions let us do that today. I've already discovered such a bug.

During the development of the Template Assertion patch I added this assertion to \Drupal\Core\Template\Attribute::createAttributeValue

assert('$value instanceof \\Drupal\\Core\\Template\\AttributeValueBase', 'inconvertableInput');

I added that assertion because the PHPDoc comments for the function said that it would always return that object. It wasn't though, and when I submitted the patch two completely unrelated tests to the Attribute system failed.

Drupal\action\Tests\ConfigurationTest
Drupal\search\Tests\SearchConfigSettingsFormTest

It's important to note that these two tests also aren't related to each other in any way. Why did they suddenly fail? Well, after some investigation I found that if you passed a number of any kind or an object into Attribute, you'd get that value back unaltered. Those two tests both pass a Translatable object in. The reason they didn't crash before was that object implements __toString. Still, it is very likely that, uncorrected, this unexpected return type could create a difficult to track down and correct bug at some point in the future. I'm willing to bet some of the reappearing regression errors have this sort of thing behind them.

So, that's the nutshell version of what Assertions are. Further reading check the Wikipedia article and especially the references of that article. And lest anyone thing assert isn't object oriented programming - it was in Java from day 1.

Drupal Core and Assert

Drupal is getting very close to releasing 8.0 stable, so I understand the hesitation on adding anything that could upset the apple cart. Much of the drupal community isn't familiar with assert, so when they start seeing assert statements show up in the code there might be some confusion. Writing unit tests with assertions specifically in mind in PHPUnit is also a little tricky, though I'll be adding code for that in the AssertionHandler main ticket as soon as I can - I pretty much have to or otherwise I can't run its tests.

That said, the assert statement and my two pet projects are entirely separate entities. In my opinion assertions should go in starting today, starting with all patches under development for any reason anywhere. SimpleTest and PHPUnit already catch them, and they can be eye opening. At the very least they verify what the PHPdocs say is true, particularly in the case of function returns.

You don't need to add the assertion handler or modify the main code to activate them. You don't need to change anything - PHP's default configuration is to have them behave as E_USER_WARNING errors. If your local configuration is not default the assert options can be changed from .htaccess

php_value assert.warning 1
php_value assert.bail 1
php_value assert.active 1

Active turns assertions off - if we adopt them the only immediate change the core code needs is to add "php_value assert.active 0" to the default .htaccess file to make sure they stay turned off in production. Warning converts assertions to E_USER_WARNING and Bail turns them into failures.

Assert, by itself, also provides a means for the core team to communicate expectations to the community and themselves. During the day we all write many function calls but how often do we look up the parameter list and check for valid types? In the case of object arguments we are suddenly and unceremoniously reminded of them when PHP throws the familiar CATCHABLE_FATAL_ERROR from a failed type hint, but when the argument is a primitive like a string then what? Usually PHP muddles its way through as a loose typed language, and much of the time that muddling not only works but it's appropriate, but assert gives us a way to be stricter when necessary.

Assert can also stop bad code from ever being written. For example, my work with Drupal/Core/Template has led me to believe the AttributeValueBase family of classes are for the exclusive use of Attribute insofar as only the Attribute class should ever create an instance of them. This is an example of the concept of protected classes.

Protected Classes

Other languages have the concept of protected or even private classes - the entire class - within a namespace. In this setup only other classes within the namespace can create objects of a type. Even more restrictive is binding a class' creation to another class such that only that class can create instances - PHP already does this internally with the PDOStatement class, you must use PDO to get an instance of it.

Assert gives us the means to enforce this - here's an early prototype of that enforcement.

assert('call_user_func(function(){
      foreach(debug_backtrace() as $frame) {
        if (isset($frame[\'class\']) && $frame[\'class\'] === \'Drupal\\Core\\Template\\Attribute\') {
          return true;
        }
      }
      return false;
    });', 'invalidCaller');

This works, but it's already getting out of hand, that's too long for an inline assert. I plan to replace it with.

assert('\\Drupal\\Core\\Error\\Assertion::validCaller(\'Drupal\\Core\\Template\\Attribute\');', 'invalidCaller');

Ultimately all computer code design patterns are about reducing the number of interactions possible among the components of a codebase. If a value in a class can't be arbitrarily changed the effects of such a change don't have to be tested. Similarly, if only one class can make instances of a class, only that the calling class needs to have it's calls to those class' constructors checked.

Call stack tracing like the above mid-program is a very large waste of CPU cycles in production. But in development it is justifiable by preventing the writing of code that will be, in the future, hard to debug.

And further, assertion gives us a means to stop developers (even twig designers in a few cases) from doing stuff they shouldn't. I've patched the Attribute class to use assert to check CSS class names for case collision - it won't let you put in two classes with the same name but in different cases anymore. The reason not to do that - the HTML spec is case sensitive, and the CSS spec is not, as a result to borrow a line from Mr. T. I pity the fool that tries to do that - they're in for a very long night of js debug hell someday.

Best practices can also be enforced.

All this without the handler. Though I'm moving as quickly as I can to get the assertion handler ready in time for release, if it isn't ready, it isn't ready. The usage of assert doesn't require it - assert is already on, though if we start using it to keep production sites from suddenly showing them we need to add "php_value assert.active 0" to the .htaccess file with a note for developers to change it over to 1 when needed.

From then on the asserts will be handled like any other PHP Warning. The AssertionHandler and TemplateAssertion tickets are just embellishments on this very critical core concept and are entirely optional - though highly useful and definitely quickly portable into beta 8 now because the fact drupal has no assertions historically means there is no existing assert aware code to worry about being compatible with.

Drupal Community and Assert

Assert isn't just for the core team. The community will also benefit in two major ways.

  1. The expectations of core code will be clearer, both in API calls and config file composition
  2. Assertions to trap mysterious crash states prompted by YAML config files will take the pain out of both module and theme development

In time the community should also this mechanic to improve their own code, following the example core will set.

Drupal Userland and Assert

The administrators and normal users of drupal systems need never be aware the code started using assert. They will begin to notice its effects though with tighter and better code reaching them.

Roadmap to Assert Use starting immediately

This is how we do this.

Step 1: Protect Production Sites

Admittedly there shouldn't be any, but just in case this is only adopted subsequent to release Candidate phase we need to make sure the assertions don't suddenly show up unexpected on anyone. This is solved by (the somewhat tongue in cheek) patch attached to this file, which adds the php_value statement to the .htaccess file and turns assertions off by default, along with instructions to developers on turning them back on. But even before this patch is included developers can use assert today since they are on by default.

Step 2: Establish the Ground Rules

Assert is uniquely a clean slate for drupal, so its a chance to get things right from go. So establishing rules for their use needs to be hashed out. This is what I propose.

  1. String Arguments in single quotes only! Assert allows you to give it a string to eval (as per the eval statement) or you pass it a boolean expression or a function that returns a truthy or falsy value. Anything other than a string argument, in single quotes no less, will be executed by PHP regardless of the assert_options value. I consider this a bug, but it's been around since the beginning. Further, the AssertHandler, once completed, needs the argument passed this way to tell the user what the assertion was in the first place.
  2. Decide on min PHP 5.4.8 or not The assert statement picks up a message parameter at that version which is heavily used by the asserts I've written for the handler and which I expect to see used in drupal. While the Handler won't crash if an earlier PHP 5.4.x is used, it will be unable to give long form error explanations. Also, changing the min to 5.4.8 relieves me of needing to make a version check in the handler (selfish reason I know, but there's my reasons). Finally UnitTests that verify assertions require this version of PHP to do the verification. They'll perform their other tests though.
  3. Use Error Message CodesThe Assertion Handler will expect to receive an encoded name from the assert call. It maps this back to the errors.yml files and looks up the long form explanation of the assertion. These long form messages are usually HTML and the more detailed ones contain references to tutorials and even sample yml files for configuration mistakes. Inadvertedly, the translation team will be able to compose localization files. The main reason for encoding is PHP has a 1024 byte limit on the message length sent by assert. The codes will be discussed in more detail with the handler.
  4. Only One Assertion of a given name per method This is critically important for UnitTesting assertion throwing classes, as the UnitTester will use the name to distinguish to the assertions.

Code Format for Assertions.
This is a proposed extension to existing drupal code standards.

Assertion Comments
In order to more quickly spot them, Comments related to assertions will use the Perl inline style ( # ) There is never to be a gap between the comment and its assertion. Unless the assertion is at function start there should be an empty line before it unless following another assertion (hence assertions should block together)

Start Assertions
The function start assertions (assertions concerning the opening state of the function, such as whether a parameter expected to be a string is indeed a string), if any, will follow the parameter line with no space between that line and the first assertion. There will be a 1 line break between the last assertion of this block and the first executable line of the function. The assertions that do not require commentary are to come first.

  public function __construct($name, $value) {
    assert('is_string($name)', 'notString');
    assert('strlen(trim($name)) > 0', 'minStringLength: 1'), 
    assert('\\Drupal\\Core\\Error\\Assertion::validCaller(\'Drupal\\Core\\Template\\Attribute');', 'invalidReturn');
    # Objects passed as values must be able to express themselves as strings
    assert('!is_object($value) || method_exists($value, \'__toString\';', 'unStringableObject');
    # Assert legal attribute name pursuant to W3C standard
    # http://www.w3.org/TR/2000/REC-xml-20001006#NT-Name
    assert('preg_match(\'/^([\\w_:][\\w\\d\\-_:\\.]*)*$/\', $name)', 'invalidAttributeName');

    $this->name = $name;
    $this->setValue($value);
}

Comment what is asserted
Unless immediately obvious from the expression (say, 'is_string($name)' ) what is being asserted should appear in comments.

Return Assertions
1. There should be no empty lines between a return assertion and the associated return statement. There should be a one line break before it or it's comment.

    assert('$value instanceof \\Drupal\\Core\\Template\\AttributeValueBase', 'inconvertableInput');
    return $value;

2. Functions returning objects must always assert with 2 exceptions allowed:
A. The function is passing along a value sourced from another method in the same class.
B. The new operator is on the return line (e.g. return new AttributeString() )

3. Only one asserted return per function and only one return in such a function.
This will require refactoring, but as a rule of thumb once you get above 3 return statements in the same function it's as bad as using goto statements as far as tracking logic flow - don't do it.

4. All PHP Doc @return type statements must be backed up, subject to the exceptions above.

Other Asserts
Mid-function asserts or assert blocks should be preceded and followed by an empty line. They should be rare - if you find yourself in need of one consider breaking the method down further to increase polymorphism.

Step 3. Finalize the above

Once that is done apply to all code moving forward.

Step 4. Add the AssertionHandler to beta

While the team can get by with just adding assertion statements, the assertion handler is what will truly make them shine. In brief it:

  1. Maps error codes to error messages as mentioned above.
  2. Allows modules to document their own concerns with assertions, particularly where hooks are involved.
  3. Performs a lookup of the Drupal API to the relevant class and method entry.
  4. Provides a stack dump if symfony/var-dump is present (must be installed separately, it's too late to add this library to the project unless an exception is made. That is highly recommended but a separate issue.
  5. Adds a Trait to allow UnitTests to be written to verify their assertions are working.
  6. Prepares the way for a more comprehensive Error system rewrite that is to be postponed for 8.1

Conclusion

Please voice any questions, comments or concerns. The adoption of assert is a profound change to how the community codes, but it solves, can solve or makes easier to solve a whole host of issues. I guarantee that within 3 months of their inclusion everyone will wonder why it took so long to start using them. They're very useful.

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Task - adding assert statements is part of code creation. All this ticket has to do codewise is add a line to .htaccess to make sure they are turned off in prod.
Issue priority Critical
As noted above, proper use of assert pays enormous dividends in development efficiency and should be introduced as soon as possible.
Unfrozen changes Unfrozen because it only changes the .htaccess file.
Prioritized changes usability/accessibility/security/performance/removing previously deprecated code/Migrate.
Assertions aid in all of these.
Disruption None.
CommentFileSizeAuthor
#1 2451793-2.diff1.69 KBAki Tendo
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Aki Tendo’s picture

Aki Tendo’s picture

Status: Active » Needs review
Aki Tendo’s picture

Issue summary: View changes
Aki Tendo’s picture

Issue summary: View changes
marcingy’s picture

Priority: Critical » Major

This isn't a release blocker

Aki Tendo’s picture

Status: Needs review » Closed (duplicate)

Scope changes to #2408013: Adding Assertions to Drupal - Test Tools. have made this ticket redundant to it.