Summary: The proposed new version of drupal_static provides several benefits:

  • It is faster.
  • It is smaller.
  • It is simpler.
  • It corrects erroneous handling of one edge case calling scenario.

Discussion: The speed of the drupal_static function in bootstrap.inc can be improved slightly (approx 4-6%) by simply rearranging the logic flow in the code to optimize the response for the most common calling scenario. Currently, the very first test in the function is to (unnecessarily) check: isset($name), and if true, reset all the static variables to their default values. However calling drupal_static(NULL), (which triggers this code block) is rare and not going to happen very often. A much more likely calling scenario is when drupal_static is first called once to initialize the static variable, and then called many times to simply fetch the reference to the static variable. And with this most-frequently-called-case, an array_key_exists($name, $data) conditional test is made, and there is no need to first check if $name is set. By placing the code for this most common calling scenario at the very top of the function, a small, but measurable speed improvement can be gained. (I measured a 4% improvement on a Linux 2.6.24-24-xen/PHP 5.2.11 box and a 6% improvement on Win32XP-SP2/PHP 5.2.13 box.)

The attached patch rearranges and streamlines the logic components of the function to achieve the performance gain while at the same time reducing the code size. (See the attached png image which shows the (rough hand drawn) logic matrix used to analyze and describe all the possible calling scenarios for drupal_static.) Although it may look quite different, the code in this patch duplicates the logic of the original function - with one exception: The current/old drupal_static function erroneously handles one edge case; when it is called for the first time to initialize a new static var with a non-NULL $name, and $reset set to TRUE like so:

  // ERROR! With $reset=TRUE, drupal_static fails to create and initialize a new
  // static variable and erroneously returns a reference to the base $data array.
  $my_new_static =& drupal_static(__FUNCTION__, 'default', TRUE); // !ERROR

However, this calling scenario is unusual and unlikely and is thus, not a critical issue. (Although this situation does (harmlessly) occur when drupal_static_reset is called with the name of a static variable that has not previously been initialized.) Nonetheless, the patch presented here does correct this erroneous behavior for this unlikely edge case, by creating and initializing a new static var even when $reset is TRUE. (This case is identified in note 3 in the attached image.)

Here is the improved drupal_static:

function &drupal_static($name, $default_value = NULL, $reset = FALSE) {
  static $data = array(), $default = array();
  // First check if dealing with a previously defined static variable.
  if (array_key_exists($name, $data)) {
    // Non-NULL $name and both $data[$name] and $default[$name] statics exist.
    if ($reset) {
      // Reset pre-existing static variable to its default value.
      $data[$name] = $default[$name];
    }
    return $data[$name];
  }
  // Neither $data[$name] nor $default[$name] static variables exist.
  if (isset($name)) {
    // First call with new non-NULL $name. Initialize a new static variable.
    $default[$name] = $data[$name] = $default_value;
    return $data[$name];
  }
  // Reset all: ($name == NULL). This needs to be done one at a time so that
  // references returned by earlier invocations of drupal_static() also get
  // reset.
  foreach ($default as $name => $value) {
    $data[$name] = $value;
  }
  // As the function returns a reference, the return should always be a
  // variable.
  return $data;
}
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Interesting approach.

One additional thing to consider: array_key_exists($name, $data) is known to be slow. Might replacing it with isset($data[$name]) || array_key_exists($name, $data) help gain some more cycles out of that?

Status: Needs review » Needs work

The last submitted patch, drupal_static_improved_01.patch, failed testing.

ridgerunner’s picture

Well, it appears that handling the one edge case in a different manner breaks a bunch of Simpletest tests. Here is a re-rolled version of drupal_static that handles all cases identically to the original and reaps the same performance gain:

function &drupal_static($name, $default_value = NULL, $reset = FALSE) {
  static $data = array(), $default = array();
  // First check if dealing with a previously defined static variable.
  if (array_key_exists($name, $data)) {
    // Non-NULL $name and both $data[$name] and $default[$name] statics exist.
    if ($reset) {
      // Reset pre-existing static variable to its default value.
      $data[$name] = $default[$name];
    }
    return $data[$name];
  }
  // Neither $data[$name] nor $default[$name] static variables exist.
  if (isset($name)) {
    if ($reset) {
      // Reset was called before a default is set and yet a variable must be
      // returned.
      return $data;
    }
    // First call with new non-NULL $name. Initialize a new static variable.
    $default[$name] = $data[$name] = $default_value;
    return $data[$name];
  }
  // Reset all: ($name == NULL). This needs to be done one at a time so that
  // references returned by earlier invocations of drupal_static() also get
  // reset.
  foreach ($default as $name => $value) {
    $data[$name] = $value;
  }
  // As the function returns a reference, the return should always be a
  // variable.
  return $data;
}

Edit 2010-07-16: I can see now that the "erroneous edge case" described in post 1 above, is actually a very useful calling scenario which allows an external function to obtain a reference to the entire $data array which would otherwise be private and inaccessible.

ridgerunner’s picture

Status: Needs work » Needs review

Note to Damien: I tested the isset($data[$name]) || array_key_exists($name, $data) construct as suggested, but it significantly reduced the performance gain. Go figure.

Edit 2010-07-16: It turns out that this method does significantly speed things up (but the static variable must contain a non-NULL value - which will be most of the time). This change alone speeds up the function by another 20-30% (see posts 10 and 12 below). Good call Damien!

ridgerunner’s picture

Status: Needs work » Needs review
Issue tags: +Performance
ridgerunner’s picture

It appears that the tag should be: "Performance" and not "+Performance"...

ridgerunner’s picture

Here is a re-roll of the patch from post 3. The function remains the same as previously listed in post 3.

This change provides a measurable performance gain with zero down side (other than the time required to review, test and implement the patch). This new drupal_static() version is functionally identical to the original - the speed-up comes from simply rearranging the conditional tests.

Not sure why no one has chimed in on this. (too busy squashing bugs I guess...?)

catch’s picture

subscribing.

ridgerunner’s picture

It turns out that using isset($data[$name]) || array_key_exists($name, $data) as suggested by Damien Tournoud in comment 1 above does *indeed* speed things up. Significantly. The benchmarking I was doing previously (which showed no improvement), was storing NULL in all the static variables which caused the isset() clause to always test FALSE (which is not realistic). I just retested this using static variables having much more realistic non-NULL values, and the speed improvements were quite significant. Here is the improved drupal_static() function:

function &drupal_static($name, $default_value = NULL, $reset = FALSE) {
  static $data = array(), $default = array();
  // First check if dealing with a previously defined static variable.
  if (isset($data[$name]) || array_key_exists($name, $data)) {
    // Non-NULL $name and both $data[$name] and $default[$name] statics exist.
    if ($reset) {
      // Reset pre-existing static variable to its default value.
      $data[$name] = $default[$name];
    }
    return $data[$name];
  }
  // Neither $data[$name] nor $default[$name] static variables exist.
  if (isset($name)) {
    if ($reset) {
      // Reset was called before a default is set and yet a variable must be
      // returned.
      return $data;
    }
    // First call with new non-NULL $name. Initialize a new static variable.
    $default[$name] = $data[$name] = $default_value;
    return $data[$name];
  }
  // Reset all: ($name == NULL). This needs to be done one at a time so that
  // references returned by earlier invocations of drupal_static() also get
  // reset.
  foreach ($default as $name => $value) {
    $data[$name] = $value;
  }
  // As the function returns a reference, the return should always be a
  // variable.
  return $data;
}

My benchmark tests (which measure drupal_static() in isolation), indicate a 25% speed improvement on a Linux 2.6.24-24-xen/PHP 5.2.11 box and a 35% improvement on Win32XP-SP2/PHP 5.2.13 box. The improvement may actually be even better when the static $data[] array has lots of entries. (My benchmark test only stores one variable). I'm very curious to see how much overall improvement will be seen in an actual Drupal run time environment.

Edit 2010-07-16: It turns out that the performance benefits increase a *LOT* as more variables are added. (See posts 20 & 21 below.)

ridgerunner’s picture

Here is some more explanation on the performance enhancement...

The speed of drupal_static() is improved by removing one call to array_key_exists() for the most common calling case - when retrieving a previously allocated static variable. When creating a new static variable, the speed of the new version is slightly slower than the old. The following listings show the actual lines of code which are executed for each of these two calling scenarios:

Code executed when creating a new static variable:

New way:
  static $data = array(), $default = array();
  if (isset($data[$name]) || array_key_exists($name, $data)) {...}	// both FALSE
  if (isset($name)) {                           // isset($name) is TRUE
    if ($reset) {...}                           // $reset is FALSE
    $default[$name] = $data[$name] = $default_value; // set new static var
    return $data[$name];

Old way:
  static $data = array(), $default = array();
  if (!isset($name)) {...}                      // !isset($name) is FALSE
  if ($reset) {...}                             // $reset is FALSE
  elseif (!array_key_exists($name, $data)) {    // !array_key_exists is TRUE
    $default[$name] = $data[$name] = $default_value; // set new static var
  return $data[$name];

For this case, the same number of conditional tests are made but the new version makes one additional call to isset(). The new version is slightly slower than the old when creating a new static variable.

Code executed when retrieving an existing static variable (having non-NULL contents):

New way:
  static $data = array(), $default = array();
  if (isset($data[$name]) || ... ) {    		// isset($data[$name]) is TRUE
    if ($reset) {...}                   		// $reset is FALSE
    return $data[$name];

Old way:
  static $data = array(), $default = array();
  if (!isset($name)) {...}              		// !isset($name) is FALSE
  if ($reset) {...}                     		// $reset is FALSE
  elseif (!array_key_exists($name, $data)) {...}  // !array_key_exists is FALSE
  return $data[$name];

For this case (which is probably the most frequent way this function is called), the new version logic eliminates one conditional test and one call to the (rather slow) array_key_exists() function. The new version is significantly faster than the old when retrieving an existing static variable (having non-NULL contents).

(Note that in the above code listings, execution paths not taken are indicated with: '...')

ridgerunner’s picture

Here is another option: This extremely simple patch achieves a 20-30% speed improvement (the lion's share of that achieved in post 10 above). It simply adds the following four lines to the existing drupal_static() function:

  elseif (isset($data[$name])) {
    // If variable is set and non-NULL, return it (avoiding array_key_exists).
    return $data[$name];
  }

This additional conditional test eliminates one call to array_key_exists() and adds one call to the much faster isset(). Although this version is not quite as fast as the patch in post 10 (which is about 5% faster), the simplicity of this one should be much easier to digest.

Status: Needs review » Needs work
Issue tags: -Performance

The last submitted patch, drupal_static_improved_768090_12.patch, failed testing.

ridgerunner’s picture

Status: Needs work » Needs review
Issue tags: +Performance
Damien Tournoud’s picture

Status: Needs review » Needs work

I like #10 better.

This said, the behavior seems weird in two places:

- before $data[$name] is set, $reset should have no effect
- when $name is unset, $reset = FALSE should simply return $data, while $reset = TRUE should reset $data and return it

Using the new code flow, it will be *much* easier to fix those issues.

ridgerunner’s picture

Status: Needs review » Needs work

For some unknown reason, the first automated run of SimpleTest on patch #12 failed (only one test out of 22,114 failed: "RDF mapping definition functionality", which does not even call drupal_static() directly. Weird.) Anyway, I ran the same RDF test on my local box and it passed so I decided to go ahead and run it again here - and this time it passed. Go figure. Evidently, the automated SimpleTest system is not 100% reliable. Is this common knowledge?

@Damien. I also prefer the patch from post 10. The resulting drupal_static() function is smaller, faster and the logic flow is cleaner (IMHO). Note, however that all these versions (original, 3, 8 (same as 3), 10 and 12) are functionally identical. The PNG file attachment in post 1 illustrates all the logic states and how each is handled.

Update: I modified my benchmarking code so that it works with more than one static variable (a much more realistic condition). The results show that with 30 static variables in the $data array, the speed improvement is better than 50%. In other words, with patch 10 applied, recalling a non-NULL static variable from drupal_static() is more than twice as fast!

ridgerunner’s picture

Status: Needs work » Needs review

Changing status back to needs review...

catch’s picture

Issue tags: +Performance

With the drupal_static_fast pattern there are quite a lot of functions where the variable is assigned but then never retrieved via a drupal_static() call again, how much of a hit is it in those cases?

ridgerunner’s picture

For the case when creating a new static variable, the new version is slightly slower than the original (there is one additional call to isset() as described in post 11 above). However, the performance effectiveness of the drupal_static_fast technique is unaffected thereafter. Thus, the net cost is a one-time-only extra call to isset() for each function that uses the drupal_static_fast method.

Last night I was curious to find out how many items are actually being stored by drupal_static() in its static $data array and also how many times the function is being called to recall these variables when Drupal renders a typical page, so I fired up my php debugger and found out. Starting with a fresh install of Drupal 7.0-alpha6, choosing the default "Standard" profile, here are the counts I came up with for the initial front page (having no content and not logged in):

Number of drupal_static calls to create a new variable: 50
Number of drupal_static calls to retrieve an existing variable: 123

For this basic plain vanilla default page case, it looks like the variables, on average, are being retrieved 2 times or more after being created. Therefore, this patch should result in a performance gain. However, I'm sure that some of you folks have access to much more sophisticated benchmarking methods. The best way to find out what the performance benefits of a change like this one are is to measure it. My guess is that both patch 10 and 12 above will result in a measurable speed improvement. I'm pretty new around here, but it seems like the automated SimpleTest machines would be an ideal platform to measure relative performances of various patches like this one.

ridgerunner’s picture

It turns out that when retrieving previously created variables, the speed of the old drupal_static() is a function of the number of static variables in its $data array. The array_key_exists() function really slows things down when the number of variables in the array goes up. Attached is a command line PHP script which measures and compares the execution speed of the old and new versions when retrieving a static variable having non-null contents. Here are some script output results from some benchmark tests I just performed on my local Windows XP box and a remote Linux account on Slicehost.

On remote Slicehost Linux 2.6.24-24-xen/PHP 5.2.11 box:
 1: nVars=   1 nCalls=100000 t_old=  2.4us t_new=  1.9us gain=21.3% speed= 1.27x
 2: nVars=   5 nCalls=100000 t_old=  2.6us t_new=  1.7us gain=34.4% speed= 1.52x
 3: nVars=  10 nCalls=100000 t_old=  2.9us t_new=  1.6us gain=43.8% speed= 1.78x
 4: nVars=  50 nCalls=100000 t_old=  6.4us t_new=  1.6us gain=74.4% speed= 3.90x
 5: nVars= 100 nCalls=100000 t_old= 11.5us t_new=  1.7us gain=85.1% speed= 6.70x
 6: nVars= 500 nCalls=100000 t_old= 65.0us t_new=  1.8us gain=97.2% speed=36.13x
 7: nVars=1000 nCalls=100000 t_old=147.0us t_new=  1.9us gain=98.7% speed=78.21x

On local Win32XP-SP2/PHP 5.2.13 box:
 1: nVars=   1 nCalls=100000 t_old=  9.9us t_new=  6.6us gain=33.5% speed= 1.50x
 2: nVars=   5 nCalls=100000 t_old=  9.4us t_new=  6.0us gain=36.1% speed= 1.57x
 3: nVars=  10 nCalls=100000 t_old=  9.8us t_new=  5.8us gain=40.4% speed= 1.68x
 4: nVars=  50 nCalls=100000 t_old= 14.3us t_new=  5.9us gain=58.8% speed= 2.43x
 5: nVars= 100 nCalls=100000 t_old= 20.9us t_new=  5.9us gain=71.9% speed= 3.56x
 6: nVars= 500 nCalls=100000 t_old= 87.3us t_new=  5.9us gain=93.2% speed=14.72x
 7: nVars=1000 nCalls=100000 t_old=192.2us t_new=  6.0us gain=96.9% speed=32.30x

Here are the formulas used:
gain = 100.0 * (t_old - t_new)/t_old;
speed = t_old / t_new

Note that the execution time of the new version is not affected by the number of static variables stored. Note also that these execution times are for retrieving existing variables and do NOT indicate the time required to create the variables. It looks like the performance gains will be truly significant when there are a lot of variables. (78 times faster? Yipes!)

ridgerunner’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
12.37 KB

Graphical plot of drupal_static() execution times.
Curious to visualize the functional relationship between execution time and number of stored variables, I created a plot. See attached PNG image. The relationship appears to go linear with more than 100 stored variables. Note that the execution times will certainly vary from box to box (and from run to run on one box), however the shape of the curve (linear) should be repeatable across various platforms.

Here is the benchmark data used to generate the plot. (Data taken from my local Win32XP-SP2/PHP 5.2.13 box using script from the previous post.)

 1: nVars=   1 nCalls=100000 t_old= 10.0us t_new=  6.7us gain=32.6% speed= 1.48x
 2: nVars=   2 nCalls=100000 t_old=  9.5us t_new=  6.2us gain=34.9% speed= 1.54x
 3: nVars=   3 nCalls= 99999 t_old=  9.4us t_new=  6.2us gain=34.5% speed= 1.53x
 4: nVars=   4 nCalls=100000 t_old=  9.4us t_new=  6.0us gain=36.1% speed= 1.56x
 5: nVars=   5 nCalls=100000 t_old=  9.4us t_new=  6.1us gain=35.8% speed= 1.56x
 6: nVars=   6 nCalls=100002 t_old=  9.6us t_new=  5.9us gain=38.6% speed= 1.63x
 7: nVars=   7 nCalls=100002 t_old=  9.7us t_new=  6.0us gain=37.6% speed= 1.60x
 8: nVars=   8 nCalls=100000 t_old=  9.8us t_new=  5.9us gain=39.6% speed= 1.66x
 9: nVars=   9 nCalls= 99999 t_old= 10.0us t_new=  6.0us gain=39.3% speed= 1.65x
10: nVars=  10 nCalls=100000 t_old= 10.0us t_new=  6.0us gain=39.4% speed= 1.65x
11: nVars=  20 nCalls=100000 t_old= 11.1us t_new=  5.9us gain=46.7% speed= 1.88x
12: nVars=  30 nCalls= 99990 t_old= 12.3us t_new=  5.9us gain=52.4% speed= 2.10x
13: nVars=  40 nCalls=100000 t_old= 13.2us t_new=  5.9us gain=55.1% speed= 2.23x
14: nVars=  50 nCalls=100000 t_old= 14.5us t_new=  5.9us gain=59.1% speed= 2.44x
15: nVars=  60 nCalls=100020 t_old= 16.0us t_new=  5.9us gain=63.3% speed= 2.72x
16: nVars=  70 nCalls=100030 t_old= 17.1us t_new=  6.0us gain=65.0% speed= 2.86x
17: nVars=  80 nCalls=100000 t_old= 18.2us t_new=  6.0us gain=67.3% speed= 3.06x
18: nVars=  90 nCalls= 99990 t_old= 19.6us t_new=  5.9us gain=69.8% speed= 3.31x
19: nVars= 100 nCalls=100000 t_old= 21.3us t_new=  5.9us gain=72.3% speed= 3.61x
20: nVars= 200 nCalls=100000 t_old= 36.7us t_new=  5.9us gain=83.8% speed= 6.19x
21: nVars= 300 nCalls= 99900 t_old= 52.3us t_new=  6.0us gain=88.6% speed= 8.78x
22: nVars= 400 nCalls=100000 t_old= 70.4us t_new=  6.1us gain=91.3% speed=11.53x
23: nVars= 500 nCalls=100000 t_old= 87.9us t_new=  6.0us gain=93.1% speed=14.59x
24: nVars= 600 nCalls=100200 t_old=106.8us t_new=  6.0us gain=94.4% speed=17.94x
25: nVars= 700 nCalls=100100 t_old=125.4us t_new=  6.0us gain=95.2% speed=21.01x
26: nVars= 800 nCalls=100000 t_old=142.1us t_new=  6.0us gain=95.8% speed=23.81x
27: nVars= 900 nCalls= 99900 t_old=159.5us t_new=  6.1us gain=96.2% speed=26.09x
28: nVars=1000 nCalls=100000 t_old=176.9us t_new=  5.9us gain=96.6% speed=29.80x
Damien Tournoud’s picture

Status: Needs review » Reviewed & tested by the community

Let's get #10 in.

Dries’s picture

Status: Needs review » Fixed

Nice improvement. I committed the patch in #10 to CVS HEAD. Thanks!

ridgerunner’s picture

You're welcome. But Damien deserves much of the credit. I am still curious to find out just how much this speeds things up.

On another note, I was looking through the Drupal code base for all the drupal_static() function calls, and I came across two places where it is being called erroneously without the &= assignment by reference syntax. One place is in the modules\simpletest\tests\module.test testModuleImplements() function where it is apparently being used for nothing at all. But the second erroneous call is in modules\file\file.module file_get_file_references(), where it is likely causing quite a bit of mischief. I did a search for any open issues related to this function and found two that might be related to this bug: #846296: file_file_download() only implements access checks for nodes and users and #845838: File/Original image attached to private message does not display when using private files.. I posted a patch to each of these threads but interestingly enough neither one got queued up for testing. ??? I'll attach another patch here to try again...

ridgerunner’s picture

Ok I figured out how to trigger the auto testing. The status needs to be "Needs Review". I did just that over on the #845838: File/Original image attached to private message does not display when using private files. thread and the bot is now running. I suppose I should set the status of this issue to closed since I am the one who opened it, but I would like to get some performance measurements feedback. Not sure of the proper procedure/protocol here...?

andypost’s picture

Status: Fixed » Needs review

Let the bot review #24

ridgerunner’s picture

It turns out that neither of the drupal_static() calling errors fixed by patch 24 above had any real ill effects. As per Berdir's comment #846296-10: file_file_download() only implements access checks for nodes and users, the only effect of the bad call in file_get_file_references() is that references are re-computed over and over rather than being statically cached. However, both of these erroneous calls should probably be fixed regardless so I'll leave the status set to "needs review".

Edit:2010-07-21: Fixed typo.

aspilicious’s picture

Status: Needs review » Reviewed & tested by the community

Berdir says this is a valid fix: #846296: file_file_download() only implements access checks for nodes and users
Bot is green ==> rtbc

ridgerunner’s picture

Status: Reviewed & tested by the community » Needs review

Actually, to my knowledge, no person has actually tested this patch. I know I haven't. Bendir's comment did not indicate that he has tested this patch. He simply said that the error that it fixes does not cause any problems. I looked at the file_get_file_references() function but was unable to easily determine if it will behave properly once the drupal_static() call is fixed. (This code is subtly complex. Someone familiar/involved with the file module needs to give this a look-see.)

catch’s picture

Title: drupal_static performance improvement » file_get_file_references() incorrect use of drupal_static()
Category: task » bug

Re-titling.

ridgerunner’s picture

Assigned: ridgerunner » Unassigned

Removing assignment from myself, because the patch currently under review is in an area I am not familiar with (the file module). Hopefully someone else will test this patch.

ridgerunner’s picture

Title: file_get_file_references() incorrect use of drupal_static() » drupal_static performance improvement
Status: Needs review » Closed (fixed)

I'm renaming this thread back to its original title (which better reflects the majority of the content here), and marking it as closed.

The 2 instances of erroneous usage of drupal_static() have been identified in other threads in case anyone cares to ever fix them.

Cheers!