Summary: The proposed new version of drupal_static
provides several benefits:
- It is faster.
- It is smaller.
- It is simpler.
- It corrects erroneous handling of one edge case calling scenario.
Discussion: The speed of the drupal_static
function in bootstrap.inc can be improved slightly (approx 4-6%) by simply rearranging the logic flow in the code to optimize the response for the most common calling scenario. Currently, the very first test in the function is to (unnecessarily) check: isset($name)
, and if true, reset all the static variables to their default values. However calling drupal_static(NULL)
, (which triggers this code block) is rare and not going to happen very often. A much more likely calling scenario is when drupal_static
is first called once to initialize the static variable, and then called many times to simply fetch the reference to the static variable. And with this most-frequently-called-case, an array_key_exists($name, $data)
conditional test is made, and there is no need to first check if $name
is set. By placing the code for this most common calling scenario at the very top of the function, a small, but measurable speed improvement can be gained. (I measured a 4% improvement on a Linux 2.6.24-24-xen/PHP 5.2.11 box and a 6% improvement on Win32XP-SP2/PHP 5.2.13 box.)
The attached patch rearranges and streamlines the logic components of the function to achieve the performance gain while at the same time reducing the code size. (See the attached png image which shows the (rough hand drawn) logic matrix used to analyze and describe all the possible calling scenarios for drupal_static
.) Although it may look quite different, the code in this patch duplicates the logic of the original function - with one exception: The current/old drupal_static
function erroneously handles one edge case; when it is called for the first time to initialize a new static var with a non-NULL $name
, and $reset
set to TRUE like so:
// ERROR! With $reset=TRUE, drupal_static fails to create and initialize a new
// static variable and erroneously returns a reference to the base $data array.
$my_new_static =& drupal_static(__FUNCTION__, 'default', TRUE); // !ERROR
However, this calling scenario is unusual and unlikely and is thus, not a critical issue. (Although this situation does (harmlessly) occur when drupal_static_reset
is called with the name of a static variable that has not previously been initialized.) Nonetheless, the patch presented here does correct this erroneous behavior for this unlikely edge case, by creating and initializing a new static var even when $reset
is TRUE. (This case is identified in note 3 in the attached image.)
Here is the improved drupal_static
:
function &drupal_static($name, $default_value = NULL, $reset = FALSE) {
static $data = array(), $default = array();
// First check if dealing with a previously defined static variable.
if (array_key_exists($name, $data)) {
// Non-NULL $name and both $data[$name] and $default[$name] statics exist.
if ($reset) {
// Reset pre-existing static variable to its default value.
$data[$name] = $default[$name];
}
return $data[$name];
}
// Neither $data[$name] nor $default[$name] static variables exist.
if (isset($name)) {
// First call with new non-NULL $name. Initialize a new static variable.
$default[$name] = $data[$name] = $default_value;
return $data[$name];
}
// Reset all: ($name == NULL). This needs to be done one at a time so that
// references returned by earlier invocations of drupal_static() also get
// reset.
foreach ($default as $name => $value) {
$data[$name] = $value;
}
// As the function returns a reference, the return should always be a
// variable.
return $data;
}
Comment | File | Size | Author |
---|---|---|---|
#24 | fix_bad_drupal_static_calls_768090_24.patch | 1.58 KB | ridgerunner |
#10 | drupal_static_improved_768090_10_cvs.patch | 2.28 KB | ridgerunner |
#21 | drupal_static_executions_times.png | 12.37 KB | ridgerunner |
#20 | benchmark_drupal_static_20100716.php_.txt | 5.15 KB | ridgerunner |
#3 | drupal_static_improved_768090_03.patch | 2.26 KB | ridgerunner |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedInteresting approach.
One additional thing to consider:
array_key_exists($name, $data)
is known to be slow. Might replacing it withisset($data[$name]) || array_key_exists($name, $data)
help gain some more cycles out of that?Comment #3
ridgerunner CreditAttribution: ridgerunner commentedWell, it appears that handling the one edge case in a different manner breaks a bunch of Simpletest tests. Here is a re-rolled version of
drupal_static
that handles all cases identically to the original and reaps the same performance gain:Edit 2010-07-16: I can see now that the "erroneous edge case" described in post 1 above, is actually a very useful calling scenario which allows an external function to obtain a reference to the entire
$data
array which would otherwise be private and inaccessible.Comment #4
ridgerunner CreditAttribution: ridgerunner commentedNote to Damien: I tested the
isset($data[$name]) || array_key_exists($name, $data)
construct as suggested, but it significantly reduced the performance gain. Go figure.Edit 2010-07-16: It turns out that this method does significantly speed things up (but the static variable must contain a non-NULL value - which will be most of the time). This change alone speeds up the function by another 20-30% (see posts 10 and 12 below). Good call Damien!
Comment #6
ridgerunner CreditAttribution: ridgerunner commentedComment #7
ridgerunner CreditAttribution: ridgerunner commentedIt appears that the tag should be: "Performance" and not "+Performance"...
Comment #8
ridgerunner CreditAttribution: ridgerunner commentedHere is a re-roll of the patch from post 3. The function remains the same as previously listed in post 3.
This change provides a measurable performance gain with zero down side (other than the time required to review, test and implement the patch). This new
drupal_static()
version is functionally identical to the original - the speed-up comes from simply rearranging the conditional tests.Not sure why no one has chimed in on this. (too busy squashing bugs I guess...?)
Comment #9
catchsubscribing.
Comment #10
ridgerunner CreditAttribution: ridgerunner commentedIt turns out that using
isset($data[$name]) || array_key_exists($name, $data)
as suggested by Damien Tournoud in comment 1 above does *indeed* speed things up. Significantly. The benchmarking I was doing previously (which showed no improvement), was storing NULL in all the static variables which caused theisset()
clause to always test FALSE (which is not realistic). I just retested this using static variables having much more realistic non-NULL values, and the speed improvements were quite significant. Here is the improveddrupal_static()
function:My benchmark tests (which measure
drupal_static()
in isolation), indicate a 25% speed improvement on a Linux 2.6.24-24-xen/PHP 5.2.11 box and a 35% improvement on Win32XP-SP2/PHP 5.2.13 box. The improvement may actually be even better when the static $data[] array has lots of entries. (My benchmark test only stores one variable). I'm very curious to see how much overall improvement will be seen in an actual Drupal run time environment.Edit 2010-07-16: It turns out that the performance benefits increase a *LOT* as more variables are added. (See posts 20 & 21 below.)
Comment #11
ridgerunner CreditAttribution: ridgerunner commentedHere is some more explanation on the performance enhancement...
The speed of
drupal_static()
is improved by removing one call toarray_key_exists()
for the most common calling case - when retrieving a previously allocated static variable. When creating a new static variable, the speed of the new version is slightly slower than the old. The following listings show the actual lines of code which are executed for each of these two calling scenarios:Code executed when creating a new static variable:
For this case, the same number of conditional tests are made but the new version makes one additional call to
isset()
. The new version is slightly slower than the old when creating a new static variable.Code executed when retrieving an existing static variable (having non-NULL contents):
For this case (which is probably the most frequent way this function is called), the new version logic eliminates one conditional test and one call to the (rather slow)
array_key_exists()
function. The new version is significantly faster than the old when retrieving an existing static variable (having non-NULL contents).(Note that in the above code listings, execution paths not taken are indicated with: '...')
Comment #12
ridgerunner CreditAttribution: ridgerunner commentedHere is another option: This extremely simple patch achieves a 20-30% speed improvement (the lion's share of that achieved in post 10 above). It simply adds the following four lines to the existing
drupal_static()
function:This additional conditional test eliminates one call to
array_key_exists()
and adds one call to the much fasterisset()
. Although this version is not quite as fast as the patch in post 10 (which is about 5% faster), the simplicity of this one should be much easier to digest.Comment #14
ridgerunner CreditAttribution: ridgerunner commented#12: drupal_static_improved_768090_12.patch queued for re-testing.
Comment #15
Damien Tournoud CreditAttribution: Damien Tournoud commentedI like #10 better.
This said, the behavior seems weird in two places:
- before $data[$name] is set, $reset should have no effect
- when $name is unset, $reset = FALSE should simply return $data, while $reset = TRUE should reset $data and return it
Using the new code flow, it will be *much* easier to fix those issues.
Comment #16
ridgerunner CreditAttribution: ridgerunner commentedFor some unknown reason, the first automated run of SimpleTest on patch #12 failed (only one test out of 22,114 failed: "RDF mapping definition functionality", which does not even call
drupal_static()
directly. Weird.) Anyway, I ran the same RDF test on my local box and it passed so I decided to go ahead and run it again here - and this time it passed. Go figure. Evidently, the automated SimpleTest system is not 100% reliable. Is this common knowledge?@Damien. I also prefer the patch from post 10. The resulting
drupal_static()
function is smaller, faster and the logic flow is cleaner (IMHO). Note, however that all these versions (original, 3, 8 (same as 3), 10 and 12) are functionally identical. The PNG file attachment in post 1 illustrates all the logic states and how each is handled.Update: I modified my benchmarking code so that it works with more than one static variable (a much more realistic condition). The results show that with 30 static variables in the
$data
array, the speed improvement is better than 50%. In other words, with patch 10 applied, recalling a non-NULL static variable fromdrupal_static()
is more than twice as fast!Comment #17
ridgerunner CreditAttribution: ridgerunner commentedChanging status back to needs review...
Comment #18
catchWith the drupal_static_fast pattern there are quite a lot of functions where the variable is assigned but then never retrieved via a drupal_static() call again, how much of a hit is it in those cases?
Comment #19
ridgerunner CreditAttribution: ridgerunner commentedFor the case when creating a new static variable, the new version is slightly slower than the original (there is one additional call to
isset()
as described in post 11 above). However, the performance effectiveness of thedrupal_static_fast
technique is unaffected thereafter. Thus, the net cost is a one-time-only extra call toisset()
for each function that uses thedrupal_static_fast
method.Last night I was curious to find out how many items are actually being stored by
drupal_static()
in its static$data
array and also how many times the function is being called to recall these variables when Drupal renders a typical page, so I fired up my php debugger and found out. Starting with a fresh install of Drupal 7.0-alpha6, choosing the default "Standard" profile, here are the counts I came up with for the initial front page (having no content and not logged in):For this basic plain vanilla default page case, it looks like the variables, on average, are being retrieved 2 times or more after being created. Therefore, this patch should result in a performance gain. However, I'm sure that some of you folks have access to much more sophisticated benchmarking methods. The best way to find out what the performance benefits of a change like this one are is to measure it. My guess is that both patch 10 and 12 above will result in a measurable speed improvement. I'm pretty new around here, but it seems like the automated SimpleTest machines would be an ideal platform to measure relative performances of various patches like this one.
Comment #20
ridgerunner CreditAttribution: ridgerunner commentedIt turns out that when retrieving previously created variables, the speed of the old
drupal_static()
is a function of the number of static variables in its$data
array. Thearray_key_exists()
function really slows things down when the number of variables in the array goes up. Attached is a command line PHP script which measures and compares the execution speed of the old and new versions when retrieving a static variable having non-null contents. Here are some script output results from some benchmark tests I just performed on my local Windows XP box and a remote Linux account on Slicehost.Note that the execution time of the new version is not affected by the number of static variables stored. Note also that these execution times are for retrieving existing variables and do NOT indicate the time required to create the variables. It looks like the performance gains will be truly significant when there are a lot of variables. (78 times faster? Yipes!)
Comment #21
ridgerunner CreditAttribution: ridgerunner commentedGraphical plot of
drupal_static()
execution times.Curious to visualize the functional relationship between execution time and number of stored variables, I created a plot. See attached PNG image. The relationship appears to go linear with more than 100 stored variables. Note that the execution times will certainly vary from box to box (and from run to run on one box), however the shape of the curve (linear) should be repeatable across various platforms.
Here is the benchmark data used to generate the plot. (Data taken from my local Win32XP-SP2/PHP 5.2.13 box using script from the previous post.)
Comment #22
Damien Tournoud CreditAttribution: Damien Tournoud commentedLet's get #10 in.
Comment #23
Dries CreditAttribution: Dries commentedNice improvement. I committed the patch in #10 to CVS HEAD. Thanks!
Comment #24
ridgerunner CreditAttribution: ridgerunner commentedYou're welcome. But Damien deserves much of the credit. I am still curious to find out just how much this speeds things up.
On another note, I was looking through the Drupal code base for all the
drupal_static()
function calls, and I came across two places where it is being called erroneously without the &= assignment by reference syntax. One place is in themodules\simpletest\tests\module.test testModuleImplements()
function where it is apparently being used for nothing at all. But the second erroneous call is inmodules\file\file.module file_get_file_references()
, where it is likely causing quite a bit of mischief. I did a search for any open issues related to this function and found two that might be related to this bug: #846296: file_file_download() only implements access checks for nodes and users and #845838: File/Original image attached to private message does not display when using private files.. I posted a patch to each of these threads but interestingly enough neither one got queued up for testing. ??? I'll attach another patch here to try again...Comment #25
ridgerunner CreditAttribution: ridgerunner commentedOk I figured out how to trigger the auto testing. The status needs to be "Needs Review". I did just that over on the #845838: File/Original image attached to private message does not display when using private files. thread and the bot is now running. I suppose I should set the status of this issue to closed since I am the one who opened it, but I would like to get some performance measurements feedback. Not sure of the proper procedure/protocol here...?
Comment #26
andypostLet the bot review #24
Comment #27
ridgerunner CreditAttribution: ridgerunner commentedIt turns out that neither of the
drupal_static()
calling errors fixed by patch 24 above had any real ill effects. As per Berdir's comment #846296-10: file_file_download() only implements access checks for nodes and users, the only effect of the bad call infile_get_file_references()
is that references are re-computed over and over rather than being statically cached. However, both of these erroneous calls should probably be fixed regardless so I'll leave the status set to "needs review".Edit:2010-07-21: Fixed typo.
Comment #28
aspilicious CreditAttribution: aspilicious commentedBerdir says this is a valid fix: #846296: file_file_download() only implements access checks for nodes and users
Bot is green ==> rtbc
Comment #29
ridgerunner CreditAttribution: ridgerunner commentedActually, to my knowledge, no person has actually tested this patch. I know I haven't. Bendir's comment did not indicate that he has tested this patch. He simply said that the error that it fixes does not cause any problems. I looked at the
file_get_file_references()
function but was unable to easily determine if it will behave properly once thedrupal_static()
call is fixed. (This code is subtly complex. Someone familiar/involved with the file module needs to give this a look-see.)Comment #30
catchRe-titling.
Comment #31
ridgerunner CreditAttribution: ridgerunner commentedRemoving assignment from myself, because the patch currently under review is in an area I am not familiar with (the file module). Hopefully someone else will test this patch.
Comment #32
ridgerunner CreditAttribution: ridgerunner commentedI'm renaming this thread back to its original title (which better reflects the majority of the content here), and marking it as closed.
The 2 instances of erroneous usage of drupal_static() have been identified in other threads in case anyone cares to ever fix them.
Cheers!