Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 UTC on 18 March 2024, to get $100 off your ticket.
Isset is about 2.5 times faster than array_key_exists:
InstaBench 0.1 (PHP 5.2.13-pl0-gentoo)
Benchmarking results (200000 iterations)
========================================
array_key_exists: 1155ms (baseline)
isset: 852ms (1.4x faster)
Also, in many cases where array_key_exists , array_exists is also called upon – which isn't really necessary. This combination is used rather frequently throughout the current codebase.
There is one difference between these – which needs to be taken into consideration while creating a patch – they treat NULL differently. In some cases, like includes/database/mysql/schema.inc this is vital.
Comment | File | Size | Author |
---|---|---|---|
#74 | 839556-74_isset_regressions_and_tests.patch | 13.74 KB | jrchamp |
#72 | 839556-72_isset_regressions_and_tests.patch | 13.58 KB | jrchamp |
#62 | 839556-62_isset_regressions_and_tests.patch | 11.98 KB | David_Rothstein |
#55 | 839556-55_isset_regressions_and_tests.patch | 11.96 KB | dalin |
#53 | 839556-53_isset_regressions_and_tests.patch | 12 KB | dalin |
Comments
Comment #1
Anonymous (not verified) CreditAttribution: Anonymous commentedEhm, pardon my french. The initial tests i did with isset compared to array_key_exists were replacing code in drupal and profiling that, while a pure benchmark yields a more humble result. Still, the number of calls will add up.
Comment #2
BerdirI'm pretty sure that in most cases where array_key_exists() is used, it is used on purpose and isset() is not what we want. If you want to do an in-depth analysis which array_key_exists() can be replaced with isset(), then go ahead. But right now, this issue isn't of much use imho.
Also, there is no array_exists function? Did you mean is_array? That is required too, because array_key_exists only accepts an array.
Also, we don't micro-optimize unless it is in a function that is called very often and it has an actual impact. As your benchmarks shows, a call to array_key_exists() takes 0,00575ms compared to 0,00426ms when using isset(). Nothing to worry about imho.
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedThere are two known calls in drupal_static() that are on the critical patch.
We generally try to avoid
array_key_exists()
, but useisset() || array_key_exists()
when performance matter.Anyone up for a patch (at least for the drupal_static())?
Comment #4
dmitrig01 CreditAttribution: dmitrig01 commentedComment #5
Damien Tournoud CreditAttribution: Damien Tournoud commentedThis one should be &&:
!(a || b) = !a && !b
.Comment #7
catchI'm pretty sure effulgentsia already benchmarked the difference here, and found that !isset() || !array_key_exists() was slower than just the array_key_exists() - possibly the !isset() returned false enough times to make it not worth the extra comparison.
Either way this needs careful benchmarking before anything gets committed. We'd also likely get a bigger gain from applying the $drupal_static_fast pattern to more places.
Comment #8
jrchamp CreditAttribution: jrchamp commentedThe drupal_static() portion of this issue was resolved in comment #23 by the patch on comment #10 of #768090: drupal_static performance improvement.
As of right now, here are the instances of array_key_exists in core that are not in unit tests or simpletest itself:
./includes/actions.inc:281: if (array_key_exists($callback, $actions_in_db)) {
./includes/bootstrap.inc:2910: if (isset($data[$name]) || array_key_exists($name, $data)) {
./includes/database/mysql/schema.inc:155: if (array_key_exists('default', $spec)) {
./includes/database/pgsql/schema.inc:461: if (!array_key_exists('size', $spec)) {
./includes/form.inc:1721: if (!isset($element['#value']) && !array_key_exists('#value', $element)) {
./includes/form.inc:1729: if (is_array($input) && array_key_exists($parent, $input)) {
./includes/form.inc:2241: // array_key_exists() accommodates the rare event where $element['#value'] is NULL.
./includes/form.inc:2243: $value_valid = isset($element['#value']) || array_key_exists('#value', $element);
./includes/tablesort.inc:280: if (is_array($header) && array_key_exists('sort', $header)) {
./includes/tablesort.inc:83: if (is_array($header) && array_key_exists('sort', $header)) {
./includes/theme.inc:396: if (!array_key_exists($key, $info) && isset($cache[$hook][$key])) {
./modules/block/block.module:609: if ($block['status'] && !array_key_exists($block['region'], $regions)) {
./modules/comment/comment.module:2253: $authenticated_post_comments = array_key_exists(DRUPAL_AUTHENTICATED_RID, user_roles(TRUE, 'post comments') + user_roles(TRUE, 'post comments without approval'));
./modules/field/modules/list/list.module:258: if (count($allowed_values) && !array_key_exists($item['value'], $allowed_values)) {
./modules/field_ui/field_ui.admin.inc:57: '%widget_type' => array_key_exists($instance['widget']['type'], $widget_types) ? $widget_types[$instance['widget']['type']]['label'] : $instance['widget']['type'],
./modules/filter/filter.module:1245: if (array_key_exists($tag, $tips)) {
./modules/openid/openid.inc:504: if (array_key_exists($rbytes, $duplicate_cache)) {
./modules/openid/openid.module:936: if (!array_key_exists($name, $_GET) || $_GET[$name] != $value) {
./modules/profile/profile.admin.inc:118: if (array_key_exists('category', $form[$key])) {
./modules/system/image.gd.inc:343: $extension = array_key_exists($data[2], $extensions) ? $extensions[$data[2]] : '';
./modules/system/system.api.php:2413: if (array_key_exists('extension', $pathinfo) && in_array($pathinfo['extension'], $cdn_extensions)) {
Quite a few of these are clearly bad. Any positive situation where the keyed value will not be NULL should simply use isset() instead. Any negative situation where the keyed value can not validly be NULL, FALSE, array(), "", 0 or "0" should simply use empty().
This is the full list. You can probably ignore bootstrap.inc and form.inc.
Comment #9
sunsubscribing
Comment #10
jrchamp CreditAttribution: jrchamp commentedHere's the array_key_exists cleanup for the non-test parts of core (excluding the parts that "need" it). I went through and checked all of them, and almost none needed the NULL property of array_key_exists.
Comment #11
jrchamp CreditAttribution: jrchamp commentedReroll against head for comment.module change.
Comment #12
dalinPatch looks good to me. Managed to get rid of some
count()
as well. This probably won't have a significant impact on performance, but it's good to be doing things The Right Way™.Comment #13
Dries CreditAttribution: Dries commentedDoes this have a measurable impact on page load times? Would be good to get some page-level benchmarks.
That said, I like this patch because it is a bit of a code-cleanup in my mind.
Comment #14
catchMost of these don't appear on pages that are accessed often, and when they do it looks like the array_key_exists() would only run a few times.
However, reviewing the patch, there are no cases here where we're replacing array_key_exists() with an exactly equivalent pattern, it's all for straight !isset() or !empty(), and there's no reason to use array_key_exists() anywhere unless you really, really need that specific behaviour of checking for an array key with NULL value. This has been an established pattern in core for a while now, but obviously not enforced, so it makes complete sense to clean that up - if you do have code in the critical path, or looping over a lot of arrays using array_key_exists(), that can be a real performance hit since it's a few times slower than isset() or !empty(), so we should set a good example.
Comment #15
Dries CreditAttribution: Dries commentedMakes sense. Committed to CVS HEAD. Thanks all.
Comment #16
effulgentsia CreditAttribution: effulgentsia commentedI agree with the spirit of this issue, and with most of the specifics that were committed. I'm not so sure about the theme.inc one. Here's a rollback of just that one. I really don't have a use-case in mind, but #11 is technically a regression, and we don't need to micro-optimize the building of the theme registry. Since I don't have a use-case in mind, I added a @todo in the patch.
Comment #17
jrchamp CreditAttribution: jrchamp commentedThe theme.inc change did make it so that NULL values in the hook would be overridden. However, instead of returning to the array_key_exists methodology, I have attached a different solution which I feel simplifies the code. Please let me know what you think.
Comment #18
effulgentsia CreditAttribution: effulgentsia commentedGood idea. How about this? It's not quite as performant, because it defines $keys inside the loop rather than outside, but that's pretty micro, and again, this is code that only runs when building the registry, which happens rarely, and the bulk of the time is spent on a giant file system scan and large db write anyway, so I think any performance difference here is negligible.
Meanwhile, Dries has expressed frustration many times that the whole theme registry building process is nearly impossible to understand, so I like your direction in trimming some of the code. This builds on that, and does a little shuffling and what I hope is a small improvement to the code comments.
Comment #19
dalinNice improvement. I actually understand what the code is trying to do now. We probably shouldn't be calling theme functions "hooks" because they're not. But that's probably something for another issue.
Comment #20
sunThanks! The movement of $function, the added comments and also the improved line breaks between function body chapters definitely help in understanding the logic.
However, the added comments can be heavily shortened, as they are quite "wordy". Start by removing subordinate clauses and using more direct speech. Remove commas and every other word that is technically not necessary.
While I can't provide evidence that it would be faster, at the very least, a simple array_flip() would be much more clear than this array_fill_keys($foo, TRUE).
Powered by Dreditor.
Comment #21
effulgentsia CreditAttribution: effulgentsia commentedLike this?
Comment #22
sunThanks! But...
I'm not able to make any sense of the comment now. Studying the code, it seems to try to tell me:
"Take over default theme hook properties for hook_theme() implementations in themes, which may override a certain property only."
?
Powered by Dreditor.
Comment #23
effulgentsia CreditAttribution: effulgentsia commentedLet's start from the beginning. Here's what the entire body for the
foreach ($result as $hook => $info) {
code block does:1. Some keys are forced:
- type
- theme path
2. Some keys are defaulted with no dependence on what's already in $cache (i.e., what was defined by an earlier hook_theme() implementation such as the base theme, theme engine, or module):
- function
3. Some keys are processed with no dependence on what's already in $cache
- template
4. Some keys are defaulted directly from what's already in $cache with no other processing:
- variables
- render element
- pattern
- base hook
5. Some keys are a mix of using what's already in $cache, as well as doing some (or a lot of) processing:
- includes
- preprocess functions
- process functions
Ok, so how do we explain that in a way that someone can understand? Maybe we should just go back to the code comment that's in HEAD for item 4:
But that comment makes no sense to me, and I thought #18 and #21 were improvements, but perhaps not.
I have no idea what that means. What does "default" mean in this context? Why are we only doing this for 4 keys only and not simply doing
$result[$hook] += $cache[$hook]
? Is it possible to answer that question without providing some background to what the whole process is trying to do?Comment #24
jrchamp CreditAttribution: jrchamp commentedI'm not a fan of defining (and redefining) the keys array inside of the loop or repeating the array_fill_keys/array_flip. While I'm still for defining the array without the need for modification similar to the $variable_process_phases in the same function, I would be willing to consider a single array_flip if the TRUEs are a readability or code size issue.
Also, the array_merge that populates $cache near the bottom of the function seems like a more confusing version of simply saying
$cache = $result + $cache;
For the comment perhaps the following?
Comment #25
David_Rothstein CreditAttribution: David_Rothstein commentedWhy were these is_array() checks removed (in the original patch)?
The headers in question can be either an array or a string, so removing that doesn't seem like a good idea. I think we should add back the is_array() as part of the followup path.
Comment #26
dalin@David_Rothstein The is_array() is removed because it is superfluous:
Comment #27
David_Rothstein CreditAttribution: David_Rothstein commentedNo, that's not correct - although you might reasonably think that's the way PHP should work, it doesn't :)
In the above example, isset($header['sort']) evaluates to TRUE, and $header['sort'] itself evaluates to 'f'.
Comment #28
dalinHot damn, I stand corrected.
This patch reverts that mistake with table sorting committed in #15, patch in #11. Not sure if protocol says that it should be rolled into the patch in #21 or what.
Comment #29
jrchamp CreditAttribution: jrchamp commentedSorry about that, here's both lines of the the tablesort semi-revert.
Comment #30
dalinI'm resurrecting this issue. To bring us back up to speed the patch committed in #15 contained two regressions.
- One was a problem with tablesorting with a simple fix to revert to the original, see comments #25-29.
- Secondly was an issue with the theme registry. Rather than simply revert we decided to rewrite things a bit to increase clarity. See comments #16-24.
This patch deals with both. The first has pretty much reached consensus. However the registry part of this patch needs to be reviewed to see if I've correctly distilled the ideas from comments #16-24.
Comment #31
jrchamp CreditAttribution: jrchamp commentedEven though I'd prefer to not array_flip and use #17 instead for the increase in clarity, I would be surprised if it poses a significant performance loss. Therefore, I'm fine with #30.
Comment #32
dalin@jrchamp I aggree, that is a bit clearer. However the patch in #17 called those keys $key_overrides which to me is opposite to what they are: the keys that can't be overridden. I've rerolled calling them $non_overridable_keys.
Comment #33
sun1) Let's use a simple list and array_flip(array(...)) here.
2) Trailing white-space.
3) The variable name sounds a bit odd. While trying to understand its actual meaning, I had hard times to understand what this code is actually doing.
- We take the module-defined hook info,
- remove all keys but those statically defined,
- and only add them to the theme hook override if they are not defined already.
Hm. I would have expected
a) either array_diff_key() (or actually no munging at all) along with the current +=
b) or array_intersect_key() along with an array_merge() -- that seems to be the only way to get to the (perhaps?) desired behavior of "keys that cannot be overridden"...?
Powered by Dreditor.
Comment #34
jrchamp CreditAttribution: jrchamp commented@sun, the variable might make more sense as "$default_keys", it is the list of keys which we want to keep from the cache if they are not overridden.
The logic is:
* if the key is in the $cache[$hook]
* use it as the defaults for $result[$hook] if those keys are not specified
Because it is *not* "keys that cannot be overridden", += offers the correct replacement order (which is the reverse of array_merge).
Comment #35
sunThanks, that explains the confusion. "non_overridable_keys" has a completely different meaning than "hook_defaults".
Thus, another re-roll to carefully clean this up would get us back to RTBC, I think. When in doubt, add some short inline comments to make sure that the logic and meaning of the code is clear.
Comment #36
jrchamp CreditAttribution: jrchamp commented@sun, Here's my interpretation.
@dalin, I returned the order the way it was, because I'm not sure what kinds of side effects could be occurring in the file that gets included when the 'file' key is set. Also, I reworded the defaults comment, but feel free to modify. I tried to capture the essence of your work.
Comment #38
dalinI think the comments are the clearest we've come up with yet. But honestly my head has imploded on this one. We'll need @sun to give a final review.
I'm queueing for retesting because I think that's an unrelated error that I also hit in another patch on another issue yesterday.
Comment #39
dalin#36: 839556-36_isset_regressions.patch queued for re-testing.
Comment #40
sunThanks, looks much better now!
Since various forms/tables using tablesort are currently broken, bumping to major.
Comment #41
webchickIf this is fixing a regression, why are there no .test hunks?
Comment #42
dalinI did a bit of research into where and how a regression could exist. There are two hunks in the latest patch that we fix a regression.
In one case it is within the TableSort object and the data is only temporarily broken within the object's protected methods, but cleaned up before the data gets out of the object due to
$sort = in_array($sort, array('ASC', 'DESC')) ? $sort : '';
Therefore I don't think there's any way to easily test the regression (short of overriding the class and testing that).
The second case is within tablesort_get_sort() which can cause something to actually break and is testable, but I ran out of time to write a test.
I'll try to pick this up tomorrow unless someone else gets to it first.
Comment #43
dalinUghhh. I started putting together what I thought were some fairly straightforward tests only to discover several bugs in tablesort.inc. Further work forthcoming.
Comment #44
dalinOk, it turned out to be just one new bug that the tests exposed. Where multiple headers are defined to be sorted.
The first patch is just the tests, the second is tests + the patch in #36 + fix for the above.
Comment #46
dalinThis patch should fix those PHP notices.
Comment #47
jrchamp CreditAttribution: jrchamp commented@dalin, Nice work on this! I took a look at the patch and was wondering if line 83 of TableSort::getSort() needed the same changes as the tablesort_get_sort() function. Honestly, I'm not 100% sure why TableSort::getSort()'s function contents are anything more than:
Of course, if that replacement is correct, it might also make sense to do the same thing with TableSort::order() and tablesort_get_order():
Now that I think about it, the behavior of those two functions is slightly different. While it seems wasteful to use array_values, it's likely to "fix" a bug that still exists in the tablesort_get_order() function on line 256. Instead of using current(), it should be using reset() if it wants the first header after the array has been foreach'd instead of returning false.
Comment #48
dalinDone. Let's see if testbot likes it.
Agreed, there's a lot of duplicated code in this file. However, this issue has ballooned in scope. Nay exploded might be a better term. I've created a new issue: #985836: tablesort.inc contains a load of duplicated code.
My brain agrees with you. But it worked so perhaps this is another PHPWTF. I've changed it to reset().
Comment #49
dalinErr, the patch.
Comment #50
jrchamp CreditAttribution: jrchamp commented@dalin, Very nice! Good idea on splitting the issue. This patch looks great!
Comment #51
jrchamp CreditAttribution: jrchamp commentedI'm marking this RTBC given that the only reason it's not RTBC was to add test hunks which I dalin has done and which succeed on the testbot. This patch is the more complicated of the two, and I'd prefer to reroll #985836: tablesort.inc contains a load of duplicated code because that's just a matter of deleting a function's content and replacing it with the only + line in the current diff.
Comment #52
sunCan we add a short inline comment explaining this advanced condition, please?
(and elsewhere) Usage of filter_xss_admin() seems to be completely off here... the exported variable is passed through check_plain() (%-placeholder) anyway... That said, a @-placeholder would likely make more sense.
Lastly, we can use strtr() instead of t().
Due to not really clear comments, I'm not entirely sure, but it doesn't look like there's a test that verifies tablesort's correct behavior with arbitrary other (unrelated) GET parameters...?
Powered by Dreditor.
Comment #53
dalinThis patch addresses @sun's concerns above. While expanding the tests I found yet another bug:
Sort can be overridden by sending $_GET['sort'] of 'asc', 'ASC', or 'desc', but not 'DESC'. This patch also fixes this.
We now have 162 lines of tests for testing 150 lines of code. I wonder if it gets 100% coverage ;-)
Comment #54
sunIncomplete multi-line comment syntax -- however, within a function body, we use
// Single-line comments only,
// even if they need to wrap at 80 chars.
It looks like this comment actually belongs to the code in tearDown().
Should still use check_plain(), not filter_xss_admin().
Trailing white-space.
Powered by Dreditor.
Comment #55
dalinThe purpose of
$_GET = array('q' => 'jahwohl');
is to prevent the get params from simpletest and batch API throwing off $expected_ts['query']. So I renamed the comment toAh. Usually I have to use filter_xss_admin when I use this technique with watchdog(), but check_plain does seem to work here.
Darn IDE is supposed to "intelligently" remove trailing whitespace only from lines that I edit.
Comment #56
sunThanks!
Comment #57
jrchamp CreditAttribution: jrchamp commentedThanks dalin and sun for finalizing this. It's nice to see progress when I get back from a long weekend.
Comment #58
webchickWow, this is changing tons of stuff deep in the innards of the theme system. :( We really need effulgentsia's sign-off on this, IMO. Should we also do some benchmarks?
Comment #59
webchickComment #60
sunBenchmarks, no. effulgentsia did the initial patch for the theme system changes and those have been changed marginally only. But I guess it can't hurt to have him double-check and confirm the final changes once more.
Comment #61
jrchamp CreditAttribution: jrchamp commentedSo we just need confirmation from effulgentsia?
Comment #62
David_Rothstein CreditAttribution: David_Rothstein commentedRerolled. The patch no longer applied due to #993026: Default sort on admin/content incorrect (which fixed part of this).
Comment #63
sun@effulgentsia: Can we get a final review/confirmation from you?
Comment #64
miro_dietikerReferring to related issue
#295430: Default sort using tablesort_get_order() is broken.
See Comment 11:
http://drupal.org/node/295430#comment-3846516
Note that this patch here corrects some sort handling issue. But it still doesn't apply the first default: It orders by the last one.
Please also review the other patch correct me if this is intended behaviour.
Comment #65
jrchamp CreditAttribution: jrchamp commentedOther than documentation and performance improvements, this doesn't change the theme behavior. The result of a foreach conditional set and a right hand array union is the same. array_merge for non-integer indexes is the same as a reversed array union.
Comment #66
miro_dietikerjrchamp: No, this is not correct. The following lines change behaviour
Comment #67
dalin@miro_dietiker We were only needing confirmation that the changes to theme.inc did not change behaviour. The line that you point out is intentional. The
is_array($header)
part fixes the regression committed in #15. Fixing these regressions got us down a deep rabbit hole. We wrote some tests for tablesort.inc which exposed several bugs. The&& $header['data'] == $ts['name']
fixes one of those bugs. #43-#56 is where the work and review on that was done.Great to see that this is finally RTBC. But since the patch was from ~20 days ago I'll retest to make sure that it doesn't need a re-roll.
Comment #68
dalin#62: 839556-62_isset_regressions_and_tests.patch queued for re-testing.
Comment #69
jrchamp CreditAttribution: jrchamp commentedTest completed successfully. Hopefully we can get this and #295430: Default sort using tablesort_get_order() is broken. in soon.
Comment #70
effulgentsia CreditAttribution: effulgentsia commentedSorry for being MIA on this. Thanks to all who kept driving it forward. I haven't reviewed the tablesort changes, as others have already done so thoroughly. The code changes to _theme_process_registry() look good, so RTBC+1.
I'm still not fully satisfied with the comments in _theme_process_registry(). Seems like we should be able to explain this stuff better. But there's no reason to hold up this bug fix on that. When someone has inspiration, perhaps we can take another stab at demystifying this function. I don't think the patch makes things any less clear than what's in HEAD.
The important bug fix to this function in this patch is:
This fixes #11's regression, ensuring for example, that if $result[$hook]['base hook'] is explictly NULL, that it stays NULL, and doesn't inherit the prior value. As per #16, I'm not sure if there's a real use-case for this, but I think there might be, and even if it's edge case, I don't think #11 intended to change this behavior, it just got accidentally rolled up with the other cleanup.
I agree with #60 that there's no need for benchmarks here. This only happens during registry building, so even if there is a microsecond difference between a
foreach
and a+= array_interect_key()
, it's not within a performance-critical part of the code for us to worry about.Comment #71
jrchamp CreditAttribution: jrchamp commentedThanks effulgentsia, sounds like everyone is on board with the patch on comment #62. As soon as this is committed, I'll be able to work on the other two tablesort issues.
Comment #72
jrchamp CreditAttribution: jrchamp commentedRerolled to git.
Comment #74
jrchamp CreditAttribution: jrchamp commentedRerolled to git with the correct test case.
Comment #75
catchComment #76
David_Rothstein CreditAttribution: David_Rothstein commentedLooks like this was committed to Drupal 8 today via these two commits:
http://drupalcode.org/project/drupal.git/commitdiff/7becd8a
http://drupalcode.org/project/drupal.git/commitdiff/7b40d0a
Comment #77
webchickPresumably this was meant to go into D7 as well. I reviewed this patch and issue once again since it'd been awhile. Nice job on those tests, dalin! And thanks sun and eff (and others) for the detailed reviews. I'm still a bit cagey about this one, but probably best to get it in sooner than later so we have some time to test before 7.1.
Committed to 7.x. Thanks! (Interestingly, git apply wasn't able to deal with the added test file. I had to patch -p1 it. Odd. Best someone check to make sure it made it through on the other side okay. :))