Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Issue status
The issue was split into two follow-up issues:
- #2720849: D8 improve theme registry build performance
- #2720853: D7 Improve theme registry build performance by 85%
Problem/Motivation
The theme registry build is slow. On one site even takes 9s. It is cached, but we can do better:
Use a lookup table instead of preg_grep on _all_ user defined functions.
This is 91% faster, down from 4000 ms to 300 ms.
Proposed resolution
- We have a huge search space of user defined functions for the grep.
- We only use a fraction of that by grouping by the first prefix (denoted by the first underscore).
- By dividing that search space by first '_' prefix we get the perfect compromise of divided function space and memory used.
Remaining tasks
User interface changes
API changes
Beta phase evaluation
Issue category | Task because optimizes registry rebuild process |
---|---|
Issue priority | Major because it's a significant slowdown during registry rebuild |
Prioritized changes | The main goal of this issue is performance |
Comments
Comment #1
Fabianx CreditAttribution: Fabianx commentedComment #2
Fabianx CreditAttribution: Fabianx commentedComment #3
Fabianx CreditAttribution: Fabianx commentedNew patch for A
Comment #4
Fabianx CreditAttribution: Fabianx commentedComment #5
Fabianx CreditAttribution: Fabianx commentedComment #6
Fabianx CreditAttribution: Fabianx commentedComment #8
BerdirInteresting, make sure you're also testing the performance difference without xhprof, that always adds considerable overhead. You also have CPU time enabled, which adds more overhead.
I'm sure it's an improvement, but I would expect it to be considerably less if you just time it with two microseconds() before/after.
Comment #9
vijaycs85Totally agree with @Fabianx as get_defined_functions() already return valid/existing functions, we can avoid function_exist() calls.
Comment #10
Fabianx CreditAttribution: Fabianx commented@Berdir: TRUE, but it was slow without xhprof as well.
Still 3s versus 300ms, so ...
( timed in browser )
Comment #11
Fabianx CreditAttribution: Fabianx commentedThe B) patch failed testing, we should probably split it up in A) and B) parts.
A) is safe to apply and should always be faster.
Comment #12
anavarreComment #13
catchThis is valid for 8.x too no?
Comment #14
Fabianx CreditAttribution: Fabianx commentedYes, theme registry building has not changed much, yet - besides being moved to OOP classes.
Comment #15
Fabianx CreditAttribution: Fabianx commentedComment #16
Wim Leers:O
Fabianx++
Comment #17
catchComment #18
amateescu CreditAttribution: amateescu commentedThe A patch applies even to D8 if you manually put 'core/' in the header.
Comment #19
dawehnerWe should certainly document why we are doing it: fetching all defined functions once is faster ... but are we sure this kind of static cache does not have to be cleared? There could be a theme registry rebuild going on when you enable a module, so the next enabled module might result in a functions cache, with not enough entries.
Comment #20
Fabianx CreditAttribution: Fabianx commentedThe A patch is idempotent, because functions cannot vanish, but adding functions is taken care of dynamically ...
So this is not a static cache, but a prefix list that is updated on the fly for newly defined functions.
Comment #21
joelpittetThis patch in #3 for D7 on a commerce site with about 350 active modules and Omega 4 base theme.
The first run with a flush all caches was way slower (not exactly sure why), but the next one right after was better like above.
The memory usage bumped up a bunch. See the xhprof diff: (may be garbage I had xdebug running too)
Also had these results (xdebug turned off this time).
Before Patch
After Patch
So superficially it looks like a trade, time for memory. About 15MB
This diff is between the last 3 page results with xdebug off.
Comment #22
Fabianx CreditAttribution: Fabianx commentedYeah, might need to check for valid prefixes only - chx had some ideas how to store less in the lookup table.
Comment #23
Fabianx CreditAttribution: Fabianx commentedIt seems the difference is that with your setup this is only called 1 times, while it was called 4 times for my case.
Also IIRC the overall global memory usage and peak mem usage did not change much.
So while 15 MB (!) might seem much, actually it is much less in comparison to the peak.
Though as its static, that is probably not a right assumption.
Comment #24
Fabianx CreditAttribution: Fabianx commentedThanks for testing this out in a different scenario.
Kind of a YMMV case.
Comment #25
joelpittet@Fabianx re #23 Yours looks like it only calls once as well. It's drupal_find_theme_functions that is called 4 times no?
If I look at that function I get that 85% you were talking about in the OP.
In my case I only have 2 calls but the % is nearly the same.
Comment #26
Fabianx CreditAttribution: Fabianx commentedThanks, that makes me more happy.
Still with some better logic we should be able to reduce the 15 MB to maybe 3 or 4 MB.
Comment #27
lauriiiThis has been forgotten for a while. Is there IRC discussion etc that has happened where the direction where this should be taken would have been addressed?
Comment #28
Fabianx CreditAttribution: Fabianx for Acquia commentedTagging
Comment #29
joelpittet@lauriii and I worked on trying to push this forward yesterday. We tried a number of things to reduce the memory and discussed our options with @msonnabaum
We decided to go without the static because this function doesn't get called much (2 times in my case) and it makes it tricky(or impossible?) to garbage collect those variables.
We also decided to remove the isset/function_exists change because it doesn't give that much speed benefits because you need to build the array(or array_flip the user defined functions) to get it to an isset(able) state.
Scenario:
Results:
Before:
After:
You can see we are saving 90% saving on time or 10x faster and memory is 9M more memory used.
Method:
Wrapped this around the inside of the the
drupal_find_theme_functions()
function.Then we got the average with a separate little script:
It would be nice if we could get the memory down but if possible?
Comment #30
joelpittetSorry on that, restore last input from dreditor decided to bust thing...
Comment #31
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commentedI've been thinking of this issue and playing around with ideas. It might well be that I'm missing something but why not just use prefixes up to the first '_'? To me it looks like that already restricts the search very effectively while using much less memory. I attached a basic patch (based on the older version) that demonstrates this.
I don't unfortunately have a good site with lots of functions to test on but on a fresh Drupal 8 install with most modules and Bartik it's just as fast as a full prefix table while consuming about 500k compared to 1.5M of the full prefix table. Those numbers seem consistent with what to expect when just looking at the number of functions on the sites that were mentioned earlier.
Thinking of #939462: Specific preprocess functions for theme hook suggestions are not invoked in advance, it would seem like this will get used more times for that (once per each module) so the static variable would maybe have to be reintroduced anyway.
As an aside, it looks to me like drupal_find_theme_functions should rather be a part of the theme registry. I guess that's something for a separate issue though.
Comment #32
lauriiiI ran the test for ajsalminen's version of the patch and the results are pretty good:
Comment #33
joelpittetSame, @lauriii is going to up-date comments.
In comparison to #29 these results were way better on memory and time!
Thanks for explaining in irc @Antti J. Salminen!
Comment #34
lauriii------
Time mean: 0.10247950315475 sec
Memory mean: 0.61116162490845MB
------
Time mean: 0.0026018602848053 sec
Memory mean: 0.3856792678833MB
Comment #36
Fabianx CreditAttribution: Fabianx as a volunteer commentedRTBC, but we need some comments / documentation here, why the first prefix works.
Here is how I understand it:
- We have a huge search space of functions defined.
- We only use a fraction of that.
- By dividing that search space by first '_' prefix we get the perfect compromise of divided function space and memory used.
Genius!
This also works for mytheme_has_really_many_underscores, because mytheme_has_really_many_underscores_preprocess_node would be grouped below 'mytheme' and even if there is another theme called mytheme, that does not matter, because before we were searching all functions, which also had both ...
In my benchmarks the function was originally called 6 times (due to some theme doing something), but as the algorithm is now really fast and the time both to generate and to search is really fast, too, the static would not matter anymore.
Thanks for all the work here, guys!
Fantastic!
Comment #37
lauriiiComment #38
Wim LeersAwesome!
Comment #39
Fabianx CreditAttribution: Fabianx as a volunteer commentedThese comments also need work as it is not all prefixes anymore.
Comment #40
lauriiiComment #41
joelpittetI doubled the word functions there...
before "the" first
Comment #42
lauriiiComment #43
joelpittetAdding Beta evaluation
Comment #44
joseph.olstadD7 version of #42. Run testbot and then switch it back to 8.0.x-dev
Comment #47
joseph.olstadTry this one
[] is not supported in php 5.3 (D7) says testbot
changed from:
to:
see what testbot says.
Comment #48
joseph.olstadtrigger testbot
Comment #50
joseph.olstadSorry, missed one " = [];"
Now try this one.
Comment #51
lauriiiCould we leave the Drupal 7 stuff for after the Drupal 8 version has been fixed?
Comment #52
joseph.olstadSorry lauriii , I saw your D8 patch passed testing so I wanted to see if it would pass D7 testing. So happy it passed D7 testing as well with only very minor modifications. Keep up the great work!
Comment #53
lauriii@joseph.olstad no problem at all and thanks for testing it out!
Comment #54
jcnventura CreditAttribution: jcnventura commented@laurii the patch seems pretty good (I haven't tested the performance. But I hate the cryptic name drupal_collect_prefix_grouped_functions().
Some more reasonable suggestions
Comment #55
Fabianx CreditAttribution: Fabianx as a volunteer commentedYes, lets rename to one of #54 suggestions, then RTBC
Comment #56
jcnventura CreditAttribution: jcnventura commentedDecided myself on the 1st option. Also, I passed this through drupalcs, as I had the feeling that the first comment must be in a single line, and that in turn suggested some enhancements to the function doc.
Comment #57
Fabianx CreditAttribution: Fabianx as a volunteer commentedRTBC, looks great now!
Comment #58
joelpittetThanks for making that more clear and fixing the docblock @jcnventura:)
Comment #59
GuyPaddock CreditAttribution: GuyPaddock commentedNot to throw a wrench in the approach, but wouldn't something like a suffix tree or ternary search tree be even faster than regexes over arrays? http://igoro.com/archive/efficient-auto-complete-with-a-ternary-search-t...
Comment #60
jcnventura CreditAttribution: jcnventura commented@GuyPaddock, a tree to search/access certainly it would be faster. It would probably be slower to create (would need to profile to make sure). Since we build this structure on every call to drupal_find_theme_functions(), I don't think that the performance gains would be noticeable. For sure, the memory usage would be significant (building a tree with all the function names...).
In any case if this function is called often, it might be useful to cache the results somewhere. But then again, the theme registry is already cached, so it probably doesn't add anything new.
Comment #62
catchCommitted/pushed to 8.0.x, thanks!
Moving to 7.x for backport.
Comment #63
joseph.olstadRefactored latest changes for D7 with comments and simplified function name.
For compatibility with D7 syntax tests that the testbot runs (php 5.3 compatibility) the following code snippets had to be changed from:
= [];
from the D8 patch was changed to:
= array();
in the D7 patch
Comment #64
jcnventura CreditAttribution: jcnventura commentedThere's still two unmerged changes:
Needs to be changed to array, instead of $array
and
should be
Comment #65
joseph.olstadHere's the requested changes.
Triggering testbot
Comment #66
joseph.olstadPatch #65 shaved off 5 seconds off of drush cache-clear all using the distribution of Drupal 7 I tested with
(my test box: php 5.4 , apache 2.4, mariadb 5.6, centos 7.x)
Great work guys.
Comment #67
joseph.olstadChanging this from Category "Task" to Category "Bug report"
I'd classify this as a bug fix, in my case this patch saves between 4000 and 5000 milliseconds.
That is a lot of CPU time indeed.
RTBC+ , would like someone else to review
Comment #68
Fabianx CreditAttribution: Fabianx as a volunteer commentedLets get this in, thanks @all!
And thanks so much for improving my algorithm :-D (#31!!!).
#66: I happen to know that distro you use, its the one that prompted this patch - as a side fun fact :-p.
Comment #69
joelpittetRTBC++
Comment #70
sylus CreditAttribution: sylus commentedThanks a bunch all really appreciate this patch!
Comment #71
joseph.olstadPlease commit to 7.x
Comment #72
btopro CreditAttribution: btopro at Pennsylvania State University commentedTested before and after cache clear as well as theme reg rebuild on a pretty complicated site (154 modules, decent caching mechanisms in place, php / mysql 5.5 / opcache). This is running on a Local CentOS 6.5 Vagrant instance.
Pre Patch
Theme rebuild:
CC all:
Post Patch
Theme rebuild:
CC all:
I am also seeing slightly more memory used (.25 megs for me) but this combined with https://www.drupal.org/node/2263365 and https://www.drupal.org/node/1443308 just beat the pants off of previously devastating CC ALL hits!
RTBC
Comment #73
quicksketchThis current patch is great and definitely helps a lot. It would be great if #65 were committed as-is.
However, I found that the current approach still isn't super-efficient. We make a registry of all functions by their prefixes. Why not just make a registry of only the functions that start with the prefixes that we care about? The current patch could easily be amended to pass in $prefixes to
drupal_group_functions_by_prefix()
, but we could simplify this by just directly filtering the list of functions down directly.Attached a patch that gives even greater performance improvements, and takes less memory (and is a lot simpler). I'm marking this needs review for testbot.
Sorry to take this down from RTBC, but as it's been 3 weeks since the last status update, it seems like we've got time to improve before taking up committer time.
Comment #74
quicksketchXHProf on the site I'm working on for patch in #73: from 7 seconds down to 30ms.
Comment #75
quicksketchHere's an XHProf run on #65 vs #73. Almost a 9x improvement over the previous approach.
All of this is a "huge improvement" vs. "even huger improvement". The current patch is about a 25x improvement on our site, but new patch is even greater with about 230x improvement.
Comment #76
sylus CreditAttribution: sylus commentedThanks so much @quicksketch I switched out to this newer patch and my automated builds did get a noticeable speed bump across the board. :)
Comment #77
dboulet CreditAttribution: dboulet commentedComment #78
joseph.olstadGreat work @quicksketch, Sylus has an excellent build procedure with Travis CI and many behat tests, the previous patch worked wonders for us and now its great to see even more performance improvement than the already amazing previous patch. Great work once again!
Comment #79
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commented#75: That's definitely a nice improvement. I'd be slightly worried about getting close to the compiled PCRE pattern length limit when preg_greping with all theme hooks though.
I don't know the implementation details but a quick test indicates that the 65536 byte limit might be already hit with only 1000-3000 entries in the registry depending on what the average string length for hooks is (tried with 10, 20, 30). That feels like it may be too close for comfort.
Comment #80
dboulet CreditAttribution: dboulet commentedCan you explain this in more detail? In the patch from #73, the first call to
preg_grep()
uses a pattern derived from concatenating all prefixes, which consist of theme names and not hook names. The pattern length will only grow with an increase in the number of installed themes, which at the limit should number in the dozens or hundreds, but not in the thousands. Have I misunderstood?Comment #81
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commented#80: No, you're right. I was mixing up things and what I said in #79 is not a problem at all here. Sorry for the noise.
quicksketch's patch looks great!
Comment #82
RainbowArrayCan we also use quicksketch's performance improvements for d8 theme registry rebuilds as well? Sounds like a great boost that will really help minimize registry rebuild time even more.
Comment #83
markhalliwellIn the interest of clarity (what has already been committed to 8.x), I would have preferred the patch in #73 were created as an entirely separate issue since it involves restricting which functions are registered and might have potential logic/bc breaks and it will need it's own testing.
Also, that patch will also need to be committed to 8.x first and then also backported to 7.x.
So, I'm going to set this issue back to RTBC so we can commit #65 to 7.x as the appropriate backport for this issue.
Edit: I really like the extra performance improvements btw! I'm just trying to keep this issue on point is all so we can get what was already committed to 8.x into 7.x.
Comment #84
Fabianx CreditAttribution: Fabianx as a volunteer commented#73: This is great! Can you open another issue?
We unfortunately need to get that into D8 first ...
Comment #85
mikeytown2 CreditAttribution: mikeytown2 commentedAlso using #73 in prod with no issues; RTBC!
Comment #86
quicksketchThe only thing #73 does is restrict the list of functions to match $prefixes all at once. Rather than checking each combination of $prefix . '_' . $function against the entire list of functions. That is, the list of matched functions is identical to what it was before, it just removes from the list of functions everything we know isn't a theme function, because they don't start with either "theme_" or the name of a theme/base theme like "bartik_". Functionally, it should be identical to the current approach.
Comment #87
dboulet CreditAttribution: dboulet commentedThe patch in #73 accomplishes the same thing as the previous patch, but does so with much fewer lines of code, and is more efficient. I would vote instead for reverting what has been committed so far, and applying the changes from #73 to 8.x and 7.x.
Comment #88
catchI'm going to move this back to 8.x to make the change in #73 there. If we can do that quickly that seems easier than tracking two issues. Also 8.x has additional test coverage.
Normally after commit we'd ask for a followup to retain the git history, however given this hasn't been committed to 7.x there is a multiple branch component to whether it's been committed yet.
Comment #89
hussainwebI am attempting the change in 8.0.x now.
The function which was committed in this issue for D8 (
drupal_group_functions_by_prefix
) was since moved totheme.registry
service. It doesn't seem to be a function registered on the interface (which was a bit risky). However, I am not sure of how to remove it as it being used\Drupal\Core\Theme\Registry::postProcessExtension()
method. I think it can be removed from there as well and will try it. The patch just (mostly) reverts what was done in #56 and applies #73 in D8.Comment #90
hussainwebComment #91
joseph.olstadHi @hussainweb, I also had a look at postProcessExtension , looks like drupal_group_functions_by_prefix should be removed from there too however there looks to be like a whole lot of changes and the postProcessExtension that was added recently might not be required at all due to the vastly simplified and improved logic that @quicksketch revealed to us.
Similar logic USED to be applied to D8 before postProcessExtension was added which seemed to have been inspired by patch 56 however now that patch 73 has simplified things it would be nice to just get rid of drupal_group_functions_by_prefix altogether but we can't until postProcessExtension is revamped or reverted.
Comment #92
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commented#91: Registry::postProcessExtension() should still be needed even if it was shifted to something else from using the grouping function. You might be able to improve it by going with something similar to what's done with drupal_find_theme_functions() but it doesn't look completely clear to me. In that method the list of prefixes you need to have in the preg_grep is actually much larger as it does include every module in the system and I believe in this case the order they are processed in also matters.
Comment #93
joelpittetI didn't have as good results but still awesome results between #65 and #73 thank you @quicksketch
Comment #94
joseph.olstadPatch 89 looks to be an improvement for the 8.x branch, lets prove it now that patch 73 for 7.x is proven.
What we need now is some xhprof profiling to prove that we're gaining in performance on 8.x as well.
Also it would be interesting to see from the xhprof results if Registry::postProcessExtension() is expensive or not, if it's not expensive then we should just leave it as is.
Lets focus on is the improvements that patch 89 in it's existing form will give and push forward.
This will help make sure that the whole of the drupal community is able to take advantage of the great work that @quicksketch and everyone else has provided for us as joelpittet has already proved is the case for 7.x.
Comment #95
Wim LeersGo @quicksketch!
Comment #97
joseph.olstad#96 still says "queued" but it actually was tested and passed all the new tests.
So it should say "passed", this looks to be like an issue with the new jenkins config.
The D8 patch is good.
Comment #99
MustangGB CreditAttribution: MustangGB commentedPatch made it into D8 (8.1.x rather than 8.0.x actually), should be ready for backport now.
Comment #101
catchThat was the previous change committed here, latest patch hasn't been committed anywhere.
Comment #102
jcnventura CreditAttribution: jcnventura commentedComment #103
joelpittetUsing the technique we did in #29. But against a stock D8 site.
Got 1.25 ms saved without any contrib projects installed.
And after installing a bunch of contrib modules and all the core modules.
So it looks like a trade, speed for memory. @catch maybe you want to make the call?
The D7 patch seems to give a better bang for our collective buck.
Comment #104
vensires CreditAttribution: vensires commentedComment #105
catchI like the preg_grep() approach.
On the patch itself:
Shouldn't getPrefixGroupedUserFunctions() be deprecated? That's dead code at this point.
Comment #106
joelpittetgetPrefixGroupedUserFunctions should be deprecated yes.
Comment #107
donquixote CreditAttribution: donquixote as a volunteer commentedI was trying to optimize the D7 version of this, and then found this issue..
Even if this issue is still focusing on D8, I want to post this D7 patch. As a starting point for later backport, and for people who want to use it in their projects.
Benchmark result on a rich real-world D7 site:
This is the accumulated time for all calls to drupal_find_theme_functions() when clearing the theme registry.
Measured with timer_start() and timer_stop(). (this was more convenient than xhprof)
without optimization:
1456.24
1558.8600000000001
1476.1399999999999
1461.79
1463.8599999999999
with optimization:
38.359999999999999
40.619999999999997
26.110000000000003
26.220000000000002
26.41
27.84
30.82
-> optimized
factor 1 / 55.54 (fastest unoptimized / fastest optimized)
factor 1 / 35.85 (fastest unoptimized / slowest optimized)
factor 1 / 59.70 (slowest unoptimized / fastest optimized)
I think the fastest / fastest is the most relevant, because faster runs usually have less noise from e.g. other processes.
Either way, it is a damn heavy improvement.
And yes, the overall page time (devel page timer) was also reduced. Divided by factor 1 / 3, I would say, for a request that only did flush the theme registry.
Memory was a bit unclear, but nothing relevant.
(sidetrack)
My original version calculated a separate pre-filtered list per prefix. This means the regex would have no () brackets, which *might* be faster in the usual case of only
count($prefixes) === 1
. But it turns out this is not the case, as we see below.with optimization, but without the brackets. That is
'/^' . $prefix . '_/'
instead of'/^(' . $prefix . ')_/'
.26.48
30.93
35.060000000000002
27
26.640000000000001
25.98
25.600000000000001
-> So, no significant difference.
(/sidetrack)
Comment #109
joseph.olstadtest log says:
donquixote, any ideas?
Here's the 7.x source code for field_sql_storage.test
Comment #110
donquixote CreditAttribution: donquixote as a volunteer commentedI cannot reproduce this locally, so I queued a re-test.
The message is not very helpful either.
Comment #111
joseph.olstadThe queued re-test passed! Nice.
Comment #112
joseph.olstadDonquixote, just fyi, patch 73 is identical to your patch. We have been using patch 73 for about six months now.
Comment #113
donquixote CreditAttribution: donquixote as a volunteer commentedDuh. I feel stupid now.
Back to "needs review".
Comment #114
catchHere's a re-roll of the 8.x patch with the deprecation notice.
Assuming it's green, I think it's ready to go.
Comment #115
andypostAnother downside of that approach is supposing all modules are loaded and no way for dynamic loading, but numbers impressive
Comment #116
xjmComment #117
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commentedRegistry::getPrefixGroupedUserFunctions() is still used in Registry::postProcessExtension() in 8.x. I don't think it can be deprecated?
Comment #118
catchWell spotted, that was added after the original patch here landed.
Here's a patch that updates Registry::postProcessExtension() to use the same approach.
Comment #119
alexpottDouble assignment.
Other than that this looks great.
Comment #120
catchDoh that's crufty. Just removing that double assignment.
Comment #121
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commentedDoesn't the removal of the usage of getPrefixGroupedUserFunctions() in postProcessExtension() make it much, much slower? To me it looks like the replacement ends up grepping almost the whole array of defined functions for as many times as there are enabled modules and avoiding that was the reason we even came up with the method.
Comment #122
catch@Antti J. Salminen yes it needs profiling again. The numbers above show that the new preg_grep() is faster than building the map once by up to ten times, so need to see whether it's slower again performed per module/theme.
Comment #123
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commented@catch: Fair enough. These usages just seem very different to me. In drupal_find_theme_functions $prefixes has just a single element, in postProcessExtension it has one for every module and won't scale well at all when the number of modules goes up as far as I can tell.
Comment #124
donquixote CreditAttribution: donquixote as a volunteer commentedThis approach should be even faster.
Comment #125
alexpott@donquixote - "should" looks like an assumption - is it backed up with real data? We need real performance data on all the patches here otherwise we don't really know if we're making an improvement of not.
Comment #126
donquixote CreditAttribution: donquixote as a volunteer commented@alexpott: I should have taken more time to comment, sorry.
I tested the patch, and the previous one, on a relatively complex D7 site, where drupal_find_theme_functions() runs 4x on a normal cache clear.
With the old D7 patch, individual calls to drupal_find_theme_functions() took sth like 5ms up to 12ms.
With the new patch, individual calls took all around 4ms.
This was with xhprof disabled.
With xhprof enabled, it can be seen that there is a minimum time to run: The time for get_defined_functions()['user'] + the initial preg_grep(). This may vary by system, but we cannot do much about it in this issue. For me it was usually >2ms for get_defined_functions(), and a bit less for the first preg_grep.
With the new patch, there is not much left besides the initial get_defined_functions() + preg_grep().
The loops are cut down to a minimum, and there are no unnecessary calls to preg_grep().
This might all depend on the site. So others should run their own benchmarks.
And, I have not done any benchmarks yet on D8.
Comment #127
donquixote CreditAttribution: donquixote as a volunteer commentedBtw, in previous patches, including my own:
The implode glue
')|('
really does not make sense.The resulting regular expression would be sth like
'/^(prefix0)|(prefix1)|(prefix2)_/'
, which is not what we want.What we really want is
'/^(prefix0|prefix1|prefix2)_/'
. So the glue needs to be'|'
.This being said: In all the cases I have seen,
$prefixes
contained exactly one item. So in the usual case, this distinction will have no consequence.----
I am working on a new version of the patch in #124, to squeeze out some extra microseconds, where the correct glue is used.
Comment #128
donquixote CreditAttribution: donquixote as a volunteer commentedNew version is simpler than #124, because there is (almost) no more case distinction for multiple prefixes vs one prefix.
In theory it should be even faster. But in practice it is hard to measure.
Profiling result (in D7):
Same site as in #124 / #126.
Every call to drupal_find_theme_functions() is less than 4ms.
Most of these microseconds are for:
- A single call to get_defined_functions()
- The initial
preg_grep('/^' . $prefix . '_/', $functions)
.- Additional preg_grep() for hooks with real non-default regular expression patterns like "webform_form_[0-9]+".
- The initial
$weights = array_flip(array_keys($cache));
.This means that all the time is spent on stuff that we really cannot avoid. So it does not get much faster.
With the older patch (#73), a single call to drupal_find_theme_functions() costs 3ms - 10ms, average 7.5ms.
This variation is not random, but based on how the function is called. E.g. with many hooks or with few.
With NO patch: single call between 7ms and 562ms, average 400ms. So, a lot slower.
Peak memory on cache clear:
- no patch: 76.25 MB
- old patch (#73): 76.25 MB
- new patch (#128): 76.5 MB
Nothing interesting happening on the memory front.
So: The older patch (#73 === #107) is fine for D7, performance-wise. Except the ')|(', which looks wrong.
This new patch is faster, but the difference is small compared to overall cache clear time.
For D8, the same applies.
The other part of the D8 patch in #120 (@catch) focuses on postProcessExtension(), which seems to be independent of drupal_find_theme_functions(). It could go into a separate issue.. but I don't care much.
Comment #130
donquixote CreditAttribution: donquixote as a volunteer commentedWe should add test cases where drupal_find_theme_functions() is called with multiple prefixes.
Comment #131
donquixote CreditAttribution: donquixote as a volunteer commentedComment #132
donquixote CreditAttribution: donquixote as a volunteer commented@joelpittet, @catch
I'm a bit confused by the profiling numbers that were posted before.
E.g. #103 talks about 2ms - 5ms, which makes sense for a single function call (to whichever function), but it does not tell us how often this function was called, and which function it is.. Is this drupal_find_theme_functions()?
Where do I find numbers for postProcessExtension()? Has anyone profiled this?
Looking at the patch in #120, I don't think this is the right approach for postProcessExtension(), or the most promising.
The challenge here is that the cost in time is proportional to the number of theme hooks times the number of modules. The existing patch, as far as I can see it, does not change this.
An approach I would recommend is sth like this:
This way, we avoid iterating over all modules and prefixes, and instead only iterate over existing functions.
I did this for D7 already in #519940-54: Performance: optimize building theme registry by using get_defined_functions() instead of function_exists().
function _drupal_theme_registry_add_module_owned_processors().
I would therefore suggest to deal with this problem in a separate issue. Either #519940: Performance: optimize building theme registry by using get_defined_functions() instead of function_exists(), or open a new one.
Comment #133
Antti J. Salminen CreditAttribution: Antti J. Salminen as a volunteer commented@donquixote: The overall approach of the latest patch looks great and I can see why it's faster! I'm just wondering what is the reason for adding the new hooks in the same order as their base hooks are defined in to the registry?
I agree postProcessExtension would better be handled in a different issue and the current implementation isn't terrible for that purpose. I don't doubt it can get even better. Some discussion on it's performance is located in: #939462: Specific preprocess functions for theme hook suggestions are not invoked
Reviewing a bit:
Short array syntax would be preferable for D8.
I think the comments in general need some work. An explanation similar to the one this patch removes would be good as the things this function implements to aren't all that well documented to begin with.
This naming doesn't seem as clear as it could be to me. Maybe something like full_prefix_length?
How about "Gather defined functions keyed by the theme hook name they are implementing."
Could this be split to be under 80 characters?
I'd rather use something like a "This is a derived theme hook". I'd be inclined to call reimplementations of existing base hooks "overrides" rather than suggestion implementations etc.
This comment pretty much just seems to spell out what you can read from the code. How about something like "Find overrides for registered theme hooks and set them in the registry."
This needs a comment explaining the rationale for the preserving the theme function weights for the suggestion hooks.
Comment #134
donquixote CreditAttribution: donquixote as a volunteer commentedI am simply trying to reproduce the same order as the original implementation.
I don't know if this has any effect we need to worry about.
One thing is, I have a local test script which compares a dump of the theme registry. Obviously, this will look like a mismatch if the order is changed. I could change this script to dump a ksort-sorted version of the theme registry.
Comment #135
joelpittetThe core committers (@xjm, @alexpott, @catch, @effulgentsia, @Cottser) and theme maintainers (@lauriii, @joelpittet) discussed this issue April 27th and agreed on a path forward for this issue. We decided to split this issue into to follow-ups. https://www.drupal.org/core/backport-policy#d7
One for improving the already committed patch for D8 in #98 and another for the D7 work that @quicksketch did and was tested in #73 and let @donquixote continue his improvements on that there as well.
I'm closing this as fixed as it was already committed. The D7 changes being in a new issue is part of the new backport policy here #2462101: [policy, no patch] Update backport policy to account for 8.x minor versions and 7.x divergence and
https://www.drupal.org/core/backport-policy#d7
D8 #2720849: D8 improve theme registry build performance
D7 #2720853: D7 Improve theme registry build performance by 85%
Can I ask the respective authors @donquixote, @catch and @quicksketch et al to post their respective patches to those issues. If you don't get around to it I'll move them myself but would really like to capture your contributions to this issue on those respective follow-ups. So at the very least please comment over there for commit credit.
Thank you all for this, I use this patch on all my D7 sites!
Comment #136
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedFixing version for posterity.
Comment #140
donquixote CreditAttribution: donquixote as a volunteer commentedUpdating issue summary with links to follow-up issues.
This can protect people from wasting their time reading this one.