Problem/Motivation
BlockRepository::getVisibleBlocksPerRegion() does an uncached config entity query on every page.
Proposed resolution
Cache the query per-theme.
Remaining tasks
Fix the patch - just did this quickly to profile, needs cache tags etc.
Also determine whether the blocks are loaded elsewhere in the request - we may want to cache just the query and do a getMultiple on the config entities so they don't get loaded twice.
Comment | File | Size | Author |
---|---|---|---|
#107 | with-patch.png | 146.69 KB | andyg5000 |
#107 | without-patch.png | 139.36 KB | andyg5000 |
#83 | block-repository-caching-2479459-83.patch | 10.07 KB | snehi |
#83 | interdiff.txt | 3.19 KB | snehi |
#77 | block-repository-caching-2479459-77-interdiff.txt | 4.91 KB | Berdir |
Comments
Comment #1
catchComment #2
anavarreComment #4
dawehnerI'm curious, how much of the actual request do we talk about? I see 25ms gain, which is a LOT, but especially a lot from the measured 100ms we had ...
Comment #5
catchNo the loadByProperties() call is showing up as ~3% of the request for me.
Also the 25ms gain I think is a result of different profiling runs (unless somehow the access check also gets cheaper...) - this is more like 8-12ms (still not bad, and could be a larger saving with more blocks configured).
Comment #6
catchRelated: #2479529: \Drupal\user\ContextProvider\CurrentUserContext::onBlockActiveContext() loads the anonymous user.
Comment #7
dawehnerYeah I guess we would need a block_list tag and invalidate that on save time.
Comment #9
alexpottAdding some related issues. I think we should explore whether or not we can make config entity retrieval faster.
Comment #10
BerdirWe have a config:block_list cache tag, that's the list cache tag of block config entities.
Yes, I've seen this too in profiling, obviously, this will get slower the more blocks you have in the system, as we need to load them all. I still think it was a bad idea to remove the forced theme ID prefix like fields have for example from blocks. We have optimizations in place for config entity queries on (partial) ID matches, that would be much faster.
Comment #11
BerdirThis adds the list cache tag, at least one test was fixed by this, let's see about others.
Comment #13
tim.plunkettFixing the unit test and adding coverage for the change.
Comment #14
tim.plunkettWorking on the \Drupal\block\Tests\BlockSystemBrandingTest fails.
Comment #15
tim.plunkettOooh, bad test, modifying config directly like that...
Nice to know that using the correct API fixes the problem :)
Comment #17
BerdirEntities are usually in the entity bin, is there a reason that data was used here?
Comment #18
dawehnerIMHO this code is looking great, especially because the approach is simple, understandable etc. especially compared to the other suggestions out there.
Comment #20
catchI put it in @cache.data because:
1. It's caching based on the query result - not entity IDs or anything.
2. A cache item with lots of entities in it is a bit different from a 1-1 entity cache
That felt like data more than entity to me, but it's definitely borderline.
Comment #21
BerdirKnowing you thought about it and decided for cache.data works for me.
I just added the cache tag, it has been reviewed by others already. so I think it's OK if I RTBC this.
Agreed with @dawehner: A nice performance improvement that is very simple in itself, thanks to cache tags :)
Comment #22
Wim LeersPatch looks great; doing a quick round of benchmarking.
Comment #23
Wim LeersMy benchmarking (as UID 2, frontpage) shows that this doesn't actually gain us much. The IS has nice before vs. after XHProf screenshots, but they only look at
BlockRepository::getVisibleBlocksPerRegion()
, not the overall request. My benchmark results suggest that the majority of the cost was actually in initializing some things, which are now just being initialized later/elsewhere.Comment #24
Wim LeersBack to NR per #23.
Comment #25
Wim LeersLooking at it with XHProf:
\Drupal\block\BlockRepository::getVisibleBlocksPerRegion()
remains around 8-9 ms, because unlike catch, the loading of the entities only took 1.5-2.5 ms on my machine, the cache get takes 0.75 ms.That, combined with some code being loaded later in the request, is probably why I'm not seeing a performance improvement.
This may still be worth it for the reduction in number of function calls, but in practice, the gain is very small in any case.
Comment #26
Wim LeersNote: I tested with the Standard install profile, catch tested with Minimal.
Comment #27
catchSo doing an xhprof diff of the whole request, I see 7ms less from that function but only 3.5ms less from the request overall in wall time, and -397 function calls. I'm testing minimal profile logged out on the front page and I think Wim is testing standard which partly explains the function call difference.
One thing that might offset the improvement from the caching here is that we're moving from APCu/chainedfast for cache.config to database for cache.data - putting this in a chained backend would probably make more sense.
Comment #28
BerdirTry it with 20, 30, 40 blocks and enable a few different themes.
I didn't expect a huge gain in profiling, and certainly not with a default installation.
Instead of execution time, try to compare the amount of database queries. I'd expect we save at least the query on the config table there, and we have a single cache get instead of a cache get multiple. But as @catch said, the downside is that those cache gets are currently in APC, while this would be the database.
Comment #29
Wim Leers+1
Comment #30
catchSo do we think cache.config would be acceptable here? It is more or less the reverse of the argument I had for cache.data.
Comment #31
BerdirWorks for me.
In fact, I think we should consider that config entity caching should use the cache_config table as well instead of cache_entity, for the exact same reason. We do use that for roles for example...
Even if this doesn't have a big impact on performance, I think we should still go ahead with this. We are definitely saving queries now, and that is worth it IMHO. The impact of this will likely be more visible on sites under load. queries are always very fast when testing with ab, because then the query caches are warm, that's not always the case.
Comment #32
BerdirOh, I'm confused, roles/config entities only use static cache. Forget what I said :)
Comment #33
BerdirPretty sure this needs to be profiled again after #2430219: Implement a key value store to optimise config entity lookups. Should be quite a bit faster now in HEAD, especially if you have many blocks.
Comment #36
star-szr(In theory) Easy rebase, git auto merged it :)
Comment #37
catchProfiled again. Still looks like an improvement here - on my machine with xhprof it comes out as ~16ms (just diffing two requests before/after).
Also while the KeyValue lookup is faster than what we used to do, it's still a database query compared to what is an APCu get with the patch.
Comment #38
catchAnd profling again, I'm only seeing this as a 3ms improvement (see attached diff). The overall request comes out as 11ms faster, but the specific code we're looking at I see 3.6ms.
However, that's a database query on every request on every page, and 3ms overall time is still a good improvement on a warm cache.
Comment #39
dawehnerShould we document that this gets rid of a keyvalue store and all the overhead of config queries, and so provide a even better way to serve Drupal without a database on a warm cache?
Comment #40
catchRebased and added those docs.
Comment #41
dawehnerThank you
Comment #42
alexpottI'm really uncertain about putting something that's not config in the config cache. Especially just because it is quicker.
Comment #43
BerdirIt kind of is config :)
We can also use a different cache bin that uses apcu by default. Both discovery and bootstrap would be fine with me.
Comment #44
catchRight it's a list of config entities that are the result of a config entity query. Whether that makes it suitable for @cache.config is definitely up for discussion though.
Not sure about discovery since that is more info hook/plugins, but bootstrap is definitely fine.
Here's a patch with just that change, I'm fine with either config or bootstrap.
Comment #45
Wim LeersHow about adding a
@cache.config_query
service, to make that distinction clear? Wouldn't that address Alex' concerns, and point D8 contrib in the right direction as well?Comment #46
alexpott#45 makes me think - should we make caching expensive queries part of ConfigEntityStorage and introduce a special version of loadByProperties to do... something like CachedLoadByProperties()
Comment #47
Wim Leers#46: But if we make it too easy, won't that fill up APCu too quickly? We want it to be a very very very conscious choice to put things in APCu, I think?
Comment #48
alexpottRe #47 thats why I was suggesting a new method.
Comment #49
Wim LeersRight, but I'm wondering if that's not still "too easy". But I guess docs of such a new method could give sufficient warning :) So +1 to
cachedLoadByProperties()
.Comment #50
catchSo the only concern about that I have compared to the current patch, is this is currently the only config entity query for basic requests (and hopefully it'll stay the only one). Doing the cache where it is now also avoids initializing the config entity query service (which otherwise we could mark lazy).
I do think it's worth exploring though, but possibly in a separate issue which we could then convert the caching added here to?
Comment #51
alexpottNot too mad about hard coding 'config:block_list'] here.
Here's a patch that adds cachedLoadByProperties to ConfigEntityStorage and we can still mark the query service as lazy. I'm happy if we choose to split this into a separate issue - if everyone else thinks that is the way to go.
Comment #52
Wim LeersI think the patch looks great.
s/Load/Loads/
s/entry/item/
s/entries in APC/data in APCu, which has a very limited amount of available storage space/
Comment #53
alexpottThanks for the review @Wim Leers.
Comment #54
Wim LeersI wanted to RTBC, but then found my own profiling in #29 that shows the gain is not noticeable. But looks like the much more recent #37 and #38 show a different story.
Do we want more profiling before committing this?
Comment #55
catch@Wim #38 I think is pretty accurate (on my machine at least), 16ms was with opcache disabled. Even if there's not a measurable difference profiling, we're swapping a database query for APCu which is good for scalability.
Worth a sanity check profiling.
Comment #56
catchJust double checked and still seeing approx 3ms improvement. #53 is a really nice approach we can use elsewhere. So RTBC for me.
Comment #57
Wim LeersGreat :)
Comment #58
effulgentsia CreditAttribution: effulgentsia at Acquia commentedThis patch looks great. Very nice implementation!
We don't use the "_storage" suffix for other cache bins, why do so here? Will this bin be intended for caching individual config entities too (if we ever choose to cache those)? If so, perhaps just "config_entity"? If not, perhaps "config_entity_query" or similar?
Since this will break every contrib module that implements a config entity type with its own storage class that has its own constructor, this could use a change record. Also, a beta evaluation listing that as a (acceptable) disruption. And, an @param doc should be added here and in all the subclasses.
s/This/This call/?
Comment #59
effulgentsia CreditAttribution: effulgentsia at Acquia commentedJust brainstorming, feel free to reject, but if we want to make it less easy/accidental, and also avoid the BC break of #58.2, could we put cachedLoadByProperties() (along with a setCacheBackend()) in a trait, and let individual storages opt-in by using the trait (and calling the setter from within their createInstance() method)?
Comment #60
effulgentsia CreditAttribution: effulgentsia at Acquia commentedClearly I wasn't thinking straight. There's no reason for the trait to require a setter. It can still be constructor injected. So really, the question is whether it makes sense for cachedLoadByProperties() to be its own interface/trait rather than requiring every subclass to have it and the corresponding $cache_backend dependency. Problem with that kind of opt-in though is that maybe it's the users of a config entity type that should decide if/when to benefit from cachedLoadByProperties() rather than it being the entity type's decision to make. So alternatively, would it make sense to have a separate service (e.g., ConfigEntityQueryCache) rather than baked into the storage classes?
Comment #61
catchSo I'd be OK with either leaving the method as is, or a new service adding the method.
We want this easy enough to use when it makes sense, but require at least some thought as to when it's used to avoid cache bin explosion. For me either a method or a service does that, it's just what the emphasis is.
Comment #62
effulgentsia CreditAttribution: effulgentsia at Acquia commentedI'm also OK with either. So this issue is Needs Work on either the CR, beta eval, and @param docs mentioned in #58.2, or on changing the approach to use a service that would make the CR and @param docs unnecessary. Leaving the decision to whoever wants to roll the next patch. If the next re-roll can also improve the name of the cache bin per #58.1 (I think
config_entity
makes most sense, since I think it'll make sense to use that for individual config entity caching as well, but am open to arguments against that), that would also be great, but not a blocker.Comment #63
Fabianx CreditAttribution: Fabianx for Drupal Association commentedComment #64
BerdirDoing some profiling on a site.
Caching the loadByProperties() is nice and what not, but in my case, that's just 2ms. However, in the same method, I'm seeing 9ms spent in $block->access() (If you don't see those numbers, try adding a few custom blocks to your site, those are extremely expensive).
The thing is, we actually have the cacheability for those access checks now. What if we just cache the whole method, using that cacheability metadata? In total, that's 11.5ms or 8% on a node page in my case.
Comment #65
BerdirAh, I guess we'd need cache redirects for the contexts to work, which we currently only have for render caching.
Comment #66
effulgentsia CreditAttribution: effulgentsia at Acquia commentedDoes SmartCache fix #64 / #65, or is there still a common scenario even with SmartCache where 5-10% of time is spent on $block->access()?
Comment #67
BerdirYes, smartcache will cache that as well.
The question I guess is how many variations there are. If you have blocks that vary by path then block access needs to be checked by path, just like smartcache and then we don't gain much. Except smartcache might also end up caching per user on many pages.
I've discussed a completely different solution with Wim in IRC. The basic idea is to move the access check to #pre_render. Then we can cache access as part of the normal block render array. There are a few problems with that, though:
a) It would mean to remove access checking from BlockRepository, which is an API change and a tricky one because it could expose security issues in contrib/custom code which used it and isn't updated. We could solve that with a new optional argument to disable access checking? But that would also affect how the whole cacheability stuff works.
b) We need to solve the problem of regions with blocks but no blocks is displayed on a specific page, so they end up empty, that would then break if (region) checks in templates, add empty wrappers and so on.
c) And possibly the most problematic one (only just thought of this again). It would mean to vary the block result by the access cacheablity. If you have a custom block and only show that on a single page, then block access varies by url. And we'd have t' re-check access on every URL and also cache the built block for every URL even though the block output itself never changes.
So... I don't think that's a good solution, actually :) (Writing down things helps)
Comment #68
Wim Leers#67
a) Could be solved by keeping the existing API, and just adding a new API.
b) That's #953034: [meta] Themes improperly check renderable arrays when determining visibility. Region checks in templates are broken by design; they're fundamentally incompatible with lazy rendering.
c) This is true. But many blocks aren't shown on a single page, of course. So I'm not sure how big of a problem that'd actually be.
Comment #69
BerdirNot sure what you mean with your response to c). It is actually very common to have blocks that are only shown on certain pages through the path visibility condition. I can totally imagine sites with dozens of those blocks that are shown on special pages (e.g. a contact form or a certain node) and we have to be able to handle that in an efficient way. And node type visibility condition is per route as well.
Comment #70
BerdirHere's a pretty ugly but working implementation for this. This could also easily live in a wrapper of the block repository if we decide to not have it in core for now.
Comment #71
Wim LeersI think it's only ugly because it reimplements cache redirection (also incompletely, but I'm sure you know that, you said it's a PoC). Once we'd have #2551419: Abstract RenderCache into a separate service that is capable of cache redirects in a non-render array-specific way, this would be able to just reuse that.
But the principle looks sound to me. This is exactly what I've been saying we should/hoping we would have for a while :)
Comment #72
BerdirComment #74
BerdirFix the serialization problem and actually using the cache now. the failing aggregator test passed now.
Comment #77
BerdirFixing those test fails.
Comment #78
Wim LeersRegarding #71: Berdir convinced me that we shouldn't do #2551419: Abstract RenderCache into a separate service that is capable of cache redirects in a non-render array-specific way first; that it's better to first have multiple implementations and then do #2551419.
Comment #79
Wim LeersThis looks very close! :)
s/cacheable_metadata/cacheability/
s/cacheablity/cacheability/
Perhaps: s/overall/page/ is slightly better?
s/cacheable_metadata/cacheability/
s/cacheable_metadata_region/region_cacheability/
Comment #82
snehi CreditAttribution: snehi as a volunteer and at Publicis Sapient for Publicis Sapient commentedComment #83
snehi CreditAttribution: snehi as a volunteer and at Publicis Sapient for Publicis Sapient commentedI changed as mentioned by wim leers.
Comment #86
Wim Leershttps://evolvingweb.ca/blog/speed-up-drupal-8-block-rendering-blackfire-io
Comment #87
snehi CreditAttribution: snehi as a volunteer and at Publicis Sapient for Publicis Sapient commentedComment #88
EclipseGc CreditAttribution: EclipseGc commentedIt's worth pointing out that a TON of the work that this stuff is doing would be dramatically reduced if we had an index of blocks that appear on pages, in other words, PageManager. The blog's example of 140+ blocks on a single site is TOTALLY reasonable, and I've been making this point for literally years now. That's not to say that there's not an incremental improvement to be had by fixing the existing approach, but this is not the final solution. We need to build pages of blocks and execute and cache those pages. Just my opinion.
Eclipse
Comment #89
krlucas CreditAttribution: krlucas commentedAdding D7 Bean.module performance as a related issue:
https://www.drupal.org/node/2734693
Comment #90
Elijah Lynnhttps://github.com/vasi/block_access_records
block_access_records
--------------------
Blocks are super powerful in D8. The page title can be a block, messages are a block, contact forms can be a block… This means your site may have dozens or even hundreds of blocks!
Unfortunately, checking which blocks should be visible on a page gets really slow when you have a lot of blocks. Drupal insists on loading all the blocks, then checking each and every one individually. I've seen it take up to 130 ms!
This module implements a fast way of checking, inspired by the node_access system. It generates a single query that does most of the work, and then only loads the blocks that are likely to be visible. On the site mentioned above, it cut over 90 ms off the time taken.
It currently works with all the visibility conditions in Drupal core: path, language, user role, and node type. It's also extensible, so you can add plugins that handler other types of conditions.
Caveats:
* If you have custom conditions, you'll have to extend this module. This module does at least ensure that you don't add conditions that aren't supported.
* Entity or block access hooks are only invoked on blocks whose conditions seem to pass, since we don't want to load other blocks. There's a new hook_block_visibility_alter() so you can add blocks whose conditions failed.
* This is not well tested, so beware of bugs.
There is a core issue about the slowness of core's block visibility checking: https://www.drupal.org/node/2479459 . I'm not sure the solutions there are ideal, because blocks may vary by path, which changes all the time. So this is my attempt at a better solution.
Comment #91
tim.plunkettThat... that's this issue :)
Comment #93
joseph.olstadThere's an updated patch in the D7 version of bean.module , this might help.
see: #2734693: Add persistent caching to bean_block_info() to avoid loading all beans at run-time when using Panels / CTools content type block plugin
Comment #95
BerdirSince I'm not sure that caching this wouldn't just end up with a per-user/per-route cache context on all sites seventually, I'm not sure this will actually ever happen.
So I started a new approach to address the main performance issue that I'm seeing with block access checking: #2878673: Cache loadEntityByUuid() calls for block_content
Comment #97
Wim Leers#2878673: Cache loadEntityByUuid() calls for block_content is in now. Does that mean this is no longer needed? I think this at least needs a new round of profiling, because #2878673 will cause the numbers to change.
Comment #98
BerdirYes, I'm not sure if this is still needed. Possibly keep it open but change to a normal issue until we have numbers that proof that it is still useful?
The main problem of this is IMHO that quite likely, the cache would end up being per-user per-url and then at that point fairly useless for a lot of cases, especially now with dynamic page cache.
Comment #99
Wim Leers👍 Done!
Comment #107
andyg5000Hi everyone, I found this issue after noticing a very long running `getVisibleBlocksPerRegion` call in my Blackfire trace. After profiling with and without this patch, I think there's still a need for this. I've included the 2 screenshots of `Blackfire's BlockPageVariant::build`.
This is with Drupal v8.9.16
Comment #108
joseph.olstadpatch #83 needs a complete reroll!
Bumping the priority up - In comment #93 I mentioned that this was fixed in the D7 bean module several years ago.
Comment #109
BerdirAnd #98/99 agreed to downgrade it to normal because we did other improvements, the config entity query is using an index, the content blocks are using a cache for the uuid lookup.
Unless someone shows numbers that makes this worth it, I don't think this as major. And it's a service, so a contrib module could override it and experiment with caching.
This is very different to the D7 bean module, and as pointed out, the cache would likely be basically end up with the same contexts as dynamic page cache.
Comment #110
mglamanJust found this again due to #3249710: hook_theme_suggestions_alter needs to statically cache results of visible blocks. Maybe it's a wrong approach, but their code is calling getVisibleBlocksPerRegion about 41x times and costing up to 2s.