Problem/Motivation
Based in discussions in #3548293: Use #lazy_builder / placeholdering for entity rendering, #3281020: referencedEntities: Use loadMultipleRevisions instead of loadRevision, performance tests and other issues, I had to idea to expand the preloading logic.
We now load all references in a single field at once. That is then repeated for each group of nested references. And they often reference nodes, media entities (which reference file entities). Many of these are then loaded one by one.
Steps to reproduce
Proposed resolution
Loop through each loaded entity, collect the references and then load them. The tricky part is not loading too much. For example, a referenced node might have a media reference too but might already be render cached, same for a file reference in a media. For nested ones, a combination with a lazy builder formatter might be more useful.
Remaining tasks
User interface changes
API changes
Data model changes
Issue fork entity_reference_revisions-3558370
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #3
berdirThe code is a bit complex and I need to refactor it to be more readable, but it's starting to do what I expect it to.
It's optimized specifically for paragraphs. Imagine the following paragraph structure:
the current preload only works on one level on on one reference field. In theory prepareView() supports multiple entities, but buildEntities() is almost never used and is broken so I didn't even bother to optimize for that in #3281020: referencedEntities: Use loadMultipleRevisions instead of loadRevision.
That means currently in HEAD, we have 3 loadMultiple() calls for paragraphs. First one for Container 1 and 2. Then one for the paragraphs of Container 1, Card 1 and 2. And then another for Card 3 and 4. And then the 4 medias are loaded one by one. In total, 7 separate load calls for the visible entities.
This introduces essentially two additional passes on top of the initial preload for Container 1 and 2. It's an optional setting that you're meant to enable on the top level structure only.
1. Pass "primary entity type": This attempts to load all references in ERR fields of the same entity type (typically paragraphs) That means it will Then find Card 1,2,3,4 and load them together. In this simple example, that's mean 2 instead of 3 loadMultiple() calls. That's not a big difference, but it's the most basic example. Long landing pages might have many of those containers. It's also recursive. If there's a third level of paragraphs, it will also collect them again over all newly identified childs.
2. Pass "secondary entity types": Then it will loop over all loaded primary entities, so both containers and cards in this case and prepare all referenced medias, nodes, terms. In this example, it will then load media 1,2,3,4 together. For this example, that's now 3 entity loads instead of 7.
In my example page that I'm testing with my performance test, I have 14 media entities, which are now all loaded in a single loadMultiple() instead of 14 separate load calls, we have a bunch of fields on media entities, so that's cutting about 130 queries. Both steps consider the view display configuration. That means they don't load references that aren't displayed. But picking the right view mode needs some optimization.
At first I thought I might go even deeper. Those 14 media entities also reference 14 file entities, which are still loaded one by one. But that's only one query (no revisions, no fields) and those medias, unlike paragraphs, might also be render cached already, so we might load things we don't need. Instead I think I'll play the ball back to #3548293: Use #lazy_builder / placeholdering for entity rendering. I think we can have a lazy builder entity view formatter that provides the parent info so that we can load and set that. Then those 14 medias could use that and the fiber bulk loading could take care of those 14 files.
Comment #4
heddnI was doing some performance baseline testing w/ @catch today and yesterday and stumbled across an issue that seems at least partially related to this. Below is my changes. Without it, we prematurely load all ERR referenced entities when
::load()or::loadeMultipleis called. In our case, we're loading several hundreds of entities with base fields in a JSON::API feed. What we'd rather have happen is delay loading all the ERR entities until later, at which point we'll load them all in a::loadMultiple(). We've got things tuned so we have one::loadMultiple()that loads up 500 entities at a time in (1) query. Because of this erring code, that results in 1+500 additional queries for each of the ERR fields. Whereas if this delayed the loading until later, we'd only have (2) queries.I think this is related here, because without it it will result in all of this work being null/void.
Comment #5
heddnI've opened #3572357: EntityReferenceRevisionsItem prematurely loads referenced entities to track #4.