Problem/Motivation

Optimize entity reference autocomplete widget should load referencable entities as little as possible, instead of each time form element is rendered.

Currently \Drupal\Core\Field\Plugin\Field\FieldWidget\EntityReferenceAutocompleteWidget::formElement is called for each field item, but calls $items->referencedEntities() each time. This method loads referencable entities for all field items.

In cases for example where Entity Reference Revisions are used, referencedEntities actually loads revisions, which do not have a cache! (See #2620980: Add static and persistent caching to ContentEntityStorageBase::loadRevision()), so page load time and database queries increase exponentially.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Issue fork drupal-3095329

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

dpi created an issue. See original summary.

dpi’s picture

StatusFileSize
new42.78 KB
new249.98 KB
new2.07 KB

Attached a patch. I'm not sure this is the right fix, as formElement can be called without calling formMultipleElements beforehand. Making this change shouldnt affect overridden implementations.

Profiling

I've attached a Blackfire sample of node type with a body field, and an ERR field with 128x items. The referenced entity is a different out of the box node type with only body field. Machine is a 13" Macbook, Xdebug and Blackfire enabled, in Docker:

Before takes 57.9seconds to load. After is a 94% reduction in load time, including 58,357 (yes!) less queries:

blackfire

blackfire queries

dpi’s picture

Assigned: dpi » Unassigned
Status: Active » Needs review
dpi’s picture

andypost’s picture

Issue tags: +Performance
larowlan’s picture

Issue tags: +DrupalSouth 2019
nigelcunningham’s picture

Reviewing @DS.

nigelcunningham’s picture

I think a better solution would be to use a drupal_static variable to cache the results of an initial call, ala the patch I'm attaching. You'll also want to drupal_static_reset from places where the referenced list of entities can be saved, so that a form rebuild will display changes made, but I'll defer to more experienced minds before I get carried away implementing a full path. (I'll ping Kim and/or Lee on the Drupal Slack channel to confirm / correct).

ndobromirov’s picture

#2:

Caching entities in a private / protected property could cause cache validity problems, some times even during the same request.
A better alternative is to make it through a dedicated cache service.

If you are aiming for static-like cache - consider the MemoryCache class in a custom service to inject and do the caching there.

You can take examples from the JSON:API module. Note that this is for Drupal 8.6+.


#8:

@NigelCunningham drupal_static should have already been deprecated in Drupal 8, so I think the patch in #8 is a no-go.


Based on the above I will be putting the issue back into needs work.

ndobromirov’s picture

Status: Needs review » Needs work

This is marked as a bug. Do we need tests on this?

dpi’s picture

I should clarify, and my original patch could be reworked to unsetting the value of the property before formMultipleElements exits. The widget shouldn’t be responsible for caching at all. Caching should be handled by mechanisms called by referencedEntities

nigelcunningham’s picture

Thanks @ndobromirov.

I wasn't aware that drupal_static was deprecated; good to know.

@dpi, I certainly see the validity of saying that Entity Reference Revisions should itself cache results at a lower level, but I don't think that in itself means it's invalid to cache the results at the widget level because the code in referencedEntities isn't trivial. Even if the cost of the loadMultiple was constant, without caching you've got n calls to referencedEntities, each involving 2 O(n) loops (unless the fields have no values at the time) and a ksort - still an O(n^2) algorithm.

On the other hand, putting static storage in the form doesn't seem to me to be a good pattern either - I think it makes most sense to put caching in referencedEntities itself, and use the MemoryCache class suggested above.

hchonov’s picture

Status: Needs work » Closed (won't fix)

In cases for example where Entity Reference Revisions are used, referencedEntities actually loads revisions, which do not have a cache! (See #2620980: Add static and persistent caching to ContentEntityStorageBase::loadRevision()), so page load time and database queries increase

If this is going to be solved by #2620980: Add static and persistent caching to ContentEntityStorageBase::loadRevision() then I don't see any reason for the issue. There we have only 3 fails left. Please help us out there in order to make the revision loading more efficient.

Based on that I am closing the issue as "won't fix". Feel free to reopen in case I've missed something or the referenced issue is not sufficient in some way.

dpi’s picture

To be honest I don’t think it matters whether there is caching deeper in the stack. Even with core ER field, there is excessive processing happening. We should fundamentally avoid calling this method more than once if widgets are used in a multivalue context.

hchonov’s picture

Status: Closed (won't fix) » Needs work

We should fundamentally avoid calling this method more than once if widgets are used in a multivalue context.

That is right. I am reopening it for that. So let's change

'#default_value' => isset($referenced_entities[$delta]) ? $referenced_entities[$delta] : NULL,

to

'#default_value' => $items->offsetExists($delta) ? $items->get($delta)->entity : NULL,

nigelcunningham’s picture

Turn @hchonov's suggestion in comment 15 into a patch; thanks Hristo!

dpi’s picture

'#default_value' => $items->offsetExists($delta) ? $items->get($delta)->entity : NULL,

Making this change means the result of referencedEntities is no longer used. There is potentially backwards compatibility issues if we are directly accessing the magic ->entity prop.

Also, looking at the body of referencedEntities, there is other logic happening in there than simply calling ->entity. There are potential behaviour changes here.

hchonov’s picture

There is no contract how the referenced entities will be retrieved. I don't see any BC issues. One might even argue that the ::formElement method should be explicitly working with the item at the corresponding delta. There is no other example where all values of all items are loaded at once like this. I would even dare to say that this method wasn't introduced for form processing.

Also calling ::referencedEntities multiple times without static entity cache is very inefficient. And it will be called multiple times for cardinality >1.

berdir’s picture

> Also calling ::referencedEntities multiple times without static entity cache is very inefficient. And it will be called multiple times for cardinality >1.

It's a little bit more complicated than that. Calling referencedEntities() means the field type has a chance to do a more efficient bulk load, which is indeed more efficient *if* it is statically cached. An assumption that's not true for ERR of course, but as I wrote above, we might be able to make it so without changing core. So using ->entity would be faster than using referencedEntities() now, but I think we can make referencedEntities() perform better than calling ->entity N times.

As for BC, it would in theory be possible to have a field type that implements referencedEntities() but does not have an ->entity property. ER and ERR both do, but some other field type might not.

hchonov’s picture

As for BC, it would in theory be possible to have a field type that implements referencedEntities() but does not have an ->entity property. ER and ERR both do, but some other field type might not.

In theory this is possible. But here we're talking about \Drupal\Core\Field\Plugin\Field\FieldWidget\EntityReferenceAutocompleteWidget, which supports only the core own entity_reference field type. I don't see any contract in our BC policy about the methods that should be used in the implementation when retrieving the referenced entities.

Also if I as a developer expect that every :: formElement works with the delta item, just because every other field widget does that except this one, then I might expect that any alteration I do on the field item entity will be reflected on the widget, but this might not be the case if ::referencedEntities is used.

Yes, one can argue that there might be people extending from \Drupal\Core\Field\Plugin\Field\FieldWidget\EntityReferenceAutocompleteWidget and using entity reference fields without an entity property, but this is not something we should care about. This widget is bound to a specific field type in core and if one is extending from it, then they should be extending from the core own entity_reference field type as well.

Consistency is very important and if we can ensure it without performance boosts then it is better to provide it.

berdir’s picture

> But here we're talking about \Drupal\Core\Field\Plugin\Field\FieldWidget\EntityReferenceAutocompleteWidget, which supports only the core own entity_reference field type

If that were true, then this issue wouldn't exist. This is about the problems with the ERR field type, which has a slow referencedEntities() implementation. For core, using referencedEntities() might even be faster for multiple items because we can pre-load them all at once and then rely on the static cache.

You can define your own field type and make other, existing (core) widgets support it. Yes, it's kinda unlikely for my use case to exist, but still, if you did that in the past and updated core then your field type would be broken.

I'm not saying that's an absolute no-go to do this, but it's something that we need to consider and I still think that an improved referencedEntities() implementation in ERR would solve this too, improve performance in other cases that use that method (even if just called once, we could at least use loadMultiple() on the revisions, possibly do the same try-the-default-revision-first optimization as ->entity uses and avoid loading them twice).

adam3278’s picture

StatusFileSize
new1.54 KB

Mistake

berdir’s picture

#3098924: referencedEntities() causes data loss is what I mean, that should reduce the overhead to a minimum as we'll just access the existing property.

adam3278’s picture

StatusFileSize
new1.54 KB

I've created interdiff for last 2 patches. May it will help for next fixing. (GCI task).

craigmc’s picture

I'm not clear why the widget and form need to do an entity load at all? I understand the autocomplete itself is doing an entity load to pull up ID and title based on the searched phrase, but when rendering the list, or saving the form, loading the entity doesn't seem to add any value and adds a tremendous amount of overhead, when dealing with a larger set of referenced entities.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

quietone’s picture

Category: Bug report » Task
Issue summary: View changes
Issue tags: +Bug Smash Initiative

This was a BugSmash bug bingo. It is an optimization and not a bug. I am changing to a task.

solideogloria’s picture

Status: Needs work » Needs review

The status never got changed to Needs Review after the patch in #16.

(Though it needs to be added as a Merge Request, the changes should still be reviewed and tested.)

smustgrave’s picture

Status: Needs review » Needs work
Issue tags: +Needs issue summary update

Can we turn to an MR and see what the pipeline says :)

Also could use a summary update as the proposed section is empty, what's being done to optimize

meeni_dhobale made their first commit to this issue’s fork.

mayurgajar’s picture

Assigned: Unassigned » mayurgajar

mayurgajar’s picture

Assigned: mayurgajar » Unassigned
Status: Needs work » Needs review
smustgrave’s picture

Status: Needs review » Needs work

Have not reviewed but summary still needs to be updated.

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.