Problem/Motivation

Some of the entities we use with our website have paragraph fields containing a lot of data with some child paragraphs as well. getParagraphWebformsRecursive tries to go through all of them, causing extreme slow down (it can take up to a minute to load if the entity has a lot of data). This entity does not have any webforms in any paragraphs.

Steps to reproduce

Have entities with paragraph fields with a lot of data and child paragraphs that have their own paragraph fields. View the page. On Drupal 8.9.13 and uses Webform 8.x-5.25 and Paragraphs 8.x-1.12.

Proposed resolution

- Add a hook in order that the entity can skip getParagraphWebformsRecursive
- Try to check if the entity has any paragraphs with webforms before doing a recursive search
- Add UI to select the content types that uses getParagraphWebformsRecursive

Issue fork webform-3204456

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

heinvdb.ntt created an issue. See original summary.

heinvdb’s picture

Issue summary: View changes
heinvdb’s picture

Issue summary: View changes
jrockowitz’s picture

We would have to rework \Drupal\webform\WebformEntityReferenceManager::getParagraphWebformsRecursive.

I think adding some caching and better checking could help a lot.

My thoughts about the possible solutions

- Add a hook in order that the entity can skip getParagraphWebformsRecursive

A patch might be the MVP solution, but people will lose functionality.

- Try to check if the entity has any paragraphs with webforms before doing a recursive search

This is really hard to do because paragraphs can contain paragraphs.

- Add UI to select the content types that uses getParagraphWebformsRecursive

We might calculate this information and cache which content types with paragraphs could contain a webform. Once again, this is really hard to do because paragraphs can contain paragraphs.

heinvdb’s picture

"A patch might be the MVP solution, but people will lose functionality"

What if the default is that all the entities use getParagraphWebformsRecursive, unless you specify in the hook that it can be skipped?

heinvdb’s picture

Issue summary: View changes
jrockowitz’s picture

Once you add a hook, you can never remove it.

This a performance issue that should be solvable via caching.

jrockowitz’s picture

Version: 8.x-5.25 » 6.1.x-dev

The attached patch adds caching to \Drupal\webform\WebformEntityReferenceManager::getWebforms.

When paragraphs are being used there is a significant performance boost because the source entity's list of webforms is only calculated once and then the cached data is only cleared when the source entity (or it's paragraphs) are updated.

jrockowitz’s picture

Status: Active » Needs review
StatusFileSize
new3.87 KB

Status: Needs review » Needs work

The last submitted patch, 9: 3204456-8.patch, failed testing. View results

jrockowitz’s picture

Status: Needs work » Needs review
StatusFileSize
new4.81 KB

Status: Needs review » Needs work

The last submitted patch, 11: 3204456-11.patch, failed testing. View results

jrockowitz’s picture

Status: Needs work » Needs review
StatusFileSize
new5.06 KB
jrockowitz’s picture

StatusFileSize
new5.06 KB

angrytoast’s picture

Thanks for the patch. We have a site that uses paragraphs heavily and are seeing performance issues in this area.

I looked over the changes and it seems like there are a couple of improvements that can be made, how do these sound?

1. In ::getFieldNames, instead of caching by the {{entity_type}}-{{entity_id}}, cache by {{entity_type}}-{{bundle}}. Especially for paragraphs, this accounts for multiple instances of a paragraph bundle on a page which is more likely given the reuse nature. As-is, the class cache var $fieldNames ends up looking like:

node-1111 => fields
paragraph-1001 => fields
paragraph-2002 => fields
paragraph-3003 => fields
... etc

Instead it can be:

node-bundle => fields
paragraph-bundle1 => fields
paragraph-bundle2 => fields
... etc

The latter will likely have a better hit ratio for paragraphs.

2. Cache field names returned in ::getParagraphFieldNames by entity type.
This one gets called often with lots of paragraphs. We're actually seeing a lot of time spent here in the field_storage_config look up for paragraph field names. Since the field names are fixed per entity type and unlikely to change, thinking this is useful to cache both at the request level but also in Drupal cache for future requests since these aren't likely to change, and skipping the field_storage_config load helps across multiple page requests.

angrytoast’s picture

Add the patch changes in #16 to the existing MR with a minor change. Modified the field_storage_config entity query lookup in ::getParagraphFieldNames to target paragraph fields and get fields for all entity types at once instead of doing it gradually per type. This caches everything at once in one persistent cache entry and on a per request basis.

https://git.drupalcode.org/project/webform/-/merge_requests/122/diffs?co...

FWIW, we are using this patch on a site that uses paragraphs and webform heavily; we saw uncached page requests with paragraphs+webform drop from average of 1.2s to ~900ms render time. It is a good amount of reduction in time spent in ::getParagraphWebformsRecursive.

berdir’s picture

Status: Needs review » Needs work

Just stumbled over this as well in profiling, makes up about 10% of the time on the one example page I looked at.

The cache as it is does not use cache tags, so it doesn't account for field changes. One option is the use the entity_field_info cache tag to handle that.

That said, what my plan was before I found this issue is to use the field storage definitions from the entity field manager and then check the type and target type setting in PHP, I think that should have minimal overhead as those definitions are very likely then already loaded at that point and a loop over that should be pretty fast.

I'm uncertain if the extra cache on getWebforms() is worth it, do we have any data on that? _Maybe_ if the fields aren't initialized yet, but cache request aren't free.

The constructor BC fallback is also not using the same bin as the service definition.

berdir’s picture

Status: Needs work » Needs review

Pushed an alternative suggestion as a new MR.

We already have the entity object and its definitions, we can loop over them, just like getFieldNames(), there is no need to query/load field definitions separately.

I also did include the static cache change for field names, but not the persistent cache, I'm pretty certain that a cache get is more expensive than a loop over field definition objects. I'm not even certain that the static method is worth it, quite possibly not.

Locally in profiling, this method and its whole call chain no longer appears in blackfire which this change, but it's also worth pointing out that it was _way_ faster locally than the blackfire reports from production in the first place.

I also changed the method to a generator but that could easily be changed back if that's deemed too much of a change.

jrockowitz’s picture

Version: 6.1.x-dev » 6.2.x-dev
jrockowitz’s picture

@Berdir I am very open to your approach. I hope other people can weigh in.

angrytoast’s picture

+1 for @Berdir's approach in #19. Since the entity field definitions are already statically and persistently cached, it's a much simpler way of getting at the necessary field names and removing the repeated + slow field_storage_config loadByProperties call.

With this, the static cache in WebformEntityReferenceManager::getFieldNames doesn't matter as much. Basic local benchmark (our own site, not a standard Drupal install) on a larger page consisting of 68 paragraph references, 5 unique bundles, shows on average ~0.0008s faster with the static cache. I think it's not a meaningful difference and can be removed to reduce code.

Thoughts?

berdir’s picture

Thanks for the review. Fixed the check and removed the static cache.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Just found this profiling a site too - was taking about 600ms, the change reduces it down to a couple of milliseconds so looks great to me.

edit: also saw it taking several seconds on one request, but then couldn't reproduce this easily - maybe when all caches were cold.

Also opened #3537200: Node test form access check runs for anonymous users which will have much less impact if this was committed, but was also the only caller of this method for anonymous users for me and can't see they'd ever need to access that route.

liam morland’s picture

Version: 6.2.x-dev » 6.3.x-dev
Status: Reviewed & tested by the community » Needs work

We will first need this on 6.3.x.

berdir’s picture

Status: Needs work » Reviewed & tested by the community

Retargeted and rebased through the UI.

berdir changed the visibility of the branch 3204456-paragraphs-slow to hidden.

jrockowitz’s picture

This is a very impressive performance enhancement. Thank you for everyone's review and contribution

  • jrockowitz committed 254c265f on 6.3.x authored by berdir
    Issue #3204456 by jrockowitz, angrytoast, berdir, liam morland, heinvdb...
jrockowitz’s picture

Status: Reviewed & tested by the community » Fixed

  • jrockowitz committed 254c265f on 6.x authored by berdir
    Issue #3204456 by jrockowitz, angrytoast, berdir, liam morland, heinvdb...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.