Problem/Motivation

With #3442009: OOP hooks using attributes and event dispatcher and #3396165: [meta] Convert all core plugin types to attribute discovery (and potentially others in the future like autowiring) we're going to be doing a lot of scanning for attributes in core.

In general, each thing-that-looks-for-attributes implements iterating over a load of files, getting the attributes via reflection, then pulling out whatever information it needs, either into the symfony container or into a cache entry.

We won't have reliable profiling numbers on this until lots of conversions are done and possibly not until someone profiles a 150+ module site in the wild, but it's likely to be expensive.

Steps to reproduce

Proposed resolution

I think we can possibly use filecache to mitigate most or all performance overhead from scanning attributes.

In this issue, we're just adding FileCache around hook discovery, possibly later on we could try to make it generic:

We can implement something that:
- given a file
- collects all the attributes from that file at once
- puts them into a structure that we can store in filecache (which is based on the mtime of the file so persists across cache clears)
- allows a single attribute value to be returned from a method, possibly all of them.

If we use this in all/most of the places we are parsing attributes, then files that potentially use more than one type of attribute (like a service that provides a route using route attributes, whenever that lands, as well as a hook implementation), would only be parsed once instead of for each service that needs to check.

This should be easy when Drupal is doing the attribute parsing, it may be more complicated for the container itself, but maybe we can swap that out too.

I don't yet know what the API or data structure might look like, only have the very rough outline above so far.

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Issue fork drupal-3478621

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

catch created an issue. See original summary.

nicxvan’s picture

Thinking through this a bit, wouldn't the ideal solution be a single file with the cached results rather than still needing to check files?

The cache would just return the attributes with mappings back to the classes / methods needed?

I mean if we need it per file we could cache both maps.

So you might end up with two caches, one with the attributes in a given file, one with files and methods for a given attribute.

catch’s picture

It would be good to be able to cache the file scanning more but I'm not sure we'll be able to do that more than we already do in the object cache. For example we'd have to invalidate it when modules are installed/uninstalled because the potential directories to scan will change.

nicxvan’s picture

So this is more about scanning a given file once for all attributes to prevent scanning a file more than once.

catch’s picture

Re #4 yes but also the individual file scans are able to persist across drush cr - drush cr might change which files we scan, but not the contents of those files. The mtime checking and deployment_identifier keys in filecache should handle file content/logic changes.

catch’s picture

Took a quick look at this in HookCollectorPass only, not trying to do something generic yet (and it might not be possible, or at least it wouldn't help the procedural bc layer).

I got it to the point where a drush cr doesn't explode, not sure if it actually 'works' yet.

Something that is seriously going to affect the utility of this is that the apcu cache is not available to/shared with cli requests.

I think we need a Drupal\Core\FileCache implementation that takes configuration from settings.php and can use a cache backend (but one that is not tagged for cache invalidation), or something like that. Might need a spin-off issue for that, but pushing where I got to.

catch’s picture

catch’s picture

Status: Active » Needs review

Tests are passing.

This might make a difference with the UI installer, but I think we need #3486503: Add a persistent cache for file-based discovery based on FileCache to get the real benefit of this, (and not just for this issue).

nicxvan’s picture

Status: Needs review » Needs work

This looks pretty straightforward. Does this need dedicated test coverage to ensure it's actually using the file cache and purging the file cache properly on module install and uninstall?

Unless I am missing something this is missing the invalidation layer.
I do know this intentionally longer lived beyond drush cr, but there are still situations it needs rebuilding.
I don't see how this file gets deleted when it needs to be deleted:

As far as I can see:

  • during development
  • after running composer install or update ( an update or dev version might have new attributes to find)
  • after install
  • after uninstall
catch’s picture

It shouldn't be necessary to purge on module install and uninstall - the module list determines discovery so any information cached for an uninstalled module will be ignored. We need add the deployment identifier to the cache key though to handle core updates.

During development the file mtime will change and that makes the cache invalid each time anyway.

nicxvan’s picture

Ah yes, forgot about mtime.

During development the file mtime will change and that makes the cache invalid each time anyway.

That's true!

However, I don't follow why we need to add a deployment identifier then?

Also, good news, looks like this won't conflict with #3484754: Ordering of hooks needs some attention since it's all within the collectModuleHookImplementations method.

catch’s picture

However, I don't follow why we need to add a deployment identifier then?

I was thinking if the logic for collecting attributes itself changes. Maybe we don't need it though and could change the prefix directly in that case? It would make the cache work a lot better for sites that use $deployment_identifier without relying on the core version fallback.

nicxvan’s picture

Ah that makes sense! I'll leave it in needs work then for now.

catch’s picture

Status: Needs work » Needs review

Added in the deployment_identifier to the cache key.

nicxvan’s picture

PHPCS failed due to $filename not being used.

I don't want to update it so I can preserve RTBC

nicxvan’s picture

Status: Needs review » Needs work
nicxvan’s picture

We can't use the Settings::get('deployment_identifier') for the file cache since it relies on the actual file to get the mtime.

catch’s picture

Status: Needs work » Needs review

Feels obvious now that you've pointed it out - the file cache depends on the filename and can't use arbitrary cids.

However, when I went to remove it, I realised we can probably add it to the namespace instead - seems like this should work but let's see what the pipeline says.

nicxvan’s picture

Status: Needs review » Reviewed & tested by the community

That is pretty clever!

Tagging this RTBC with the assumption we don't need to test individual cache sets and gets.

If we need that let me know.

This does have a lot of implicit coverage, if the cache wasn't storing it properly everything would break.

catch’s picture

Title: Add a filecache-backed attribute parser » Add filecache to OOP hook attribute parsing
Issue summary: View changes

The original issue title was overly optimistic, maybe we can come up with a generic filecache-backed attribute parser but I think we should do a couple of one-off issues like this before that to understand the requirements better.

longwave’s picture

Status: Reviewed & tested by the community » Needs work

The changes make sense, there appears to be a further optimisation we can do but otherwise this looks good.

catch’s picture

Status: Needs work » Reviewed & tested by the community

Good point - pushed a commit for this, since it's a trivial change, moving back to RTBC.

longwave’s picture

Fixed PHPCS, only rearranging lines and deleting some blank ones.

longwave’s picture

The "validatable config" job is failing on drush cr, unsure if it's directly related to this but looks relevant.

https://git.drupalcode.org/project/drupal/-/jobs/3366605

$ vendor/bin/drush cr --quiet
Error: Call to undefined function \ckeditor5_config_schema_info_alter() in /builds/project/drupal/core/lib/Drupal/Core/Extension/ModuleHandler.php on line 459 #0 /builds/project/drupal/core/lib/Drupal/Core/Plugin/DefaultPluginManager.php(386): Drupal\Core\Extension\ModuleHandler->alter('config_schema_i...', Array)
#1 /builds/project/drupal/core/lib/Drupal/Core/Config/TypedConfigManager.php(406): Drupal\Core\Plugin\DefaultPluginManager->alterDefinitions(Array)
nicxvan’s picture

Oh that looks like we missed a direct call to a hook, I can take a look in a bit.

It could also be in drush or another vendor package.

catch’s picture

I think that is specific to this MR and that job. The 11.x branch tip here was from before OOP hooks so it was getting very confused. I've pushed a rebase of the 11.x branch here and will re-run the job.

catch’s picture

OK it's not that, might be to do with config_inspector but I can't see any direct hook invocations, which leaves 'something weird' or drush.

nicxvan’s picture

I would just require drush locally and search for ckeditor5_config_schema_info_alter. If that doesn't find it I'll investigate further when I'm at my desk in about an hour

catch’s picture

No it is that, I just failed to push the correct branch. Pipeline is green again.

nicxvan’s picture

Oh great! Looks good to me still and you addressed @longwave's comment on the mr.

Rtbc from me again

  • longwave committed f6f705f8 on 11.1.x
    Issue #3478621 by catch, longwave, nicxvan: Add filecache to OOP hook...

  • longwave committed 6aca2acf on 11.x
    Issue #3478621 by catch, longwave, nicxvan: Add filecache to OOP hook...
longwave’s picture

Version: 11.x-dev » 11.1.x-dev
Status: Reviewed & tested by the community » Fixed

Let's squeeze this into 11.1.0-beta1.

Committed and pushed 6aca2acfd19 to 11.x and f6f705f8a29 to 11.1.x. Thanks!

  • longwave committed bcee82d6 on 11.1.x
    Revert "Issue #3478621 by catch, longwave, nicxvan: Add filecache to OOP...

  • longwave committed dafc1fb9 on 11.x
    Revert "Issue #3478621 by catch, longwave, nicxvan: Add filecache to OOP...
longwave’s picture

Status: Fixed » Needs work

Reverted, this broke Experience Builder and will break many other modules that implement hooks on behalf of others.

Experience Builder implements two hooks on behalf of core modules: datetime_range_storage_prop_shape_alter and media_library_storage_prop_shape_alter - this is so the changes only take effect if those modules are installed.

During a test, we first of all install without media_library or datetime_range enabled. The hook discovery runs and these two hook implementations are not cached, because they aren't in the modules list and therefore aren't included in the function name regex. Later in the test, we install these modules, but the hook implementation cache is not cleared; the code believes it's already discovered all the hooks for experience_builder.module, the cached result is used, and the hooks are never found or executed at runtime.

This will presumably affect any module that implements hooks on behalf of other modules, the incomplete set of hooks may be cached and then never cleared. Not sure what the fix is yet, maybe we need to hash the regex as part of the cache key?

nicxvan’s picture

That might work, but can we also just dump it on module install?

We won't see as much benefit in that case as we'd want, but that should solve this case for now right?

catch changed the visibility of the branch 3478621-add-a-filecache-backed to active.

catch’s picture

Not sure what the fix is yet, maybe we need to hash the regex as part of the cache key?

The concept of implementing a hook on behalf of another module doesn't exist with OOP hooks (you can check moduleExists() inside a hook implementation of course still), so we only need to cover this with procedural hooks, so I tried implementing it only for them - seems simple enough but see what the pipeline says. The advantage if we do it this way is we only take the hit on the bc layer and only when the module list changes, so regular drush crs still get the benefit as do converted modules.

nicxvan’s picture

Just a quick note, you can implement on behalf of other modules using #[Hook('toolbar_alter', module: 'module_on_behalf_of')]

You can't implement on behalf of a theme though.

catch’s picture

Ahh #43 is a good point, however I think we're OK here because those implementations will still be discovered by the OOP attribute discovery (just not invoked when hooks are invoked), whereas the problem we have with experience_builder is the procedural implementation not being discovered at all until the module is installed. There's no way to tell which module foo_bar_foo_bar_foo_bar() is implemented on behalf of, until foo, foo_bar or foo_bar_foo module is installed, but the attribute is explicit.

nicxvan’s picture

These changes look good to me actually, can we test this against XB before going through the commit process again?

nicxvan’s picture

I created #3487815: Test Filecache against XB we can apply this there.

catch’s picture

Status: Needs work » Needs review
nicxvan’s picture

longwave’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Given we broke contrib let's add a test to ensure we don't regress again in the future

nicxvan’s picture

I'm taking a crack at this.

nicxvan’s picture

Status: Needs work » Needs review

Ok this is ready for review.

I ran the test only, but that won't fail cause the current code and previous code both work for the situation, but it will prevent the regression. That forced reverting this.

nicxvan’s picture

I added a procedural test too.

catch’s picture

Version: 11.1.x-dev » 11.x-dev
Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs tests +beta target

The new test coverage looks good, marked threads on the MR resolved because I agree with the answers given.

Tagging beta target because this should significantly help mitigate the increased memory usage due to the procedural hook bc layer.

catch’s picture

Did some manual testing of this + #3486503: Add a persistent cache for file-based discovery based on FileCache together.

This is what the file_cache table looks like after running the UI installer with the standard profile:

MariaDB [db]> SELECT COUNT(*) FROM file_cache WHERE cid LIKE '%\:hook_implementations%';
+----------+
| COUNT(*) |
+----------+
|       62 |
+----------+
1 row in set (0.001 sec)
MariaDB [db]> SELECT COUNT(*) FROM file_cache WHERE cid LIKE '%:procedural_hook_implementations%';
+----------+
| COUNT(*) |
+----------+
|     1591 |
+----------+
1 row in set (0.002 sec)

As you can see there are far fewer entries for the non-procedural implementations, this is because we don't have to use the module regexp in the cache key so they persist across module installs, whereas procedural implementations are discovered each time to allow for implementing hooks from other modules.

I pushed one additional commit:

1. Gave the procedural hook implementation items their own namespace, this made it easier to run the queries above to differentiate the prevalence of the two. Due to the module preg sharing the namespace was fine, but nice to see what things look like.

2. Removed the deployment identifier from the cache key - it's already in apcu_prefix so this would mean it's in the cid twice.

Leaving RTBC because these are very small changes.

nicxvan’s picture

Status: Reviewed & tested by the community » Needs review

Keeping an eye on tests.

alexpott’s picture

Status: Needs review » Reviewed & tested by the community

My review was minor code-style nits - no reason to not leave rtbc.

  • alexpott committed 6d78e0cf on 11.1.x
    Issue #3478621 by catch, nicxvan, longwave: Add filecache to OOP hook...

  • alexpott committed 59598e57 on 11.x
    Issue #3478621 by catch, nicxvan, longwave: Add filecache to OOP hook...
alexpott’s picture

Version: 11.x-dev » 11.1.x-dev
Status: Reviewed & tested by the community » Fixed

Committed and pushed 59598e57ab3 to 11.x and 6d78e0cf06e to 11.1.x. Thanks!

  • alexpott committed 89c81430 on 11.1.x
    Revert "Issue #3478621 by catch, nicxvan, longwave: Add filecache to OOP...

  • alexpott committed 28c7686a on 11.x
    Revert "Issue #3478621 by catch, nicxvan, longwave: Add filecache to OOP...
alexpott’s picture

Status: Fixed » Needs work

This broke core/tests/Drupal/FunctionalTests/Installer/DistributionProfileExistingSettingsTest.php https://git.drupalcode.org/project/drupal/-/pipelines/351677/test_report...

alexpott’s picture

This is because combined with #3488176: Convert system_theme to OOP and handle install time call we break core/tests/Drupal/FunctionalTests/Installer/DistributionProfileExistingSettingsTest.php

I've rebased the MR on top of 11.x head so we'll see the failure here.

alexpott’s picture

For any follow-up that tries to fix this - the hook collector pass that runs before system classes are autoloadable is triggered by install_begin_request() from

  // Override the module list with a minimal set of modules.
  $module_handler = \Drupal::moduleHandler();
  if (!$module_handler->moduleExists('system')) {
    $module_handler->addModule('system', 'core/modules/system');
  }
catch’s picture

Status: Needs work » Needs review
nicxvan’s picture

Status: Needs review » Reviewed & tested by the community

Looks good now.

andypost’s picture

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed c79a622227e to 11.x and e01542fd879 to 11.1.x. Thanks!

  • alexpott committed e01542fd on 11.1.x
    Issue #3478621 by catch, nicxvan, longwave, alexpott: Add filecache to...

  • alexpott committed c79a6222 on 11.x
    Issue #3478621 by catch, nicxvan, longwave, alexpott: Add filecache to...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

bkosborne’s picture

This causes a huge increase in APCu usage for multi-site setups that use lots of modules. I created #3566156: APCu FileCache can consume a ton of memory due to caching procedural hook data since Drupal 11.1 to address it.