Problem/Motivation

Reflecting every class and method defined in a Drupal install is both a performance and memory nightmare.

Steps to reproduce

Proposed resolution

There are two solutions:

  1. Move nikic/php-parser to runtime, use it on classes, it has a constant expression evaluate feature and attributes are using such. This is going to a lot of fun especially cross class constants. For performance, this solution would use #3486503: Add a persistent cache for file-based discovery based on FileCache. The drawback is an awful lot of effort would need to go into this.
  2. Swallow the bitter pill and ship core/modules with attributes dumped. The drawback and it's a major one is development becomes harder as something needs to re-dump attributes on file change, manually or automatically. Of course many projects have went down this route, most famously perhaps sass --watch and cargo watch and they lived to tell the tale. https://github.com/olvlvl/composer-attribute-collector/ is a good start but it likely would need to be changed to fork for every class -- or project at least -- otherwise the memory issue is back. Or perhaps composer using all the memory in the world is fine?

Remaining tasks

Decide whether to do #1 or #2 and how.

User interface changes

API changes

Data model changes

Release notes snippet

Comments

duadua created an issue. See original summary.

catch’s picture

We (me, @andypost, @donquixote - apologies if I missed anyone several discussions have merged into one in my memory). had a discussion about DrupalCon Lille about this.

One idea was to create an 'attribute registry' - i.e. to essentially parse out all attributes from a file at once and store them somewhere in a format.

That way if a single file contains (or might contain) multiple attributes types that all need to be discovered, we would only check that file once instead of once-per attribute type.

However it would be complex to implement so we thought better to see how we're doing after conversions.

Ideas like checking for #( worth doing first too if they're effective.

mstrelan’s picture

ghost of drupal past’s picture

Move nikic/php-parser up from dev dependency to normal and run something like https://gist.github.com/chx/9f289ffd541701228ba98243dd7eaff7

ghost of drupal past’s picture

@mstrelan it uses reflection https://github.com/olvlvl/composer-attribute-collector/blob/a3d45a2e9bab... it just does it when the file gets installed by composer, it would be a very different development workflow compared to what we have now.

ghost of drupal past’s picture

Issue summary: View changes
ghost of drupal past’s picture

Issue summary: View changes
ghost of drupal past’s picture

Issue summary: View changes
ghost of drupal past’s picture

Issue summary: View changes
catch’s picture

This might be less necessary with #3478621: Add filecache to OOP hook attribute parsing which will persist attribute parsing between full cache clears (but not yet between deployments, but see related issues).

godotislate’s picture

One thing to consider besides performance issues, a problem that's come up in #3421014: Convert MigrateSource plugin discovery to attributes and #3458177: Changing plugins from annotations to attributes in contrib leads to error if plugin extends from a missing dependency is that attributes can't be read via Reflection on classes that implement missing interfaces, extend missing classes, or use missing traits, in the case that the missing dependency comes from an uninstalled module. While Reflection will throw an exception for a missing interface or class, PHP will fatal on a missing trait. The exception thrown during plugin discovery was handled in #3458177: Changing plugins from annotations to attributes in contrib leads to error if plugin extends from a missing dependency, but a solution for handling missing traits gracefully is still outstanding.

Using something like nikic/php-parser to retrieve attribute values without might be a viable way forward to retrieve at least some attribute values in the case of missing dependencies.

catch’s picture

#12 is interesting. And a possible counterpoint to #11 is that the caching rather than cancelling this issue out, gives us more options to potentially adopt a slower but more memory-efficient attribute parser.

godotislate’s picture

Created #3490322: Use nickic/php-parser for attribute-based plugin discovery to avoid errors to explore using nikic/php-parser to help with plugin discovery per #12 and put up an MR.
Hat tip to @ghost of drupal past for the library suggestion and gist.

godotislate’s picture

Surfacing a comment I made in #3490322: Use nickic/php-parser for attribute-based plugin discovery to avoid errors about missing dependencies that I think warrants mention here outside the specific context of plugin discovery:

That being said, I think the scope of the missing dependencies issue will expand as attribute-based discovery expands to routes (#3311365: Use PHP attributes for route discovery), and services (#3422359: Directory based automatic service creation, #3376163: Support #[AsEventListener] attribute on classes and methods, #3481903: Support hooks (Hook attribute) in any registered service), etc. It might actually make sense to provide some sort of missing-dependency-safe wrapper for Reflection to address all these use cases.

ghost of drupal past’s picture

Note if either autowire and autoconfigure is enabled for a services.yml then every class defined as a service will be loaded and reflected during rebuild by Symfony. core.services.yml already has autoconfigure. I think this issue is completely moot in light of that.

For hooks, a performance improvement would be to pass down the container as far as getHookAttributesInClass and use $container->getReflectionClass() to save on reflecting the classes twice -- ContainerBuilder keeps the reflections in a property.

Btw the same is true for TwigExtensionPass it's just even simpler I will include it because it shows what I am talking about:

      $class_name = $container->getDefinition($service_id)->getClass();
      $reflection = new \ReflectionClass($class_name);

change to

      $class_name = $container->getDefinition($service_id)->getClass();
      $reflection = $container->getReflectionClass($class_name);
godotislate’s picture

Created #3510574: Use container builder to track and reuse reflection classes in compiler passes to reuse reflection classes in the container builder per #16. Changes were small and straightforward, so MR is ready.

donquixote’s picture

I suspect that the main cost of reflection is loading and parsing the class file. And of course scanning the filesystem.
After that, I suspect we can call "new ReflectionClass" and "$reflector->getAttributes()" repeatedly without much extra cost.
So we should be really careful about the cost/benefit.

I tried to see what kind of caching symfony does.
I see a runtime cache (in a variable), I don't see anything persistent.
Besides, that symfony code looks quite scary to me, like it does a lot of stuff we don't need.

Maybe the benefit depends on how many times we want attributes from the same class for different purposes.

Regarding a parser instead of php:
In the past I created https://github.com/donquixote/quick-attributes-parser/
It is definitely faster than nikic php-parser, at least it was at the time.
But I still would not recommend this for performance gains. I only did this because I wanted to support attributes in older php versions.

EDIT: I corrected the url, thanks.

nicxvan’s picture

Here is the link to the repo: https://github.com/donquixote/quick-attributes-parser/ the one in 18 404s for me.

catch’s picture

I tried to see what kind of caching symfony does.
I see a runtime cache (in a variable), I don't see anything persistent.

Yes, but Symfony by default compiles the container to PHP in a build step, which is then pushed to production, whereas because of modules installation vs uninstallation, Drupal has to build its container dynamically on production. This has been the disconnect between Drupal and the Symfony container since it was adopted.

mstrelan’s picture

Re #20: Since we have package manger in core doing composer stuff I'm wondering if there is a path forward where we have a dev mode that uses package manager to dump the container and prod uses a read only container that folks would have compiled in a build step. Not sure the logistics of how this would work.

catch’s picture

The biggest blocker is the disconnect between the compiled container before or after a module install/uninstall when that happens as part of a config deployment or update.

meanderix’s picture

This is definitely a major performance issue with regards to our build/install procedures.

Here are some numbers (we're installing 216 modules):

  • Drupal 10 installation: 7m 50s
  • Drupal 11 installation: 20m 29s
  • Drupal 11 installation, with gc_collect_cycles() commented out in AttributeClassDiscovery::getDefinitions(): 13m 1s

It must be possible to make some improvements here. As far as I can tell, each module install causes multiple invocations through DefaultPlugin::getDefinitions(), where the same kind of operations are repeated over and over again (i.e. scanning for attributes in all module folders).

nicxvan’s picture

@meanderix are you testing in 11.0, 11.2, or 11.x?

There were huge improvements to the module install process for 11.2

catch’s picture

meanderix’s picture

@nicxvan I was testing with 11.1.8, but today I will test with 11.2.x (I'll add that to the report).

meanderix’s picture

@nicxvan With 11.2.4 the installation time drops to about 5 minutes. I did not expect that. This is even quicker than Drupal 10. Thank you so much for all of your work.

catch’s picture

Version: 11.x-dev » main

Drupal core is now using the main branch as the primary development branch. New developments and disruptive changes should now be targeted to the main branch.

Read more in the announcement.