Problem/Motivation

By @pwolanin:

In the block manager and other places in core we have to use truly horrible code like this to get the human-readable name of a module:


function getModuleName($module) {
    // Gather module data.
    if (!isset($this->moduleData)) {
      $this->moduleData = system_get_info('module');
    }
    // If the module exists, return its human-readable name.
    if (isset($this->moduleData[$module])) {
      return $this->t($this->moduleData[$module]['name']);
    }
    // Otherwise, return the machine name.
    return $module;
  }

Why is that so horrible? Well it requires calling out to a procedural function AND system_get_info() calls system_rebuild_module_data() and none of this is cached even statically.

This was also pretty bad in Drupal 7, but in the worst case you could get the module info out of the database. In 8, none of that seems to be available.

By @Berdir:

system_get_info() calls right into system_rebuild_module_data(), which parses all .info.yml files. And that's slow, even with the apcu file cache that I have applied already. Without it, it's much worse.

We used to have some pretty complicated and arcane caching there, but we removed it, as we no longer needed that function during bootstrap.

However, we still call it on some places, like /admin/people, to group the permission list by module. On a default D8 installation, 50%/200ms is spent in that function on that page.

Proposed resolution

Cache the module data into the cache backend as well as statically in system_get_info() and clean the cache when we do a system_rebuild_module_data(). Stop loading system_rebuild_module_data()/system_get_info() if we can use the ModuleHandler::getName() method instead.

Remaining tasks

N/A

User interface changes

N/A

API changes

Not really an API change but in the current version of the patch ModuleHandler::getName() now returns the module name as it had been input in case it can't find the actual name, instead of giving an error.

Beta phase evaluation

Reference: https://www.drupal.org/core/beta-changes
Issue category Bug because significant slowdown on certain pages
Issue priority Major because significant slowdown
Prioritized changes The main goal of this issue is performance
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

YesCT’s picture

linking to the homestretch as the parent.

dawehner’s picture

To be honest this does not really belongs to the module handler ... the module handler not cares about the meta information
about modules ... on the other hand I don't know a really good service this belongs to

Related issue where similar issues got discussed: #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList, #2228093: Modernize theme initialization

pwolanin’s picture

stil todo I guess

Wim Leers’s picture

stefan.r’s picture

Assigned: Unassigned » stefan.r

I'll try to give this a go over the next days :)

pwolanin’s picture

In the dup we had - the following suggestion, but I'm not sure that's sufficient? Or is the human-readable name (& description?) the only thing we are looking for often (vs. e.g. requirements, dependencies, etc)

Proposed resolution

We already have the module list with filename and pathname in the container. Won't hurt much to put the module name in there as well, and then we can offer a method on ModuleHandler that returns it?

Berdir’s picture

Category: Task » Bug report
Priority: Normal » Major
Issue tags: +Performance

Fine with closing the other as duplicate but then let's use the same category/priority/tag as that one had, a single function taking 50% of the wall time is a major performance bug IMHO

dawehner’s picture

I honestly think that we should keep the container as slim as possible and adding the module name for all of them for a really minor usecase is not a good thing IMHO.

Instead we should expose those meta information about all extensions as its own thing, if you ask me, see #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList

pwolanin’s picture

Talking to dawehner, there are 2 places roughly where we cache overlapping info

core/lib/Drupal/Core/DrupalKernel.php line 952 calls:

$container->setParameter('container.modules', $this->getModulesParameter());

Where is where the %container.modules% data comes form that's set on the module handler. Sadly, in that location it seems not to be parsing the .info file, so we don't have the human-readable name.

In addition, in core/modules/system/system.module line 1004:

\Drupal::state()->set('system.module.files', $files);

core/lib/Drupal/Core/Extension/ThemeHandler.php line 422:

$this->state->set('system.theme.data', $this->list);

The use of state for these seems a little weird. Possibly that should be replaced or wrapped by implementing the service described in #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList

Berdir’s picture

Having the extension handler would be great, but I think I would prefer a smaller fix that is easier to achieve as a first step.

I just noticed that ModuleHandler *already has* the API to get the module name: ModuleHandlerInterface::getName(). And it calls out to system_rebuild_module_data().

Just finding a better way to implement this and then switch to use that wherever we just need the name would already improve a lot of cases where system_get_info() is used.

Berdir’s picture

Discussed this with @dawehner, and managed to convince him that this is a useful first step. This takes 1.8s on my production site every time it is called, IMHO, that more than qualifies some sub-optimal small and local performance improvements instead of waiting for a full refactoring of the whole extension discovery/info system.

So..

- Find a way to make getName() faster. Whether we call out to state(), or pass in the module name as part of the existing data structure, I don't care so much. A few extra calls to state() are fine with me if we want to avoid having to dump this in the container.
- Check all calls to system_get_info(), update them to use this API when possible (when it just needs the name)
- Add a @todo for #1744302: [meta] Resolve known performance regressions in Drupal 8.

Also switching the meta issue.

Berdir’s picture

stefan.r’s picture

Status: Active » Needs review
FileSize
19.78 KB

This patch simply saves the module info into state with key system.module.data while we're rebuilding module info anyway in system_rebuild_module_data().

It also adds a getInfo($module) method on the ModuleHandler which does exactly what system_get_info('module', $module) used to do, except it gets the module info from state. This method is now used by getName() as well.

It also adds a fallback to the machine name of the module in ModuleHandler::getName() in case the name can't be found, as @Berdir had already implemented in MenuLinkDefaultForm::getModuleName().

It seems there are already calls to system_rebuild_module_data() everywhere where state would need to be reset.

Status: Needs review » Needs work

The last submitted patch, 13: 2281989-13.patch, failed testing.

dawehner’s picture

Given that this is just a temporary solution until #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList is done, it seems more sane to not add getInfo() itself to the interface.

stefan.r’s picture

Status: Needs work » Needs review
FileSize
5.23 KB

Status: Needs review » Needs work

The last submitted patch, 16: 2281989-16.patch, failed testing.

stefan.r’s picture

Title: Add a fast and simple way to get module name and other info from the module handler » Add a fast and simple way to get module name from the module handler
Status: Needs work » Needs review
FileSize
2.33 KB

@daweher I think we can change the issue title then? :)

stefan.r’s picture

FileSize
5.23 KB

Disregard #18, that was a patch for another issue...

stefan.r’s picture

FileSize
14.62 KB
8.8 KB

This replaces another few calls to system_get_info() by calls to getName()

Status: Needs review » Needs work

The last submitted patch, 20: 2281989-20.patch, failed testing.

stefan.r’s picture

Status: Needs work » Needs review
FileSize
14.62 KB
Berdir’s picture

+++ b/core/lib/Drupal/Core/Extension/ModuleHandler.php
@@ -711,7 +711,7 @@ public function getModuleDirectories() {
   public function getName($module) {
     $module_data = system_rebuild_module_data();
-    return $module_data[$module]->info['name'];
+    return isset($module_data[$module]->info['name']) ? $module_data[$module]->info['name'] : $module;

This still calls directly into system_rebuild_module_data(), so that will still trigger the rebuild every time?

Status: Needs review » Needs work

The last submitted patch, 22: 2281989-21.patch, failed testing.

stefan.r’s picture

Yes that should be calling system_get_info().

stefan.r’s picture

Status: Needs work » Needs review
FileSize
2.92 KB
16.17 KB

Can we get rid of system_get_module_admin_tasks() still? It is only being called from AdminController and HelpController, it needs an info file in the second argument only to take the name out.

Berdir’s picture

Not sure. We could offer a new version, for example as a trait that both controllers can use and deprecate it for removal in 9.0. And if the core committers are OK with removing it, then we can still do that too, would be easy then.

stefan.r’s picture

Cool, we do so in this issue or in a followup?

dawehner’s picture

Thank you for keeping the amount of changes limited.

Can we get rid of system_get_module_admin_tasks() still? It is only being called from AdminController and HelpController, it needs an info file in the second argument only to take the name out.

That sounds certainly like follow up work, but looking at that method, extracting that into somehow its own thing seems doable.

stefan.r’s picture

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Looks ready for me. I hope we can get done with proper reorganization of all that various mess around.

stefan.r’s picture

Issue summary: View changes

Updated issue summary

stefan.r’s picture

Assigned: stefan.r » Unassigned
catch’s picture

Status: Reviewed & tested by the community » Needs work

Why state and not just a cache entry?

State should really be used for information that can't be rebuilt if it goes missing, which isn't the case here.

Berdir’s picture

Issue tags: +needs profiling

Sorry, but this doesn't quite work yet :)

Preparing a longer response, just want to make sure this is not committed yet.

Ah, @catch already commented.

Berdir’s picture

admin/people is now actually slower than before. This because we call PermissionHandler::getPermissions(), which calls sortPermissions(), which calls ::getModuleNames(), which calls systemRebuildModuleData(), which just a wrapper for system_rebuild_module_data().

So that means we do both now. We rebuild and we load and unserialize the data from state. And it's a lot of data to unserialize.

Which shows the flaw the current approach. cache get/set must happen in the same function. Otherwise direct calls to the other function, which we still have a lot, constantly re-write that data, and that's not cheap.

Doing it in system_get_info() means we can optimize it to only cache enabled modules. That can make quite a difference.

I was even thinking about doing the cache/state for just the module names and not the whole info. There's hardly a case where we need anything but the name for any page were performance matters? Then we'd do that directly in ModuleHandler imho, load from cache on the first call and then return from there. And don't touch system_get_info().

So, before we can RTBC this again, we definitely need to do profiling to make sure were actually improving the situation :)

stefan.r’s picture

getModuleNames should actually be calling system_get_info(). No need to rebuild.

stefan.r’s picture

We actually don't have that much direct calls to system_rebuild_module_data, it's only the one you pointed out in #36 that seems misplaced, the other ones actually need to trigger a cache reset (on update, on module install/uninstall etc)

I could also see the point of having other data than just the name available (ie. configure page, description), but when we get the patch right we can profile whether there's a difference between storing the name vs the full processed info.

stefan.r’s picture

Status: Needs work » Needs review
FileSize
16.86 KB
1.81 KB

This should solve the gist of the performance issue, anything beyond this we can micro-optimize if needed. The info files by themselves are not that big/complicated so I can't imagine the unserialization to be that expensive. But we can profile whether making it smaller makes a difference.

@Berdir when you say "Otherwise direct calls to the other function, which we still have a lot, constantly re-write that data, and that's not cheap.", you are referring to calls to system_rebuild_module_data()? I really don't see that many of them (where we should be using cached data).

Status: Needs review » Needs work

The last submitted patch, 39: 2281989-39.patch, failed testing.

stefan.r’s picture

Issue summary: View changes
Status: Needs work » Needs review
FileSize
591 bytes
16.91 KB
stefan.r’s picture

It's a 150ms improvement here but there seems to be some room for shaving off a few ms, it's still spending too much time on unserializing/cache_gets when it only needs a couple of names.

stefan.r’s picture

FileSize
17.47 KB
1.81 KB

stefan.r’s picture

FileSize
17.39 KB

Can we profile this to see how much better this is and whether we need to optimize this further?

The last submitted patch, 43: 2281989-43.patch, failed testing.

stefan.r’s picture

FileSize
17.45 KB
2.66 KB
stefan.r’s picture

FileSize
17.57 KB
2.78 KB

stefan.r’s picture

FileSize
1.43 KB
17.53 KB

The last submitted patch, 47: 2281989-47.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 48: 2281989-48.patch, failed testing.

stefan.r’s picture

Status: Needs work » Needs review
FileSize
2.91 KB
17.45 KB

So I have looked at this in xhprof and on admin/people the execution time for system_get_info() goes from taking up 21% of the load time to down to about a millisecond.

The last submitted patch, 51: interdiff-26-51.patch, failed testing.

dawehner’s picture

I really like the patch in general because it outlines where we really want to rebuild the module data and where we just use it to
gain some information which is out in the system.

#2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList will introduce some proper listing service to be able to retrieve the information

  1. +++ b/core/modules/system/system.module
    @@ -854,10 +854,27 @@ function system_check_directory($form_element, FormStateInterface $form_state) {
    +    static $drupal_static_fast;
    +    if (!isset($drupal_static_fast)) {
    +      $drupal_static_fast = &drupal_static(__FUNCTION__);
    +    }
    

    I'm curious, do we call system_get_info() so often that its worth to use the faster drupal_static pattern?

  2. +++ b/core/modules/user/src/Plugin/views/filter/Permissions.php
    @@ -30,6 +30,13 @@ class Permissions extends ManyToOne {
    +   *
    +   * @var \Drupal\user\ModuleHandlerInterface
    +   */
    

    Nitpick: The ModuleHandler has a different namespace

stefan.r’s picture

FileSize
455 bytes
17.46 KB

On admin/people we get the full list of modules several times and then get the name of each one separately from system_get_info(), so we end up doing >100 calls to that function. In any case I assume that static will be turned into a protected property anyway as soon as we implement that listing service.

stefan.r’s picture

Issue summary: View changes
stefan.r’s picture

Issue summary: View changes
Issue tags: -needs profiling
FileSize
103.78 KB
51.15 KB
92.01 KB

Berdir’s picture

Status: Needs review » Reviewed & tested by the community

Strange, what page did you test this on, admin/people? Do you have a lot of users there? What is taking a lot of time? Make sure that you don't have xdebug and xhprof/uprofiler enabled at the same time. I can see that you're not using the CPU flag, that's also a big overhead.

I'm asking because the difference for system_rebuild_module_data() is about the same for me... but the total wall time is 367ms before and 199ms after, making this a 45% improvement and 108k instead of 260k function calls, a 60% improvement. I don't have any users, though. but if that would be the reason, I'd expect to see more function calls in your run.

Anyway, this looks good to me. Great to see this resolved.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/modules/system/system.module
    @@ -854,10 +854,27 @@ function system_check_directory($form_element, FormStateInterface $form_state) {
    +        // Store the module info in cache. We rely on consecutive calls to
    +        // system_rebuild_module_data() to reset this key, such as when
    +        // listing modules, (un)installating modules, importing configuration,
    +        // updating the site and when flushing all caches.
    

    I don't understand why we need the word consecutive. Could this comment be:

    Store the module information in cache. This cache is cleared by calling system_rebuild_module_data(), for example, when listing modules, (un)installing modules, importing configuration, updating the site and when flushing all the caches.
    
  2. +++ b/core/modules/system/system.module
    @@ -999,9 +1016,12 @@ function system_rebuild_module_data() {
    +    // Store filenames to allow drupal_get_filename() to
         // retrieve them without having to rebuild or scan the filesystem.
    

    Comment needs re-flowing.

stefan.r’s picture

Status: Needs work » Needs review
FileSize
17.53 KB
1.42 KB

Yes the revised comment makes more sense. "consecutive" was from when that comment was elsewhere in a previous version of this patch.

pwolanin’s picture

Do we need the extra code complexity of using this?

+    $info = &$drupal_static_fast;
stefan.r’s picture

You mean do we need to use the drupal_static_fast pattern at all? Or that the way it's implemented is unclear?

It prevents over 100 calls to drupal_static on select pages but it's not thousands either, so it's just shaving off a couple hundred microseconds -- if we rather have clearer code I am happy to take it out?

pwolanin’s picture

Yes, I mean do we need to use the drupal_static_fast pattern at all?

In general, I'd look to use that only in things used heavily every page. I think the speedup here for admin pages will already be significant, so simpler code is better.

stefan.r’s picture

Issue tags: +drupaldevdays
FileSize
877 bytes
17.62 KB

Yes, we actually don't really need it, it's likely not even a millisecond difference...

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +D8 Accelerate Dev Days

Back to RTBC since it was RTBC and fixed only the recent nitpicks :)

pwolanin’s picture

Are we worried about adding a new/heavy dependency to Permissions, FieldPluginBase, etc?

  • effulgentsia committed cbc0316 on 8.0.x
    Issue #2281989 by stefan.r: Add a fast and simple way to get module name...
effulgentsia’s picture

Status: Reviewed & tested by the community » Fixed

This looks like a step in the right direction, so committed and pushed to 8.0.x. Thanks!

Some thoughts for optional follow-ups if anyone's inspired to pursue them:

  1. +++ b/core/modules/system/system.module
    @@ -852,16 +852,29 @@ function system_check_directory($form_element, FormStateInterface $form_state) {
    +      if ($cache = \Drupal::cache()->get('system.module.info')) {
    

    I think this is a good candidate for the 'discovery' bin rather than the default bin. That would let it benefit from APCu.

  2. +++ b/core/modules/system/system.module
    @@ -852,16 +852,29 @@ function system_check_directory($form_element, FormStateInterface $form_state) {
    +        foreach (\Drupal::moduleHandler()->getModuleList() as $module => $filename) {
    

    Sad that there's a little circularity here: the module handler calls this function, which calls back to the module handler. Maybe that can be improved in #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList.

  3. +++ b/core/modules/help/src/Controller/HelpController.php
    @@ -131,7 +129,7 @@ public function helpPage($name) {
    -      $admin_tasks = system_get_module_admin_tasks($name, $info[$name]);
    +      $admin_tasks = system_get_module_admin_tasks($name, system_get_info('module', $name));
    

    Shame that we need a system_get_info() here, but #2466933: Change $info array argument to system_get_module_admin_tasks() to $name can help with that.

  4. Are we worried about adding a new/heavy dependency to Permissions, FieldPluginBase, etc?

    I don't think it's a problem for the plugins changed by this patch. I think it would be a problem if we needed it on FieldPluginBase, but I don't see why we do.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.