Follow-up to: #1808200: Unit test performance significantly decreased since system list config conversion

@@ -5212,7 +5219,14 @@ function drupal_system_listing($mask, $directory, $key = 'name', $min_depth = 1)
+    $files_to_add = file_scan_directory($dir, $mask, array(
+      'key' => $key,
+      'min_depth' => $min_depth,
+      // Do not recurse into ./lib directories; they cannot contain extensions.
+      // We also skip templates, css, and js directories.
+      // @todo Find a way to skip ./config directories (but not modules/config).
+      'nomask' => '/^(CVS|lib|templates|css|js)$/',
+    ));

We probably need a funky look-behind PCRE here that matches $name but not $name/$name.

The solution could actually be applied to all of the new nomask values; i.e., lib, templates, css, js. That would allow a "js" module to still exist.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Crell’s picture

Easier answer: mv modules/config modules/configuration. kthxbye.

Berdir’s picture

That sounds easy enough and fine to me.

As asked in the other issue, do we need to document this limitation somewhere? As mentioned before, http://drupal.org/project/js, http://drupal.org/project/css and http://drupal.org/project/templates actually exist, although eauch of them only with 6.x or older versions.

sun’s picture

I was mistaken, a PCRE with a look-behind is not possible, because file_scan_directory() applies 'nomask' to the filename within a directory only. Thus, the PCRE cannot match the path.

I took the chance to investigate whether a RecursiveDirectoryIterator with a custom recursive iterator filter would be of any help here, but unfortunately, only mirroring the current behavior of file_scan_directory() for this special drupal_system_listing() case (without fix for this issue) lead to significantly lower numbers in only 10 iterations already:

$ php bench.drupal.system-listing.php
ref: refs/heads/8.x
Peak memory before test: 2,926.88 KB
Iterations: 10

nothing:                                  0 seconds
function no_op():                    0.0001 seconds
drupal_system_listing():             2.5156 seconds
SystemListRecursiveFilterIterator:   2.8038 seconds

Peak memory after test: 3,167.72 KB
Memory difference: +240.84 KB

Why oh why is it that everything OO is significantly slower than procedural code in PHP.

Again, not even covering the fix for this issue. Just a bare replacement of the needed file_scan_directory() options with a custom SystemListRecursiveFilterIterator extends RecursiveFilterIterator filter class applied to a RecursiveDirectoryIterator.

(Took me some hours to figure out how those recursive iterators work, and how to code them in a performant way — apparently, Symfony's Finder component does not even use recursive iterators, which explains why its performance sucks badly; I've posted numbers to the corresponding Finder issue.)

That said, after figuring that much out, I realized that there might not be a fix that can be applied here.

I originally assumed that we only want to match $filename if it's not $filename/$filename. But that is completely wrong, since we would skip a bare $filename with that already.

So in the concrete example of 'config', it happens to be that Config module currently does not have a ./config directory itself. And that's actually irrelevant. :)

We always want to include 'config' (because of 'core/modules/config'), but we cannot differentiate that from 'config'. ;)


Oy. Which leads me to the idea that we only want to skip those known standard subdirectories, if we are in a directory that contains a filename, which matched.

sun’s picture

we only want to skip those known standard subdirectories, if we are in a directory that contains a filename extension, which matched.

So I managed to implement this, and this screenshot shows what we're after. :)

system-list-skip-subdirs.png

Unfortunately, I implemented it in said SystemListRecursiveFilterIterator, which is 2-4 times slower than the current file_scan_directory() (somehow my numbers varied a lot in later tests and I didn't get the smaller difference from #3 anymore), and I'm afraid, drupal_system_listing() is called too often (especially when running tests) to accept any kind of slowdown there — regardless of how nice classes are.

However, we can certainly re-implement this in private helper function _drupal_system_listing().

That said, it might make sense to merge this issue into #1831688: Simplify drupal_system_listing() parameters

sun’s picture

If anyone wants to confirm my numbers, here's my bench script — plug it into your Drupal root + execute it:

php bench.drupal.system-listing.php
yched’s picture

Renaming config.module to configuration.module, as proposed in #1, also seemed to me like the easy way out ?
Technically speaking, config_ui.module would even be more accurate. breakpoint.module was briefly shipped with a dependency on config.module in its ifo file (got fixed later), so I guess it's likely that a similar confusion happens in contrib...

Crell’s picture

config_ui.module sounds fine to me. And avoids any funny stuff with regex.

sun’s picture

Just to make sure that you guys have read #2:

http://drupal.org/project/js
http://drupal.org/project/css
http://drupal.org/project/templates

These projects actually exist. And it's only by coincidence that I named Libraries module 'libraries' and not http://drupal.org/project/lib

I also think that this list will only increase in the future — e.g., consider ./layouts, ./vendor, etc.pp.

Furthermore:

Meanwhile I figured that the selective approach for skipping certain directories within an extension directory would have a very very interesting aspect we could advance to later on:

As you can already see from the limited fragment in the screenshot, we are also scanning all ./tests directories.

Now:

If this would turn into reality: #1537198: Add a Production/Development Toggle To Core

...then we could additionally skip all of those ./tests directories in a production environment.

Doing so would have a significant impact on the total filesystem scan time. And I don't see any reason for why anyone would have any reason for seeing or enabling a test module in a production environment. That entire information can be completely hidden from the system in a production environment (but only there).

In terms of overall system performance, that sounds like an extremely good idea to me.

sun’s picture

Status: Active » Needs review
FileSize
6.81 KB
  1. Added Drupal\Core\System\SystemListRecursiveFilterIterator.
  2. Converted drupal_system_listing() to RecursiveDirectoryIterator with optional $procedural flag for profiling.
sun’s picture

Forgot to mention — with this patch, the installer becomes twice or thrice times slower for me. (FWIW, I'm on Windows.)

As mentioned earlier in here, as well as in #1451320-9: Evaluate Symfony2's Finder Component to simplify file handling, this slowdown is solely caused by using the RecursiveDirectoryIterator instead of the current, manual opendir(), readdir(), closedir() flow of file_scan_directory(). I've extensively debugged the iterator code and verified that the iterator-based scan recurses into the exact same directories, so the only difference between the two is iterator vs. plain filesystem handling functions.

sun’s picture

Testbot confirmed a slowdown of 3 minutes for #9 relative to the regular time on the testbot the patch was tested on.

sun’s picture

Issue summary: View changes

Updated issue summary.

sun’s picture

Issue summary: View changes
Status: Needs review » Closed (duplicate)