Problem/Motivation

Drupal recursively scans the front end vendor directories: node_modules and bower_components.

This causes a noticeable and significantly slower response time when front end developers do common tasks such as clearing the cache (#59) or when rebuilding the theme registry: drupal_find_theme_templates():

drush cc all with node_modules in Drupal theme's root folder: 45 secs
drush cc all without node_modules: 10 secs

Also see statistics from XHProf before and after the proposed fixes in comments #119 (D8) and #105 (D7). In Drupal 8, in certain scenarios, there's over a 2x performance improvement for expensive operations like cache clears, and in Drupal 7, over 3x.

These front end developer tools are very common and highly used (even outside of Drupal).

NPM Stats (https://www.npmjs.com):

175,409 total packages
29,619,131 downloads in the last day
490,004,267 downloads in the last week
1,939,777,453 downloads in the last month

Bower Stats via NPM (https://www.npmjs.com/package/bower):

23,318 downloads in the last day
360,264 downloads in the last week
1,474,986 downloads in the last month
38,599 packages (http://bower.io/stats/)

Grunt Stats (https://www.npmjs.com/package/grunt)

19,102 downloads in the last day
318,143 downloads in the last week
1,290,279 downloads in the last month

All these tools are NodeJS oriented and use the node_modules folder, with the addition of the bower_components folder for Bower.

Due to the combined and shared nature of the node_modules folder, the average size of node_modules is roughly ~5,000-10,000 files. This can, on occasion, easily skyrocket to upwards of ~30,000-50,000 files depending on the size of the project and the node modules used.

Suffice it to say, this isn't some "small deal" Drupal can afford to simply ignore.

Can't people just use a different directory other than node_modules?
No. Node has already made it quite clear that they will never let this happen. Also, this would still be considered a "Drupalism" because front end developers new to Drupal wouldn't know about this issue.

Proposed resolution

Add the node_modules and bower_components folders as a default for the "nomask" option in file_scan_directory().

Remaining tasks

  • Update patch to reflect current issue summary

User interface changes

None

API changes

Technically this is considered an API change, however the event of this happening is very highly unlikely. However:

If you installed Drupal themes or modules via npm, you will be required to move them out of npm's default node_modules folder.
If you grouped sub-modules under the directory name "node_modules", you will be required to rename it.

Adding an update hook as mentioned in #57/#58 to warn users about possible existing directories may be the best way to do this?

CommentFileSizeAuthor
#209 2329453-171-reroll.patch9.5 KBjaperry
#204 2329453-198.patch9.59 KBalexpott
#203 interdiff-2329453-198-202.txt1.52 KBmirie
#202 ignore_front_end_vendor-2329453-202.patch7.5 KBmirie
#198 2329453-198.patch9.59 KBsteveoliver
#198 195-198-interdiff.txt1.34 KBsteveoliver
#195 2329453-195.patch9.59 KBalexpott
#195 193-195-interdiff.txt2.19 KBalexpott
#193 2329453-193.patch9.61 KBalexpott
#193 179-191-interdiff.txt1.74 KBalexpott
#179 2329453-178.patch9.68 KBalexpott
#178 2329453-178.patch9.68 KBalexpott
#178 171-178-interdiff.txt3.17 KBalexpott
#171 2329453-171.patch8.61 KBalexpott
#171 165-171-interdiff.txt2.44 KBalexpott
#165 2329453-165.patch6.83 KBalexpott
#165 159-165-interdiff.txt1.76 KBalexpott
#159 2329453-159.patch6.54 KBalexpott
#159 158-159-interdiff.txt1.69 KBalexpott
#158 2329453-158.patch4.85 KBalexpott
#158 148-158-interdiff.txt5.13 KBalexpott
#148 ignore_front_end_vendor-2329453-148.patch2.74 KBjoelpittet
#148 interdiff.txt626 bytesjoelpittet
#144 interdiff-2329453-142-144.txt726 bytesandriyun
#144 ignore_front_end_vendor-2329453-144.patch2.69 KBandriyun
#142 interdiff-2329453-136-142.txt661 bytesandriyun
#142 ignore_front_end_vendor-2329453-142.patch2.7 KBandriyun
#136 interdiff.txt2.23 KBjoelpittet
#136 ignore_front_end_vendor-2329453-136.patch1.97 KBjoelpittet
#131 performance-do-not-test.patch913 bytesjoelpittet
#127 ignore_front_end_vendor-2329453-127.patch4.43 KBhussainweb
#127 interdiff-121-127.txt2.75 KBhussainweb
#121 ignore_front_end_vendor-2329453-121.patch2.55 KBjoelpittet
#121 interdiff.txt2.83 KBjoelpittet
#117 ignore_front_end_vendor-2329453-117.patch1.36 KBhussainweb
#117 interdiff-116-117.txt771 byteshussainweb
#116 ignore_front_end_vendor-2329453-116.patch626 byteshussainweb
#111 ignore_front_end_vendor-2329453-111.patch3.57 KBgeerlingguy
#108 interdiff.txt1.55 KBjoelpittet
#108 ignore_front_end_vendor-2329453-108.patch3.52 KBjoelpittet
#107 interdiff.txt2.19 KBjoelpittet
#107 ignore_front_end_vendor-2329453-107.patch3.53 KBjoelpittet
#101 ignore_frontend_folders-2329453-101-7.x.patch3.42 KBkaidjohnson
#99 ignore_frontend_folders-2329453-99-7.x.patch3.44 KBpablo.guerino
#89 optimize_scan-2329453-89-d7-do-not-test.patch2.73 KBkaidjohnson
#88 optimize_scan-2329453-88-d7-do-not-test.patch2.73 KBkaidjohnson
#84 optimize_scan-2329453-84-do-not-test.patch2.85 KBjenlampton
#76 optimize_scan-2329453.patch2.1 KBquicksketch
#27 2329453-skip-node-modules.patch689 bytesJohnAlbin
#26 2329453-test-broken-info.patch47.19 KBJohnAlbin
#19 2329453.patch47.14 KBRobLoach
#8 2329453-8-ignore_node_modules_directory.patch463 bytesSebCorbin
#4 node_modules_cleanup.sh_.txt137 bytesafoster
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

afoster’s picture

Issue summary: View changes
afoster’s picture

Issue summary: View changes
iamcarrico’s picture

As a note, as users of npm packages we do not have control over which files some npm packages are sent with. We also have no way of changing the node_modules folder name. In this instance... it might not be a terrible idea to prevent file_scan_directory from scanning inside a directory named "node_modules". This project name has yet to be taken on D.o, and there may be a way of safely doing so. That being said, this is an alteration of behavior and for D7 it might be better to just create a patch for those of us that are affected, and put in a full issue for D8.

(For anybody stumbling on this, I *think* if you alter line 2125 of includes/file.inc from

if (!preg_match($options['nomask'], $filename) && $filename[0] != '.') {

to

if (!preg_match($options['nomask'], $filename) && $filename[0] != '.' && $filename[0] !== 'node_modules) {

It should work. But I say that without any proper testing or full checking of the code. Just as a "you could try this and let me know if it works".

afoster’s picture

Issue summary: View changes
FileSize
137 bytes

Here's a work-around that takes a different approach. Rather then have drupal handle the node_module .info files, I run this bash script (in OSX mavericks) each time after I run npm update or npm install. It finds and renames all *.info files in node_modules to *.inf0 (that's a zero on the end)

I haven't noticed anything bad happen to the node modules themselves as the .info files seem to have something to do with test coverage.

After $npm install or $npm update...

1. Copy file to your theme folder. (Rename attached file or copy code to new file) - node_modules_cleanup.sh

2. In terminal from the root of your theme containing node_modules
$bash node_modules_cleanup.sh

This runs the following bash script.

find -L ./node_modules -type f -name "*.info" -print0 | while IFS= read -r -d '' FNAME; do
    mv -- "$FNAME" "${FNAME%.info}.inf0"
done
iamcarrico’s picture

@afoster, I did a similar thing (noted here:https://www.drupal.org/node/2309023 ), but I added it to my package.json so it runs automatically. I also like deleting them, because I like to live life dangerously.

"scripts": {
    "postinstall": "find node_modules/ -name '*.info' -type f -delete"
  }

Edited: Quotes issue.

gregori.goossens’s picture

@iamcarrico, the only way for me, and maybe not so bad is like your #3 prevent scanning node_modules folder, but need a core update in include/file.inc
i change the nomask option in file_scan_directory function like this.

	'nomask' => '/(\.\.?|CVS|node_modules)$/',
  
justb3a’s picture

I ran into the same issue. Here is my solution:

1. I created a folder in sites/all directory.
2. Then I moved all relevant files into that folder (in my case: .jshintrc, config.json, gulpfile.js, js, node_modules, package.json, sass).
3. Now you can execute `npm install`. Make sure that your minified files will be placed into site/all/theme/mysubtheme/css and /js (configure in gulpfile or in my case in config.json).

That's all.

SebCorbin’s picture

Component: theme system » file system
Status: Active » Needs review
FileSize
463 bytes

#6 is a great way to solve this. I've attached the corresponding patch.

stevesmename’s picture

I have a hard time saying that #6 is a "great way" to solve this but it does solve the issue. This issue is really only for development purposes; as node_modules shouldn't be pushed to production. I think that npm should fix their issue, https://github.com/npm/npm/issues/775.

To commit this into Drupal core (7.x) would be the lazy way to solve this. There are other methods to get this done and developers should consider doing the alternatives until open source communties work together. + Drupal 8 won't have this issue as .info files are not used.

Related links (outside of Drupal):
http://stackoverflow.com/questions/16738177/grunt-possible-to-relocate-n...
http://stackoverflow.com/questions/15521655/can-i-put-the-npm-node-modul...
http://stackoverflow.com/questions/18974436/change-node-modules-location

LewisNyman’s picture

To commit this into Drupal core (7.x) would be the lazy way to solve this

I would be a shame if Drupal required extra steps or extra knowledge just to support a workflow that frontend developers can use without a problem in other frameworks.

LewisNyman’s picture

I'm also getting a few info files in my gem files by the way.

 find ./ -type f -name '*.info'
.//.vendor/bundle/ruby/2.0.0/build_info/breakpoint-2.4.6.info
.//.vendor/bundle/ruby/2.0.0/build_info/chunky_png-1.3.1.info
.//.vendor/bundle/ruby/2.0.0/build_info/color-schemer-0.2.8.info
.//.vendor/bundle/ruby/2.0.0/build_info/compass-1.0.1.info
.//.vendor/bundle/ruby/2.0.0/build_info/compass-blend-modes-0.0.2.info
.//.vendor/bundle/ruby/2.0.0/build_info/compass-core-1.0.1.info
.//.vendor/bundle/ruby/2.0.0/build_info/compass-import-once-1.0.5.info
.//.vendor/bundle/ruby/2.0.0/build_info/ffi-1.9.5.info
.//.vendor/bundle/ruby/2.0.0/build_info/modular-scale-2.0.5.info
.//.vendor/bundle/ruby/2.0.0/build_info/multi_json-1.10.1.info
.//.vendor/bundle/ruby/2.0.0/build_info/rb-fsevent-0.9.4.info
.//.vendor/bundle/ruby/2.0.0/build_info/rb-inotify-0.9.5.info
.//.vendor/bundle/ruby/2.0.0/build_info/sass-3.3.14.info
.//.vendor/bundle/ruby/2.0.0/build_info/sassy-maps-0.4.0.info
.//.vendor/bundle/ruby/2.0.0/build_info/singularitygs-1.2.3.info
.//.vendor/bundle/ruby/2.0.0/build_info/toolkit-2.0.0.info
.//.vendor/bundle/ruby/2.0.0/gems/ffi-1.9.5/ext/ffi_c/libffi/doc/libffi.info
Danny Englander’s picture

I just ran up against this issue myself (got the drush error and admin areas were WSOD) with a grunt / LibSass setup for my contrib theme. It seems like a no brainer to have this added to core but in the mean time I split the difference between #4 and #5. I liked the code better in #4 so I simply added that to run in my package.json file for post install.

"scripts": {
    "postinstall": "sh npm_clean.sh"
  },

... where npm_clean.sh contains the code from #4. One thing to note, For some reason I had to add a .npmrc file that has the code unsafe-perm = true, otherwise, I was getting nasty errors. That was even running npm install as sudo npm install.

After all this, everything works great, no more drupal admin WSOD, drush works and grunt works.

reikiman’s picture

First - isn't it a bug in Drush or Drupal Core that it is crashing on finding what it thinks is a malformed .info file? Rather than crash it should print an error and abort whatever it was trying to do.

Second - I can't think of a valid reason for a npm module to include something like test results in the packaged distribution. That's a sloppy package maintainer not adding things to their .npmignore file. The package maintainer should be receiving a bug on this.

Third - local modifications to a package.json will just be overwritten the next time the package is updated. Therefore the suggestions y'all made along that line will be ephemeral and break at some random time in the future.

Fourth - just because you're finding lcov.info files in this case doesn't mean some other package maintainer won't have a legitimately named a file with a .info extension.

Fifth - the idea of changing the name of the node_modules directory is a non-starter. That directory name is baked into Node core, and is the methodology it uses to find Node modules. See http://nodejs.org/api/modules.html Npm then builds on the package.json defined in the Node API to support the additional features supplied by npm.

The patch in #8 looks good as far as it goes. Seems to me there's a similar risk from bower packages containing unexpected files.

What's needed is for Drupal to (for certain purposes) simply ignore stuff inside directory structures managed by other tools like npm or bower.

ksenzee’s picture

#13.3: The package.json in question is at the root of the theme folder - it's not going to be overwritten by a package update. Adding a postinstall script to it is a quick, easy fix that doesn't require patching core. The only problem I see with it is your #4: someone, somewhere, someday, might add a .info file to their npm package that is actually needed.

In theory we could rewrite the code that parses .info files such that it doesn't segfault, but the fix would only apply to D7; D8 uses YAML files instead.

Danny Englander’s picture

Yeah, I agree with @ksenzee, #13.3 won't be an issue. I've done a lot of testing around this over the past few days and codified it in a blog post here: http://dannyenglander.com/blog/drupal-drush-segmentation-fault-11-error-...

I tested this by deleting my node_modules folder and then running sudo npm install and then immediately running drush cc all. No more error. The nice thing about this method is that it's very portable across workflows and version control.

reikiman’s picture

I must have missed the location of the package.json. That does render my point# 3 moot.

BTW - when using npm for a local install it's not necessary or desirable to use sudo. npm used to warn you when ran with sudo.

LewisNyman’s picture

I like the idea of Drupal catching the error instead of crashing

RobLoach’s picture

Looking through the tests for drupal_parse_info_format(), it seems like its huge regex is not tested against malformed Drupal .info files.

RobLoach’s picture

FileSize
47.14 KB

This demonstrates the Segmentation Fault that we get with drupal_parse_info_format(). Expected to fail and kill PHP....

Pull Request and branch up at https://github.com/RobLoach/drupal/pull/3

Can be fixed in other packages to not have .info files too:

Handlebars
Will be fixed in 2.0.1: https://github.com/RobLoach/handlebars.js/commit/c5fe2527b393a1cc83948bd...
Gulp-yui
https://github.com/jsBoot/gulp-yuidoc/pull/15
lcov-parse
https://github.com/davglass/lcov-parse/pull/8
json3
https://github.com/bestiejs/json3/pull/76

Status: Needs review » Needs work

The last submitted patch, 19: 2329453.patch, failed testing.

albertski’s picture

I had the same problem but I only seen it when creating and updating features locally. I was able to track it down to the .info file (node_modules/grunt-eslint/node_modules/eslint/node_modules/doctrine/coverage/lcov.info) and I came across this post :). #8 fixed my problem.

marcvangend’s picture

Re #9 "Drupal 8 won't have this issue as .info files are not used.":
Can we be really sure? Is there a standard which says that node modules must never contain a .info.yml file? It would be a shame if have to fix the same problem again for a different file extension.

JohnAlbin’s picture

Title: Themes should not crash when .info file appears inside node_modules » Drupal crashes when node_modules folder contains a .info file
Category: Feature request » Bug report
Priority: Normal » Major

So _this_ explains why my Drupal 7 dev site was seg faulting for the last week. *sigh*

I was installing eslint and that ended up adding this file to my theme: "node_modules/eslint/node_modules/doctrine/coverage/lcov.info" And the file is zero bytes, btw.

This feels more like a major bug than a feature request. If I install the eslint tool (which Drupal 8 is using) into my theme, then Drupal seg faults for all pages.

So the next version of Zen for Drupal 7 will have people run npm install in their sub-themes. So this is a huge blocker for me.

Re #9 "Drupal 8 won't have this issue as .info files are not used.":
Can we be really sure? Is there a standard which says that node modules must never contain a .info.yml file?

We should definitely test that theory. But a random .info.yml file will still be in Yaml format, so it might get confused, but its unlikely it would segfault.

JohnAlbin’s picture

Issue summary: View changes
JohnAlbin’s picture

Issue summary: View changes
JohnAlbin’s picture

Version: 7.x-dev » 8.0.x-dev
Assigned: Unassigned » JohnAlbin
Status: Needs work » Needs review
FileSize
47.19 KB

In D8, we already have tests to check for a broken .info.yml file (InfoParserUnitTest::testInfoParser), but here's a quick patch to see if we can take Rob's .info file and try the same test on it. That should verify that D8 is immune to broken .info files.

After the test bots have their way, I've got an actually-useful patch that prevents Drupal 8 from searching node_modules and bower_components folders.

JohnAlbin’s picture

FileSize
689 bytes

I think #26 proves we don't have an issue in D8. So #2 on the proposed solution is already fixed in D8.

Here's the patch the prevents node_modules and bower_components from being traverse when looking for special Drupal files. (#1 on the proposed solution.)

JohnAlbin’s picture

Issue summary: View changes

In D8 this is a performance improvement (if you have node_modules in your Drupal site).

It's definitely not a major bug in D8, but it blocks a major bug in D7.

So… I'm not sure how to categorize/prioritize this. Do we leave this as "major bug"? or fork this issue to create a separate D8 task report?

marcvangend’s picture

Thanks for the patch, John!

To be honest I fail to see why the D8 patch blocks the D7 fix. The bug in D7 does not occur in D8 (even though it's great if we can do a related performance improvement while we're at it) so I'd say the "fix it in dev first" rule does not really apply here.

Since a chain is as strong as its weakest link, I think this issue should either remain a major bug blocked by the D8 patch, or the D8 patch should be split out into its own issue.

malcomio’s picture

Regarding Drupal 7, there is already a separate issue for this - #619542: Malformed theme .info files break menu_router generation.

I'd be inclined to add the patch to that issue to cover D7, and leave this issue open for D8.

RobLoach’s picture

I was installing eslint and that ended up adding this file to my theme: "node_modules/eslint/node_modules/doctrine/coverage/lcov.info" And the file is zero bytes, btw.

This PR fixes it so that coverage tests arn't deployed in that package: https://github.com/Constellation/doctrine/pull/104

JohnAlbin’s picture

Title: Drupal crashes when node_modules folder contains a .info file » Ignore vendor folders to improve directory search performance
Category: Bug report » Task
Priority: Major » Normal
Issue tags: +Performance

I'd be inclined to add the patch to that issue to cover D7, and leave this issue open for D8.

That works for me. I've RTBCed that issue.

Tweaking the title and category since this issue isn't about fixing a WSOD on D8.

bserem’s picture

Status: Needs review » Reviewed & tested by the community

Patch in #8 works fine for me in the last two months.

Shall I tag it as RTBC and against D7 so that this makes into D7 sometime soon?

alexpott’s picture

Status: Reviewed & tested by the community » Needs review

@bserem there's a Drupal 7 issue open already #619542: Malformed theme .info files break menu_router generation

An rtbc should be for the latest patch and Drupal version the patch was tested against. I'm completely unsure of what you've rtbc'd here.

The last submitted patch, 26: 2329453-test-broken-info.patch, failed testing.

RobLoach’s picture

Status: Needs review » Reviewed & tested by the community

@alexpott, #27 is for Drupal 8, which is RTBC. It's not critical in Drupal 8, but it does speed up looking for .info files.

The Drupal 7 one you pointed to is also RTBC. They're being handled in separate issues. While the problems are similar, their solutions are different.

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs change record

Well the D8 part needs a CR because if you have a module and have a sub directory called node_modules which contains a set of modules that provide node related functionality this change would break that.

+++ b/core/lib/Drupal/Core/Extension/Discovery/RecursiveExtensionFilterIterator.php
@@ -65,11 +65,13 @@ class RecursiveExtensionFilterIterator extends \RecursiveFilterIterator {
+    'node_modules',

Could we come up with a string that conflicts more with existing Drupal terminology? Is this a standard in node.js? Same goes for bower_components?

alexpott’s picture

Also we should probably open a followup to document a module directory structure and the reserved names we currently have.

marcvangend’s picture

Re #40: according to http://stackoverflow.com/a/21818724, the directory name "node_modules" cannot be changed.

David_Rothstein’s picture

Issue tags: +Needs backport to D7

Since this is a bugfix for Drupal 7 I would commit it there before Drupal 8, but the unfortunate "node_modules" naming conflict gives me pause too. We could really break someone's production site with this, in order to fix a bug that (mostly?) affects developers.

Also, although the approach here is the quickest way to fix the bug, it's not necessarily the only or best one. For example, I tried these patches:
#2349457-2: preg_match_all segmentation fault on drupal_parse_info_format
#619542-68: Malformed theme .info files break menu_router generation

And with both of those applied, it fixes the problem for the .info file RobLoach provided in #19. (Haven't tested it with all the Node modules/Bower components, but it looks like it should work in general.)

In that case, this issue would just become a potential performance improvement (do we know how much of one?) for Drupal 7 too.

SebCorbin’s picture

Related https://github.com/drush-ops/drush/issues/1445

For drush on Windows, a drush cc all will give errors for paths too long

dman’s picture

FWIW, we have encountered the (well described in the summary) crashes also - using Compass for SASS generation.

My work-around was to hide the vendor directory in a dot-folder at install time. This excludes it from scanning.
Things seem to work fine from then on, Bundler somehow knows where to look for them next time without trouble.

bundle install --path .bundle/vendor

I'm certainly in favour of more bullet-proof scanning from the Drupal end though. Segfaulting due to a malformed info file was embarrassing.

JohnAlbin’s picture

Status: Needs work » Needs review

Well the D8 part needs a CR because if you have a module and have a sub directory called node_modules which contains a set of modules that provide node related functionality this change would break that.

Change record draft is here: https://www.drupal.org/node/2544502

Is this a standard in node.js? Same goes for bower_components?

Unfortunately, yep. Both bower_components and node_modules are the default names for these folders. There's a hidden config to change Bower, but node_modules can't be changed.

Going to let the test bot review this again (since its been 3 months) before setting it to RTBC again.

JohnAlbin’s picture

Status: Needs review » Reviewed & tested by the community

Patch passes tests. https://qa.drupal.org/pifr/test/996598 Back to RTBC.

JohnAlbin’s picture

FYI, the D7 backport of this patch is over in #619542-146: Malformed theme .info files break menu_router generation and needs to be moved over here.

alexpott’s picture

I'm not 100% sure this is need for D8, with this patch we only break things which are not broken at the moment. I'm unsure of what to do here.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

Sounds like this still could use discussion.

joelpittet’s picture

I've been using the approach in #2393605: Allow file_scan_directory() to ignore folder to resolve this issue. Although manual it's a bit more generic for things like I added it to sites/all/libraries and sites/all/vendor, and to node_modules in my theme and bower_components.

dddbbb’s picture

#2393605: Allow file_scan_directory() to ignore folder Is the smartest approach to this problem that I've seen yet:

  • It solves the seg fault issue if there are .info files in your vendor directories
  • It solves the performance issue that arises when file_scan_directory() meets large/heavily nested vendor directories (this is the biggy for me personally)
  • It won't break sites that require vendor directories like node_modules in production
markhalliwell’s picture

Could we come up with a string that conflicts more with existing Drupal terminology? Is this a standard in node.js? Same goes for bower_components?

"We" didn't come up with these folder names. It's a silly coincidence, but nothing more. These folders are what is used on the front end and node_modules cannot be changed (nor should it).

I'm not 100% sure this is need for D8, with this patch we only break things which are not broken at the moment.

It doesn't break until the front end tools install them (which happens all the time when you're a front end developer).

We could really break someone's production site with this, in order to fix a bug that (mostly?) affects developers.

This is a huge "what if". If a developer knows they can group sub-modules inside a parent folder, it would be highly unlikely they would arbitrarily name it something as generic as "node_modules". Typically anything "grouped" would have a lot of context associated with it, especially if it pertained to a client, tool[set], feature, etc.

These aren't actual projects either:
https://www.drupal.org/project/node_modules
https://www.drupal.org/project/bower_components

If, on the off chance, someone actually did name a folder this.... a really simple solution: rename the folder. For the all these reasons, I cannot imagine how this could be considered even a remotely strong argument for one hypothetical site that did this. A little blurb at the top of the release notes should be sufficient way to notify anyone of this highly unlikely "potential issue".

#2393605: Allow file_scan_directory() to ignore folder Is the smartest approach to this problem that I've seen yet:

This is a nice feature, sure, but to use it as the "solution" for this issue is really just an anti-pattern. This would require documentation since developers who normally can just do npm install (and be done with it) wouldn't know about this Drupalism they would have to add to their vendor folders because Drupal will recursively look through them to find .info[.yml] files. Yes, there is a "solution" in that issue to help "automate" this via post install scripts, however that is still yet another "Drupalism".

I, for one, would much rather not have more Drupalisms to consider and constantly worry if I added this really weird "exclusion file" to my vendor dirs. Drupal should just ignore them.... period.

These folders definitely need to be added to this "blacklist". This all being said, I actually think this "blacklist" needs to be mutable. Drupal will never get this 100% right, the best we can hope for is sane defaults and allow projects to independently define blacklisted (and whitelisted) directories as needed. Perhaps through settings.php or services.yml?

dddbbb’s picture

100% agreed with #54. Avoiding a Drupalism is ultimately the best way to go and if we can get away with ignoring these directory names altogether then great (already doing this with https://www.drupal.org/node/619542#comment-9771891 and it works well for me on D7).

joelpittet’s picture

@markcarver if we go the blacklist route, then it would be nice to have it configurable to cover the cases that the .scanignore file deals with.

I don't mind the 'sane' defaults, but I'd suggest putting that in a variable_get()/Config/Setting(not sure the right one to use in D8). Then is someone does have a node_modules folder for another weird reason they can change that config, OR more likely if there is another tool that gets added to the Front End Stack, we can deal with it in the project at the time it happens.

Dave Reid’s picture

We could probably add an update hook which throws a warning for existing sites *just in case* they have actual modules that would now be ignored due to the change. Easy enough (in Drupal 7) to scan all the existing records in the system table for anything matching system.filename LIKE '%' . db_like('/node_modules/') . '%'. I would argue D8 doesn't need this check since it's not official yet and this can be documented behavior. For existing D7 sites, the chances of this causing a problem are slim, but it doesn't mean we shouldn't do anything about it.

markhalliwell’s picture

@joelpittet, RecursiveExtensionFilterIterator already has this black/white list (above patch). I think maybe moving them to settings.php does make a little more sense though. That way the location and method to modify these values would be the same for all core versions. That should probably be a separate issue though.

@Dave Reid, that's a really great suggestion for D7 and agree that D8 shouldn't be an issue.

edit: The D7 update should probably do file_scan_directory instead though since there could (in theory) simply be a grouping folder with no actual module associated with it (e.g. like we do with custom, contrib folders).

corbacho’s picture

This hit me hard.. https://twitter.com/dcorbacho/status/632132568587051008
Not only me https://twitter.com/crizzirc/status/632138118695682049

It was not a problem of .info files... it was the fact that my node_modules has 39K files! it took extra 30 seconds to load a page.
As you can see https://monosnap.com/file/rRqDF50bipkVr949Z8Y60zxe33QCV2.png

I agree with mark carver on avoiding "another drupalism".

A temporary working fix without touching Drupal, thanks to Andrew https://www.drupal.org/u/agilmore87

chmod 700 the node_modules folder, so apache/Drupal can't access the folder.

I'm working in a virtualised (vagrant) environment with Mac as Host with NFS, so it doesn't allow me to chown the folder, but you might want to explore that route also.

markhalliwell’s picture

So I just wasted the better part of today reinstalling my local dev stack because I thought I had messed up when I recently added a new PHP extension (xphrof). Even after re-installation, I was still getting a blank browser window and the cryptic [core:notice] [pid 57] AH00052: child pid XXXX exit signal Segmentation fault (11) messages in my Apache logs. I couldn't for the life of me figure out what I did. I even went as far as gdb httpd to figure out that it was a PCRE recursion issue, but that didn't really help pinpoint what was going on either.

It turns out I had installed two npm modules, the day before, that added the following files in the theme I was working on:

node_modules/bower/node_modules/handlebars/coverage/lcov.info
node_modules/queue/coverage/lcov.info

Because I was working soely on the theme's grunt tasks and not anything PHP related, I installed these modules many hours before refreshing the browser and noticing there was an "issue". Thus, I did not make the connection of what was actually causing the issue.

It's really ironic that even after knowing about this specific issue, someone as seasoned as myself can still run into this and it is a huge "WTF"...

So, while this isn't necessarily a "bug" in 8..x (because of YAML), it IS for 7.x... and a massive one at that. Even though there are other 7.x issues, #619542: Malformed theme .info files break menu_router generation and #2349457: preg_match_all segmentation fault on drupal_parse_info_format, they aren't necessarily related to this specific issue (crap in developer vendor directories that Drupal shouldn't care about).

So since the policy of core is to backport, we have to first "fix" 8.x before this can be properly addressed in 7.x. Even though it doesn't actually "break" anything in 8.x, it would still be a huge DX performance gain which is a win in my book.

markhalliwell’s picture

Status: Needs review » Reviewed & tested by the community

The patch in #27 is all 8.x needs.

The "issue" about a hypothetical "node_modules" module/folder only needs to be addressed when we backport to 7.x with the update hook solution found in #57 and #58.

dman’s picture

@markcarver I sympathise *so much* with this.

After losing $TOO_MANY_HOURS_TO_ADMIT to it a few months ago, I was recently lucky enough to be in earshot of some other (2 dev + 1 design) team members whose day had just ground to a halt and were discussing raising tickets with a serveradmin or asking for a server rebuild. Their day(s) would have been totally shot, but I had been stung hard enough to point at an innocuous 'grunt' addition the designer had added to make their image-building job easier.

That a bit of utility code (in an unsupported language!) that was only ever meant to run on demand on a designers desktop could bring the entire server to segfault is definitely this years WTF winner.

MattA’s picture

+++ b/core/lib/Drupal/Core/Extension/Discovery/RecursiveExtensionFilterIterator.php
@@ -65,11 +65,13 @@ class RecursiveExtensionFilterIterator extends \RecursiveFilterIterator {
+    'bower_components',
...
+    'node_modules',

I won't hold up the issue on this, but I would rather see those two directories live under a comment explaining why they are there, instead of under "Front-end". Otherwise it just looks like two random directories that someone who hasn't seen this issue may want to remove in the future.

markhalliwell’s picture

#63 is just a personal nitpick and honestly makes no sense to me. That is why they're already under the "front-end" comment to begin with. Also, their directory names are actually self-explanatory and even if you didn't know what they were, a quick search could tell you exactly what you needed to know.

I get that core is mostly backend folk, but just because y'all don't use it doesn't make it "random". These front end technologies are actually very common and highly used (they're our "composer" if that helps any). Why do we insist on nitpicking the front end so much... can we stop please?

nicholasruunu’s picture

I just spent hours on this issue since I was getting 502 Bad Gateway on all routes under /admin on nginx.
Would be great with a proper solution, being able to blacklist node_modules folder or other.

alexpott’s picture

@markcarver you seemed to misunderstand my comment about node_modules - which of course I know we did not come up with. That much is plain. But, here is the rub. Today it is bower and node, but 2 years down the line it might be something completely different. There must be other tools that add folders and files that break Drupal's "let's check everything that might possibly be Drupal" code.

The current solution looks wrong to me. The blacklist in HEAD contains concepts that Drupal creates or relies on:

  protected $blacklist = array(
    // Object-oriented code subdirectories.
    'src',
    'lib',
    'vendor',
    // Front-end.
    'assets',
    'css',
    'files',
    'images',
    'js',
    'misc',
    'templates',
    // Legacy subdirectories.
    'includes',
    // Test subdirectories.
    'fixtures',
    // @todo ./tests/Drupal should be ./tests/src/Drupal
    'Drupal',
  );

Drupal does not create or rely on the concept of bower_components or node_modules.

To me this fix looks shortsighted, in that does nothing to solve the root problem. We should fix it so the Drupal 7 does not segfault on parsing .info files. We should think about only parsing info files when there is a corresponding .module or .theme in the same folder (d7 only). We should think about a project having to supply package information that details it's sub-modules so we don't have to recursively scan everything. But continuing to add random stuff to this list will eventually break someones site in a really really hard to spot way.

alexpott’s picture

Also I think if we want to quickfix it - then let's quickfix it for D7 and handle D8 in a better and more complete way.

catch’s picture

Version: 8.0.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

The backport policy applies when the same issue affects both 8.x and 7.x. Since 8.x doesn't have a fatal error (or any error in fact), I think this patch could go into 7.x directly.

I'm not sure at all about the 'huge performance gain' - this is a couple of folders in a couple of modules, and a folder that shouldn't be checked into git iirc, so it's only a performance gain on dev at best.

The ability to skip sub-directories for scanning sounds useful (and agreed that'd be better than manually adding things to the list each time), just don't think we need that to fix the fatal in 7.x.

markhalliwell’s picture

Version: 7.x-dev » 8.0.x-dev
Status: Patch (to be ported) » Needs work

you seemed to misunderstand my comment about node_modules

No I got that you meant it sarcastically; I'm just tired of the front end being treated in a disrespectful manner. It's offensive and, quite honestly, this type of attitude towards the front end is why I had to "quit/take a break" from core.

Today it is bower and node, but 2 years down the line it might be something completely different.

The same could be said about composer. I get that compared the the back end, the front end is faster moving and could be considered by some as "unstable". However that is why tools like this came to be: to help bring stability and conformity.

I don't see node or bower going away anytime soon. This is a huge "what if". The fact of the matter is this: these tools are widely used now, even if something else comes along and we start to see a shift towards this new "technology", it wouldn't change overnight. This means that these directories would still be valid for likely many years. To say that we can't address this "what if" in the future is a complete cop-out. Please, stop arguing hypotheticals.

There must be other tools that add folders and files that break Drupal's "let's check everything that might possibly be Drupal" code.

Which is why I suggested this list get moved into settings.php in #57, however that doesn't mean we still shouldn't provide sane defaults for something very common on the front end. Based on this, I guess this issue really needs to be CNW instead of RTBC then.

Drupal does not create or rely on the concept of bower_components or node_modules.

Yes, I understand that "Drupal" itself doesn't necessarily rely on these, but core isn't just "backend" components. The fact that core provides an API for themes (theme system) means that it is aware of this extensibility. Even if core itself doesn't use these directories, it inherently touches them because of this extensibility. By adding these directories, you're in fact saying that core doesn't use them and that they should, in fact, be ignored..... which is what this whole issue is about.

But continuing to add random stuff to this list will eventually break someones site in a really really hard to spot way.

Again... not "random" and a huge "what if". There's already been a solution suggested for a way to warn any "potential" sites that may use these directories in #57 and #58.

I'm not sure at all about the 'huge performance gain' - this is a couple of folders in a couple of modules, and a folder that shouldn't be checked into git iirc, so it's only a performance gain on dev at best.

Sigh... no, this isn't necessarily a "production" performance gain, but it is a massive front end DX performance gain.

Backend developers don't see this issue because they aren't developing with these tools.

To arbitrarily dismiss this as "trivial" just because it doesn't affect you is wrong. It severely cuts into a front end developer's productivity time (see #59). We shouldn't have the opportunity to refill our coffee after initiating a cache clear.

---

I know this response may seem "harsh" or "emotional", but it isn't directed towards anyone in particular. I'm just fed up with my voice (and other front end developer voices) not being heard and our issues not being acknowledged or taken seriously. These kinds of issues are just one of the reasons why we have a low front end developer ratio in core (recommend reading the entire post when you get a chance btw).

markhalliwell’s picture

The backport policy applies when the same issue affects both 8.x and 7.x. Since 8.x doesn't have a fatal error (or any error in fact), I think this patch could go into 7.x directly.

Also, I'm not saying that this issue should go in to "fix" the D7 bug; it would just be a happy coincidence and easy fix for info files found in the node_modules folder. This issue is simply about excluding front end vendor directories (that Drupal shouldn't traverse to begin with) to improve front end DX.

The malformed .info file bug in D7 are entirely separate issues:
#619542: Malformed theme .info files break menu_router generation
#2349457: preg_match_all segmentation fault on drupal_parse_info_format

catch’s picture

I don't see the problem with fixing the fatal error in 7.x, which David already indicated he'd be happy to commit without an equivalent 8.x patch (since there isn't one), then handling the performance issue separately. Bundling them together does not help them get fixed faster, it just muddies the discussion as this issue shows. I'd imagine the .info fix (and even the quick fix for performance) could get into 7.x relatively quick, then we could open a separate issue for 8.x to look at performance there.

Sigh... no, this isn't necessarily a "production" performance gain, but it is a massive front end DX performance gain.

The directory scan for .info files should not be happening every request, if it is, something else is wrong. 30 seconds is indeed a long time and thanks to pointing to that comment though. Would have been useful if it was in the issue summary with a clear indication of what context the performance improvement is in (and that isn't clear even from the comment). These details are important. Also developer facing performance issues are minor relative to something that happens on production - doesn't matter whether that developer is front or back end. We should still fix them, but there's a massive difference between visitor vs. content editor vs. site builder vs. developer in terms of where the focus is on optimisation.

Also missing from this issue is an indication of how deep these directories are. I noticed in drupal_system_listing() that we don't pass any maximum depth argument to file_scan_directory(). Normally modules live at most in mymodule/modules/mysubmodule/mysubmodule.info. At least for 8.x we could enforce a maximum depth quite easily. I don't know if that would help or not because the information is not here.

To arbitrarily dismiss this as "trivial" just because it doesn't affect you is wrong

Loads of core issues don't personally affect me, whether or not they're front-end developer facing is only one factor in that - I'm much more likely to be looking at js (or preprocess/templates/CSS) than some site building interfaces during a normal work day. Just pushing back on patches is not 'disrespecting', disrespecting an issue would be ignoring it.

David_Rothstein’s picture

Right, so I would be willing to commit the current approach to Drupal 7 (with the changes mentioned in #57 and #58) as a (possibly temporary?) workaround. I would definitely prefer to see #619542: Malformed theme .info files break menu_router generation and #2349457: preg_match_all segmentation fault on drupal_parse_info_format fixed though - they are real independent bugfixes and both look pretty close to ready except perhaps for automated tests (the latter definitely needs manual testing too), because at that point this issue would just be a performance discussion for both Drupal 7 and Drupal 8 and we could focus on figuring out the best way to deal with that for both. But whatever happens, happens.

We should think about only parsing info files when there is a corresponding .module or .theme in the same folder (d7 only).

This seems interesting and I think it would work for modules, but I believe .info files are the only ones required for a Drupal 7 theme.

markhalliwell’s picture

Title: Ignore vendor folders to improve directory search performance » Ignore front end vendor folders to improve directory search performance
Issue summary: View changes
Issue tags: +Needs issue summary update
Related issues: +#619542: Malformed theme .info files break menu_router generation, +#2349457: preg_match_all segmentation fault on drupal_parse_info_format, +#2393605: Allow file_scan_directory() to ignore folder

I will have to come back to this later to finish, but I've somewhat updated the issue summary to rectify the confusion of what this issue should be tackling.

Just pushing back on patches is not 'disrespecting', disrespecting an issue would be ignoring it.

Push back should be constructive and ask questions (as you did and I have subsequently updated the issue summary because of it).

The previous comments leading up to this point were merely speculative and rather dismissive, not "push back".

---

Also, after looking at what RecursiveExtensionFilterIterator is, I can see a little confusion here. I thought this black list was a comprehensive file_scan_directory "black list". It's not...

The patch should instead focus on adding these directories to file_scan_directory() as a default "nomask" option. As this issue is pertinent to anytime Drupal does directory/file traversal (e.g. drupal_find_theme_templates() and theme registry rebuilds).

quicksketch’s picture

The patch should instead focus on adding these directories to file_scan_directory() as a default "nomask" option. As this issue is pertinent to anytime Drupal does directory/file traversal (e.g. drupal_find_theme_templates() and theme registry rebuilds).

As an alternative approach, perhaps we could also just specify a custom nomask in drupal_find_theme_templates() and drupal_system_listing()?

If that approach were taken, we could further optimize the scanning of files by skipping css, js, images, and other directories as well when parsing for .tpl.php or .info files. I also like the idea of adding a nomask list that is appropriate to each situation, rather than including something specific like node_modules in the default list.

quicksketch’s picture

Here's a stab at the approach from #74. No change to the default nomask, but instead using a specialized nomask in drupal_find_theme_templates and drupal_system_listing().

quicksketch’s picture

Version: 8.0.x-dev » 7.x-dev
FileSize
2.1 KB

Patch attached. Just setting to 7.x-dev for testing.

quicksketch’s picture

Version: 7.x-dev » 8.0.x-dev
JohnAlbin’s picture

Status: Needs work » Needs review

The patch in #76 looks interesting.

+++ b/includes/theme.inc
@@ -1325,8 +1325,20 @@ function drupal_find_theme_templates($cache, $extension, $path) {
+  $files = file_scan_directory($path, $regex, array('key' => 'name', 'nomask' => $no_mask));

Took me a bit to realize why you changed this line. But then noticed you want to skip 'templates' folders for system files, but not for theme hooks.

Let's have the bot have a look at it.

Status: Needs review » Needs work

The last submitted patch, 76: optimize_scan-2329453.patch, failed testing.

JohnAlbin’s picture

Oh, duh. That's a D7 patch. :-p

markhalliwell’s picture

Yes, #76 is definitely an interesting approach. I saw this yesterday and my first impression was like "Doh, that makes a lot of sense!" However, I decided to think on this a bit and I still think putting this as a sensible default inside of file_scan_directory() makes the most sense.

I think what it it boils down to is this: because node_modules and bower_components are commonly used, they can actually appear anywhere (modules, themes, libraries, profiles, etc.). There are 22 calls to file_scan_directory() in 8.x and 26 calls in 7.x.

Since this list can change, I don't think an "individualized" approach is the right direction for something as common as these front end directories. They can appear anywhere and should always be ignored by core, which is what this issue is about. We shouldn't have to remember this each and every time a new call to file_scan_directory() is used (which will lead to future issues about this very topic again).

jenlampton’s picture

@JohnAlbin the 'templates' probably threw you off. ;) In D7 'theme' is also used (and perhaps should be added?), but in Backdrop we switched to 'templates' much like D8.

I like the idea of adding a nomask to file_scan_directory() because then each time it's used it can be optimized for that specific purpose. I haven't looked at all 26 calls to see how different they are, but it could be a handy tool regardless of this particular issue.

@markcarver is there a use case for a contrib project trying to include files from node_modules or bower_components using file_scan_directory?

If not, then maybe we should do both: allow file_scan_directory() to be *specifically* optimized everywhere, and always exclude node_modules or bower_components too.

markhalliwell’s picture

is there a use case for a contrib project trying to include files from node_modules or bower_components using file_scan_directory?

Is this a possibility, sure... I suspect there may someone who may be using it this way. However this isn't really a "Best Practice™" and cannot imagine this being very commonplace. These folders are for development purposes only (e.g. to compile the files that Drupal would actually consume) in the same way that they should never be committed to the repo either.

then maybe we should do both

Yes, I think that is the most sensible approach.

My main concerns with not specifying node_modules or bower_components in file_scan_directory as sane $nomask defaults is that we will most certainly run into this issue again otherwise.

That way, when someone decides to override $nomask (as in #76), then they'll see these defaults and will likely include them along with their other specific uses cases.

We would undoubtedly have to add a comment explaining why these front end directories are sane defaults in file_scan_directory though. That way if/when someone chooses to ignore them (for whatever reason) they will know just how much it can impact local development.

jenlampton’s picture

Ah, I see! If $no_mask is not provided when file_scan_directory() is caled, we throw those two in. If $no_mask is specified, we assume they are included. I like it :)

jenlampton’s picture

Status: Needs work » Needs review

Question: do we really still need the 'CVS' in there?

quicksketch’s picture

Question: do we really still need the 'CVS' in there?

I can't think of a reason why it would be necessary. Because CVS is no longer used for modules, it's very unlikely that a CVS directory is going to be within a module or theme directory, which is where file_scan_directory() is called most often (whether to find .info files or .tpl.php files).

The only even somewhat likely location for a CVS directory would be in the root of the project, in the event that the developer preferred to use CVS for deployment to the server or localhost development. In such a case, file_scan_directory() is never called on the *entire* installation (it's always in the modules directory, profiles directory, files directory, or themes directory), so it would never come into play. And finally, even if a CVS directory were there, this would only have a performance impact, not actually change the functionality of the site. So it's very unlikely to affect anyone negatively, and almost impossible it would actually break a site.

markhalliwell’s picture

Status: Needs review » Needs work

Following the thread over at https://github.com/backdrop/backdrop-issues/issues/1110#issuecomment-139..., I would say that we should also add scss and less to the $ignore_directories list.

kaidjohnson’s picture

While #76 and #84 both solve the issue for me (I was getting segfaults when using drush due to scanning complex node_modules directory in my theme), they also both prevent some modules from being picked up. Before the patch, my modules list was 84 deep. After the patch, my modules list was 82 deep.

The two modules that disappear are 'link_css' and 'retina_images'. It looks like the regex isn't quite right to avoid this unintended exclusion, but my regex foo isn't quite strong enough to put together a reliable solution.

This patch (D7) at least prevents the segfaults I was getting form having bower_components and node_modules in my theme, but is less aggressive in terms of what else it ignores. While I think excluding common folders like 'css' and 'images' is potentially a good thing, perhaps less is more here and we should target the core issue (node_modules/bower_components) here and worry about further performance improvements by excluding common folders (css/js/templates/images) in another thread?

kaidjohnson’s picture

Whoops, managed to bump a '?' into one of the regexes that wreaked havoc on the admin theme. Reverted back to regex pattern from #84 but without the possible string matching conflicts.

makbeta’s picture

I would second the request to handle js/css/templates/js exclusion separately.
Ex: excluding templates folder completely destroys the functionality of fences module & screws up the HTML of the site

Good idea to sass/scss/less directories since they are commonly used as source folders throughout
Might be a good idea to add a /src directory as well

gmclelland’s picture

Thanks @kaidjohnson! I had the same problem with #84, it wouldn't detect the hires_images module.

After the patch in #89, the hires_images module was detected.

corbacho’s picture

I agree with #88, we need a quickfix

perhaps less is more here and we should target the core issue (node_modules/bower_components) here and worry about further performance improvements by excluding common folders (css/js/templates/images) in another thread?

This continues being a big problem. Today I saw someone else hitting this problem https://twitter.com/illepic/status/651175874281123840 and this stackexchange question is getting very popular http://drupal.stackexchange.com/questions/126880/how-do-i-prevent-drupal...

malcomio’s picture

Seems like there is some confusion about the related D7 and D8 issues. This issue relates to Drupal 8, and segmentation faults no longer occur in D8.

For the Drupal 7 issue (with a patch that is currently RTBC) please see #619542: Malformed theme .info files break menu_router generation.

As regards the Drupal 8 issue, my vote would be to just add node_modules and bower_components to the blacklist, as per the patch in #27, as long as a change record is issued to advise people to avoid using those names for modules.

kaidjohnson’s picture

Thanks, @malcomio.

I'm concerned that #619542 is a slightly different issue and while it may 'fix' the issue we're seeing, it also 'masks' the issue further. If a theme has node_modules with multiple .info files nested within, that fix will simply register those info files with a safe default name, as far as I can tell. This seems wholly unnecessary and disconcerting as we're registering files that were never intended to be a part of the Drupal system.

Here we're attempting to completely ignore and disregard entire subsets of files that are known to not be registered by Drupal. I see the symptoms being the same between these two issues, but the reasoning behind them are very different, and the solutions should both be implemented for maximum stability and minimum overhead, imo.

webcultist’s picture

Not sure if it matters for this issue, but I tried D7.40 where the fix from #619542 should be included and the page still breaks with an internal server error when though the json3 nodejs module through it .info file.
Not on all pages but when deleting the cache and obviously when calling the module page...

David_Rothstein’s picture

webcultist’s picture

Ah sorry, didn't mention this ticket. Thank you very much!

andypost’s picture

Version: 8.0.x-dev » 7.x-dev

I read a whole thread and looks there's no consensus.

#71-72 said that issue should be fixed for D7 first so #89 looks great except "CVS" and some tuning for code comments (explain about node)

#27 patch for D8 still valid and requirement to do frontend without loosing sanity

I think we need to improve #89 get it commited and use #27 for d8 with @todo link to separate issue to make this configurable.

pablo.guerino’s picture

Status: Needs work » Needs review
FileSize
3.44 KB

Hello,

Adding patch following comment #98 from andypost for drupal 7,
hope it passes testing !

andypost’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs change record

suppose that ready for 7.x
change record is there, maybe we need separate for 8.x later

kaidjohnson’s picture

Updated comment style to more carefully conform to style rules.

From https://www.drupal.org/node/1354#inline:

C style comments (/* */) and standard C++ comments (//) are both fine, though the former is discouraged within functions (even for multiple lines, repeat the // single-line comment).

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 101: ignore_frontend_folders-2329453-101-7.x.patch, failed testing.

Status: Needs work » Needs review

geerlingguy’s picture

Status: Needs review » Reviewed & tested by the community

From some quick local testing, after one of the FE devs I work with complained of 1 minute+ cache rebuild times:

Before patch:

  • Total Incl. Wall Time (microsec): 50,479,410 microsecs
  • Total Incl. CPU (microsecs): 10,083,991 microsecs
  • Number of Function Calls: 555,303

After patch:

  • Total Incl. Wall Time (microsec): 6,508,084 microsecs
  • Total Incl. CPU (microsecs): 3,721,190 microsecs
  • Number of Function Calls: 416,977

It makes development sooooo much nicer for the poor FE devs who use themes that download hundreds of nested dependencies. This was on a pretty new D7 site with only about 30 modules enabled, almost nothing custom (just a bunch of features/exported content types).

What's up with the 2nd set of tests showing 3 fails? Is testbot just misbehaving?

kaidjohnson’s picture

Thanks for those benchmarks! Those are very promising results. Always nice to see a bug fix also result in performance improvements. :)

Yeah, I don't know what is with those tests... I've been using the patch on a development site for a few days now with no side effects, so I'm assuming the testbot is acting up, but I'm not 100% sure.

joelpittet’s picture

Assigned: JohnAlbin » Unassigned
Status: Reviewed & tested by the community » Needs review
Issue tags: -Needs backport to D7
FileSize
3.53 KB
2.19 KB

As I mentioned in #56 and agreed by @andypost in #98 (I think). Here is a patch that makes that patch configurable through your settings.php for your consideration.

joelpittet’s picture

small double space on concat nitpick fix.

geerlingguy’s picture

Status: Needs review » Needs work
+++ b/includes/common.inc
@@ -5512,12 +5512,22 @@ function drupal_system_listing($mask, $directory, $key = 'name', $min_depth = 1)
+  $ignore_directories = variable_get('ignore_directories', array(

ignore_directories seems a much too generic variable name... maybe file_scan_directory_ignore_directories instead?

joelpittet’s picture

Was thinking that too, maybe just drupal_ignore_directories to namespace it? But I'm fine with that really long one too. Feel free to roll that in depending how you feel.

geerlingguy’s picture

Status: Needs work » Needs review
FileSize
3.57 KB

Attached a patch that makes the variable name drupal_file_scan_ignore_directories. That way it's namespaced, and a little more specific. Since it doesn't apply globally to every file_scan_directory() invocation, it makes sense to not name it after the function itself.

Forgot to commit twice to my local branch, so there's no interdiff, but I only changed the variable name itself in the three lines/files where it was added.

joelpittet’s picture

Status: Needs review » Reviewed & tested by the community

Was RTBC #105, re-RTBC because the variable was just made configurable and a name change to avoid conflicts.

markhalliwell’s picture

Version: 7.x-dev » 8.0.x-dev
Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs backport to D7

This should be fixed in 8.x first.

I've been working on #2609316: Port Bootstrap to Drupal 8, constantly doing cache clears and it's really starting to tick me off. Each time I do one, it's a very noticeable performance hit because I have my developer hat on where all these files are present (which is what this issue is all about).

The only reason this has been moved back to 7.x is because there was some confusion that this isn't an issue in 8.x. That is wrong. This issue is not about the malformed .info[.yml] files (that was a separate issue in 7.x and was already fixed). This issue is about the fact that core traverses unnecessary front end vendor directories.

chellman’s picture

Having to clear cache repeatedly is certainly annoying, but this is a crasher in Drupal 7, so don't think rewriting this for Drupal 8 first is good, especially when we're at RTBC for the Drupal 7 patch. I don't want to diddle the tags again (yet), but I'd really like to see this fixed in 7.

markhalliwell’s picture

This issue is strictly about file system traversal performance (which is still an issue in 8.x and has absolutely nothing to do with parsing). The related issues (which have already been mentioned and attached to this issue) for fixing 7.x crashing are:

#619542: Malformed theme .info files break menu_router generation

#2349457: preg_match_all segmentation fault on drupal_parse_info_format

hussainweb’s picture

Status: Needs work » Needs review
FileSize
626 bytes

I am just adding a patch with hard-coded directories now. @geerlingguy, @markcarver, can you test and see if this makes a difference and then we can figure out how to make this configurable?

hussainweb’s picture

Found another usage. Please test this one.

I also read this issue again and I see that it has oscillated between D7 and D8 several times. I am not sure where we stand right now but I am posting the patch anyway.

catch’s picture

I definitely conflated this with the related issues when I first looked at it, so agreed on 8.x assuming the problem is the same

geerlingguy’s picture

Status: Needs review » Needs work

D8 Before patch:

  • Total Incl. Wall Time (microsec): 10,431,715 microsecs
  • Total Incl. CPU (microsecs): 4,072,000 microsecs
  • Number of Function Calls: 883,587

D8 After patch:

  • Total Incl. Wall Time (microsec): 2,950,189 microsecs
  • Total Incl. CPU (microsecs): 1,964,000 microsecs
  • Number of Function Calls: 634,441

Difference: 111% faster cache clear after patch, ~250,000 fewer function calls

This was with Drupal 8 HEAD (as of today), with and without the patch in #117 above. For flexibility's sake, I would like to see those two options as configurable options in Drupal 8 (like they are in the D7 patch in #111), so setting back to needs work for that.

I tested with the same theme directory node_modules and bower_components directories that I had in the D7 tests above.

geerlingguy’s picture

Issue summary: View changes
joelpittet’s picture

Status: Needs work » Needs review
Issue tags: +Needs manual testing
FileSize
2.83 KB
2.55 KB

I think this is more generic and may cover off the config need. Hopefully didn't overstep my bounds. @hussainweb let me know if that works for you?

And hopefully does the trick still @geerlingguy?

joelpittet’s picture

Oh and I removed (\..*)| because it's covered with the $filename[0] != '.' I figured

hussainweb’s picture

@joelpittet: Not at all. :) Thank you for the changes. It is still missing in RecursiveExtensionFileIterator, and we should use it there as well for the sake of consistency. I don't think we should directly use Settings here, but inject it. I will see if I can attempt a solution later today.

joelpittet’s picture

@hussainweb thanks. I agree with the Settings injection if possible and hopefully I got the wrapping syntax on long conditionals correct because I don't do that often.

Not sure if we need the configuration on RecursiveExtensionFileIterator but if you find a way to do it well, all the power to ya;)

brahmjeet789’s picture

Lukas von Blarer’s picture

I have been using this patch for a few days now and works perfectly. I am using Vagrant for development which is affected by this extremely. Page load times after a cache rebuild decreased from 60s to 10s.

hussainweb’s picture

FileSize
2.75 KB
4.43 KB

Here is my attempt to use Settings in RecursiveExtensionFilterIterator. But with this, I am afraid that I am changing API and this might go to 8.1.x. Can we commit the patch in #121 to 8.0.x and this patch to 8.1.x? :P

joelpittet’s picture

@hussainweb maybe a postponed follow-up issue for 8.1.x changes?

Lukas von Blarer’s picture

Sounds reasonable.

hussainweb’s picture

I created a follow-up for 8.1.x at #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance. We can review the patch in #121 here to keep things simple.

joelpittet’s picture

FileSize
913 bytes

I've done another performance test.

Scenario: I installed basic theme, npm install the modules it comes with.

Apply this little microtime test patch and clear the cache on the command line via drush cr.
Apply the patch in #121
and clear cache again.

No Patch
---------
0.07743501663208
0.526780128479 <---- THIS IS HUGE!
0.00018978118896484

With Patch
---------
0.069730997085571
0.010643005371094 <---- :-)
0.00014495849609375
markhalliwell’s picture

Here's the above performance test with npm install for Bootstrap 8.x-3.x-dev:

No Patch
---------
0.049306869506836
0.30480408668518 <--
0.00015616416931152

With Patch
---------
0.041584014892578
0.0044867992401123 <--
0.00018906593322754
markhalliwell’s picture

Status: Needs review » Needs work
+++ b/core/includes/file.inc
@@ -953,7 +964,10 @@ function file_scan_directory($dir, $mask, $options = array(), $depth = 0) {
+          && !(isset($options['nomask']) && preg_match($options['nomask'], $filename))
+          && !(!empty($default_nomask) && preg_match($default_nomask, $filename))

We should only provide a default "nomask" if one wasn't already provided.

As it stands now, this code will always exclude the "default nomask". It should extend (+=) the existing $options['nomask'] array with the defaults instead.

johannez’s picture

I have a vanilla Drupal 8.0.2 install and use the Basic theme after "npm install" and "bower install".

Added #131 for performance testing and did a "drush cr" before and after the patch #121:

Before

0.0304069519043
0.170646905899
0.000127077102661

After

0.0295081138611
0.00653886795044
0.000136852264404

apratt’s picture

I have a vanilla Drupal 8.0.3 install and use the Basic theme after "npm install".

Added #131 for performance testing and did a "drush cr" before and after the patch #121:
No patch
0.026458978652954
0.20892095565796
0.00011992454528809

After patch

0.019454002380371
0.0019950866699219
7.4863433837891E-5

joelpittet’s picture

Status: Needs work » Needs review
FileSize
1.97 KB
2.23 KB

Hopefully addresses #133. This doesn't effect the performance at all because that function doesn't get called for modules/themes discovery anymore.

lauriii’s picture

Status: Needs review » Needs work

Reviewed the code and it looks good. We have still 2 needs tags that needs to be solved before this is RTBC.

joelpittet’s picture

Issue tags: -Needs manual testing

I know people are using as am I the d7 patch in production. Plus the performance tests should count for as much extra too.

If someone is near a computer to update the issue summary, please do

joelpittet’s picture

Status: Needs work » Needs review
Issue tags: -Needs issue summary update

Read through the issue summary, it seems good to me. The tag was added in #73 and updated in #120

lauriii’s picture

Status: Needs review » Reviewed & tested by the community
catch’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs change record
+++ b/core/includes/file.inc
@@ -945,6 +946,19 @@ function file_scan_directory($dir, $mask, $options = array(), $depth = 0) {
+    $ignore_directories = Settings::get('drupal_file_scan_ignore_directories', [

This should be documented in one of the settings.php examples.

Otherwise looks good though.

andriyun’s picture

Status: Needs work » Needs review
FileSize
2.7 KB
661 bytes

Patch with added description and example to default.settings.php

star-szr’s picture

Status: Needs review » Needs work

Thanks @andriyun, good start :)

+++ b/sites/default/default.settings.php
@@ -705,6 +705,19 @@
+ * The list of folders disabled for directory scanning.
+ * For example we added node_modules and bower_components folders to avoid
+ * issues with Drupal recursive scanning.
+ * Uncomment below strings to enable.

I would say "Uncomment the below strings to enable", that makes it a bit less terse. However it's also a bit misleading IMO to say they are enabling anything when the default is the same as uncommenting.

I also think this either needs to be combined into one paragraph or split into multiple paragraphs, the way the lines are bunched together doesn't match with our docs standards or the rest of default.settings.php:

https://www.drupal.org/node/1354#general

To make a paragraph break in a docblock, leave a blank line (see example above).

andriyun’s picture

Status: Needs work » Needs review
FileSize
2.69 KB
726 bytes

Added adjustments to description:

  • Removed summary line b/c first line of text contains the essence of description.
  • Added mention about default value.
  • Added blank line before last to make more focus on uncomment action.
andypost’s picture

+++ b/core/lib/Drupal/Core/Extension/Discovery/RecursiveExtensionFilterIterator.php
@@ -71,6 +71,8 @@ class RecursiveExtensionFilterIterator extends \RecursiveFilterIterator {
     'templates',
+    'node_modules',
+    'bower_components',
     // Legacy subdirectories.
     'includes',

would be great to add line comment here about where that comes from

joelpittet’s picture

@andypost how about
// Deeply nested Node.js folders.

andypost’s picture

@joelpittet great, just for consistency with other ones in the list

joelpittet’s picture

Thanks, there we go.

andypost’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs change record

Thanx! Let's get this in

alexpott’s picture

Status: Reviewed & tested by the community » Needs work

I don't think we can ignore existing sites and BC, unfortunately. It is perfectly possible to have created a directory in modules called node_modules with a load of custom modules to do with node integration in it. Such a site would be broken by this change.

In order to get around this we need to add the setting to settings.php without it being commented out and the default value in code should be an empty array. That way new sites get the new settings but old sites are not broken.

Another problem is that the setting makes this look extensible but in fact it is not...

@@ -71,6 +71,9 @@ class RecursiveExtensionFilterIterator extends \RecursiveFilterIterator {
+    // Deeply nested Node.js subdirectories.
+    'node_modules',
+    'bower_components',

Hardcoding here... shouldn't we be getting the new setting and adding it to the blacklist?

hussainweb’s picture

@alexpott: We are using a blacklist in #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance which targets 8.1.x. Since using a blacklist properly would mean a change on the interface, we didn't do that for 8.0.x. See comments #127 to #130.

catch’s picture

Adding it as a default in settings.php rather than inline makes sense to me.

alexpott’s picture

@hussainweb there's no change to any interface - but I'm pretty certain that this is not something we'd do in a patch release considering the nature of the change - it is a task not a bug.

hussainweb’s picture

Sorry, I don't know why I said interface. We would need to be able to inject the blacklist in the RecursiveExtensionFilterIterator which (kind of) changes the constructor. I agree that can't be done in a patch release and that is why there is another issue for 8.1.x. See the patch in #2640572-2: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance.

The patch in this issue does not break anything (in the API) but unfortunately has to hardcode the directories. Calling Settings::get from inside the iterator seems like a smell. Theoretically speaking, yes, it is possible that there would be a directory called node_modules with custom modules but somewhat unlikely. In any case, it is possible and it counts as a break. Maybe we could remove it from the iterator object in the patch here (for 8.0.x) and get some performance gains at least. We go all the way in the other issue which is for 8.1.x. Thoughts?

markhalliwell’s picture

I don't think we can ignore existing sites and BC, unfortunately. It is perfectly possible to have created a directory in modules called node_modules with a load of custom modules to do with node integration in it. Such a site would be broken by this change.

The odds of this actually occurring seem astronomically microscopic considering how, we as a community, generally think and name things using "Best Practices". The name "node_modules" has never been a "thing" in Drupal, even as a remote concept AFAIK and was and has been only ever introduced as a known FE "thing".

I find this argument extremely weak and hypothetical. Just because something is possible doesn't always make it a reality. It would be far different if there was a previous core module, naming convention, or published contrib module... none of which has happened.

If this truly is a concern, the solution is simple: warn the one or two hypothetical users that have done this about the change in an update hook, as mentioned in the IS (I personally still think this is overkill).

Also, this really isn't an "API break" anyway. Any "fix" would be relatively simple: rename the very hypothetical folder/module.... boom, done.
This change would not actually affect any logic inside the very hypothetical folder/module (which is what an "API break" really is about).

markhalliwell’s picture

Only local images are allowed.

alexpott’s picture

@hussainweb well then this patch should remove all configurability because having it configurable in one place and not another does not make sense. Let's just combine the changes and do this in 8.1.x - it is around the corner and the correct place to make non-bug fixes anyway.

@markcarver trading a broken site against a slow cache clear is not a hypothetical comparison. This can change can break sites. Please stop telling me this is hypothetical. Unfortunately we allow our folder structures for discovery extensions to be a complete wild west. But also note, I've note been saying don't make this change I've been trying to help this change go in in a way that will not break existing sites - so, exactly what is the problem?

alexpott’s picture

Version: 8.0.x-dev » 8.1.x-dev
Status: Needs work » Needs review
Issue tags: +Needs tests
FileSize
5.13 KB
4.85 KB

All I'm suggesting is that we do this. And we need to test the new setting probably by extending Drupal\system\Tests\File\ScanDirectoryTest for file_scan_directory() changes and we need to add a test for the ExtensionDiscovery changes somewhere...

Also I don;t think we need to prefix the setting with drupal_ this is a Drupal setting after all.

alexpott’s picture

Here's the test for file_scan_directory(). Still need a test for ExtensionDiscovery

markhalliwell’s picture

Status: Needs review » Needs work

Please stop telling me this is hypothetical.

I wasn't saying it towards you specifically. I was using it to point out that issue keeps slanting towards a tone that has no real merit.

So, no. I'm not going to "stop" using that word, especially when it's being used quite appropriately:

That this issue can, or rather, might break a site. That statement isn't fact. It's bolstering a "what if" scenario that has no real basis on truth, whatsoever, other than the coincidental name of the folder. There is a minute possibility (if not, complete improbability) that some individual decided (in 8.x no less which has far fewer contrib development than 7.x btw) to name their module or group modules in a sub-folder with the same names?

Really?

I've seen stuff go in far more damaging than this, often without any perceivable foresight into the damage it "might" cause. That fact that this issue has already identified this scenario and provided a viable "solution" (update hook, see IS) is a far better deal than other core issues that had no trouble "breaking" the FE left and right in the name of "progress" (or "tasks" I should say). In fact, it actually astounds me to see what core picks and chooses as a "BC break".

It's hypocritical, a double standard and quite frankly... complete rubbish.

---

I'm not going to argue anymore about this. I have repeated myself multiple times now, with clearly no ability to convey English in a comprehensible manner.

So, since 8.1.x is "just around the corner", then let's just stop the bikeshedding and taking things personally. The only reason this issue is on 8.0.x is because of the backport policy.

---

When we do backport this issue finally to 7.x, it will still need some sort of mechanism to warn users of ignored folders/modules that might have the name. I thought It made sense to figure that out here/now... but I guess it's being decided that it's now "out-of-scope" until we backport?

markhalliwell’s picture

+++ b/core/includes/file.inc
@@ -946,6 +947,13 @@ function file_scan_directory($dir, $mask, $options = array(), $depth = 0) {
+  $default_nomask = '/^' . implode('|', $ignore_directories) . '$/';

@@ -954,7 +962,10 @@ function file_scan_directory($dir, $mask, $options = array(), $depth = 0) {
+        if ($filename[0] != '.'
+          && !(isset($options['nomask']) && preg_match($options['nomask'], $filename))
+          && !(!empty($default_nomask) && preg_match($default_nomask, $filename))
+          ) {

This re-introduces the bug where $default_nomask is always applied even if an explicit $options['nomask'] value was provided, see #133.

hussainweb’s picture

+++ b/core/lib/Drupal/Core/Extension/Discovery/RecursiveExtensionFilterIterator.php
@@ -90,6 +87,20 @@
+  public function __construct(\RecursiveIterator $iterator, array $blacklist = []) {
+    parent::__construct($iterator);
+    $this->blacklist += $blacklist;
+  }

Doesn't this constitute an API break? If you see the patch in the issue for 8.1.x, this is exactly what we do there (because the function signature changes).

hussainweb’s picture

Oh, I see you changed this issue to 8.1.x-dev. Do you mean we shouldn't really try this for 8.0.x?

Side-question: What happens to 8.0.x after 8.1.0 comes out? Will it be supported for bug-fixes for some time?

markhalliwell’s picture

Yes, that's what seems to be the consensus. I was going to change it to 8.1.x, just for the sake of stopping the the "argument". I really don't care when/how this is done... just that it's done so we can move on to the 7.x backport.

alexpott’s picture

Status: Needs work » Needs review
FileSize
1.76 KB
6.83 KB

re #161 mmm that means the patch on #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance was wrong - which shows the important of having a test. Which this patch now has for this.

alexpott’s picture

Status: Needs review » Needs work

Given that my tone has no real merit, I guess my addition of tests has no real merit either. So I'm not going to add the test for ExtensionDiscovery because I don't want to spend my time doing that. But a test does need to be added before this should be committed.

markhalliwell’s picture

o_O

I did not say that you or your tone "has no real merit". I said [this] issue keeps slanting towards it when it's really not necessary, especially given the solutions already provided and the context for what this change implies (and its actual reality of impact). Ok.. done.

---

Re: #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance
I think that issue should be closed since this one is now 8.1.x. It's only adding to the confusion, especially considering that work has been make on the above patches (in this issue) after that other issue/patch was created.

alexpott’s picture

@hussainweb once 8.1.x comes out 8.0.x is unsupported - see https://www.drupal.org/core/release-cycle-overview#minor

hussainweb’s picture

@alexpott: Thanks! I think then there is no point trying a backward compatible way for 8.0.x. 8.1.x it is.

Mile23’s picture

+1 on this. It's good to make it configurable, and putting it in settings.php means it can be backported and D7 benefits as well.

+++ b/sites/default/default.settings.php
@@ -705,6 +705,17 @@
 /**
+ * The list of folders disabled for directory scanning.
+ *
+ * By default ignore node_modules and bower_components folders to avoid issues
+ * with Drupal recursive scanning.
+ */
+$settings['file_scan_ignore_directories'] = [

Using settings is a good idea, but then we have to explain it to the user better. This comment and maybe other comments need to explain the scope of the ignore. It's not obvious what's Drupal scans and what it doesn't, so what's an end user to think?

Maybe something like "Will ignore directories with this name in any directory scan, such as finding extensions." and also add @see file_scan_directory and @see ExtensionDiscovery.

alexpott’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
FileSize
2.44 KB
8.61 KB

As no one else wrote the test I decided too and hey look the changes to RecursiveExtensionFilterIterator didn't work. Patch attached fixes that and tests it.

Status: Needs review » Needs work

The last submitted patch, 171: 2329453-171.patch, failed testing.

alexpott’s picture

alexpott’s picture

Status: Needs work » Needs review
andriyun’s picture

Looks perfect for me.
tests, docs, proper comments
Thank you @alexpott
+1 to RTBC

neclimdul’s picture

Status: Needs review » Needs work
  1. Not in patch, file_scan_directory documentation:
     *   - 'nomask': The preg_match() regular expression for files to be excluded.
     *     There is no default.
    

    This seems out of date now, there is in fact a default.

  2. +++ b/core/includes/file.inc
    @@ -946,6 +947,15 @@ function file_scan_directory($dir, $mask, $options = array(), $depth = 0) {
    +    $default_nomask = '/^' . implode('|', $ignore_directories) . '$/';
    

    Does this need to be escaped? We should be able to trust settings files but it is not clear from the settings documentation that it is a regex you're providing not a directory name. IMO we should document that it is a regex, or we should preg_quote().

  3. +++ b/core/lib/Drupal/Core/Extension/ExtensionDiscovery.php
    @@ -433,11 +433,17 @@ protected function scanDirectory($dir, $include_tests) {
    +    // Allow directories specified in settings.php to be ignored. You can use
    +    // this to not check for files in common special-purpose directories. For
    +    // example, node_modules and bower_components. Ignoring irrelevant
    +    // directories is a performance boost.
    +    $ignore_directories = Settings::get('file_scan_ignore_directories', []);
    +
    

    Globals are arguments so we are hiding a API change to ExtensionDiscovery::scan() here. I tried to come up with a suggestion for clarifying it but ExtensionDiscovery is quantifiably complex. Maybe it is just ok? Maybe we make the blacklist an argument to scan() similar to file_scan_directory? I don't know.

  4. +++ b/sites/default/default.settings.php
    @@ -705,6 +705,17 @@
    + * The list of folders disabled for directory scanning.
    + *
    + * By default ignore node_modules and bower_components folders to avoid issues
    + * with Drupal recursive scanning.
    + */
    

    This seems less clear than it could be. This provides the default list of directories that will be ignored by Drupal's file API's, it is only hard coded to disable scanning in ExtensionDiscovery as mentioned in the previous bullet.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.0-beta1 was released on March 2, 2016, which means new developments and disruptive changes should now be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

alexpott’s picture

Status: Needs work » Needs review
FileSize
3.17 KB
9.68 KB

Thanks for the review @neclimdul

  1. Fixed - good catch
  2. Went for the preg_quote - I don't think we should be supporting regexes in the file_scan_ignore_directories setting. That would open all sorts of interesting ways to break stuff.
  3. Yeah it's not nice - ideal I think there should be an ExtensionDiscoveryFactory that can take care of injecting what we need - but this is okay for now - it is not too bad to set up a settings object to test this - or change at will.
  4. Fixed.
alexpott’s picture

8.2.x automated testing was turned off - uploading the patch again...

donquixote’s picture

Three things.

------------

First: file_scan_directory() is already a poor piece of code, architecturally and performance-wise.
A list of flaws: #2696335: Optimize file_scan_directory() / drupal_system_listing()
Adding more complexity (special case logic, configurability) to it only will make it harder to refactor.

--------------

Second: You are saying that the folders will be named node_modules and bower_components. Fine.
But why do these folders need to live in the same parent folders that are scanned for Drupal extensions?

E.g. with Composer Manager, or with Libraries API, this 3rd party stuff would live in folders outside of the Drupal extension paths.
The same should apply to nodejs and bower.
Can someone explain why this is not possible? Maybe I am missing something.

---------------

Third: In my view, the Drupal extension folders are "owned" by Drupal, or by the respective extension (module, theme). The extensions should control what is scanned by Drupal and what isn't.
In an ideal world, this would be a whitelist: E.g. a file at the root of the module/theme package, similar to a composer.json, would tell Drupal about folders that contain templates, themes or modules.
Since this is not the case right now, the option left to us is a blacklist. Either a central file that tells Drupal which folders *not* to scan, or local files to suppress scanning individual subfolders.
Or maybe: If no central file is found, scan the entire module/theme. If a central file is found, let this file control what is to be scanned and what isn't.
Either way: The thing that tells Drupal what to scan or what not to scan would be Drupal-specific. It would be named ".drupal-whatever", and it would only be read by Drupal. I guess one could call it a Drupalism. The extension written for Drupal. So what? A composer.json is a Composerism. A package.json is a nodeism. So what?

A "do not scan the node_modules subfolder" or "please scan the following folders for Drupal-related stuff" should be specific to an extension (module, theme), not system-wide.

--------------

I am happy to discuss this on IRC, if we want to reduce noise in this issue. I was too impatient this time.

donquixote’s picture

And btw:
Since file_scan_directory() already uses (indirectly) a service (via file_stream_wrapper_uri_normalize()), I think we should make the entire thing a service, and reduce the function in file.inc to a \Drupal::service(..)->..() call.

Another btw:

  // Normalize $dir only once.
  if ($depth == 0) {
    $dir = file_stream_wrapper_uri_normalize($dir);
    $dir_has_slash = (substr($dir, -1) === '/');
  }

What if the function is called with a $depth !== 0 ?
The only solution is to split this into a public part, and a "private" recursive part.

EDIT: Oops, sorry. The docblock says to not call this with an explicit depth.

donquixote’s picture

Oh, another btw:

          if ($options['recurse'] && is_dir($uri)) {
            // Give priority to files in this folder by merging them in after
            // any subdirectory files.
            $files = array_merge(file_scan_directory($uri, $mask, $options, $depth + 1), $files);
          }
          elseif ($depth >= $options['min_depth'] && preg_match($mask, $filename)) {

If a $options['recurse'] is false, then the is_dir() never happens.
If now a directory matches the $mask for files (e.g. a directory named "foo.html.twig", then this directory is added to the $files array.

donquixote’s picture

I thought about this some more, and did local experiments.
Whether my previous comments make sense or not, I no longer see them as a reason to block the patch (#179).

dddbbb’s picture

@donquixote Regarding your second point/question from #180. bower_components can be configured to live pretty much anywhere and be renamed. node_modules can do neither. It's common for themes to rely on node_modules (e.g. to use a task runner like grunt/gulp, process files etc), this normally means that node_modules will need to be present within the theme (normally theme root).

Mile23’s picture

Re #181:

Since file_scan_directory() already uses (indirectly) a service (via file_stream_wrapper_uri_normalize()), I think we should make the entire thing a service, and reduce the function in file.inc to a \Drupal::service(..)->..() call.

If extension discovery is a service, it should not have other services as dependencies, and must, never, ever, ever, ever call anything on \Drupal. :-)

That way we can use it at install and update times, run-tests.sh can use it, and we might even be able to make it a component so that other libraries can re-use it. (See https://github.com/paul-m/drupal-merge-plugin )

Really this issue is an implementation detail of the refactoring of ExtensionDiscovery and it's friends: #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList

Adding parent: #2186491: [meta] D8 Extension System: Discovery/Listing/Info

donquixote’s picture

@danbohea

Regarding your second point/question from #180. bower_components can be configured to live pretty much anywhere and be renamed. node_modules can do neither. It's common for themes to rely on node_modules (e.g. to use a task runner like grunt/gulp, process files etc), this normally means that node_modules will need to be present within the theme (normally theme root).

It is "node_modules" relative to the location of the main packages.json. My idea was (and no longer is) to put the main packages.json somewhere else. Either outside the theme, or in a subfolder of the theme. But if people think that theme root is the best place, so be it.

@Mile23:

If extension discovery is a service, it should not have other services as dependencies, and must, never, ever, ever, ever call anything on \Drupal. :-)

Yeah.. my remark was simply an observation. If we can get rid of this dependency, all the better.

--------------

I still think that file_scan_directory() is a too flexible API.
Basically there are 3 use cases:
- Extension discovery (D7 only. In D8, this has moved elsewhere).
- Discover "stuff" (e.g. templates) within one extension.
- Discover "stuff" (e.g. css files, style images, etc) within another folder, e.g. public://styles (ImageAdminStylesTest::getImageCount()).

The discussion we are having here applies only to some of these use cases. It feels wrong to put this all within one one-size-fits-all directory scanning function.

We should rather have one (or more) functions/methods that do the actual scanning, and other APIs on top of this that understand which subfolder names should be ignored.

For D8 extension discovery, we could factor out the actual file scanning part. In its current state, I think it is too big currently.
For D8 file_scan_directory(), we could provide alternative functions/methods for dedicated use cases. And also split out the recursive part. And get rid of dependencies.
For D7 file_scan_directory() .. long discussion.

This said: We can move on with this issue, and handle possible refactoring in a follow-up.

dddbbb’s picture

The latest patch (#179 right now) makes a change to default.settings.php. In Drupal 8, are these really defaults that are loaded unless you say otherwise or would a user also be expected to ensure the following lines are also copied to their settings.php file (a la D7)?:

$settings['file_scan_ignore_directories'] = [
  'node_modules',
  'bower_components',
];

Also, I'm not seeing any real performance gains with the patch in #179 (yes, I've been ensuring the code above is copied to settings.php). I've been running time drush cr to get basic insight into performance with & without the patch applied and also with & without the presence of vendor directories for comparison (node_modules & bower_components). The results suggest that the patch is making little to no difference with the vendor directories present. Not applying the patch and removing the vendor directories seems to guarantee vastly improved performance. Am I missing something here? I appreciate that not all themes are created equal but surely the comparative tests I'm doing here should rule that out?

bradallenfisher’s picture

A simple stop-gap solution is to edit you php.ini file and set
pcre.recursion_limit=10000

by default it is set pretty high, causing the segfault errors.

remember to restart apache after this change, and if you are using php-fpm restart php-fpm as well.

I know this isn't the solution, but it allows me to happily go back to FE dev, and I can still use drush like normal. :)

MiSc’s picture

I think this patch complicates things more than solving it in a easy way. There are folders that never should be scanned, so why add another setting for it? Is there are case then you want scanning of node_modules or bower_components in Drupal? I have never had such a case myself.

mcrittenden’s picture

Status: Needs review » Needs work

The patch in #179 won't apply since on Drupal 8.1.2 and above. Back to needs work.

joelpittet’s picture

Status: Needs work » Reviewed & tested by the community

re #190@marcvangend It's targeted to the 8.2.x branch.
re #189 @MiSc if a custom module uses/used that namespace they may want to avoid renaming, it's away to provide a BC layer.
re #187 @danbohea They are expected to copy them to settings.php if they are upgrading an existing site. The performance is similar to #135 did you use #131 performance test?

RTBC as @donquixote has another issue for further improvements in the architecture realm.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 179: 2329453-178.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
1.74 KB
9.61 KB

Rerolled the patch and improved the test to use PHPUnit assertions.

Status: Needs review » Needs work

The last submitted patch, 193: 2329453-193.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
2.19 KB
9.59 KB

More improvements and fixes to the added test.

MiSc’s picture

@joelpittet, that should be a custom module by really bad design. That is not a drupal-way of doing things. But just wanted to point out that this complicates bits more than providing a simple solution on the problem. But I leave it.

dddbbb’s picture

@joelpittet I didn't use that test but will do so next time, thanks.

steveoliver’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
1.34 KB
9.59 KB

Looks good. Still applies to 8.2.x.

A few nits:

  1. +++ b/core/tests/Drupal/KernelTests/Core/File/ScanDirectoryTest.php
    @@ -142,4 +142,25 @@ function testOptionMinDepth() {
    +    $this->assertCount(2, $files, '2 text files found when an "nomask" option passed in.');
    

    option (is) passed in.

  2. +++ b/sites/default/default.settings.php
    @@ -719,6 +719,21 @@
    + * The default list of directories that will be ignored by Drupal's file API's.
    

    API.

Before:
0.034054040908813
0.13604807853699 <--
0.00011086463928223

After:
0.034997940063477
0.0023999214172363 <--
0.00012397766113281

neclimdul’s picture

Issue summary: View changes

Node doesn't seem to be hosting the FAQ anymore so linked to the github repo... hopefully it won't go away.

Looks good but summary nit, we comment that node_modules will never move or be different but seem to ignore the fact that bower_components is totally moveable. https://bower.io/docs/config/#bowerrc-specification

The last submitted patch, 111: ignore_front_end_vendor-2329453-111.patch, failed testing.

sylus’s picture

Would this still be the correct issue to upload a compatible patch for 8.1.x for composer patch workflow? Or should I create a postponed issue related to this? The patch doesn't apply in 8.1.x only because of the non existence of a test file where lines want to be appended to.

I really appreciate the work done in this issue as this has just destroyed my performance when playing around with angularjs 2 and a rather big node_modules folder. ^_^

mirie’s picture

I removed the part that tries to alter the nonexistent test file (core/tests/Drupal/KernelTests/Core/File/ScanDirectoryTest.php) in this patch file. It applies to 8.1.6

mirie’s picture

Sorry forgot to include the interdiff for the patch.

alexpott’s picture

@mirie thanks for testing the patch - I was a bit confused at first - but then saw you've applied this patch to a version without \Drupal\KernelTests\Core\File\ScanDirectoryTest - actually this test exists in 8.2.x - I guess you are used a version of Drupal 8 where this test has not been moved to a PHPUnit based kernel test (for example 8.1.x). Re-uploading #198 as that is still the patch to apply to 8.2.x. This patch is only eligible for inclusion in the next minor release of Drupal 8 because it makes an API addition and is a task - the next minor release is 8.2.0.

Thanks again for testing the patch.

The last submitted patch, 202: ignore_front_end_vendor-2329453-202.patch, failed testing.

xjm’s picture

Assigned: Unassigned » catch

I think it would be best for catch to take a look at this RTBC issue based on his earlier reviews. Thanks all!

  • catch committed 68a543a on 8.2.x
    Issue #2329453 by alexpott, joelpittet, hussainweb, kaidjohnson,...
catch’s picture

Status: Reviewed & tested by the community » Fixed

Would be really good to keep going with modernizing and clarifying the extension discovery system, this patch still feels like a bit of a workaround. However it works around something very annoying, and a lot happier with current iteration than earlier versions.

Committed 68a543a and pushed to 8.2.x. Thanks!

#2482549: Ignore node_module folder in core to use Drupal with npm/grunt/nodejs looks fine for the 7.x backport issue, so marking fixed!

japerry’s picture

FileSize
9.5 KB

For those still on 8.1.x (8.1.7) I found that patches were not working correctly. Here is a re-rolled version against 8.1.x HEAD (7064dfc4e896630b7dce9f06383f88715674f597) but it also works for 8.1.7

  • catch committed 68a543a on 8.3.x
    Issue #2329453 by alexpott, joelpittet, hussainweb, kaidjohnson,...

  • catch committed 68a543a on 8.3.x
    Issue #2329453 by alexpott, joelpittet, hussainweb, kaidjohnson,...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Elijah Lynn’s picture

Here are some commands to analyze things like this.

` tree node_modules | tail --lines=1`

> 6302 directories, 82349 files

However, an `strace drush cache-clear all 2>&1 | grep "node_modules" | wc --lines`

> 1804155 (almost 2 million file requests!)

`time drush cache-clear all`

> real: 1.45 seconds

After this patch:

> real 1:25 seconds

This is on an amazee.io docker container with a Ubuntu/GNU/Linux host (aka my Thinkpad) and a pretty bloated Drupal 7 site. Our Mac users were having cache-clear times up 30+ minutes due to Docker for Mac NFS issues. @schnitzel did a good job pointing us here.

marcelovani’s picture

See fix for Drupal 7 here https://www.drupal.org/node/2482549

quietone’s picture

Published the change record