Problem/Motivation
Drupal recursively scans the front end vendor directories: node_modules
and bower_components
.
This causes a noticeable and significantly slower response time when front end developers do common tasks such as clearing the cache (#59) or when rebuilding the theme registry: drupal_find_theme_templates():
drush cc all with node_modules in Drupal theme's root folder: 45 secs
drush cc all without node_modules: 10 secs
Also see statistics from XHProf before and after the proposed fixes in comments #119 (D8) and #105 (D7). In Drupal 8, in certain scenarios, there's over a 2x performance improvement for expensive operations like cache clears, and in Drupal 7, over 3x.
These front end developer tools are very common and highly used (even outside of Drupal).
NPM Stats (https://www.npmjs.com):
175,409 total packages
29,619,131 downloads in the last day
490,004,267 downloads in the last week
1,939,777,453 downloads in the last month
Bower Stats via NPM (https://www.npmjs.com/package/bower):
23,318 downloads in the last day
360,264 downloads in the last week
1,474,986 downloads in the last month
38,599 packages (http://bower.io/stats/)
Grunt Stats (https://www.npmjs.com/package/grunt)
19,102 downloads in the last day
318,143 downloads in the last week
1,290,279 downloads in the last month
All these tools are NodeJS oriented and use the node_modules
folder, with the addition of the bower_components
folder for Bower.
Due to the combined and shared nature of the node_modules
folder, the average size of node_modules
is roughly ~5,000-10,000 files. This can, on occasion, easily skyrocket to upwards of ~30,000-50,000 files depending on the size of the project and the node modules used.
Suffice it to say, this isn't some "small deal" Drupal can afford to simply ignore.
Can't people just use a different directory other than node_modules
?
No. Node has already made it quite clear that they will never let this happen. Also, this would still be considered a "Drupalism" because front end developers new to Drupal wouldn't know about this issue.
Proposed resolution
Add the node_modules
and bower_components
folders as a default for the "nomask" option in file_scan_directory().
Remaining tasks
- Update patch to reflect current issue summary
User interface changes
None
API changes
Technically this is considered an API change, however the event of this happening is very highly unlikely. However:
If you installed Drupal themes or modules via npm, you will be required to move them out of npm's default node_modules folder.
If you grouped sub-modules under the directory name "node_modules", you will be required to rename it.
Adding an update hook as mentioned in #57/#58 to warn users about possible existing directories may be the best way to do this?
Comment | File | Size | Author |
---|---|---|---|
#209 | 2329453-171-reroll.patch | 9.5 KB | japerry |
#204 | 2329453-198.patch | 9.59 KB | alexpott |
#203 | interdiff-2329453-198-202.txt | 1.52 KB | mirie |
#202 | ignore_front_end_vendor-2329453-202.patch | 7.5 KB | mirie |
#198 | 2329453-198.patch | 9.59 KB | steveoliver |
Comments
Comment #1
afoster CreditAttribution: afoster commentedComment #2
afoster CreditAttribution: afoster commentedComment #3
iamcarrico CreditAttribution: iamcarrico commentedAs a note, as users of npm packages we do not have control over which files some npm packages are sent with. We also have no way of changing the node_modules folder name. In this instance... it might not be a terrible idea to prevent file_scan_directory from scanning inside a directory named "node_modules". This project name has yet to be taken on D.o, and there may be a way of safely doing so. That being said, this is an alteration of behavior and for D7 it might be better to just create a patch for those of us that are affected, and put in a full issue for D8.
(For anybody stumbling on this, I *think* if you alter line 2125 of includes/file.inc from
to
It should work. But I say that without any proper testing or full checking of the code. Just as a "you could try this and let me know if it works".
Comment #4
afoster CreditAttribution: afoster commentedHere's a work-around that takes a different approach. Rather then have drupal handle the node_module .info files, I run this bash script (in OSX mavericks) each time after I run npm update or npm install. It finds and renames all *.info files in node_modules to *.inf0 (that's a zero on the end)
I haven't noticed anything bad happen to the node modules themselves as the .info files seem to have something to do with test coverage.
After $npm install or $npm update...
1. Copy file to your theme folder. (Rename attached file or copy code to new file) - node_modules_cleanup.sh
2. In terminal from the root of your theme containing node_modules
$bash node_modules_cleanup.sh
This runs the following bash script.
Comment #5
iamcarrico CreditAttribution: iamcarrico commented@afoster, I did a similar thing (noted here:https://www.drupal.org/node/2309023 ), but I added it to my package.json so it runs automatically. I also like deleting them, because I like to live life dangerously.
Edited: Quotes issue.
Comment #6
gregori.goossens CreditAttribution: gregori.goossens commented@iamcarrico, the only way for me, and maybe not so bad is like your #3 prevent scanning node_modules folder, but need a core update in include/file.inc
i change the nomask option in file_scan_directory function like this.
Comment #7
justb3a CreditAttribution: justb3a commentedI ran into the same issue. Here is my solution:
1. I created a folder in sites/all directory.
2. Then I moved all relevant files into that folder (in my case: .jshintrc, config.json, gulpfile.js, js, node_modules, package.json, sass).
3. Now you can execute `npm install`. Make sure that your minified files will be placed into site/all/theme/mysubtheme/css and /js (configure in gulpfile or in my case in config.json).
That's all.
Comment #8
SebCorbin CreditAttribution: SebCorbin at Makina Corpus commented#6 is a great way to solve this. I've attached the corresponding patch.
Comment #9
stevesmename CreditAttribution: stevesmename commentedI have a hard time saying that #6 is a "great way" to solve this but it does solve the issue. This issue is really only for development purposes; as node_modules shouldn't be pushed to production. I think that npm should fix their issue, https://github.com/npm/npm/issues/775.
To commit this into Drupal core (7.x) would be the lazy way to solve this. There are other methods to get this done and developers should consider doing the alternatives until open source communties work together. + Drupal 8 won't have this issue as .info files are not used.
Related links (outside of Drupal):
http://stackoverflow.com/questions/16738177/grunt-possible-to-relocate-n...
http://stackoverflow.com/questions/15521655/can-i-put-the-npm-node-modul...
http://stackoverflow.com/questions/18974436/change-node-modules-location
Comment #10
LewisNymanI would be a shame if Drupal required extra steps or extra knowledge just to support a workflow that frontend developers can use without a problem in other frameworks.
Comment #11
LewisNymanI'm also getting a few info files in my gem files by the way.
Comment #12
Danny EnglanderI just ran up against this issue myself (got the drush error and admin areas were WSOD) with a grunt / LibSass setup for my contrib theme. It seems like a no brainer to have this added to core but in the mean time I split the difference between #4 and #5. I liked the code better in #4 so I simply added that to run in my package.json file for post install.
... where npm_clean.sh contains the code from #4. One thing to note, For some reason I had to add a .npmrc file that has the code
unsafe-perm = true
, otherwise, I was getting nasty errors. That was even runningnpm install
assudo npm install
.After all this, everything works great, no more drupal admin WSOD, drush works and grunt works.
Comment #13
reikiman CreditAttribution: reikiman commentedFirst - isn't it a bug in Drush or Drupal Core that it is crashing on finding what it thinks is a malformed .info file? Rather than crash it should print an error and abort whatever it was trying to do.
Second - I can't think of a valid reason for a npm module to include something like test results in the packaged distribution. That's a sloppy package maintainer not adding things to their .npmignore file. The package maintainer should be receiving a bug on this.
Third - local modifications to a package.json will just be overwritten the next time the package is updated. Therefore the suggestions y'all made along that line will be ephemeral and break at some random time in the future.
Fourth - just because you're finding lcov.info files in this case doesn't mean some other package maintainer won't have a legitimately named a file with a .info extension.
Fifth - the idea of changing the name of the node_modules directory is a non-starter. That directory name is baked into Node core, and is the methodology it uses to find Node modules. See http://nodejs.org/api/modules.html Npm then builds on the package.json defined in the Node API to support the additional features supplied by npm.
The patch in #8 looks good as far as it goes. Seems to me there's a similar risk from bower packages containing unexpected files.
What's needed is for Drupal to (for certain purposes) simply ignore stuff inside directory structures managed by other tools like npm or bower.
Comment #14
ksenzee#13.3: The package.json in question is at the root of the theme folder - it's not going to be overwritten by a package update. Adding a postinstall script to it is a quick, easy fix that doesn't require patching core. The only problem I see with it is your #4: someone, somewhere, someday, might add a .info file to their npm package that is actually needed.
In theory we could rewrite the code that parses .info files such that it doesn't segfault, but the fix would only apply to D7; D8 uses YAML files instead.
Comment #15
Danny EnglanderYeah, I agree with @ksenzee, #13.3 won't be an issue. I've done a lot of testing around this over the past few days and codified it in a blog post here: http://dannyenglander.com/blog/drupal-drush-segmentation-fault-11-error-...
I tested this by deleting my
node_modules
folder and then runningsudo npm install
and then immediately runningdrush cc all
. No more error. The nice thing about this method is that it's very portable across workflows and version control.Comment #16
reikiman CreditAttribution: reikiman commentedI must have missed the location of the package.json. That does render my point# 3 moot.
BTW - when using npm for a local install it's not necessary or desirable to use sudo. npm used to warn you when ran with sudo.
Comment #17
LewisNymanI like the idea of Drupal catching the error instead of crashing
Comment #18
RobLoachLooking through the tests for
drupal_parse_info_format()
, it seems like its huge regex is not tested against malformed Drupal.info
files.Comment #19
RobLoachThis demonstrates the Segmentation Fault that we get with drupal_parse_info_format(). Expected to fail and kill PHP....
Pull Request and branch up at https://github.com/RobLoach/drupal/pull/3
Can be fixed in other packages to not have .info files too:
Comment #21
albertski CreditAttribution: albertski commentedI had the same problem but I only seen it when creating and updating features locally. I was able to track it down to the .info file (node_modules/grunt-eslint/node_modules/eslint/node_modules/doctrine/coverage/lcov.info) and I came across this post :). #8 fixed my problem.
Comment #22
marcvangendRe #9 "Drupal 8 won't have this issue as .info files are not used.":
Can we be really sure? Is there a standard which says that node modules must never contain a .info.yml file? It would be a shame if have to fix the same problem again for a different file extension.
Comment #23
JohnAlbinSo _this_ explains why my Drupal 7 dev site was seg faulting for the last week. *sigh*
I was installing eslint and that ended up adding this file to my theme: "node_modules/eslint/node_modules/doctrine/coverage/lcov.info" And the file is zero bytes, btw.
This feels more like a major bug than a feature request. If I install the eslint tool (which Drupal 8 is using) into my theme, then Drupal seg faults for all pages.
So the next version of Zen for Drupal 7 will have people run
npm install
in their sub-themes. So this is a huge blocker for me.We should definitely test that theory. But a random .info.yml file will still be in Yaml format, so it might get confused, but its unlikely it would segfault.
Comment #24
JohnAlbinComment #25
JohnAlbinComment #26
JohnAlbinIn D8, we already have tests to check for a broken .info.yml file (InfoParserUnitTest::testInfoParser), but here's a quick patch to see if we can take Rob's .info file and try the same test on it. That should verify that D8 is immune to broken .info files.
After the test bots have their way, I've got an actually-useful patch that prevents Drupal 8 from searching node_modules and bower_components folders.
Comment #27
JohnAlbinI think #26 proves we don't have an issue in D8. So #2 on the proposed solution is already fixed in D8.
Here's the patch the prevents node_modules and bower_components from being traverse when looking for special Drupal files. (#1 on the proposed solution.)
Comment #28
JohnAlbinIn D8 this is a performance improvement (if you have node_modules in your Drupal site).
It's definitely not a major bug in D8, but it blocks a major bug in D7.
So… I'm not sure how to categorize/prioritize this. Do we leave this as "major bug"? or fork this issue to create a separate D8 task report?
Comment #29
marcvangendThanks for the patch, John!
To be honest I fail to see why the D8 patch blocks the D7 fix. The bug in D7 does not occur in D8 (even though it's great if we can do a related performance improvement while we're at it) so I'd say the "fix it in dev first" rule does not really apply here.
Since a chain is as strong as its weakest link, I think this issue should either remain a major bug blocked by the D8 patch, or the D8 patch should be split out into its own issue.
Comment #31
malcomio CreditAttribution: malcomio commentedRegarding Drupal 7, there is already a separate issue for this - #619542: Malformed theme .info files break menu_router generation.
I'd be inclined to add the patch to that issue to cover D7, and leave this issue open for D8.
Comment #32
RobLoachThis PR fixes it so that coverage tests arn't deployed in that package: https://github.com/Constellation/doctrine/pull/104
Comment #33
JohnAlbinThat works for me. I've RTBCed that issue.
Tweaking the title and category since this issue isn't about fixing a WSOD on D8.
Comment #34
bserem CreditAttribution: bserem commentedPatch in #8 works fine for me in the last two months.
Shall I tag it as RTBC and against D7 so that this makes into D7 sometime soon?
Comment #35
alexpott@bserem there's a Drupal 7 issue open already #619542: Malformed theme .info files break menu_router generation
An rtbc should be for the latest patch and Drupal version the patch was tested against. I'm completely unsure of what you've rtbc'd here.
Comment #39
RobLoach@alexpott, #27 is for Drupal 8, which is RTBC. It's not critical in Drupal 8, but it does speed up looking for .info files.
The Drupal 7 one you pointed to is also RTBC. They're being handled in separate issues. While the problems are similar, their solutions are different.
Comment #40
alexpottWell the D8 part needs a CR because if you have a module and have a sub directory called
node_modules
which contains a set of modules that provide node related functionality this change would break that.Could we come up with a string that conflicts more with existing Drupal terminology? Is this a standard in node.js? Same goes for bower_components?
Comment #41
alexpottAlso we should probably open a followup to document a module directory structure and the reserved names we currently have.
Comment #42
marcvangendRe #40: according to http://stackoverflow.com/a/21818724, the directory name "node_modules" cannot be changed.
Comment #43
David_Rothstein CreditAttribution: David_Rothstein commentedSince this is a bugfix for Drupal 7 I would commit it there before Drupal 8, but the unfortunate "node_modules" naming conflict gives me pause too. We could really break someone's production site with this, in order to fix a bug that (mostly?) affects developers.
Also, although the approach here is the quickest way to fix the bug, it's not necessarily the only or best one. For example, I tried these patches:
#2349457-2: preg_match_all segmentation fault on drupal_parse_info_format
#619542-68: Malformed theme .info files break menu_router generation
And with both of those applied, it fixes the problem for the .info file RobLoach provided in #19. (Haven't tested it with all the Node modules/Bower components, but it looks like it should work in general.)
In that case, this issue would just become a potential performance improvement (do we know how much of one?) for Drupal 7 too.
Comment #44
SebCorbin CreditAttribution: SebCorbin at Makina Corpus commentedRelated https://github.com/drush-ops/drush/issues/1445
For drush on Windows, a drush cc all will give errors for paths too long
Comment #45
dman CreditAttribution: dman as a volunteer commentedFWIW, we have encountered the (well described in the summary) crashes also - using Compass for SASS generation.
My work-around was to hide the vendor directory in a dot-folder at install time. This excludes it from scanning.
Things seem to work fine from then on, Bundler somehow knows where to look for them next time without trouble.
bundle install --path .bundle/vendor
I'm certainly in favour of more bullet-proof scanning from the Drupal end though. Segfaulting due to a malformed info file was embarrassing.
Comment #46
JohnAlbinChange record draft is here: https://www.drupal.org/node/2544502
Unfortunately, yep. Both
bower_components
andnode_modules
are the default names for these folders. There's a hidden config to change Bower, but node_modules can't be changed.Going to let the test bot review this again (since its been 3 months) before setting it to RTBC again.
Comment #48
JohnAlbinPatch passes tests. https://qa.drupal.org/pifr/test/996598 Back to RTBC.
Comment #49
JohnAlbinFYI, the D7 backport of this patch is over in #619542-146: Malformed theme .info files break menu_router generation and needs to be moved over here.
Comment #50
alexpottI'm not 100% sure this is need for D8, with this patch we only break things which are not broken at the moment. I'm unsure of what to do here.
Comment #51
webchickSounds like this still could use discussion.
Comment #52
joelpittetI've been using the approach in #2393605: Allow file_scan_directory() to ignore folder to resolve this issue. Although manual it's a bit more generic for things like I added it to sites/all/libraries and sites/all/vendor, and to node_modules in my theme and bower_components.
Comment #53
dddbbb CreditAttribution: dddbbb commented#2393605: Allow file_scan_directory() to ignore folder Is the smartest approach to this problem that I've seen yet:
Comment #54
markhalliwell"We" didn't come up with these folder names. It's a silly coincidence, but nothing more. These folders are what is used on the front end and node_modules cannot be changed (nor should it).
It doesn't break until the front end tools install them (which happens all the time when you're a front end developer).
This is a huge "what if". If a developer knows they can group sub-modules inside a parent folder, it would be highly unlikely they would arbitrarily name it something as generic as "node_modules". Typically anything "grouped" would have a lot of context associated with it, especially if it pertained to a client, tool[set], feature, etc.
These aren't actual projects either:
https://www.drupal.org/project/node_modules
https://www.drupal.org/project/bower_components
If, on the off chance, someone actually did name a folder this.... a really simple solution: rename the folder. For the all these reasons, I cannot imagine how this could be considered even a remotely strong argument for one hypothetical site that did this. A little blurb at the top of the release notes should be sufficient way to notify anyone of this highly unlikely "potential issue".
This is a nice feature, sure, but to use it as the "solution" for this issue is really just an anti-pattern. This would require documentation since developers who normally can just do
npm install
(and be done with it) wouldn't know about this Drupalism they would have to add to their vendor folders because Drupal will recursively look through them to find.info[.yml]
files. Yes, there is a "solution" in that issue to help "automate" this via post install scripts, however that is still yet another "Drupalism".I, for one, would much rather not have more Drupalisms to consider and constantly worry if I added this really weird "exclusion file" to my vendor dirs. Drupal should just ignore them.... period.
These folders definitely need to be added to this "blacklist". This all being said, I actually think this "blacklist" needs to be mutable. Drupal will never get this 100% right, the best we can hope for is sane defaults and allow projects to independently define blacklisted (and whitelisted) directories as needed. Perhaps through settings.php or services.yml?
Comment #55
dddbbb CreditAttribution: dddbbb commented100% agreed with #54. Avoiding a Drupalism is ultimately the best way to go and if we can get away with ignoring these directory names altogether then great (already doing this with https://www.drupal.org/node/619542#comment-9771891 and it works well for me on D7).
Comment #56
joelpittet@markcarver if we go the blacklist route, then it would be nice to have it configurable to cover the cases that the
.scanignore
file deals with.I don't mind the 'sane' defaults, but I'd suggest putting that in a variable_get()/Config/Setting(not sure the right one to use in D8). Then is someone does have a node_modules folder for another weird reason they can change that config, OR more likely if there is another tool that gets added to the Front End Stack, we can deal with it in the project at the time it happens.
Comment #57
Dave ReidWe could probably add an update hook which throws a warning for existing sites *just in case* they have actual modules that would now be ignored due to the change. Easy enough (in Drupal 7) to scan all the existing records in the system table for anything matching
system.filename LIKE '%' . db_like('/node_modules/') . '%'
. I would argue D8 doesn't need this check since it's not official yet and this can be documented behavior. For existing D7 sites, the chances of this causing a problem are slim, but it doesn't mean we shouldn't do anything about it.Comment #58
markhalliwell@joelpittet, RecursiveExtensionFilterIterator already has this black/white list (above patch). I think maybe moving them to settings.php does make a little more sense though. That way the location and method to modify these values would be the same for all core versions. That should probably be a separate issue though.
@Dave Reid, that's a really great suggestion for D7 and agree that D8 shouldn't be an issue.
edit: The D7 update should probably do file_scan_directory instead though since there could (in theory) simply be a grouping folder with no actual module associated with it (e.g. like we do with custom, contrib folders).
Comment #59
corbacho CreditAttribution: corbacho commentedThis hit me hard.. https://twitter.com/dcorbacho/status/632132568587051008
Not only me https://twitter.com/crizzirc/status/632138118695682049
It was not a problem of .info files... it was the fact that my node_modules has 39K files! it took extra 30 seconds to load a page.
As you can see https://monosnap.com/file/rRqDF50bipkVr949Z8Y60zxe33QCV2.png
I agree with mark carver on avoiding "another drupalism".
A temporary working fix without touching Drupal, thanks to Andrew https://www.drupal.org/u/agilmore87
chmod 700
the node_modules folder, so apache/Drupal can't access the folder.I'm working in a virtualised (vagrant) environment with Mac as Host with NFS, so it doesn't allow me to chown the folder, but you might want to explore that route also.
Comment #60
markhalliwellSo I just wasted the better part of today reinstalling my local dev stack because I thought I had messed up when I recently added a new PHP extension (xphrof). Even after re-installation, I was still getting a blank browser window and the cryptic
[core:notice] [pid 57] AH00052: child pid XXXX exit signal Segmentation fault (11)
messages in my Apache logs. I couldn't for the life of me figure out what I did. I even went as far as gdb httpd to figure out that it was a PCRE recursion issue, but that didn't really help pinpoint what was going on either.It turns out I had installed two npm modules, the day before, that added the following files in the theme I was working on:
Because I was working soely on the theme's grunt tasks and not anything PHP related, I installed these modules many hours before refreshing the browser and noticing there was an "issue". Thus, I did not make the connection of what was actually causing the issue.
It's really ironic that even after knowing about this specific issue, someone as seasoned as myself can still run into this and it is a huge "WTF"...
So, while this isn't necessarily a "bug" in 8..x (because of YAML), it IS for 7.x... and a massive one at that. Even though there are other 7.x issues, #619542: Malformed theme .info files break menu_router generation and #2349457: preg_match_all segmentation fault on drupal_parse_info_format, they aren't necessarily related to this specific issue (crap in developer vendor directories that Drupal shouldn't care about).
So since the policy of core is to backport, we have to first "fix" 8.x before this can be properly addressed in 7.x. Even though it doesn't actually "break" anything in 8.x, it would still be a huge DX performance gain which is a win in my book.
Comment #61
markhalliwellThe patch in #27 is all 8.x needs.
The "issue" about a hypothetical "node_modules" module/folder only needs to be addressed when we backport to 7.x with the update hook solution found in #57 and #58.
Comment #62
dman CreditAttribution: dman as a volunteer commented@markcarver I sympathise *so much* with this.
After losing $TOO_MANY_HOURS_TO_ADMIT to it a few months ago, I was recently lucky enough to be in earshot of some other (2 dev + 1 design) team members whose day had just ground to a halt and were discussing raising tickets with a serveradmin or asking for a server rebuild. Their day(s) would have been totally shot, but I had been stung hard enough to point at an innocuous 'grunt' addition the designer had added to make their image-building job easier.
That a bit of utility code (in an unsupported language!) that was only ever meant to run on demand on a designers desktop could bring the entire server to segfault is definitely this years WTF winner.
Comment #63
MattA CreditAttribution: MattA commentedI won't hold up the issue on this, but I would rather see those two directories live under a comment explaining why they are there, instead of under "Front-end". Otherwise it just looks like two random directories that someone who hasn't seen this issue may want to remove in the future.
Comment #64
markhalliwell#63 is just a personal nitpick and honestly makes no sense to me. That is why they're already under the "front-end" comment to begin with. Also, their directory names are actually self-explanatory and even if you didn't know what they were, a quick search could tell you exactly what you needed to know.
I get that core is mostly backend folk, but just because y'all don't use it doesn't make it "random". These front end technologies are actually very common and highly used (they're our "composer" if that helps any). Why do we insist on nitpicking the front end so much... can we stop please?
Comment #65
nicholasruunu CreditAttribution: nicholasruunu commentedI just spent hours on this issue since I was getting 502 Bad Gateway on all routes under /admin on nginx.
Would be great with a proper solution, being able to blacklist node_modules folder or other.
Comment #66
alexpott@markcarver you seemed to misunderstand my comment about
node_modules
- which of course I know we did not come up with. That much is plain. But, here is the rub. Today it is bower and node, but 2 years down the line it might be something completely different. There must be other tools that add folders and files that break Drupal's "let's check everything that might possibly be Drupal" code.The current solution looks wrong to me. The blacklist in HEAD contains concepts that Drupal creates or relies on:
Drupal does not create or rely on the concept of
bower_components
ornode_modules
.To me this fix looks shortsighted, in that does nothing to solve the root problem. We should fix it so the Drupal 7 does not segfault on parsing .info files. We should think about only parsing info files when there is a corresponding .module or .theme in the same folder (d7 only). We should think about a project having to supply package information that details it's sub-modules so we don't have to recursively scan everything. But continuing to add random stuff to this list will eventually break someones site in a really really hard to spot way.
Comment #67
alexpottAlso I think if we want to quickfix it - then let's quickfix it for D7 and handle D8 in a better and more complete way.
Comment #68
catchThe backport policy applies when the same issue affects both 8.x and 7.x. Since 8.x doesn't have a fatal error (or any error in fact), I think this patch could go into 7.x directly.
I'm not sure at all about the 'huge performance gain' - this is a couple of folders in a couple of modules, and a folder that shouldn't be checked into git iirc, so it's only a performance gain on dev at best.
The ability to skip sub-directories for scanning sounds useful (and agreed that'd be better than manually adding things to the list each time), just don't think we need that to fix the fatal in 7.x.
Comment #69
markhalliwellNo I got that you meant it sarcastically; I'm just tired of the front end being treated in a disrespectful manner. It's offensive and, quite honestly, this type of attitude towards the front end is why I had to "quit/take a break" from core.
The same could be said about composer. I get that compared the the back end, the front end is faster moving and could be considered by some as "unstable". However that is why tools like this came to be: to help bring stability and conformity.
I don't see node or bower going away anytime soon. This is a huge "what if". The fact of the matter is this: these tools are widely used now, even if something else comes along and we start to see a shift towards this new "technology", it wouldn't change overnight. This means that these directories would still be valid for likely many years. To say that we can't address this "what if" in the future is a complete cop-out. Please, stop arguing hypotheticals.
Which is why I suggested this list get moved into settings.php in #57, however that doesn't mean we still shouldn't provide sane defaults for something very common on the front end. Based on this, I guess this issue really needs to be CNW instead of RTBC then.
Yes, I understand that "Drupal" itself doesn't necessarily rely on these, but core isn't just "backend" components. The fact that core provides an API for themes (theme system) means that it is aware of this extensibility. Even if core itself doesn't use these directories, it inherently touches them because of this extensibility. By adding these directories, you're in fact saying that core doesn't use them and that they should, in fact, be ignored..... which is what this whole issue is about.
Again... not "random" and a huge "what if". There's already been a solution suggested for a way to warn any "potential" sites that may use these directories in #57 and #58.
Sigh... no, this isn't necessarily a "production" performance gain, but it is a massive front end DX performance gain.
Backend developers don't see this issue because they aren't developing with these tools.
To arbitrarily dismiss this as "trivial" just because it doesn't affect you is wrong. It severely cuts into a front end developer's productivity time (see #59). We shouldn't have the opportunity to refill our coffee after initiating a cache clear.
---
I know this response may seem "harsh" or "emotional", but it isn't directed towards anyone in particular. I'm just fed up with my voice (and other front end developer voices) not being heard and our issues not being acknowledged or taken seriously. These kinds of issues are just one of the reasons why we have a low front end developer ratio in core (recommend reading the entire post when you get a chance btw).
Comment #70
markhalliwellAlso, I'm not saying that this issue should go in to "fix" the D7 bug; it would just be a happy coincidence and easy fix for info files found in the node_modules folder. This issue is simply about excluding front end vendor directories (that Drupal shouldn't traverse to begin with) to improve front end DX.
The malformed .info file bug in D7 are entirely separate issues:
#619542: Malformed theme .info files break menu_router generation
#2349457: preg_match_all segmentation fault on drupal_parse_info_format
Comment #71
catchI don't see the problem with fixing the fatal error in 7.x, which David already indicated he'd be happy to commit without an equivalent 8.x patch (since there isn't one), then handling the performance issue separately. Bundling them together does not help them get fixed faster, it just muddies the discussion as this issue shows. I'd imagine the .info fix (and even the quick fix for performance) could get into 7.x relatively quick, then we could open a separate issue for 8.x to look at performance there.
The directory scan for .info files should not be happening every request, if it is, something else is wrong. 30 seconds is indeed a long time and thanks to pointing to that comment though. Would have been useful if it was in the issue summary with a clear indication of what context the performance improvement is in (and that isn't clear even from the comment). These details are important. Also developer facing performance issues are minor relative to something that happens on production - doesn't matter whether that developer is front or back end. We should still fix them, but there's a massive difference between visitor vs. content editor vs. site builder vs. developer in terms of where the focus is on optimisation.
Also missing from this issue is an indication of how deep these directories are. I noticed in drupal_system_listing() that we don't pass any maximum depth argument to file_scan_directory(). Normally modules live at most in mymodule/modules/mysubmodule/mysubmodule.info. At least for 8.x we could enforce a maximum depth quite easily. I don't know if that would help or not because the information is not here.
Loads of core issues don't personally affect me, whether or not they're front-end developer facing is only one factor in that - I'm much more likely to be looking at js (or preprocess/templates/CSS) than some site building interfaces during a normal work day. Just pushing back on patches is not 'disrespecting', disrespecting an issue would be ignoring it.
Comment #72
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commentedRight, so I would be willing to commit the current approach to Drupal 7 (with the changes mentioned in #57 and #58) as a (possibly temporary?) workaround. I would definitely prefer to see #619542: Malformed theme .info files break menu_router generation and #2349457: preg_match_all segmentation fault on drupal_parse_info_format fixed though - they are real independent bugfixes and both look pretty close to ready except perhaps for automated tests (the latter definitely needs manual testing too), because at that point this issue would just be a performance discussion for both Drupal 7 and Drupal 8 and we could focus on figuring out the best way to deal with that for both. But whatever happens, happens.
This seems interesting and I think it would work for modules, but I believe .info files are the only ones required for a Drupal 7 theme.
Comment #73
markhalliwellI will have to come back to this later to finish, but I've somewhat updated the issue summary to rectify the confusion of what this issue should be tackling.
Push back should be constructive and ask questions (as you did and I have subsequently updated the issue summary because of it).
The previous comments leading up to this point were merely speculative and rather dismissive, not "push back".
---
Also, after looking at what RecursiveExtensionFilterIterator is, I can see a little confusion here. I thought this black list was a comprehensive file_scan_directory "black list". It's not...
The patch should instead focus on adding these directories to file_scan_directory() as a default "nomask" option. As this issue is pertinent to anytime Drupal does directory/file traversal (e.g. drupal_find_theme_templates() and theme registry rebuilds).
Comment #74
quicksketchAs an alternative approach, perhaps we could also just specify a custom nomask in
drupal_find_theme_templates()
anddrupal_system_listing()
?If that approach were taken, we could further optimize the scanning of files by skipping css, js, images, and other directories as well when parsing for .tpl.php or .info files. I also like the idea of adding a nomask list that is appropriate to each situation, rather than including something specific like node_modules in the default list.
Comment #75
quicksketchHere's a stab at the approach from #74. No change to the default nomask, but instead using a specialized nomask in
drupal_find_theme_templates
anddrupal_system_listing()
.Comment #76
quicksketchPatch attached. Just setting to 7.x-dev for testing.
Comment #77
quicksketchComment #78
JohnAlbinThe patch in #76 looks interesting.
Took me a bit to realize why you changed this line. But then noticed you want to skip 'templates' folders for system files, but not for theme hooks.
Let's have the bot have a look at it.
Comment #80
JohnAlbinOh, duh. That's a D7 patch. :-p
Comment #81
markhalliwellYes, #76 is definitely an interesting approach. I saw this yesterday and my first impression was like "Doh, that makes a lot of sense!" However, I decided to think on this a bit and I still think putting this as a sensible default inside of file_scan_directory() makes the most sense.
I think what it it boils down to is this: because
node_modules
andbower_components
are commonly used, they can actually appear anywhere (modules, themes, libraries, profiles, etc.). There are 22 calls to file_scan_directory() in 8.x and 26 calls in 7.x.Since this list can change, I don't think an "individualized" approach is the right direction for something as common as these front end directories. They can appear anywhere and should always be ignored by core, which is what this issue is about. We shouldn't have to remember this each and every time a new call to file_scan_directory() is used (which will lead to future issues about this very topic again).
Comment #82
jenlampton@JohnAlbin the 'templates' probably threw you off. ;) In D7 'theme' is also used (and perhaps should be added?), but in Backdrop we switched to 'templates' much like D8.
I like the idea of adding a nomask to
file_scan_directory()
because then each time it's used it can be optimized for that specific purpose. I haven't looked at all 26 calls to see how different they are, but it could be a handy tool regardless of this particular issue.@markcarver is there a use case for a contrib project trying to include files from
node_modules
orbower_components
using file_scan_directory?If not, then maybe we should do both: allow
file_scan_directory()
to be *specifically* optimized everywhere, and always excludenode_modules
orbower_components
too.Comment #83
markhalliwellIs this a possibility, sure... I suspect there may someone who may be using it this way. However this isn't really a "Best Practice™" and cannot imagine this being very commonplace. These folders are for development purposes only (e.g. to compile the files that Drupal would actually consume) in the same way that they should never be committed to the repo either.
Yes, I think that is the most sensible approach.
My main concerns with not specifying
node_modules
orbower_components
in file_scan_directory as sane $nomask defaults is that we will most certainly run into this issue again otherwise.That way, when someone decides to override $nomask (as in #76), then they'll see these defaults and will likely include them along with their other specific uses cases.
We would undoubtedly have to add a comment explaining why these front end directories are sane defaults in file_scan_directory though. That way if/when someone chooses to ignore them (for whatever reason) they will know just how much it can impact local development.
Comment #84
jenlamptonAh, I see! If
$no_mask
is not provided whenfile_scan_directory()
is caled, we throw those two in. If$no_mask
is specified, we assume they are included. I like it :)Comment #85
jenlamptonQuestion: do we really still need the 'CVS' in there?
Comment #86
quicksketchI can't think of a reason why it would be necessary. Because CVS is no longer used for modules, it's very unlikely that a CVS directory is going to be within a module or theme directory, which is where file_scan_directory() is called most often (whether to find .info files or .tpl.php files).
The only even somewhat likely location for a CVS directory would be in the root of the project, in the event that the developer preferred to use CVS for deployment to the server or localhost development. In such a case, file_scan_directory() is never called on the *entire* installation (it's always in the modules directory, profiles directory, files directory, or themes directory), so it would never come into play. And finally, even if a CVS directory were there, this would only have a performance impact, not actually change the functionality of the site. So it's very unlikely to affect anyone negatively, and almost impossible it would actually break a site.
Comment #87
markhalliwellFollowing the thread over at https://github.com/backdrop/backdrop-issues/issues/1110#issuecomment-139..., I would say that we should also add
scss
andless
to the$ignore_directories
list.Comment #88
kaidjohnson CreditAttribution: kaidjohnson commentedWhile #76 and #84 both solve the issue for me (I was getting segfaults when using drush due to scanning complex node_modules directory in my theme), they also both prevent some modules from being picked up. Before the patch, my modules list was 84 deep. After the patch, my modules list was 82 deep.
The two modules that disappear are 'link_css' and 'retina_images'. It looks like the regex isn't quite right to avoid this unintended exclusion, but my regex foo isn't quite strong enough to put together a reliable solution.
This patch (D7) at least prevents the segfaults I was getting form having bower_components and node_modules in my theme, but is less aggressive in terms of what else it ignores. While I think excluding common folders like 'css' and 'images' is potentially a good thing, perhaps less is more here and we should target the core issue (node_modules/bower_components) here and worry about further performance improvements by excluding common folders (css/js/templates/images) in another thread?
Comment #89
kaidjohnson CreditAttribution: kaidjohnson commentedWhoops, managed to bump a '?' into one of the regexes that wreaked havoc on the admin theme. Reverted back to regex pattern from #84 but without the possible string matching conflicts.
Comment #90
makbeta CreditAttribution: makbeta commentedI would second the request to handle js/css/templates/js exclusion separately.
Ex: excluding templates folder completely destroys the functionality of fences module & screws up the HTML of the site
Good idea to sass/scss/less directories since they are commonly used as source folders throughout
Might be a good idea to add a /src directory as well
Comment #91
gmclelland CreditAttribution: gmclelland commentedThanks @kaidjohnson! I had the same problem with #84, it wouldn't detect the hires_images module.
After the patch in #89, the hires_images module was detected.
Comment #92
corbacho CreditAttribution: corbacho commentedI agree with #88, we need a quickfix
This continues being a big problem. Today I saw someone else hitting this problem https://twitter.com/illepic/status/651175874281123840 and this stackexchange question is getting very popular http://drupal.stackexchange.com/questions/126880/how-do-i-prevent-drupal...
Comment #93
malcomio CreditAttribution: malcomio commentedSeems like there is some confusion about the related D7 and D8 issues. This issue relates to Drupal 8, and segmentation faults no longer occur in D8.
For the Drupal 7 issue (with a patch that is currently RTBC) please see #619542: Malformed theme .info files break menu_router generation.
As regards the Drupal 8 issue, my vote would be to just add node_modules and bower_components to the blacklist, as per the patch in #27, as long as a change record is issued to advise people to avoid using those names for modules.
Comment #94
kaidjohnson CreditAttribution: kaidjohnson commentedThanks, @malcomio.
I'm concerned that #619542 is a slightly different issue and while it may 'fix' the issue we're seeing, it also 'masks' the issue further. If a theme has node_modules with multiple .info files nested within, that fix will simply register those info files with a safe default name, as far as I can tell. This seems wholly unnecessary and disconcerting as we're registering files that were never intended to be a part of the Drupal system.
Here we're attempting to completely ignore and disregard entire subsets of files that are known to not be registered by Drupal. I see the symptoms being the same between these two issues, but the reasoning behind them are very different, and the solutions should both be implemented for maximum stability and minimum overhead, imo.
Comment #95
webcultist CreditAttribution: webcultist commentedNot sure if it matters for this issue, but I tried D7.40 where the fix from #619542 should be included and the page still breaks with an internal server error when though the json3 nodejs module through it .info file.
Not on all pages but when deleting the cache and obviously when calling the module page...
Comment #96
David_Rothstein CreditAttribution: David_Rothstein as a volunteer commented@webcultist: See #2349457: preg_match_all segmentation fault on drupal_parse_info_format which is mentioned above.
Comment #97
webcultist CreditAttribution: webcultist commentedAh sorry, didn't mention this ticket. Thank you very much!
Comment #98
andypostI read a whole thread and looks there's no consensus.
#71-72 said that issue should be fixed for D7 first so #89 looks great except "CVS" and some tuning for code comments (explain about node)
#27 patch for D8 still valid and requirement to do frontend without loosing sanity
I think we need to improve #89 get it commited and use #27 for d8 with @todo link to separate issue to make this configurable.
Comment #99
pablo.guerino CreditAttribution: pablo.guerino at Skilld commentedHello,
Adding patch following comment #98 from andypost for drupal 7,
hope it passes testing !
Comment #100
andypostsuppose that ready for 7.x
change record is there, maybe we need separate for 8.x later
Comment #101
kaidjohnson CreditAttribution: kaidjohnson commentedUpdated comment style to more carefully conform to style rules.
From https://www.drupal.org/node/1354#inline:
Comment #105
geerlingguy CreditAttribution: geerlingguy as a volunteer and at Acquia commentedFrom some quick local testing, after one of the FE devs I work with complained of 1 minute+ cache rebuild times:
Before patch:
After patch:
It makes development sooooo much nicer for the poor FE devs who use themes that download hundreds of nested dependencies. This was on a pretty new D7 site with only about 30 modules enabled, almost nothing custom (just a bunch of features/exported content types).
What's up with the 2nd set of tests showing 3 fails? Is testbot just misbehaving?
Comment #106
kaidjohnson CreditAttribution: kaidjohnson commentedThanks for those benchmarks! Those are very promising results. Always nice to see a bug fix also result in performance improvements. :)
Yeah, I don't know what is with those tests... I've been using the patch on a development site for a few days now with no side effects, so I'm assuming the testbot is acting up, but I'm not 100% sure.
Comment #107
joelpittetAs I mentioned in #56 and agreed by @andypost in #98 (I think). Here is a patch that makes that patch configurable through your settings.php for your consideration.
Comment #108
joelpittetsmall double space on concat nitpick fix.
Comment #109
geerlingguy CreditAttribution: geerlingguy as a volunteer and at Acquia commentedignore_directories
seems a much too generic variable name... maybefile_scan_directory_ignore_directories
instead?Comment #110
joelpittetWas thinking that too, maybe just
drupal_ignore_directories
to namespace it? But I'm fine with that really long one too. Feel free to roll that in depending how you feel.Comment #111
geerlingguy CreditAttribution: geerlingguy as a volunteer and at Acquia commentedAttached a patch that makes the variable name
drupal_file_scan_ignore_directories
. That way it's namespaced, and a little more specific. Since it doesn't apply globally to everyfile_scan_directory()
invocation, it makes sense to not name it after the function itself.Forgot to commit twice to my local branch, so there's no interdiff, but I only changed the variable name itself in the three lines/files where it was added.
Comment #112
joelpittetWas RTBC #105, re-RTBC because the variable was just made configurable and a name change to avoid conflicts.
Comment #113
markhalliwellThis should be fixed in 8.x first.
I've been working on #2609316: Port Bootstrap to Drupal 8, constantly doing cache clears and it's really starting to tick me off. Each time I do one, it's a very noticeable performance hit because I have my developer hat on where all these files are present (which is what this issue is all about).
The only reason this has been moved back to 7.x is because there was some confusion that this isn't an issue in 8.x. That is wrong. This issue is not about the malformed .info[.yml] files (that was a separate issue in 7.x and was already fixed). This issue is about the fact that core traverses unnecessary front end vendor directories.
Comment #114
chellman CreditAttribution: chellman commentedHaving to clear cache repeatedly is certainly annoying, but this is a crasher in Drupal 7, so don't think rewriting this for Drupal 8 first is good, especially when we're at RTBC for the Drupal 7 patch. I don't want to diddle the tags again (yet), but I'd really like to see this fixed in 7.
Comment #115
markhalliwellThis issue is strictly about file system traversal performance (which is still an issue in 8.x and has absolutely nothing to do with parsing). The related issues (which have already been mentioned and attached to this issue) for fixing 7.x crashing are:
#619542: Malformed theme .info files break menu_router generation
#2349457: preg_match_all segmentation fault on drupal_parse_info_format
Comment #116
hussainwebI am just adding a patch with hard-coded directories now. @geerlingguy, @markcarver, can you test and see if this makes a difference and then we can figure out how to make this configurable?
Comment #117
hussainwebFound another usage. Please test this one.
I also read this issue again and I see that it has oscillated between D7 and D8 several times. I am not sure where we stand right now but I am posting the patch anyway.
Comment #118
catchI definitely conflated this with the related issues when I first looked at it, so agreed on 8.x assuming the problem is the same
Comment #119
geerlingguy CreditAttribution: geerlingguy as a volunteer and at Acquia commentedD8 Before patch:
D8 After patch:
Difference: 111% faster cache clear after patch, ~250,000 fewer function calls
This was with Drupal 8 HEAD (as of today), with and without the patch in #117 above. For flexibility's sake, I would like to see those two options as configurable options in Drupal 8 (like they are in the D7 patch in #111), so setting back to needs work for that.
I tested with the same theme directory node_modules and bower_components directories that I had in the D7 tests above.
Comment #120
geerlingguy CreditAttribution: geerlingguy as a volunteer and at Acquia commentedComment #121
joelpittetI think this is more generic and may cover off the config need. Hopefully didn't overstep my bounds. @hussainweb let me know if that works for you?
And hopefully does the trick still @geerlingguy?
Comment #122
joelpittetOh and I removed
(\..*)|
because it's covered with the$filename[0] != '.'
I figuredComment #123
hussainweb@joelpittet: Not at all. :) Thank you for the changes. It is still missing in
RecursiveExtensionFileIterator
, and we should use it there as well for the sake of consistency. I don't think we should directly useSettings
here, but inject it. I will see if I can attempt a solution later today.Comment #124
joelpittet@hussainweb thanks. I agree with the Settings injection if possible and hopefully I got the wrapping syntax on long conditionals correct because I don't do that often.
Not sure if we need the configuration on
RecursiveExtensionFileIterator
but if you find a way to do it well, all the power to ya;)Comment #125
brahmjeet789 CreditAttribution: brahmjeet789 commentedComment #126
Lukas von BlarerI have been using this patch for a few days now and works perfectly. I am using Vagrant for development which is affected by this extremely. Page load times after a cache rebuild decreased from 60s to 10s.
Comment #127
hussainwebHere is my attempt to use
Settings
inRecursiveExtensionFilterIterator
. But with this, I am afraid that I am changing API and this might go to8.1.x
. Can we commit the patch in #121 to8.0.x
and this patch to8.1.x
? :PComment #128
joelpittet@hussainweb maybe a postponed follow-up issue for 8.1.x changes?
Comment #129
Lukas von BlarerSounds reasonable.
Comment #130
hussainwebI created a follow-up for 8.1.x at #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance. We can review the patch in #121 here to keep things simple.
Comment #131
joelpittetI've done another performance test.
Scenario: I installed basic theme,
npm install
the modules it comes with.Apply this little microtime test patch and clear the cache on the command line via
drush cr
.Apply the patch in #121
and clear cache again.
Comment #132
markhalliwellHere's the above performance test with
npm install
for Bootstrap 8.x-3.x-dev:Comment #133
markhalliwellWe should only provide a default "nomask" if one wasn't already provided.
As it stands now, this code will always exclude the "default nomask". It should extend (+=) the existing
$options['nomask']
array with the defaults instead.Comment #134
johannez CreditAttribution: johannez as a volunteer commentedI have a vanilla Drupal 8.0.2 install and use the Basic theme after "npm install" and "bower install".
Added #131 for performance testing and did a "drush cr" before and after the patch #121:
Before
0.0304069519043
0.170646905899
0.000127077102661
After
0.0295081138611
0.00653886795044
0.000136852264404
Comment #135
apratt CreditAttribution: apratt commentedI have a vanilla Drupal 8.0.3 install and use the Basic theme after "npm install".
Added #131 for performance testing and did a "drush cr" before and after the patch #121:
No patch
0.026458978652954
0.20892095565796
0.00011992454528809
After patch
0.019454002380371
0.0019950866699219
7.4863433837891E-5
Comment #136
joelpittetHopefully addresses #133. This doesn't effect the performance at all because that function doesn't get called for modules/themes discovery anymore.
Comment #137
lauriiiReviewed the code and it looks good. We have still 2 needs tags that needs to be solved before this is RTBC.
Comment #138
joelpittetI know people are using as am I the d7 patch in production. Plus the performance tests should count for as much extra too.
If someone is near a computer to update the issue summary, please do
Comment #139
joelpittetRead through the issue summary, it seems good to me. The tag was added in #73 and updated in #120
Comment #140
lauriiiComment #141
catchThis should be documented in one of the settings.php examples.
Otherwise looks good though.
Comment #142
andriyun CreditAttribution: andriyun at Skilld, Drupal Ukraine Community commentedPatch with added description and example to default.settings.php
Comment #143
star-szrThanks @andriyun, good start :)
I would say "Uncomment the below strings to enable", that makes it a bit less terse. However it's also a bit misleading IMO to say they are enabling anything when the default is the same as uncommenting.
I also think this either needs to be combined into one paragraph or split into multiple paragraphs, the way the lines are bunched together doesn't match with our docs standards or the rest of default.settings.php:
https://www.drupal.org/node/1354#general
Comment #144
andriyun CreditAttribution: andriyun at Skilld, Drupal Ukraine Community commentedAdded adjustments to description:
Comment #145
andypostwould be great to add line comment here about where that comes from
Comment #146
joelpittet@andypost how about
// Deeply nested Node.js folders.
Comment #147
andypost@joelpittet great, just for consistency with other ones in the list
Comment #148
joelpittetThanks, there we go.
Comment #149
andypostThanx! Let's get this in
Comment #150
alexpottI don't think we can ignore existing sites and BC, unfortunately. It is perfectly possible to have created a directory in modules called node_modules with a load of custom modules to do with node integration in it. Such a site would be broken by this change.
In order to get around this we need to add the setting to settings.php without it being commented out and the default value in code should be an empty array. That way new sites get the new settings but old sites are not broken.
Another problem is that the setting makes this look extensible but in fact it is not...
Hardcoding here... shouldn't we be getting the new setting and adding it to the blacklist?
Comment #151
hussainweb@alexpott: We are using a blacklist in #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance which targets 8.1.x. Since using a blacklist properly would mean a change on the interface, we didn't do that for 8.0.x. See comments #127 to #130.
Comment #152
catchAdding it as a default in settings.php rather than inline makes sense to me.
Comment #153
alexpott@hussainweb there's no change to any interface - but I'm pretty certain that this is not something we'd do in a patch release considering the nature of the change - it is a task not a bug.
Comment #154
hussainwebSorry, I don't know why I said interface. We would need to be able to inject the blacklist in the RecursiveExtensionFilterIterator which (kind of) changes the constructor. I agree that can't be done in a patch release and that is why there is another issue for 8.1.x. See the patch in #2640572-2: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance.
The patch in this issue does not break anything (in the API) but unfortunately has to hardcode the directories. Calling Settings::get from inside the iterator seems like a smell. Theoretically speaking, yes, it is possible that there would be a directory called node_modules with custom modules but somewhat unlikely. In any case, it is possible and it counts as a break. Maybe we could remove it from the iterator object in the patch here (for 8.0.x) and get some performance gains at least. We go all the way in the other issue which is for 8.1.x. Thoughts?
Comment #155
markhalliwellThe odds of this actually occurring seem astronomically microscopic considering how, we as a community, generally think and name things using "Best Practices". The name "node_modules" has never been a "thing" in Drupal, even as a remote concept AFAIK and was and has been only ever introduced as a known FE "thing".
I find this argument extremely weak and hypothetical. Just because something is possible doesn't always make it a reality. It would be far different if there was a previous core module, naming convention, or published contrib module... none of which has happened.
If this truly is a concern, the solution is simple: warn the one or two hypothetical users that have done this about the change in an update hook, as mentioned in the IS (I personally still think this is overkill).
Also, this really isn't an "API break" anyway. Any "fix" would be relatively simple: rename the very hypothetical folder/module.... boom, done.
This change would not actually affect any logic inside the very hypothetical folder/module (which is what an "API break" really is about).
Comment #156
markhalliwellComment #157
alexpott@hussainweb well then this patch should remove all configurability because having it configurable in one place and not another does not make sense. Let's just combine the changes and do this in 8.1.x - it is around the corner and the correct place to make non-bug fixes anyway.
@markcarver trading a broken site against a slow cache clear is not a hypothetical comparison. This can change can break sites. Please stop telling me this is hypothetical. Unfortunately we allow our folder structures for discovery extensions to be a complete wild west. But also note, I've note been saying don't make this change I've been trying to help this change go in in a way that will not break existing sites - so, exactly what is the problem?
Comment #158
alexpottAll I'm suggesting is that we do this. And we need to test the new setting probably by extending
Drupal\system\Tests\File\ScanDirectoryTest
forfile_scan_directory()
changes and we need to add a test for the ExtensionDiscovery changes somewhere...Also I don;t think we need to prefix the setting with
drupal_
this is a Drupal setting after all.Comment #159
alexpottHere's the test for
file_scan_directory()
. Still need a test forExtensionDiscovery
Comment #160
markhalliwellI wasn't saying it towards you specifically. I was using it to point out that issue keeps slanting towards a tone that has no real merit.
So, no. I'm not going to "stop" using that word, especially when it's being used quite appropriately:
That this issue can, or rather, might break a site. That statement isn't fact. It's bolstering a "what if" scenario that has no real basis on truth, whatsoever, other than the coincidental name of the folder. There is a minute possibility (if not, complete improbability) that some individual decided (in 8.x no less which has far fewer contrib development than 7.x btw) to name their module or group modules in a sub-folder with the same names?
Really?
I've seen stuff go in far more damaging than this, often without any perceivable foresight into the damage it "might" cause. That fact that this issue has already identified this scenario and provided a viable "solution" (update hook, see IS) is a far better deal than other core issues that had no trouble "breaking" the FE left and right in the name of "progress" (or "tasks" I should say). In fact, it actually astounds me to see what core picks and chooses as a "BC break".
It's hypocritical, a double standard and quite frankly... complete rubbish.
---
I'm not going to argue anymore about this. I have repeated myself multiple times now, with clearly no ability to convey English in a comprehensible manner.
So, since 8.1.x is "just around the corner", then let's just stop the bikeshedding and taking things personally. The only reason this issue is on 8.0.x is because of the backport policy.
---
When we do backport this issue finally to 7.x, it will still need some sort of mechanism to warn users of ignored folders/modules that might have the name. I thought It made sense to figure that out here/now... but I guess it's being decided that it's now "out-of-scope" until we backport?
Comment #161
markhalliwellThis re-introduces the bug where
$default_nomask
is always applied even if an explicit$options['nomask']
value was provided, see #133.Comment #162
hussainwebDoesn't this constitute an API break? If you see the patch in the issue for 8.1.x, this is exactly what we do there (because the function signature changes).
Comment #163
hussainwebOh, I see you changed this issue to 8.1.x-dev. Do you mean we shouldn't really try this for 8.0.x?
Side-question: What happens to 8.0.x after 8.1.0 comes out? Will it be supported for bug-fixes for some time?
Comment #164
markhalliwellYes, that's what seems to be the consensus. I was going to change it to 8.1.x, just for the sake of stopping the the "argument". I really don't care when/how this is done... just that it's done so we can move on to the 7.x backport.
Comment #165
alexpottre #161 mmm that means the patch on #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance was wrong - which shows the important of having a test. Which this patch now has for this.
Comment #166
alexpottGiven that my tone has no real merit, I guess my addition of tests has no real merit either. So I'm not going to add the test for ExtensionDiscovery because I don't want to spend my time doing that. But a test does need to be added before this should be committed.
Comment #167
markhalliwello_O
I did not say that you or your tone "has no real merit". I said [this] issue keeps slanting towards it when it's really not necessary, especially given the solutions already provided and the context for what this change implies (and its actual reality of impact). Ok.. done.
---
Re: #2640572: Allow extending the blacklist in RecursiveExtensionFilterIterator to improve directory search performance
I think that issue should be closed since this one is now 8.1.x. It's only adding to the confusion, especially considering that work has been make on the above patches (in this issue) after that other issue/patch was created.
Comment #168
alexpott@hussainweb once 8.1.x comes out 8.0.x is unsupported - see https://www.drupal.org/core/release-cycle-overview#minor
Comment #169
hussainweb@alexpott: Thanks! I think then there is no point trying a backward compatible way for 8.0.x. 8.1.x it is.
Comment #170
Mile23+1 on this. It's good to make it configurable, and putting it in settings.php means it can be backported and D7 benefits as well.
Using settings is a good idea, but then we have to explain it to the user better. This comment and maybe other comments need to explain the scope of the ignore. It's not obvious what's Drupal scans and what it doesn't, so what's an end user to think?
Maybe something like "Will ignore directories with this name in any directory scan, such as finding extensions." and also add @see file_scan_directory and @see ExtensionDiscovery.
Comment #171
alexpottAs no one else wrote the test I decided too and hey look the changes to
RecursiveExtensionFilterIterator
didn't work. Patch attached fixes that and tests it.Comment #173
alexpottUnrelated random test fail due to #1993928: Language of parts: Introduce a language toolbar button
Comment #174
alexpottComment #175
andriyun CreditAttribution: andriyun at Skilld, Drupal Ukraine Community commentedLooks perfect for me.
tests, docs, proper comments
Thank you @alexpott
+1 to RTBC
Comment #176
neclimdulThis seems out of date now, there is in fact a default.
Does this need to be escaped? We should be able to trust settings files but it is not clear from the settings documentation that it is a regex you're providing not a directory name. IMO we should document that it is a regex, or we should preg_quote().
Globals are arguments so we are hiding a API change to ExtensionDiscovery::scan() here. I tried to come up with a suggestion for clarifying it but ExtensionDiscovery is quantifiably complex. Maybe it is just ok? Maybe we make the blacklist an argument to scan() similar to file_scan_directory? I don't know.
This seems less clear than it could be. This provides the default list of directories that will be ignored by Drupal's file API's, it is only hard coded to disable scanning in ExtensionDiscovery as mentioned in the previous bullet.
Comment #178
alexpottThanks for the review @neclimdul
file_scan_ignore_directories
setting. That would open all sorts of interesting ways to break stuff.Comment #179
alexpott8.2.x automated testing was turned off - uploading the patch again...
Comment #180
donquixote CreditAttribution: donquixote as a volunteer commentedThree things.
------------
First: file_scan_directory() is already a poor piece of code, architecturally and performance-wise.
A list of flaws: #2696335: Optimize file_scan_directory() / drupal_system_listing()
Adding more complexity (special case logic, configurability) to it only will make it harder to refactor.
--------------
Second: You are saying that the folders will be named node_modules and bower_components. Fine.
But why do these folders need to live in the same parent folders that are scanned for Drupal extensions?
E.g. with Composer Manager, or with Libraries API, this 3rd party stuff would live in folders outside of the Drupal extension paths.
The same should apply to nodejs and bower.
Can someone explain why this is not possible? Maybe I am missing something.
---------------
Third: In my view, the Drupal extension folders are "owned" by Drupal, or by the respective extension (module, theme). The extensions should control what is scanned by Drupal and what isn't.
In an ideal world, this would be a whitelist: E.g. a file at the root of the module/theme package, similar to a composer.json, would tell Drupal about folders that contain templates, themes or modules.
Since this is not the case right now, the option left to us is a blacklist. Either a central file that tells Drupal which folders *not* to scan, or local files to suppress scanning individual subfolders.
Or maybe: If no central file is found, scan the entire module/theme. If a central file is found, let this file control what is to be scanned and what isn't.
Either way: The thing that tells Drupal what to scan or what not to scan would be Drupal-specific. It would be named ".drupal-whatever", and it would only be read by Drupal. I guess one could call it a Drupalism. The extension written for Drupal. So what? A composer.json is a Composerism. A package.json is a nodeism. So what?
A "do not scan the node_modules subfolder" or "please scan the following folders for Drupal-related stuff" should be specific to an extension (module, theme), not system-wide.
--------------
I am happy to discuss this on IRC, if we want to reduce noise in this issue. I was too impatient this time.
Comment #181
donquixote CreditAttribution: donquixote as a volunteer commentedAnd btw:
Since file_scan_directory() already uses (indirectly) a service (via file_stream_wrapper_uri_normalize()), I think we should make the entire thing a service, and reduce the function in file.inc to a \Drupal::service(..)->..() call.
Another btw:
What if the function is called with a $depth !== 0 ?
The only solution is to split this into a public part, and a "private" recursive part.
EDIT: Oops, sorry. The docblock says to not call this with an explicit depth.
Comment #182
donquixote CreditAttribution: donquixote as a volunteer commentedOh, another btw:
If a $options['recurse'] is false, then the is_dir() never happens.
If now a directory matches the $mask for files (e.g. a directory named "foo.html.twig", then this directory is added to the $files array.
Comment #183
donquixote CreditAttribution: donquixote as a volunteer commentedI thought about this some more, and did local experiments.
Whether my previous comments make sense or not, I no longer see them as a reason to block the patch (#179).
Comment #184
dddbbb CreditAttribution: dddbbb commented@donquixote Regarding your second point/question from #180. bower_components can be configured to live pretty much anywhere and be renamed. node_modules can do neither. It's common for themes to rely on node_modules (e.g. to use a task runner like grunt/gulp, process files etc), this normally means that node_modules will need to be present within the theme (normally theme root).
Comment #185
Mile23Re #181:
If extension discovery is a service, it should not have other services as dependencies, and must, never, ever, ever, ever call anything on \Drupal. :-)
That way we can use it at install and update times, run-tests.sh can use it, and we might even be able to make it a component so that other libraries can re-use it. (See https://github.com/paul-m/drupal-merge-plugin )
Really this issue is an implementation detail of the refactoring of ExtensionDiscovery and it's friends: #2208429: Extension System, Part III: ExtensionList, ModuleExtensionList and ProfileExtensionList
Adding parent: #2186491: [meta] D8 Extension System: Discovery/Listing/Info
Comment #186
donquixote CreditAttribution: donquixote as a volunteer commented@danbohea
It is "node_modules" relative to the location of the main packages.json. My idea was (and no longer is) to put the main packages.json somewhere else. Either outside the theme, or in a subfolder of the theme. But if people think that theme root is the best place, so be it.
@Mile23:
Yeah.. my remark was simply an observation. If we can get rid of this dependency, all the better.
--------------
I still think that file_scan_directory() is a too flexible API.
Basically there are 3 use cases:
- Extension discovery (D7 only. In D8, this has moved elsewhere).
- Discover "stuff" (e.g. templates) within one extension.
- Discover "stuff" (e.g. css files, style images, etc) within another folder, e.g. public://styles (ImageAdminStylesTest::getImageCount()).
The discussion we are having here applies only to some of these use cases. It feels wrong to put this all within one one-size-fits-all directory scanning function.
We should rather have one (or more) functions/methods that do the actual scanning, and other APIs on top of this that understand which subfolder names should be ignored.
For D8 extension discovery, we could factor out the actual file scanning part. In its current state, I think it is too big currently.
For D8 file_scan_directory(), we could provide alternative functions/methods for dedicated use cases. And also split out the recursive part. And get rid of dependencies.
For D7 file_scan_directory() .. long discussion.
This said: We can move on with this issue, and handle possible refactoring in a follow-up.
Comment #187
dddbbb CreditAttribution: dddbbb commentedThe latest patch (#179 right now) makes a change to default.settings.php. In Drupal 8, are these really defaults that are loaded unless you say otherwise or would a user also be expected to ensure the following lines are also copied to their settings.php file (a la D7)?:
Also, I'm not seeing any real performance gains with the patch in #179 (yes, I've been ensuring the code above is copied to settings.php). I've been running
time drush cr
to get basic insight into performance with & without the patch applied and also with & without the presence of vendor directories for comparison (node_modules & bower_components). The results suggest that the patch is making little to no difference with the vendor directories present. Not applying the patch and removing the vendor directories seems to guarantee vastly improved performance. Am I missing something here? I appreciate that not all themes are created equal but surely the comparative tests I'm doing here should rule that out?Comment #188
bradallenfisher CreditAttribution: bradallenfisher commentedA simple stop-gap solution is to edit you php.ini file and set
pcre.recursion_limit=10000
by default it is set pretty high, causing the segfault errors.
remember to restart apache after this change, and if you are using php-fpm restart php-fpm as well.
I know this isn't the solution, but it allows me to happily go back to FE dev, and I can still use drush like normal. :)
Comment #189
MiSc CreditAttribution: MiSc at Wunder commentedI think this patch complicates things more than solving it in a easy way. There are folders that never should be scanned, so why add another setting for it? Is there are case then you want scanning of node_modules or bower_components in Drupal? I have never had such a case myself.
Comment #190
mcrittenden CreditAttribution: mcrittenden commentedThe patch in #179 won't apply since on Drupal 8.1.2 and above. Back to needs work.
Comment #191
joelpittetre #190@marcvangend It's targeted to the 8.2.x branch.
re #189 @MiSc if a custom module uses/used that namespace they may want to avoid renaming, it's away to provide a BC layer.
re #187 @danbohea They are expected to copy them to settings.php if they are upgrading an existing site. The performance is similar to #135 did you use #131 performance test?
RTBC as @donquixote has another issue for further improvements in the architecture realm.
Comment #193
alexpottRerolled the patch and improved the test to use PHPUnit assertions.
Comment #195
alexpottMore improvements and fixes to the added test.
Comment #196
MiSc CreditAttribution: MiSc at Wunder commented@joelpittet, that should be a custom module by really bad design. That is not a drupal-way of doing things. But just wanted to point out that this complicates bits more than providing a simple solution on the problem. But I leave it.
Comment #197
dddbbb CreditAttribution: dddbbb commented@joelpittet I didn't use that test but will do so next time, thanks.
Comment #198
steveoliver CreditAttribution: steveoliver commentedLooks good. Still applies to 8.2.x.
A few nits:
option (is) passed in.
API.
Before:
0.034054040908813
0.13604807853699 <--
0.00011086463928223
After:
0.034997940063477
0.0023999214172363 <--
0.00012397766113281
Comment #199
neclimdulNode doesn't seem to be hosting the FAQ anymore so linked to the github repo... hopefully it won't go away.
Looks good but summary nit, we comment that
node_modules
will never move or be different but seem to ignore the fact thatbower_components
is totally moveable. https://bower.io/docs/config/#bowerrc-specificationComment #201
sylus CreditAttribution: sylus commentedWould this still be the correct issue to upload a compatible patch for 8.1.x for composer patch workflow? Or should I create a postponed issue related to this? The patch doesn't apply in 8.1.x only because of the non existence of a test file where lines want to be appended to.
I really appreciate the work done in this issue as this has just destroyed my performance when playing around with angularjs 2 and a rather big node_modules folder. ^_^
Comment #202
mirie CreditAttribution: mirie at Phase2 commentedI removed the part that tries to alter the nonexistent test file (core/tests/Drupal/KernelTests/Core/File/ScanDirectoryTest.php) in this patch file. It applies to 8.1.6
Comment #203
mirie CreditAttribution: mirie at Phase2 commentedSorry forgot to include the interdiff for the patch.
Comment #204
alexpott@mirie thanks for testing the patch - I was a bit confused at first - but then saw you've applied this patch to a version without \Drupal\KernelTests\Core\File\ScanDirectoryTest - actually this test exists in 8.2.x - I guess you are used a version of Drupal 8 where this test has not been moved to a PHPUnit based kernel test (for example 8.1.x). Re-uploading #198 as that is still the patch to apply to 8.2.x. This patch is only eligible for inclusion in the next minor release of Drupal 8 because it makes an API addition and is a task - the next minor release is 8.2.0.
Thanks again for testing the patch.
Comment #206
xjmI think it would be best for catch to take a look at this RTBC issue based on his earlier reviews. Thanks all!
Comment #208
catchWould be really good to keep going with modernizing and clarifying the extension discovery system, this patch still feels like a bit of a workaround. However it works around something very annoying, and a lot happier with current iteration than earlier versions.
Committed 68a543a and pushed to 8.2.x. Thanks!
#2482549: Ignore node_module folder in core to use Drupal with npm/grunt/nodejs looks fine for the 7.x backport issue, so marking fixed!
Comment #209
japerryFor those still on 8.1.x (8.1.7) I found that patches were not working correctly. Here is a re-rolled version against 8.1.x HEAD (7064dfc4e896630b7dce9f06383f88715674f597) but it also works for 8.1.7
Comment #213
Elijah LynnHere are some commands to analyze things like this.
` tree node_modules | tail --lines=1`
> 6302 directories, 82349 files
However, an `strace drush cache-clear all 2>&1 | grep "node_modules" | wc --lines`
> 1804155 (almost 2 million file requests!)
`time drush cache-clear all`
> real: 1.45 seconds
After this patch:
> real 1:25 seconds
This is on an amazee.io docker container with a Ubuntu/GNU/Linux host (aka my Thinkpad) and a pretty bloated Drupal 7 site. Our Mac users were having cache-clear times up 30+ minutes due to Docker for Mac NFS issues. @schnitzel did a good job pointing us here.
Comment #214
marcelovaniSee fix for Drupal 7 here https://www.drupal.org/node/2482549
Comment #215
quietone CreditAttribution: quietone at PreviousNext commentedPublished the change record