Updated: Comment #88
Problem/Motivation
With system_list(), memory usage increases significantly when large numbers of modules are enabled.
Issue #887870: Make system_list() return full module records triggered this memory spike, since full module records are returned, rather than just module names. Since #328357: Allow module .info files to add CSS/JS allowed module .info files to pull in CSS/JS, this can lead to very high memory usage.
Examples:
- 31 modules enabled: Roughly 400kb of memory used
- 90 modules enabled: Roughly 676kb of memory used
Proposed resolution
A lot of information in the system table isn't needed with system_list(). Instead, we can just store the array directly with the relevant info and improve how the cache is handled.
Here is an example of the performance improvements:
Before Patch
- Cache clear: 38206ms, 188.5MB RAM
- Cached page load: 315ms, 15.25MB RAM
After Patch
- Cache clear: 35875ms, 186.5MB RAM
- Cached page load: 308ms, 14.5MB RAM
Please note that this patch requires update.php to be run after it is applied.
Remaining tasks
Currently this patch is queued for re-testing.
User interface changes
None
API changes
Rather than loading full module records into the system list, only the requested module information is returned.
Related Issues
- #887870: Make system_list() return full module records
- #328357: Allow module .info files to add CSS/JS
- #624848: Allow to retrieve a list of modules defining a certain .info file property
- #402896: Introduce DrupalCacheArray and use it for drupal_get_schema()
- #1503224: Cleanup module_list()
- #1493098: Convert cron settings to configuration system
- #1029460: Find a better hook than hook_init() for drupal_add_js() and drupal_add_css()
- #1861676: Remove stylesheets[] and scripts[] .info file property support for modules
Original report by [catch]
system_list() with 31 modules enabled uses around 400kb of memory. On a D7 site with 90 modules installed this increases to 676kb - 13% of total memory usage on that site.
This is due to #887870: Make system_list() return full module records - we did benchmarks there but no-one including me actually checked the memory implications.
We have to keep info there, which is the biggest issue, due to css/js additions in system_init(), I didn't like that change but it's too late to change now at least for D7. Although with a small API change we could strip some of the other info keys, but that needs to happen in another issue. edit: that's out of the question for D7 too, we're using things like configuration_path via system_get_info() too.
However there's a lot of crap in system table that we don't need, so let's try to cut that down.
Here's how block module looks in HEAD:
[block] => stdClass Object
(
[filename] => modules/block/block.module
[name] => block
[type] => module
[owner] =>
[status] => 1
[bootstrap] => 0
[schema_version] => 7007
[weight] => -5
[info] => Array
(
[name] => Block
[description] => Controls the visual building blocks a page is constructed with. Blocks are boxes of content rendered into an area, or region, of a web page.
[package] => Core
[version] => 7.0-dev
[core] => 7.x
[files] => Array
(
[0] => block.test
)
[configure] => admin/structure/block
[dependencies] => Array
(
)
[php] => 5.2.4
[bootstrap] => 0
)
)
Instead of this we can just store the array directly.
These numbers include the entire memory usage from system_info() - including bootstrap, themes and filenames, but the saving is only against modules, still very measurable though:
With 30 modules:
before patch: 415,016 / peak: 495,464
After patch: 330,240 / peak: 394,712
With 90 modules:
Before patch: 691,620 / peak: 829,164
After patch: 569,200 / 684,872
So that's > 100kb off inclusive peak memory usage, and between 85kb to 120kb off inclusive memory usage. With 150 or 200 contrib modules installed we could be looking at more like 200kb of savings.
This is going to be a bit tricky with upgrades due to the array being used in full bootstrap, you can get more modest memory savings from keeping it as an object with just the info property, that's about 15kb more expensive with 30 modules but less of a change. Uploading both patches for now.
Comments
Comment #1
catchComment #3
sunI seriously wonder why this patch passed. $record->name is not used anywhere?
Powered by Dreditor.
Comment #4
catchHere's the 'safe' patch that should pass at least the help tests.
Comment #5
catchSo #4 is a simple change that should be pretty uncontroversial.
Here's a different approach, requires an API addition to system_get_info(), but one that is completely backwards compatible, this has much, much bigger memory savings. This is revisiting Frando's suggestion from http://drupal.org/node/887870#comment-3358654 except it keeps system_get_info() handling the caching and tries to optimize that for the most common cases.
system_get_info() now accepts an additional, optional $keys argument - an array of info keys that are needed. This allows system_add_module_assets() to only request stylesheets and scripts for example.
Using xhprof on a site with 32 modules enabled.
With HEAD, system_list() takes 400k/6% (500k/6.9% peak) memory.
With the patch system_list() uses 190k/2.8% (245k/3.5% peak) memory.
system_get_info() with the patch is using it's own cache, that takes 13k / 0.2% (18k/0.3% peak) memory.
So combining the two, we're looking at around 50% memory savings or around 200k of memory. Screenshots attached for that.
I then enabled all core modules + ctools, media, media gallery and some others to get a total of 65 modules enabled.
HEAD:
system_list: 700k / 7.4% (778,776 / 8.3% peak)
Patch:
system_list: 250k 2.8% (310k / 3.3% peak)
system_get_info() 22k / 0.2% (43k / 0.5% peak).
So with 63 modules there's a 60% reduction in memory usage, and getting close to half a megabyte of memory saved.
Also note that in HEAD the percentage change from memory usage jumps from 6% to 7.4%. However with the patch we're static at 2.8% - this means we're at a point where the memory usage from this at least isn't growing faster than everything else when modules are added.
If we doubled the number of modules again (126 is nowhere near the most I've seen enabled on a site), then we're looking at a full mb or more of memory saved here. At that point we might also see measurable performance changes as well, since the quantity of data to fetch and unserialize is going to be quite large by this point.
One issue here is the old system_list cache entry is incompatible with the new one, so had to change the cid slightly, better suggestions than adding ':' to the end welcome.
Comment #6
moshe weitzman CreditAttribution: moshe weitzman commentedI read over the code and it looks like a safe (i.e. backwards compatible) and highly desireable memory improvement.
Comment #7
catchAlso note that this is a straight regression from Drupal 6, and a recent regression in Drupal 7 - original patch was committed September 2010 which was either 9 months or a year after code freeze depending on when you count from.
Overall we have in the region of 2mb memory regression from Drupal 6 when using an opcode cache - see the numbers at http://ca.tchpole.net/node/2.
Comment #8
catchComment #9
Dries CreditAttribution: Dries commentedWould be cleaner if we'd get rid of the ':'. I think we can.
I had to read this twice and look at the source code. It wasn't clear what 'info object keys' are. Also, the parameter is called $options instead of $keys.
Also,I'd mention why this is useful to specify.
Any reason we don't implode with ':'? Looks like it would be safe to.
Comment #10
sunI disagree with the removal of the ':' suffix, as this cache clear is using a wildcard. Thus, removing the ':' suffix may unintentionally kill other caches, and it's going to be very hard to notice and detect this bug when it happens.
Comment #11
catchReminding myself I need to refresh this. I agree with #10 - the namespacing is important to me.
The rest of #9 looks fine - will fix those in a re-roll.
Comment #12
catchThis should cover everything except the ':' suffix on the wildcard clear.
Comment #13
catchBumping this to major since 500kb of memory with just 63 modules enabled is significant. I have seen more than one Drupal 6 site with over 200 modules enabled, we're looking at a couple of megabytes of memory wasted on sites like that.
Comment #14
sunThe unserialize() is only required for themes then.
We should still key the module list by module name (i.e., key + value being identical). Themes are keyed by their name, too.
Shouldn't the items live in cache_bootstrap as well?
This reminds me a lot of #624848: Allow to retrieve a list of modules defining a certain .info file property
However, that was purely meant for .info file data retrieval that doesn't happen on every page load.
From a performance perspective, in this patch I could imagine that we'll quickly see duplicate caches and needless lookups. For example, scripts + stylesheets are retrieved here, while another module may only retrieve stylesheets, and yet another only scripts, etc.
It might be more sensible to only allow for a single .info file key ($option) instead of multiple.
"A list of .info file properties to retrieve."
1) We're missing a delimiter after $type.
2) 'system_info' and ':' don't need to be concatenated.
3) Replace '|' with ':' in the implode?
4) If we're going to keep multiple $options, then we should at least sort() them.
Powered by Dreditor.
Comment #15
catchFixed.
Fixed.
No, we should never, ever have been loading data from .info files in bootstrap, and this is just a bandaid because 7.x came out with this regression. I have another issue open to move the css/js code out of system_init() into hook_page_alter() or similar, so when that's done, this will never be loaded during bootstrap again.
I wish someone had brought that up on the issue where this regression was introduced.
Bettter than loading the entire thing on every request.
Those are contrib modules, this is core. If you install a stupid contrib module, that's a shame. Let's not add i/o and cpu usage to core just in case.
No because then instead of this patch adding one extra network request, it would add two extra network requests, for data we know system_init() is going to be requesting every bootstrap on every Drupal 7 site everywhere.
Fixed.
Comment #16
aspilicious CreditAttribution: aspilicious commentedRemove trailing whitespace in next iteration :)
Powered by Dreditor.
Comment #17
sunmmm, the sort() was supposed to be at the beginning of the function; i.e., for the initial implode().
Powered by Dreditor.
Comment #18
catchfwiw this is probably something that can use CacheArrayObject from the theme registry issue, this would also allow us to have just a single $key parameter and share a single cache entry without a lot of extra code, so I may re-roll using that rather than continuing with the current version of the patch (assuming the other has a decent chance to get in).
Comment #19
catchHere's a patch using that, the includes/bootstrap.inc hunk is just copied from #402896: Introduce DrupalCacheArray and use it for drupal_get_schema(), putting it here so tests will run.
Comment #21
catchFixed test failures.
Uploading xhprof screenshots.
Memory usage for system_list() before the patch on a vanilla install is 400kb.
Memory usage for system_list() + system_get_module_info() combined on the same install is 226kb (this is with stylesheets and scripts cached).
Comment #22
xjmTagging issues not yet using summary template.
Comment #23
Stefan Freudenberg CreditAttribution: Stefan Freudenberg commentedUpdated the patch to current 8.x and fixed whitespace and indentation issues. I assume that at least system_get_info() is covered by tests, not so sure about system_add_module_assets(). Does Simpletest provide metrics about test coverage?
Comment #24
catchNew patch now that DrupalCacheArray is in (woot!) based off #23.
@Stefan Freudenberg we don't have code coverage on qa.drupal.org, there's http://drupal.org/project/code_coverage but that hasn't been run for a while afaik.
However system_add_module_assets() is run on every request from hook_init(), so it's executed a lot by the core simpletests, not sure if it has dedicated core test coverage though in terms of asserting the behaviour.
Comment #25
Stefan Freudenberg CreditAttribution: Stefan Freudenberg commentedI say it's good then.
Comment #26
catchThere's something wrong here, looks like I didn't make the changes to system_list() itself.
Comment #27
catchRe-rolled with the module.inc hunks this time.
Comment #28
bfroehle CreditAttribution: bfroehle commentedMinor typo here (For For).
Comment #29
bfroehle CreditAttribution: bfroehle commentedI cleaned up the For For typo and simplified the control flow of system_list() slightly. See the attached interdiff.
Comment #30
bfroehle CreditAttribution: bfroehle commentedI think it'd be nice to have some more documentation here. In particular, reading just the docblock I'd have no idea what the array offsets into a ModuleInfo object would correspond to. At a minimum there should probably be a note saying that users should instead call system_get_module_info().
Comment #31
catchPatch no longer applied due to cache API changes.
I've re-rolled and added that note, looks better?
Comment #32
catch#31: system_get_module_info.patch queued for re-testing.
Comment #33
catchRe-rolled for /core.
Comment #34
sunDoesn't ModuleInfo or DrupalCacheArray cache the info statically already? I thought that was part of the primary purpose of DrupalCacheArray...?
29 days to next Drupal core point release.
Comment #35
catchDrupalCacheArray doesn't static cache by itself, instead you static cache the class in a factory function (or in this case wrapper since it's not returning the instance of the class).
Comment #36
catch#33: system_info.patch queued for re-testing.
Comment #37
swentel CreditAttribution: swentel commentedJust did some (basic) memory profiling with devel at "admin/reports/dblog path" as user 1 on a Plain Drupal install with a couple of additional core modules enabled.
Before patch: Memory used at: devel_boot()=1 MB, devel_shutdown()=3.39 MB, PHP peak=3.75 MB.
After patch: Memory used at: devel_boot()=1 MB, devel_shutdown()=3.25 MB, PHP peak=3.5 MB.
So it saves memory, good! I'm not setting this RTBC, maybe some else could do a quick profile as well?
Comment #38
sunRe-uploading #33, as the testbot got stuck on it.
Comment #39
sunComment #40
catch@sun. Themes have had .info available for several releases, there is no need to change that in Drupal 7.
However modules had .info (and the rest of the contents of the system table) loaded into memory only after #328357: Allow module .info files to add CSS/JS which was a feature added to Drupal 7 way after code freeze that introduced this major performance regression.
This patch just attempts to fix that performance regression in Drupal 7 without breaking backwards compatibility.
The entire module system depends on system module, as does system_get_info(), that is not introduced here.
Comment #41
Niklas Fiekas CreditAttribution: Niklas Fiekas commentedIf it's a class, should we have the ()? Also, there should be an empty line before the @see.
There should be an empty line before this @see.
Comment #42
catchAdded the blank line before the @see.
Comment #44
catchPatch needed updating for DrupalCacheArray moving to namespaces.
Comment #46
catchComment #47
moshe weitzman CreditAttribution: moshe weitzman commented#46: system_info.patch queued for re-testing.
Comment #48
moshe weitzman CreditAttribution: moshe weitzman commentedStill green, but the new class needs PSR-0 shitification.
Comment #49
tim.plunkettNow PSR-ified.
Comment #50
moshe weitzman CreditAttribution: moshe weitzman commentedThanks Tim.
Comment #51
sun#49: drupal-1061924-49.patch queued for re-testing.
Comment #52
catchUnassigning me now this is RTBC since any of the three of us can commit it.
Comment #53
sunRe-rolled for #1503224: Cleanup module_list()
...which allows to simplify the module_list() code, and also converted an instance in search.admin.inc.
Comment #54
sunComment #56
tim.plunkett#53: drupal8.system-info.53.patch queued for re-testing.
Comment #58
tim.plunkettAh! These exceptions are fixed by #1503224-52: Cleanup module_list(). Here's a combined patch to prove it, once that's in #53 should pass just fine.
Comment #59
tim.plunkettRerolling around #1493098: Convert cron settings to configuration system.
Comment #60
tim.plunkett#59: drupal-1061924-59.patch queued for re-testing.
Comment #61
sun#1493098: Convert cron settings to configuration system landed, so this is ready to be committed again.
Comment #62
catchThanks for helping to push this one home. Committed/pushed to 8.x. Moving to 7.x for backport.
Comment #63
BerdirBackport for 7.x based on #46, I believe everything that followed was 8.x specific, correct me if I'm wrong.
Seems to work, had to run update.php, drush seems to be broken, gives me a fatal "DrupalCacheArray not found", looks like bootstrap.inc isn't included or something like that?
Comment #64
BerdirApplied the patch on a big site, memory usage of system_list() went down from 3,4MB to 600KB.
Comment #65
tim.plunkett#63: system-list-memory-usage-1061924-63.patch queued for re-testing.
Comment #66
Cameron Tod CreditAttribution: Cameron Tod commented#63: system-list-memory-usage-1061924-63.patch queued for re-testing.
Comment #67
Cameron Tod CreditAttribution: Cameron Tod commentedI had a quick debug of drush with this patch, and yes, as berdir suspects in #63 bootstrap.inc is not being included by _drush_bootstrap_drupal_root(). I will have another look later, as I think it would be great if we could get this in soon, without having to patch drush.
Comment #68
moshe weitzman CreditAttribution: moshe weitzman commentedI'm happy to answer Drush questions. Not sure that the error is drush's fault though. See me in #drush on IRC.
Comment #69
BerdirActually, it's quite straight forward. I did add the .info.inc to files[] but I didn't actually move the class definition to that file :)
This should be good, works fine with drush, let's see what the tests have to say.
Comment #70
BerdirErm.
Comment #71
Cameron Tod CreditAttribution: Cameron Tod commentedWith drush 5.7:
Comment #72
BerdirDid you run update.php/drush rr? Works fine for me after drush rr but I haven't tested if update.php works too.
Comment #73
Cameron Tod CreditAttribution: Cameron Tod commentedSo after a rebuild of the site, the .info.inc file is picked up ok. If I use an existing install, then apply the patch, Drupal can't boot from either the web interface or drush. If I rebuild the registry with rfay's script, everything comes together, the class file is included, and Drupal boots.
What's a better approach, manually including system.info.inc on demand, or moving SystemModuleInfo into, say, system.module?
edit: or implementing an update hook or something?
Comment #74
BerdirIt was in system.module, that's why it broke drush :) As I said, have you tried running update.php?
Comment #75
podarok#70
+}
\ No newline at end of file
Comment #76
Cameron Tod CreditAttribution: Cameron Tod commentedYep, works fine after running update.php. Is it ok to commit a patch that will kill sites until they run update? Is there a way we could fail more gracefully?
Here's a patch with a couple of small code style fixes.
Comment #77
BerdirI'm not sure. Not sure if we've done that with the other classes that we added.
Also, note that the Return/Returns change is wrong in the 8.x patch that got commited too, so that will need to be forward-ported.
Comment #78
podarok#76 looks good for me
lets wait for a testbot
Comment #79
bojanz CreditAttribution: bojanz commentedYou could never just replace the files and expect the site to work without running update.php. That was never claimed to be supported, no?
So the current behavior is something I'd expect.
Comment #80
geerlingguy CreditAttribution: geerlingguy commentedI've always expected update.php is expected immediately after Drupal point upgrades, and, while I don't have specific references, I know that other changes have disabled sites until update.php is run. That's why we have the instruction to put the site in maintenance mode while updating :)
Comment #81
geerlingguy CreditAttribution: geerlingguy commentedHere are some testing results on a MacBook Air with a hefty Drupal 7 site. I ran each test three times, averaging the results.
Before Patch
Cache clear: 38206ms, 188.5MB RAM
Cached page load: 315ms, 15.25MB RAM
After Patch
Cache clear: 35875ms, 186.5MB RAM
Cached page load: 308ms, 14.5MB RAM
Also, I found one more whitespace error in the patch in #76 that should be fixed in this patch.
Comment #82
pounardModule info should never be accessed on normal pages, it should be used for administrative purpose only: if we consider that as true and discourage module info read at normal runtime, it should never end up cached and we shouldn't need the SystemModuleInfo class.
Comment #83
catch@pounard: that used to be true until #328357: Allow module .info files to add CSS/JS landed (without any profiling). I'd love to see that patch reverted, but it definitely won't be reverted from Drupal 7 so we need to fix the performance issue there. There's no issue currently open to revert the .info css/js adding but if you open one I'll follow it. Note there's also #1029460: Find a better hook than hook_init() for drupal_add_js() and drupal_add_css() open which would at least mean this doesn't happen on non-themed requests - right now the logic runs for autocomplete callbacks and all kinds of other irrelevant things.
Comment #84
pounardOk I see. So we are fixing in a complex fashion this problem because another peice of code is doing it wrong, that's the kind of almost necessary patch I don't like to see happening.
Comment #85
catchYeah me neither, but unfortunately I only did serious profiling of Drupal 7 for memory/cache size in December
20112010 (and no-one else had previously, or if they did they kept it a secret), about a week before it was released, so it was too late to fight the original patch at that point.Comment #86
sunSee also #1861676: Remove stylesheets[] and scripts[] .info file property support for modules
Comment #87
catch#81: system-list-memory-usage-1061924-81.patch queued for re-testing.
Comment #88
blueminds CreditAttribution: blueminds commented#81: system-list-memory-usage-1061924-81.patch queued for re-testing.
Comment #88.0
RainbowArrayWorking on issue summary template
Comment #88.1
RainbowArrayCompleted formatting of issue summary
Comment #90
AnybodyThis issue seems to be sleeping for quite a while. So let me ask, what's the current status on this? Is there still someone working on it?
Comment #91
mgiffordThe patch from #81 still applies. Seems to nicely break out the code. @geerlingguy's profiling from 2 years ago look like it will speed up page loads & reduce memory demands.
Why isn't there movement on bringing this into Core?
Comment #93
podarok#81 speedup a lot massive cache rebuilds on large site.
Looks good for me.
Let's push it
+1 RTBC
Comment #94
podarokI have tested it on another large site
after drush cc all
looks like this needs work
Comment #95
BerdirUpdating from this issue is a bit brittle. Did you try to run update.php first?
Comment #97
mgiffordAfter looking at the code, I'm a bit baffled at how it would affect user_access(). If we can't replicate this, it's probably worth resetting to RTBC.
@podarok - how large a site? 1 million nodes? 200 modules?
How does it work in one site and not another? Could there be a module conflict with it?
Comment #98
swentel CreditAttribution: swentel commentedThis patch has problems with rules, going to admin/config/workflow/rules errors in
Getting notices also like this at runtime
Comment #103
BerdirReroll, didn't check #98 yet.
Comment #104
joseph.olstadI just triggered all other php 5.x and 7.x versions to be tested.
Comment #105
joseph.olstadI'm reviewing the patches marked for 7.60, so far this is the only one giving my system fits.
patch #103 crashes on my system
test system is running php version 7.0.18-0ubuntu0.16.04.1
Here's the command that causes the crash:
drush cc all; drush status;
Here's the message:
I've tried WITH or without the other patches I'm using, this issue is repeatable on my system with drupal core 7.56 and a very simple installation.
The other patches I use are working great.
Here's a list of the other patches, I've included all of the 7.60 patches, except patch 103 above that crashes my system.
with or without my other patches, patch 103 is not playing nice.
Comment #106
joseph.olstadComment #107
joseph.olstadHere's a copy of my make file that works without the above patch, but as soon as I put patch 103 in, boom.
http://cgit.drupalcode.org/sandbox-joseph.olstad-2893160/plain/media_bol...
If needed I can share a copy of my test sql database , make a dump file.
Comment #108
joseph.olstadah yes I forgot to mention, memcached is installed on the test environment I'm using but it is not fully configured (the memcache module is not installed )
not sure if this has any impact.
Comment #109
joseph.olstadprobably won't make it into 7.60, so target 7.70
Comment #110
podarokRe #97
> @podarok - how large a site? 1 million nodes? 200 modules?
Sorry, it was in 2014, I'm already everywhere on drupal 8.
But if I recall it right - more than 200 modules and about 10000+ nodes
Comment #111
justindodge CreditAttribution: justindodge commentedI was playing around with this patch and also got the user access error:
Call to undefined function user_access() in includes/menu.inc on line 644
The stack trace looks like this:
I'm also working on a very large site that has many contributed modules enabled, but just by glancing at the trace, it doesn't seem like there is another contrib module to blame, but who knows.
I haven't been digging any deeper, just wanted to provide a little more info - hope it helps.
Comment #112
Fabianx CreditAttribution: Fabianx at Tag1 Consulting for Pac-12 Networks commentedThe patch is overall a nice idea, however has problem in the details:
- A better and more BC compatible approach IMHO would be to:
- Create a ModuleInfo class, replace the stdclass object with the ModuleInfo object
- Implement magic __get() method and return 'info' property lazy-loaded
- Lazy load via cache array as usual
- Add SELECT query directly to cache array resolveCacheMiss and only load _one_ at a time
Unless I miss something this should be 100% transparent to contrib
Example 3v4l: https://3v4l.org/ZjBkH