At http://drupal.org/node/197720 we identified that Drupal eats considerably more amount of memory on the modules page. This is due to all .install files being included and checked for possible update functions not run yet. Even if the PHP memory limit is made higher, this page might display a WSOD.
The reason .install files are included is to check whether they have newer update functions not yet run on the local database. So the admin is informed that certain modules need their updates run. So we need the list of update functions defined in each module enabled.
A few options to not include the .install file to get this list:
(a) Not serious: introduce a new hook to specify the latest update number. But this would need to be updated by hand, and the info is already in .install files.
(b) Preg through the .install files (so we can load and *unload* them), grepping for update functions. This sounds easy but could have false positives in commented update functions. The only real problem here would be if update functions later then the ones run already are commented out. I'd say his is highly unlikely.
(c) Load in .install files and tokenize them one by one. Then in the tokenized form, we can look up the function definitions. This surely avoids the comment problem from the previous option, but takes considerably more memory, as we would need to have the tokenized source of one .install file in memory at a time. This might end up requiring more memory memory in core itself as the tokenized source takes up more memory is a PHP array then when we include the file and tokenization takes place in PHP's internals (without userspace data access). *But*, we can unset() the file string and the token array, once done with a file, and we can go to the next. This also takes more time most probably then (b).
All these options can decrease the memory requirement for the modules page considerably. Although (c) might not decrease it for core-only installations, it will definitely decrease it for contrib (when lots more install files are loaded, so the fact that we can unload stuff becomes more of an advantage).
Opinions? Anyone up for implementation?
Comments
Comment #1
JirkaRybka CreditAttribution: JirkaRybka commentedI like the (b) solution most. If the preg checks for something like
[newline]function [module]_update_[digits](
, it will catch the // or # comments, leaving only just /* big block */ problem - I think it's not likely to occur on *latest* function in *released* code, it's quite edge case, and we risk maximally a false alarm prompting the user to run update.php (which then do nothing). Also this may be documented somewhere.Comment #2
moshe weitzman CreditAttribution: moshe weitzman commentedyes, b) sounds reasonable to me too.
Comment #3
scor CreditAttribution: scor commentedcould the cron be used to do these operations in batch and cache somehow the relevant updates in the db?
Comment #4
moshe weitzman CreditAttribution: moshe weitzman commentedscor - thats a performance optimization thats not needed. this mega load is a very infrequent event.
Comment #5
JirkaRybka CreditAttribution: JirkaRybka commentedAdditional thoughts:
I think that update.php should be fixed in the same patch, re-using the same detection mechanism. It's a pretty similar case: All .install files included after full bootstrap (and going to be even worse, if the disabled modules update patch lands - http://drupal.org/node/194310 ).
This will also make update.php offering the same versions as reported on modules page, thus partially fixing the "commented function" problem (in the sense of avoiding user's confusion, when update.php offered nothing after a warning shown on modules page).
The actual inclusion of .install files would be then done one-by-one in the batch processing, cutting on the update.php footprint. I'm not sure whether this is important, but keeping update.php on the safe side is always nice.
This will all result in the commented-out functions included in the update.php batch, though. But I think this may be easily solved by a simple function_exists() check, so that the commented-out functions will then behave just like if only just their content was commented-out.
So, we need to re-factor drupal_get_schema_versions() to the preg method, remove drupal_load_updates() here and there, and add module_load_install() and function_exists() to the batch. Opinions?
I'm not sure whether I'll find time to try and implement this, though.
Comment #6
Crell CreditAttribution: Crell commentedI'm actually working on an enhanced version of the page-split system for Drupal 7 that uses token_get_all(), so I'd actually favor C. While the tokenized version of each file will be non-small, we'll only ever have one in memory at a time so the cost is relatively flat.
Also not sure if I'll have a chance to work on this in the next day or two, but I'll see if I can play with it and come up with anything.
Comment #7
dpearcefl CreditAttribution: dpearcefl commentedIs this still a problem in current D6?
Comment #8
Crell CreditAttribution: Crell commentedThe "enhanced page split" I talked about in #6 is the registry, which sadly got ripped out later in Drupal 7's lifetime (at least for functions). I believe the class registry ended up using option B, string parsing, over tokenizing.
This probably is worth revisting for D7/D8, but not for D6. Refiling.
Comment #9
rfayComment #10
catchThis is very closely related to #1188084: Try to avoid loading every specified include file during theme registry rebuilds. (where we load very many if not all .pages.inc and .admin.inc also just looking for functions).
We are not going to be able to avoid loading all .install files from this though, because they're also loaded for hook_requirements() iirc. However it would be a good start.
Comment #23
smustgrave CreditAttribution: smustgrave at Mobomo commentedWonder if this can be closed? To my knowledge the update functions are displayed on the modules page? (Least I've never seen)