Problem/Motivation

"The $bootstrap parameter of module_list() says it controls whether or not a list of bootstrap modules is returned... but it doesn't always actually do that." [comment #65]

"Currently, module_list() caches its results based on its parameters. However, those parameters can change, and just calling module_list() on its own will return NULL if it hasn't already been "primed", since the default is to not refresh. However, who knows how it's been primed before you wanted it, or even if? That means you need to either pray or always tell it to refresh, which is nasty for performance. It also has no "I'm empty so rebuild" validation, so if you call it early in the page request you stand a very good chance of getting back NULL." [comment #50]

"Although in theory it's a simple fix, the entire Drupal house of cards is built on top of this and would come crashing down :)" [comment #65] Therefore, this issue will be corrected in D8 only. module_list() will remain as-is in D7, documented in the code so that developers are alerted to its unexpected behavior.

Proposed resolution

"module_list() needs to be totally rewritten, and probably broken into 2 separate functions: module_list() and module_list_bootstrap(). Both should self-initialize and cache the first time they're called, as well as accept a $reset parameter. That would make them safe to call. Trying to throw it all into one function is a horrid idea that causes all sorts of problems for, say, DBTNG." [comment #50]

"We probably want to...get rid of the nasty parameter list altogether." [comment #65]

"Also we need to consider restricting module_list_bootstrap() to page cache hits (which could turn into a miss after hook_boot()), and just using regular full module list for requests that definitely aren't going to be a page cache hit - currently we make two round trips to the cache where there should only be one and for any higher bootstrap level it's worth loading the full list." [comment #79]

Remaining tasks

  • Needs a patch for D8 breaking module_list() into module_list() and module_list_bootstrap(). The patch at #42 may provide a starting point. Comments in #44 and #45 should be considered.
  • Simpletest coverage based on comment from Dries in #44 and core gates.
  • This API change wil require a change notification.

API changes

In Drupal 7, module_list() currently is designed to return different results dependent on context:

  • Early in the bootstrap: Returns a list of vital modules required for bootstrap.
  • All other times: Returns a list of all enabled modules.

The old api includes a number of parameters for modifying its behavior, but they generally should not be changed from their defaults.

In Drupal 8, we split into these functions:

  • module_list_bootstrap(): Returns a list of vital modules required for bootstrap.
  • module_list(): Returns a list of all enabled modules.

Further changes to parameters and function behavior are TBD.

Original report by earnie

This patch supports http://drupal.org/node/196862 at #36.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

David_Rothstein’s picture

Status: Needs review » Closed (fixed)

Please reopen this if I'm mistaken... but as I said at the other issue, this patch doesn't seem to help.

Anonymous’s picture

Status: Closed (fixed) » Needs review

Reopened. This patch does indeed help and is more proper than #237336: Url Aliases no longer work comment #4 which does the same thing but every call and thus eliminating the cache.

Anonymous’s picture

FileSize
827 bytes

I've included a modification to the $bootstrap default value.

catch’s picture

I'm not convinced that drupal_lookup_path needs to be in path.inc at all. Why not just move it to path.module? (along with drupal_get_path_alias and drupal_clear_path_cache). Would need to be renamed of course.

drupal_get_normal_path needs to stay, and would need a function_exists for drupal_lookup_path (there's already a function_exists for custom_url_rewrite_inbound in there so I guess it'd be balanced if nothing else).

catch’s picture

Title: Using module_exists in path.inc causes warnings about list not being an array » URL aliases broken

retitling to something more eye-catching.

recidive’s picture

This change make sense, looking at the code if ($refresh || $fixed_list) seems to assume the first call to module_list() should be module_list(TRUE) which is not consistent. But $list is not an array only when it is not set, so a check for !isset($list) should suffice, and should be less expensive than !is_array($list).

I'm not sure about defaulting $bootstrap to FALSE though. If it is correct we should search the calls to module_list(FALSE, FALSE) and module_list(TRUE, FALSE) and change them to module_list(FALSE) and module_list(TRUE).

Anonymous’s picture

The change for the default value of $bootstrap is needed for the !isset($list) case. However we need to check all the uses of module_list to find those that might have assumed it to be TRUE. Those would something like module_list(TRUE) with no other parameter given and would need to change to module_list(TRUE, TRUE).

David_Rothstein’s picture

Status: Needs review » Needs work

OK, I just tried applying the patch from #3, but unfortunately it doesn't fix the problem either. URL aliases are still broken.

It's definitely a caching-related problem, but I think a more serious one than described above... As I said at the other issue, I think the fundamental problem with module_list() is this:

It does its internal caching in a rather odd way, by storing everything in the same static $list variable (regardless of what type of module list it is generating; e.g. $bootstrap = TRUE or FALSE). This means that when you call it normally (without asking it to do a refresh), it returns whatever module list it happened to generate last time, which is not necessarily the list you wanted!

This shouldn't be too hard to fix... we need an isset() check as you suggest here, but we also need to statically cache the bootstrap and non-bootstrap lists separately. I think that should be about it, right? I'll see if I have time to whip up a patch later.

David_Rothstein’s picture

@catch, your idea in #4 sounds worthwhile in its own right (maybe a separate feature request?). But for fixing this bug, I would vote for a direct fix rather than a workaround... and there's no fundamental reason why calling module_exists() inside path.inc should be forbidden, right?

David_Rothstein’s picture

FileSize
9.6 KB

OK, here is a patch. It fixes module_list()'s internal cache to actually work correctly... and on an existing Drupal 7 installation, everything seems OK. URL aliases work again, and I didn't notice any other problems (although I haven't tested it extensively).

The only teensie-weensie problem is that this patch totally breaks install.php (and probably update.php too, I haven't checked). It causes a "call to undefined function db_query()" error inside module_list()..... I have to track this down further, but I think the code in install.php was basically relying on the old, weird caching behavior of module_list(), in particular with regard to the $fixed_list parameter. I think $fixed_list doesn't really belong in this function anyway (it's kind of a weird parameter), and what the function really needs is a more general way to get a list of modules to use when it's called before there is an active database connection.

But I think this is on the right track. Frankly, I'm amazed that the caching problem never caused any bugs before this... module_list() really seems like it's completely broken internally! (Either that or I'm totally going crazy here ;)

catch’s picture

Status: Needs work » Needs review
FileSize
1.23 KB

OK I took a look at splitting the path.inc stuff into a different issue but that's kinda tricky.

This from bootstrap.inc - it runs before common.inc is included or any modules loaded:

case DRUPAL_BOOTSTRAP_PATH:
      require_once './includes/path.inc';
      // Initialize $_GET['q'] prior to loading modules and invoking hook_init().
      drupal_init_path();
      break;

If we follow the trail from drupal_init_path
it does this:

drupal_init_path
drupal_get_normal_path
drupal_lookup_path
elseif (module_exists('path')) << before any modules are loaded

Note I did this via api.drupal.org rather than a debugger but it appears to explain what's going on.

My inclination would be to roll back #196862 (where this issue never came up, and the final patch was committed without testing) and look at this from scratch. Judging by this I'm not at all convinced that module_list is broken, or at least not in such a way that it's causing this specific issue.

Attaching a rollback patch for #196862 and marking to needs review.

Anonymous’s picture

Status: Needs review » Needs work

@David: Your patch to me appears to return a module named boolean TRUE since it is possible to call module_list(TRUE, TRUE) or even module_list(FALSE, TRUE) and then somewhere else call module_list() but you've set $list[$bootstrap] when $bootstrap == TRUE and you don't have a correct list of modules. Why not use another static array named $bootstrap_list to take care of the bootstrap modules? We could even go wild and fill both $list and $bootstrap_list if $refresh is TRUE since we have knowledge of the bootstrap modules with the full list read anyway.

@catch: There is no reason to not use module_list as the indicator that the path module is enabled. You're just ignoring the bigger issue with module_list. David's patch @#10 is the correct method to correct the issue.

chx’s picture

Status: Needs work » Needs review

The patch to review is #11. earnie, you again?

catch’s picture

path.module is not in module_list() during that stage of the bootstrap.

Easy way to check is to add this to the top of path.module

function path_boot() {
}

and url aliases work again (after a module_rebuild_cache).

catch’s picture

Status: Needs review » Closed (duplicate)

Marking as duplicate of http://drupal.org/node/196862

Anonymous’s picture

Status: Closed (duplicate) » Needs work

module_list is broken. The sucker needs fixed.

David_Rothstein’s picture

Title: URL aliases broken » module_list() is broken

Indeed, I'm retitling this, since module_list() is definitely broken (regardless of whether or not fixing it is the best way to deal with the URL alias problem).

@catch, your analysis makes a whole lot of sense, but I think there is one problem. module_exists() does not actually check whether a module has been LOADED... it checks whether the module is enabled (i.e., whether it WILL BE LOADED at some point during the bootstrap process). Indeed, this is very confusing, since module_exists() calls module_list(), and the code comments for module_list() claim that it returns a list of "loaded modules"... however, if you look at the code, that's not what it's doing. It simply checks whether status=1 in the database, which is not the same thing.

Therefore, @earnie's original use of module_exists('path') should theoretically work, as it's only designed to check whether or not the path module is ENABLED -- and indeed, it does work when my patch in #10 is applied.

Basically, it seems like we have two bugs here which are sort of canceling each other out... or something like that ;) Arguably, module_exists() should be rewritten to behave the way you thought it was behaving (i.e., check whether the module has actually been loaded)... as to a large extent, that would make more sense.

The end result is that there is definitely broken code here which should be fixed. @earnie, thanks for your review in #12... I will look over those comments more carefully later when I have time.

catch’s picture

module_exists() calls module_list(), and the code comments for module_list() claim that it returns a list of "loaded modules"... however, if you look at the code, that's not what it's doing. It simply checks whether status=1 in the database, which is not the same thing.

else {
      if ($bootstrap) {
        $result = db_query("SELECT name, filename, throttle FROM {system} WHERE type = 'module' AND status = 1 AND bootstrap = 1 ORDER BY weight ASC, filename ASC");
      }
      else {
        $result = db_query("SELECT name, filename, throttle FROM {system} WHERE type = 'module' AND status = 1 ORDER BY weight ASC, filename ASC");
      }

note bootstrap = 1, which is set in module_rebuild_cache, which sets it on the basis of whether a module uses hook_exit or hook_boot, as evidenced by my empty hook_boot example for path.module

So I think it's actually a documentation/naming issue. module_exists() should be module_is_loaded()

Freso’s picture

Adding to my issue queue (catch thought I should have a look at it, so... ;)).

Anonymous’s picture

@catch: module_exists returns TRUE if the module is both installed and enabled. That is what it is being used for throughout Drupal core. I don't understand the reasoning to rename it.

Module_list returns the list of enabled modules. Some of that list has special meaning in the bootstrap phases. The documentation for it should state a list of enabled modules rather than loaded to give it proper definition. Even bootstrap uses it to get a list of modules to load. It doesn't use it to get a list of modules loaded.

David_Rothstein’s picture

Ah, okay, I think I finally have an explanation that hopefully we can all agree on:

@catch, you are right, module_exists() actually does behave more like "module_is_loaded" in this instance... but it doesn't always. It's a very fragile house of cards that allows this to occur.

For example, I can easily make module_exists() go back to its bootstrap behavior even after all modules are fully loaded. Try this:


// Full bootstrap.
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

// This works as expected.
if (module_exists('system')) {
  print 'The system module is loaded... ';
}

// Retrieve a list of modules.
$junk = module_list(TRUE, TRUE);

// Uh oh.
if (!module_exists('system')) {
  print 'But Drupal suddenly forgot that it is!';
}

Or the opposite:


// Minimal bootstrap.
require_once './includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE);

// This works as expected.
if (!module_exists('system')) {
  print 'The system module is not loaded yet... ';
}

// Retrieve a list of modules.
$junk = module_list(TRUE, FALSE);

// Uh oh.
if (module_exists('system')) {
  print 'But Drupal suddenly thinks it is!';
}

The behavior of module_exists() depends on the past state of function calls to module_list() -- and as stated above, the behavior of module_list() itself does so as well. This is double-plus-ungood.

This whole thread is a poster child for why unit tests are a good idea, I think ;) I can guarantee you that module_list() would fail any unit tests written for it right now.

Anonymous’s picture

@David: That is why I advocate for module_list to have two caches, one for the full list of modules and one for the bootstrap list. If $refresh is true it sets both lists from the full query. But if $bootstrap is TRUE only that one will be returned. The $fixed_list parameter is a special case for being able to call drupal_load('module', 'system') and drupal_load('module', 'filter') from places like install.php, update.php and includes/theme_maintenance.inc. However, I can't find from the drupal_load function that it calls module_list and I think we can do away with the special case of the $fixed_list parameter and simply call the drupal_load for the two modules.

David_Rothstein’s picture

@earnie, with regard to your comments in #12 and #22:

That is why I advocate for module_list to have two caches, one for the full list of modules and one for the bootstrap list.

I'm pretty sure that's what my patch in #10 does (?). $list[TRUE] is an array of bootstrap modules, and $list[FALSE] is an array of all modules. When you call the function you will always have $list[$bootstrap] returned, which is what you want. Your idea of using a separate $bootstrap_list variable would certainly work too (and would probably be more readable code, although at the cost of some code duplication, I think).

We could even go wild and fill both $list and $bootstrap_list if $refresh is TRUE since we have knowledge of the bootstrap modules with the full list read anyway.

That certainly seems reasonable. (Although I'm not sure if it would have a practical effect, since I suspect there is never a situation where the first call to this function during the course of a page load is with $bootstrap = FALSE?)

The $fixed_list parameter is a special case for being able to call drupal_load('module', 'system') and drupal_load('module', 'filter') from places like install.php, update.php and includes/theme_maintenance.inc. However, I can't find from the drupal_load function that it calls module_list and I think we can do away with the special case of the $fixed_list parameter and simply call the drupal_load for the two modules.

You know, you may be right. I think module_list() does get indirectly called somewhere during the early stages of install.php -- there might be a module_hook call somewhere within one of the functions called by drupal_maintenance_theme(), I think? However, I strongly suspect that those hooks are not intended to be fired during the install process... especially since system and filter are the only modules loaded then, so hooks are probably pretty pointless. So it may indeed be possible to get rid of $fixed_list, which would be very nice.

I'll have more thoughts related to the above in a moment.

David_Rothstein’s picture

OK, I have been doing some thinking and I am now pretty convinced that there is no way to fix the problems with module_list() without also dealing with the module_exists() vs. module_is_loaded() distinction. There are just too many places in Drupal that assume that module_exists() behaves one way or the other... and all these places can (and will) break when they are reached during install.php and friends.

Thus, I've been thinking of the following totally new plan:

  • Remove all database calls out of the module_list() function, and put them in module_load_all() and bootstrap_invoke_all(), which is where they properly belong. The only time you should ever need to query the database is right when you're deciding which modules to load.
  • drupal_load() already has a pretty good internal cache; with a little reworking and a helper function this could probably be used as a general-purpose place to store the current list of loaded modules at any given time.
  • With the above, module_list() and a new module_is_loaded() become extremely simple functions... all they have to do is check the helper function's statically cached list. No need for a $bootstrap or $fixed_list parameter or anything like that... module_list() is guaranteed to always return you a list of the currently loaded modules at the instant in time at which you call it. (How to handle the sorting correctly might require a little thought, but I don't think it would be hard.)
  • module_exists(), perhaps renamed to module_is_enabled(), would still be there. Probably there would be a helper function that stores the list of modules from the database and is used both by module_load_all() and by module_exists(), so again this function would be very simple. It would also be easy to add an $enabled parameter to module_list() which defaults to FALSE but which can be used to get a list of all enabled modules rather than the currently-loaded ones.

The above might be a fair amount of work, but I think the result would be a much cleaner, clearer and simpler system that "just works" correctly and is not prone to bugs. I probably do not have time to work on this for a little while, so I wanted to just throw it out there now for comment. I have not thought through every detail of this, so if there are holes to be poked in it, please poke them ;)

David_Rothstein’s picture

I also just came across #116820: module_list review, which seems relevant? I haven't look through it in depth, but I wonder if there are ideas there that can be merged with those here? @earnie, being the author, I guess you would probably know... ;)

chx’s picture

Status: Needs work » Needs review
FileSize
10.09 KB

David, very nice writeup. I rely on drupal_get_filename in this patch. I believe it's one nice cleanup. Look at how special cases collapse into 1-2 function calls.

mmilano’s picture

Installed fresh drupal and applied CHX's patch in post #26.

Installed successfully.

Listed modules successfully.

Enabled profile and statistics module successfully

Added article and page nodes successfully

Statistics module appears to be working except for top visitors. May be un-related to this patch, but this module was specifically mentioned when asking how to test. Missing table 'access'.

user warning: Table 'drupal7.access' doesn't exist query: SELECT COUNT(a.uid) AS hits, a.uid, u.name, a.hostname, SUM(a.timer) AS total, ac.aid FROM accesslog a LEFT JOIN access ac ON ac.type = 'host' AND LOWER(a.hostname) LIKE (ac.mask) LEFT JOIN users u ON a.uid = u.uid GROUP BY a.hostname, a.uid, u.name, ac.aid ORDER BY hits DESC LIMIT 0, 30 in /home/mike/projects/drupal/drupal7/drupal/modules/statistics/statistics.admin.inc on line 87.

chx’s picture

I think you just discovered a bug in statistics -- access table was used to store the access rules which is now removed. As the accesslog table is there, it's not that I broke module install...

chx’s picture

FileSize
9.93 KB

I removed one unnecessary hunk (remnant from an earlier attempt using drupal_load) and remove one line of code: the drupal_get_filename from module_load_minimal . Heine pointed out that drupal_load will call drupal_get_filename and if the DB is down then it will fall back to file search which is fine. But, I needed to fix that with a check whether the db system is even loaded already :)

catch’s picture

Issue for statistics breakage here: http://drupal.org/node/248436 (sorry, my fault).

chx’s picture

Status: Needs review » Needs work

drupal_get_schema will include every schema there is, even disabled modules. Stay tuned.

Anonymous’s picture

@David: I've marked the #116820: module_list review as duplicate referring to this node. Some of the ideas have already been redone.

@Chx: Thank-you for your revamp patch. I'm waiting for the stay tuned and will be giving it my reviews. I'm glad people with intimate knowledge of Drupal core has given this a closer look.

David_Rothstein’s picture

Wow, thanks @chx! This looks really great.

I too am waiting for the "stay tuned" in #31, but in the meantime I found a couple things that need work with the patch in #29. I won't try revising the patch now, since I'm waiting to hear what the changes in #31 mean for all this. But:

  • It seems that this patch causes modules with bootstrap = 1 to always appear first in the list, regardless of their weight (since they're the first ones that get inserted in the drupal_get_filename() cache). So a standard call to module_list() will not always return the modules in the correct order.
  • We probably need to switch update.php to use module_load_minimal(), like the others.
  • In module_list(), the static $stored_list variable never gets set...
  • Finally, this patch is a huge improvement over the current system, but there's still some dependencies that are a little fragile. A lot of things rely on the fact that module_fetch_all() is called within DRUPAL_BOOTSTRAP_PATH, and many functions will not work correctly if used before then. I think with a small amount of rewriting, we can find a way around this; for example, ideally there would be a call to module_fetch_all() within module_load_all(), right? I also think using drupal_get_filename() for the cache definitely makes a lot of sense... but I'm wondering, is there a way we can store information in there about the state of the module (loaded or not)? For example, something like this as the internal storage mechanism inside drupal_get_filename():
    array(
      'module_1' => FALSE, // Stored this way by module_fetch_all() to indicate an enabled module.
      'module_2' => TRUE,  // Stored this way by drupal_load() to indicate a loaded module.
      etc...
    );
    

    I think that keeping track of this difference would allow things to be even simpler and more flexible, with even fewer special cases to worry about... any time you wanted to know if a module was (a) enabled or (b) both enabled and loaded, you could always get exactly what you want.

Dries’s picture

Is this something we can write tests for now? :)

chx’s picture

Status: Needs work » Needs review
FileSize
9.53 KB

I used drupal_load instead and defer loading when that's needed.

chx’s picture

FileSize
12.44 KB

Now, path tests pass with one failure regarding alias deletion which has hardly anything to do with this patch -- before the patch, they failed on every 'alias works' test. Alas, I still a cache reset possibility -- when you remove modules, that is not picked up by this approach. I have fixed the "bootstrap modules first" problem. Also, I forgot about update.php that is fixed, too. I have not implemented the distinction between the 'fetched' and the 'loaded' states as I see no use case for it. It'd be easy if it's needed, though, one just needs to keep a separate list in drupal_load.

JohnAlbin’s picture

Status: Needs review » Needs work

First, this patch needs more php docs. This would help me figuring out what some of the params are doing. But here’s what I can determine so far…

This patch modifies/simplifies the API for drupal_load() and module_list() and adds module_load_minimal() and module_fetch_all().

1. I think module_fetch_all() needs:

/**
 * Fetches a list of modules from the database.
 *
 * @param $bootstrap
 *   Whether to fetch the reduced set of modules loaded in "bootstrap mode"
 *   for cached pages. See bootstrap.inc.
 * @param $force
 *   Force module list to be reloaded from database.
 */
function module_fetch_all($bootstrap = FALSE, $force = FALSE) {

2. drupal_load() needs its $mode documented. Here’s my stab at it:

/**
 * Includes a file with the provided type and name.
 *
 * @param $type
 *   The type of item to load (i.e. theme, theme_engine, module).
 * @param $name
 *   The name of the item to load.
 * @param $mode
 *   Whether to load the item(s) immediately, defer loading the item until
 *   later, or to forget that all the items of that type were loaded.
 * @return
 *   TRUE if the item is loaded or has already been loaded.
 */
function drupal_load($type, $name = NULL, $mode = DRUPAL_LOAD_IMMEDIATELY) {

3. drupal_web_test_case.php: there should be one extra space before // We might have removed some modules.

4. If drupal_load('module', NULL, DRUPAL_LOAD_DEFERRED) is called then all the modules are included. This seems contrary to passing in DRUPAL_LOAD_DEFERRED.

  if (!isset($name)) {
   if ($mode == DRUPAL_LOAD_DEFERRED) {
      foreach ($files[$type] as $filename) {
        include_once "./$filename";
      }
    }

5. module_load_minimal() is much better than copy and pasting code 3 times into theme.maintenance.inc, install.php, and update.php. +1

6. What’s with the update to simpletest_entrypoint()? This seems unrelated to this issue.

7. In module_fetch_all(), 'AND bootstrap = 0' should be 'AND bootstrap = 1', no?

8. When running Path and System simple tests before the patch:

  Path: 76 passes, 4 fails and 0 exceptions.
    Alias works. at [modules/path/path.test line 38]
    Changed alias works. at [modules/path/path.test line 49]
    Alias works. at [modules/path/path.test line 89]
    Changed alias works. at [modules/path/path.test line 98]
  System: 85 passes, 0 fails and 0 exceptions.

After the patch:

  Path: 79 passes, 1 fails and 0 exceptions.
    Alias was successfully deleted. at [modules/path/path.test line 120]
  System: 82 passes, 3 fails and 0 exceptions.
    Module "aggregator" is enabled. at [modules/system/system.test line 43]
    Module "translation" is enabled. at [modules/system/system.test line 88]
    Module "locale" is enabled. at [modules/system/system.test line 88]
chx’s picture

Status: Needs work » Needs review
FileSize
15.81 KB

Thanks for the through review, the documentation and drawing my attention to the fact that system.test has a lurking module_list reset. And that's necessary here as well because some other Drupal have changed the module list and we need to resync. In normal operations, such resets will be much less necessary than they were before. I simplified how drupal_load works with deferred stuff and I have documented everything stressing that all this new stuff to drupal_load is not something people should do. Finally, I have enrolled http://drupal.org/node/211439 into this patch. system test pass and path has a single, unrelated failure.

Anonymous’s picture

subscribing. will review tomorrow.

chx’s picture

note that the aforementioned single unrelated path test failure is fixed in http://drupal.org/node/251631

chx’s picture

FileSize
14.33 KB

I found two unrelated hunks in the patch. Removed. No other change.

David_Rothstein’s picture

FileSize
16.01 KB

I reviewed and tested this patch. (Sorry, I had intended to do this earlier!). Overall looks very good, and everything seems to basically work.

I'm attaching a new version which includes a few changes to the documentation (no code changes, though, so it doesn't matter if anyone else started testing the earlier version):

  • I fixed some errors and added some comments in the module_list(), module_fetch_all(), and drupal_load() documentation... the most important of these probably being that the documentation for the $sort parameter in module_list() was no longer correct.
  • This patch seems to have removed a random code comment about the aggregator module from modules/system/system.test, so I put that back in.

Now some comments/suggestions about the code:

  • The comment in DRUPAL_BOOTSTRAP_PATH implies that the call to module_fetch_all() is needed only for path aliases to work, but actually, it's needed for, like, all of Drupal to work... try commenting out that line and you'll see what I mean ;) The reason is that other functions, in particular module_load_all(), rely on module_fetch_all() having been called earlier. I think we need to fix this so these functions become more robust (especially since that call to module_fetch_all in the bootstrap can maybe be removed at some point). Since module_fetch_all() is cached, there is no harm in doing this. So I recommend that we rewrite module_load_all() as:
    function module_load_all($bootstrap = FALSE) {
      module_fetch_all($bootstrap);
      drupal_load();
    }
    

    and bootstrap_invoke_all() as:

    function bootstrap_invoke_all($hook) {
      module_load_all(TRUE);
      module_invoke_all($hook);
    }
    

    Which looks a lot cleaner anyway.

  • I think the cache for the sorted list in module_list() is not working correctly. Maybe that part of the function should be rewritten like this?:
    if ($sort) {
      // Only do the sorting if the list has changed since the last time we sorted.
      if ($stored_list != $list) {
        $stored_list = $list;
        ksort($list);
        $sorted_list = $list;
      }
      else {
        $list = $sorted_list;
      }
    }
    

    and then we need static $stored_list = array(), $sorted_list = array(); at the top of the function too. On the other hand, I'm not sure this cache is even necessary?! I can only find one place in core where module_list() is actually called with $sort = TRUE (in user.admin.inc).

  • When module_enable() is called, I believe the newly enabled module always gets put at the end of the list in drupal_load(), so that for the rest of that page load module_list() doesn't return things correctly ordered by weight. I'm not sure how much this matters (?), but we might want to consider using a module_fetch_all() with force refresh in there, just to be safe.
  • I'm pretty sure we can get rid of DRUPAL_LOAD_NONE entirely. In module_list(), can't $list = drupal_load('module', NULL, DRUPAL_LOAD_NONE); just become $list = drupal_load('module');?
  • Should a call to drupal_load() with DRUPAL_LOAD_RESET reset the $deferred array too? (I think that would probably be a good idea, but I'm not sure.)
  • If I'm understanding things correctly, then the $files cache in drupal_load() -- and by extension the return values of things like module_list() and module_exists() -- actually involves a list of all fetched modules, rather than loaded ones? This seems to be because the line $files[$type][$name] = $filename; near the bottom of drupal_load() is called regardless of whether the file is being loaded immediately or deferred. If this is correct, we should probably document that better, since I found it pretty confusing when reading through the code! It also probably affects the documentation of module_load() and module_exists()... we probably should explain more carefully in those functions exactly what is being returned. I think this is somewhat important, since this seems to be the whole reason that this patch actually allows path aliases to work again.
  • While it would be nice to get rid of the call to module_fetch_all() in DRUPAL_BOOTSTRAP_PATH, since it's basically a hack to let module_exists() work within path.inc before all modules have been loaded, I can't think of a great way to do it. As @chx and I discussed in IRC a couple days ago, although we could do this either by creating separate "module_is_loaded" and "module_is_enabled" functions as I suggested above (or alternatively, by adding an extra parameter to module_exists), it's probably not worth changing the API of these public functions just for this special case. Doing so would have the slight benefit of allowing "module_is_enabled" to work within bootstrap hooks; however, @chx pointed out that if you really need that information during the bootstrap phase, the best way to get it is probably just to query the system table anyway, and I tend to agree with this point.
David_Rothstein’s picture

It also occurs to me that since drupal_load() is kind of a crazy function with all sorts of parameter combinations intended for internal use only, it might be good to take that code and put it in a private function, with a few public wrapper functions intended for calling it in its various ways. That would make the code more readable and make it harder for modules that might be using this function to make any mistakes.

I don't think that necessarily needs to be part of the current patch, though.

Dries’s picture

Excellent review, David. I had a look at the patch and I have some additional suggestions.

1. In the code comment of DRUPAL_LOAD_DEFERRED, I would explain what it means to load a module later, and when that should (or shouldn't) be used. The current code comment doesn't answer these questions and I'm left wondering why I'd want to defer loading a module ... *confused*.

2. The explanation should probably be repeated in drupal_load()'s comments.

3. + $list = drupal_load('module', NULL, DRUPAL_LOAD_NONE); looks somewhat "ugly" -- it wasn't self-explanatory and could use a code comment (if not a more intuitive API).

4. I wonder if we could write a couple of Simpletests that use (aggressive) page caching, and that use the new API to check the number of loaded modules. Let's set the example and start writing tests.

It looks like this patch is getting close to be RTBC.

David_Rothstein’s picture

Here is some possible text for Dries' first two suggestions (it's not so easy to explain, I guess... I struggled figuring out how to write this):

/**
 * Mark the file to be loaded later on in the bootstrap process.
 *
 * This is used when information about enabled modules needs to be
 * retrieved from the database before the module code itself is actually
 * required.  This method prevents unnecessary loading of files if the
 * bootstrap process is cut short.
 */
define('DRUPAL_LOAD_DEFERRED', 2);

And then in the drupal_load() comments:

 * @param $mode
 *   Whether to load the file immediately (DRUPAL_LOAD_IMMEDIATELY, the
 *   default), mark the file to be loaded later on in the bootstrap process
 *   (DRUPAL_LOAD_DEFERRED) or just reset the internal cache of files
 *   (DRUPAL_LOAD_RESET). These are for internal use only; do not specify
 *   this argument.  DRUPAL_LOAD_DEFERRED is used when information about
 *   enabled modules needs to be retrieved from the database before the
 *   module code itself is actually required; this prevents unnecessary
 *   loading of files if the bootstrap process is cut short.
chx’s picture

I think we should really postpone this past the registry. As we were wroking on that , all this deferring problem is solved: we will not load modules, no need.

chx’s picture

Assigned: Unassigned » chx
Status: Needs review » Needs work

So, here we are, the registry is in! Stay tuned, I will post more today.

chx’s picture

Status: Needs work » Closed (duplicate)

I have written http://drupal.org/node/259412 which as a side effect fixes this too :)

catch’s picture

Status: Closed (duplicate) » Needs work

#259412: Total module system revamp is stalled and this issue is blocking dbtng now.

Crell’s picture

I was just about to post a new issue for this, but here's what I was going to say:

Currently, module_list() caches its results based on its parameters. However, those parameters can change, and just calling module_list() on its own will return NULL if it hasn't already been "primed", since the default is to not refresh. However, who knows how it's been primed before you wanted it, or even if? That means you need to either pray or always tell it to refresh, which is nasty for performance. It also has no "I'm empty so rebuild" validation, so if you call it early in the page request you stand a very good chance of getting back NULL.

module_list() needs to be totally rewritten, and probably broken into 2 separate functions: module_list() and module_list_bootstrap(). Both should self-initialize and cache the first time they're called, as well as accept a $reset parameter. That would make them safe to call. Trying to throw it all into one function is a horrid idea that causes all sorts of problems for, say, DBTNG.

I really think the fix is to just trash module_list() and make 2 functions. That will simplify the code here and in various other places, too.

drewish’s picture

subscribe / bump.. just found this from the TODO in boostrap.inc

kenorb’s picture

I agree with #50
I get into the same problem.

6.x
module_list(FALSE, TRUE) will cache list of module with condition: $bootstrap = TRUE,
so when you call this function again it should give you list of all active modules, but instead of that it will return list of modules which have only defined boot and exit hooks.

It cause in many cases WSOD, because depends on that list Drupal loading rendering page using hook_theme, which will be not find hook_theme in the rest of modules.
Definitely module_list() should be fixed and backported to 6.x as well

First place when bootstrap module list is cached badly in bootstrap_invoke_all('boot'); called from bootstrap.inc
Affected functions:
- module_exists() will return TRUE only for bootstrap module, will return FALSE for module which doesn't have boot or exit hook, even if is activated

kenorb’s picture

kenorb’s picture

Created issue related strictly to 6.x: #496198: module_list() called with wrong arguments causing WSOD and breaking theme_registry
To not making a mess with 7.x, where structure of this function is quite different.

CorniI’s picture

Subscribe
Is there any work done on this?
module_list() still looks pretty broken, for now I can't even see that the static cache is used for anything else than caching the sorted module list.

moshe weitzman’s picture

Status: Needs work » Closed (fixed)

I think that Catch rewrote module_list recently. Please reopen if needed.

David_Rothstein’s picture

Status: Closed (fixed) » Needs work

Still an issue. See the top of the function:

function module_list($refresh = FALSE, $bootstrap = FALSE, $sort = FALSE, $fixed_list = NULL) {
  static $list = array(), $sorted_list;

There are a bunch of different lists this function lets you ask for, but only one list is ever stored in the static variable at a time, which is the heart of the problem.

Given all the changes to core in the year and a half since the last patch here, though, I'm thinking the fix for this might now be a lot simpler... hopefully :)

catch’s picture

Yeah it's still incredibly broken. When working on the system_list() stuff, I considered attacking the static, however the main problem is $fixed_list - once you set that once, every subsequent call to module_list() uses whatever was in $fixed_list until it was reset. To fix it properly, I think we'd have to completely refactor the code which depends on $fixed_list - this may or may not be hard to do, but was the point where I decided a helper function for the system_list() issue was less painful.

sun.core’s picture

Title: module_list() is broken » $fixed_list breaks module_list()
Assigned: chx » Unassigned

Attempt at a better title.

catch’s picture

Version: 7.x-dev » 8.x-dev

Title change looks right - at least it'd be so much simpler without that. However short of a miracle I don't see us fixing it for D7, and it's been at least this broken since Drupal 6, so I'm bumping this, but leaving it as a critical bug report for D8.

catch’s picture

Priority: Critical » Major

Downgrading all D8 criticals to major per http://drupal.org/node/45111

This is really horrible though.

Damien Tournoud’s picture

Version: 8.x-dev » 7.x-dev

Bugs cannot target D8, as the tree is not open yet.

moshe weitzman’s picture

Version: 7.x-dev » 8.x-dev

@Damien - there are bugs that we don't plan to tackle until D8. I dont think D8's open/close status is relevant. Lets not be slaves to our own rules.

catch’s picture

Category: bug » task

This doesn't cause any user-facing bugs, it's more annoying API wtf, so we can stick to the rules and change it to a task.

David_Rothstein’s picture

Title: $fixed_list breaks module_list() » module_list() $bootstrap parameter doesn't work as documented
Version: 8.x-dev » 7.x-dev
Category: task » bug
Status: Needs work » Needs review
FileSize
5.24 KB

Nope, this is a straight-up bug.

Although $fixed_list is pretty weird, that's not what the bug is about, so I'm changing the title accordingly. The bug is that the $bootstrap parameter of module_list() says it controls whether or not a list of bootstrap modules is returned... but it doesn't always actually do that.

Making the function behave like we want it to is indeed D8 material, because although in theory it's a simple fix, the entire Drupal house of cards is built on top of this and would come crashing down :) And once D8 comes around, we probably actually just want to do what Crell suggests in #50 and split the function up and get rid of the nasty parameter list altogether.

However, there's no reason we can't rewrite the documentation of this function for D7 and rename its parameters to explain what it is they actually do today. The attached patch does that.

Status: Needs review » Needs work

The last submitted patch, module-list-is-ugly-222109-65.patch, failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review
FileSize
5.19 KB

Wow, the commit that caused the conflict happened about 5 minutes after I posted the above patch...

Status: Needs review » Needs work

The last submitted patch, module-list-is-ugly-222109-67.patch, failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review
David_Rothstein’s picture

Couldn't reproduce those test failures locally, and they seem very unlikely to be related.

David_Rothstein’s picture

Component: base system » documentation

Moving to the "documentation" queue... that's basically what this patch is now.

jhodgdon’s picture

Status: Needs review » Reviewed & tested by the community

As a documentation patch, this looks fine to me. I'm provisionally setting to RTBC, but am also going to click re-test to refresh the testing bot.

jhodgdon’s picture

tom_o_t’s picture

webchick’s picture

Priority: Major » Normal
Status: Reviewed & tested by the community » Needs work

I'm not sure how this documentation can be major. Looks good, though.

Sun is doing a quick re-roll for some missing (optional)s on the parameters in the PHPDoc.

sun’s picture

Priority: Normal » Major
Status: Needs work » Reviewed & tested by the community
FileSize
5.46 KB
webchick’s picture

Version: 7.x-dev » 8.x-dev
Status: Reviewed & tested by the community » Needs work

Thanks. Committed #76 to HEAD. I guess this belongs in the D8 queue now for the other stuff?

pillarsdotnet’s picture

Title: module_list() $bootstrap parameter doesn't work as documented » module_list() should be rewritten as 2 functions: module_list() and module_list_bootstrap()
Component: documentation » base system

Bump.

catch’s picture

Category: bug » task

Now the docs are right, this really is a task. I have a patch somewhere that started to remove the $fixed_list parameter.

Also we need to consider restricting module_list_bootstrap() to page cache hits (which could turn into a miss after hook_boot()), and just using regular full module list for requests that definitely aren't going to be a page cache hit - currently we make two round trips to the cache where there should only be one and for any higher bootstrap level it's worth loading the full list.

One extra thing - in Drupal 6 the bootstrap list is queried from the database twice for every request, this is also in another issue and was fixed in D7 only a few months ago.

xjm’s picture

This could use a summary.

rdickert’s picture

Assigned: Unassigned » rdickert

Taking on the issue summary.

rdickert’s picture

Issue summary: View changes

Posting issue summary.

rdickert’s picture

Assigned: rdickert » Unassigned

Unassigning the issue. Posted a new issue summary for this.

rdickert’s picture

Issue summary: View changes

Updated username of first post in issue summary.

ZenDoodles’s picture

Thank you for the help on the issue summary rdickert. It almost seems like this should have been two issues. We're going backwards here (D7 to D8 instead of the other way around). In D7 it's a bug; in D8 it's more of a task. Also, we are taking a different approach for D8 (by design) than we did for D7. But meh. We're here now, and it will probably be helpful to have the history.

All that is a very long winded way of saying I'm removing the Issue Summary tag...

rdickert’s picture

I agree. It's quite an unwieldy issue and has been renamed several times. I'd say that anyone thinking about renaming an issue should also consider whether the change to the meaning of the issue is big enough to close it and start a new issue (or a new task, in this case). However, there is some useful background to this task that is captured here.

sun’s picture

Status: Needs work » Fixed

It looks like the remaining task is dealt with in #1503224: Cleanup module_list()

A couple of patches were committed but also reverted here, so I'm not sure whether this actually means it's a duplicate now or not. Changing to fixed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Updated issue summary.