Preface:

Some time ago I started this discussion about menu_rebuild, which is now quite blown-up and hard to read (when I started that issue, I knew very little about the menu system). Now I thought it would be a lot better to have two separate issues: One for menu_router_build (this can be the old thread, because it already contains patches for that), and one for _menu_navigation_links_rebuild (this one). Both functions are called from menu_rebuild, both have a poor performance (in D6 at least), and I believe that both could be much faster with a smart algorithm.

...

Now, the issue:

The following search
http://drupal.org/project/issues/drupal?text=_menu_navigation_links_rebu...
shows we have a lot of bugs related to _menu_navigation_links_rebuild.

Upon that, this function takes quite a long time, slowing down
- the modules page
- any request that adds or removes content types or fields
- anything else that triggers a menu_rebuild.

The function is slow because it does a lot of individual calls to menu_link_save(), which is not optimized to be used en masse. We end up with a lot of redundant and all separate individual queries of all kinds (select, insert, update, delete).

To speed things up, we need an algorithm that is smart enough to avoid unnecessary DB queries, and that does the remaining queries (be it select or insert or update or delete) all in one go (at least not in separate individual queries). This can only be done with a custom function, not something that uses menu_link_save().

If this is combined with a (diff-based) rewrite of menu_router_build, we could try to use the diff information from there (telling us what has changed in the menu_router table) for a very targetted update of menu_links. But maybe this is not even necessary.

As has been discussed in the other issue, menu_links is generally to be considered too big to be in memory all at once. However, we are only rebuilding those links that are coming directly from menu_router, which is generally not too big for memory. This means, it is acceptable to have some portions of the menu data in memory, if this can speed up the algorithm. For instance, we could load everything with menu_name="navigation", or we could load everything with module="system"...

---

Things to consider when inventing a new solution:

-----

Other things that slow down the modules page, and do not need to be discussed here:

  • menu_router_build, until we patch it (in D7 it's already faster, but can still be improved by switching to a diff-based rebuild). Called by menu_rebuild(). D6 patch here, waiting for someone to port it to D7.
  • drupal_system_listing called by module_rebuild_cache, if there are too many files and folders in contributed modules (civicrm has a lot of 3rd party stuff blowing it up, but other modules can be quite heavy too). This can be fixed by telling the function where to look, and where not to look.
  • _admin_menu_rebuild_links, which does something similar to _menu_navigation_links_rebuild. The new version of admin_menu will work differently, so maybe this problem will go away. Still, the new admin_menu relies on a properly rebuilt navigation menu, so it's a good idea to speed this up.
  • various other rebuild processes.

These are the findings from my own projects, but from other discussions I assume that others have similar bottlenecks.

------

I would like to post my own solution, but I don't have one, sorry... Still, I would love if this bottleneck could be fixed! Maybe you guys have some ideas?

Thanks.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

EvanDonovan’s picture

Thanks for this. I believe earlier in #512962: Optimize menu_router_build() / _menu_router_save(), you said that it would be possible to create a version of menu_link_save, delete, etc. that was optimized to do the INSERT all in one query, or at least in batches (if it would be too much to load in memory all at once). Any more thoughts along those lines?

Also, there is an issue that discusses the number of no-op SELECT queries that menu_rebuild generates: #302638: No-op and slow queries during menu rebuild. That issue had a few patches, one of which even made it in to an earlier version of 6.x, but was rolled back. Now that issue is stalled. Perhaps that would be relevant too.

chx, sun, pwolanin, moshe_weitzman, Damien Tournoud, et al.: do you have any suggestions?

donquixote’s picture

<?php
    foreach ($menu_links as $item) {
      [..]
      if (!$existing_item || !$existing_item['customized']) {
        menu_link_save($item);
      }
    }
?>

could we change this to

<?php
    foreach ($menu_links as $item) {
      [..]
      $menu_item_modified = ...;
      if (!$existing_item || !$existing_item['customized']) {
        if ($menu_item_modified) {
          menu_link_save($item);
        }
      }
    }
?>

Maybe the menu_link_save is necessary as a cleanup step, even for non-modified items .. I dunno. But if this is possible, it would be a cheap and effective improvement.

EDIT:
We need to check menu_link_alter before we can know if the item is modified or not. This usually happens inside menu_link_save, so again we need our own menu_link_save equivalent.

donquixote’s picture

Next thing:

<?php
    foreach ($menu_links as $item) {
      $existing_item = db_fetch_array(db_query("SELECT mlid, menu_name, plid, customized, has_children, updated FROM {menu_links} WHERE link_path = '%s' AND module = '%s'", $item['link_path'], 'system'));
?>

This happens for each item from menu_router. What about doing it all at once, like this:

<?php
    $existing_items = array();
    $q_existing_items = db_query("SELECT mlid, link_path, menu_name, plid, customized, has_children, updated FROM {menu_links} WHERE module = 'system'");
    while ($row = db_fetch_array($q_existing_items)) {
      // allow more than one item with the same path - we never know!!
      $existing_items[$row['link_path']][] = $row;
    }
    foreach ($menu_links as $item) {
?>

This is not as straightforward as it looks: menu_link_save can do some modifications to the menu links in the database, so the cached $existing_items could quickly be out of date.

The solution is that we need to keep track of the changes done by menu_link_save, and update the $existing_items array accordingly. This will not be possible with the existing version of menu_link_save, we need something custom.

donquixote’s picture

Title: Performance Rewrite for _menu_navigation_links_rebuild » Optimize _menu_navigation_links_rebuild()

A much nicer term..

mikeytown2’s picture

subscribe... EDIT: 6.x issue below
http://drupal.org/node/512962#comment-2522418

looking at http://api.drupal.org/api/function/_menu_navigation_links_rebuild/6 seems like the quickest way to speed this up is to use db_multi_select for the second loop of this function. All the other queries seem to grab multiple rows. This is the code block I'm thinking of optimizing; if some one could verify my guess that would be nice.

  if ($menu_links) {
    // Make sure no child comes before its parent.
    array_multisort($sort, SORT_NUMERIC, $menu_links);

    foreach ($menu_links as $item) {
      $existing_item = db_fetch_array(db_query("SELECT mlid, menu_name, plid, customized, has_children, updated FROM {menu_links} WHERE link_path = '%s' AND module = '%s'", $item['link_path'], 'system'));
      if ($existing_item) {
        $item['mlid'] = $existing_item['mlid'];
        // A change in hook_menu may move the link to a different menu
        if (empty($item['menu_name']) || ($item['menu_name'] == $existing_item['menu_name'])) {
          $item['menu_name'] = $existing_item['menu_name'];
          $item['plid'] = $existing_item['plid'];
        }
        $item['has_children'] = $existing_item['has_children'];
        $item['updated'] = $existing_item['updated'];
      }
      if (!$existing_item || !$existing_item['customized']) {
        menu_link_save($item);
      }
    }
  }

This doesn't address menu_link_save() ATM. Main issue I see with doing a multiple database operation is http://api.drupal.org/api/function/db_last_insert_id/6

The other thing I see is to do a db_multi_update on this code

  $placeholders = db_placeholders($menu, 'varchar');
  $paths = array_keys($menu);
  // Updated and customized items whose router paths are gone need new ones.
  $result = db_query("SELECT ml.link_path, ml.mlid, ml.router_path, ml.updated FROM {menu_links} ml WHERE ml.updated = 1 OR (router_path NOT IN ($placeholders) AND external = 0 AND customized = 1)", $paths);
  while ($item = db_fetch_array($result)) {
    $router_path = _menu_find_router_path($item['link_path']);
    if (!empty($router_path) && ($router_path != $item['router_path'] || $item['updated'])) {
      // If the router path and the link path matches, it's surely a working
      // item, so we clear the updated flag.
      $updated = $item['updated'] && $router_path != $item['link_path'];
      db_query("UPDATE {menu_links} SET router_path = '%s', updated = %d WHERE mlid = %d", $router_path, $updated, $item['mlid']);
    }
  }

Knowing which one is slower would be nice. Currently too busy to benchmark ATM.

aspedia’s picture

Subscribe

aspedia’s picture

The following is a patch for D6, as that's what I'm currently needing this to work on, not sure how simple it would be to port to D7.

This patch would still need work, as it could probably use some memory optimization. I don't have any benchmarks, but regular rebuilds seem to be much much quicker. There's also room for improvement by doing something similar for the deleting of orphaned links, and adjusting changed router paths.

xjm’s picture

Tracking.

catch’s picture

Managed to completely miss this issue, however there's an RTBC Drupal 7 patch here that removes a lot of the queries done here #1010480: Optimize _menu_navigation_links_rebuild(). Didn't compare it to the Drupal 6 patch.

Andrew Answer’s picture

#7 patch seems not speed up my system. I use Pressflow and Ubercart together (installed, but all modules disabled) and use devel for testing speed. It's around 70 sec and 1400 queries, not depends from patch. I test it with enabling/disabling Aggregator module.

donquixote’s picture

@Andrew Answer (#10):
Thanks for testing. Could you do some microtime(TRUE), or xdebug, to measure exactly the function we want to optimize? I think this would be even more useful.

Andrew Answer’s picture

Devel module shows me two functions which need to be optimized:
1. _menu_update_parental_status
2. menu_link_save
Both have "UPDATE menu_links" queries.
(1) have only "UPDATE menu_links SET has_children = 1 WHERE mlid = [different numbers]"
(2) have more complex UPDATE like "UPDATE menu_links SET menu_name = 'admin_menu', plid = 457, link_path = 'admin/settings/site-information', router_path = 'admin/settings/site-information', hidden = 0, external = 0, has_children = 0, expanded = 0, weight = 0, depth = 2, p1 = 457, p2 = 511, p3 = 0, p4 = 0, p5 = 0, p6 = 0, p7 = 0, p8 = 0, p9 = 0, module = 'admin_menu', link_title = 'Site information', options = 'a:1:{s:5:\"alter\";b:1;}', customized = 0 WHERE mlid = 511" Everywhere I see "SET menu_name = 'admin_menu'" i.e. admin menu rebuilt.

donquixote’s picture

menu_link_save()
might be slow in itself, but even worse is that it is called a hundred or thousand times during a menu rebuild. Each time it talks to the database. Some of these write operations are even redundant.
The optimization strategy should be to do all or a lot of this in PHP, determine what really needs to go to the database, and write this stuff in one go, or big chunks - not one by one. Same for the SELECT lookups.

The challenge: We can not keep all items of one menu in PHP memory. Menus can grow quite big, and menu_links can be much bigger than menu_router usually is. This means, things need to be done in chunks. But still we should do something better than compute every item individually.

catch’s picture

Status: Active » Closed (duplicate)

While this is the older issue, I'm marking it as duplicate of the issue linked in #9 since that already has an rtbc patch against Drupal 7.