Problem/Motivation

I was having some huge slowdowns so I decided to see what kinds of things were slowing down node_save().

module_invoke_all('node_insert') seemed to be big slow down and inside that, this is what I got:

hook_node_insert()

hook_node_update()

hook_node_delete()

expire_node_insert() was getting 2.2seconds!

Proposed resolution

Not sure yet, just wanted to share...

Remaining tasks

User interface changes

API changes

Comments

joelpittet’s picture

Title: node_insert performance issue. » expire_node_insert performance issue on node_save
Issue tags: +Performance
joelpittet’s picture

Maybe it would be worth having a whitelist of node types that can be excluded from expiring?

spleshka’s picture

That is the issue that definately have to be solved. To start this conversation I need more info about your environment. Could you please describe how do you use Cache Expiration - does it integrated with some external caching system, what modules are enabled which integrates with Cache Expiration, etc.

spleshka’s picture

Also would be interesting to know about other hooks - node_update, node_delete, etc.

joelpittet’s picture

@Spleshka I'm using this with a commerce setup. Using this a a guide for AuthCache:
https://www.drupal.org/node/2158615

I've got Memcache Storage + Redis (For when cache items which grow > 1MB) and DB storage for a few bins like cache_form.

It's PHP 5.5.9 with Zend OPcache v7.0.3 enabled and configured.

I'll have to get back to you on the node_update and node_delete as I was digging pretty deep with microtime calls in core to get at those specific times and sorted the results.

Also the thing I was testing the node_insert with was not a commerce product or page, it was a retail location node with address and geo fields on it. So one of the less complicated node types.

joelpittet’s picture

Issue summary: View changes
StatusFileSize
new48.15 KB
new42.39 KB

Here is the hook_node_update and hook_node_delete added to the issue summary.

joelpittet’s picture

StatusFileSize
new871 bytes

Here's what I was using for a performance test with a breakpoint condition on the return statement of:

$hook == 'node_update' || $hook == 'node_insert' || $hook == 'node_delete'

So that we aren't seeing all the $hooks that get called.

joelpittet’s picture

Issue summary: View changes
StatusFileSize
new42.13 KB

Digging a bit further I have the following in ExpireNode::expire()

Which is just hunking different pieces out to see where the bottle neck is at:

joelpittet’s picture

Issue summary: View changes
Status: Active » Needs review
StatusFileSize
new18.45 KB

Further down the rabbit hole, inside ExpireAPI::executeExpiration()

Which brings me into one call to authcache_builtin_expire_v2_expire_cache().

joelpittet’s picture

I think this may be an authcache issue and it boils down to number of roles I have and how many cache key variations occur. In my case there are 512 role choices because it takes every combination which produces enum_user_keys at 1024.

Then it times 4 keys to expire which does it 4K times... I guess for now I'll just turn off caching on a bunch of the roles I have.

joelpittet’s picture

Changing the number of roles from 11 to 5 had a huge differences because of the exponential growth of the choices.

2.2s down to 0.05s!

Maybe we should move this over to authcache module as I don't think expire can do much here?

Though I will likely still be looking at ways to temporarily disable expire module while some of my feed imports are running to avoid all the unnecessary expire calls. Maybe expire module could determine that it's in a batch process and collect the nodes and expire them at the end?

spleshka’s picture

Title: expire_node_insert performance issue on node_save » Performance issue with cache expiration
Project: Cache Expiration » Authenticated User Page Caching (Authcache)
Status: Needs review » Active

Thanks for your research, was really interesting to read, just like a real detective :)

I think we definately should move this issue to authcache, because the "heavy" job happens there and Cache Expiration can't influence it.

Maybe expire module could determine that it's in a batch process and collect the nodes and expire them at the end?

Actually, no proper way to check that, because each batch operation is executed inside of independant thread, so we can't use static caching or tricks like that. And this situation happens too rare to provide some weird logic for a very specific use-case as a generic solution for everyone.

znerol’s picture

Depending on your configuration, you may try to customize authcache key calculation as described in Authcache 2 tuning guide - optimize cache-hit ratio.

In this case, it might also be worthwhile to replace the cache-key enumeration algorithm, especially take a look at _authcache_enum_default_role_combine:

/**
 * Default enumeration method for authcache keys.
 *
 * Default method for enumerating possible combinations of roles. Each item is
 * an array with one or more role-ids forming a role-combination.
 *
 * This approach will build an array with 2^n elements. If you enable more
 * than a couple of roles (say 10) in authcache, then memory consumption and
 * computation time will rise quickly.
 *
 * In order to override this function with your own implementation, set the
 * variable authcache_enum_role_combine to the desired function, e.g. in
 * settings.php:
 *
 *   $conf['authcache_enum_role_combine'] = '_my_function';
 */

I've been thinking about an automated way to restrict the number of combinations needed in #2388687-3: Exhausted memory with many roles enable. The idea is to compute k_max (the maximum of roles selected for one user) and then only generate all combinations up to k_max. Another idea might be to first determine the actual roles used for each k in 1...k_max and then only compute the necessary combinations.

A completely different approach might be to simply record the keys used every time a page is cached and accumulate them in the database. Upon expiration, use the recorded keys. However, if the recorded keys are lost, expiration will stop working.

I need to find a place to document that authcache will need additional tweaking if many roles are selected. Maybe this even could go into the configuration form.

joelpittet’s picture

The configuration form would be a great place to start with documentation! I had roles dialed to 11 and that tapped my site's spine pretty good.

znerol’s picture

Status: Active » Closed (duplicate)

Closing this as a duplicate of #2388687: Exhausted memory with many roles enable now that the issue in #15 has landed.

Thanks for the report and the thorough research.

joelpittet’s picture

Status: Closed (duplicate) » Active

I finally wrote my own authcache_enum_role_combine. That took some time, OMG, and if I missed any there would just be a cache miss so those obscure combos or if someone goes creazy with the roles, they will get some slower page loads, which is what I got before when I turned off some of the roles.

It may not be perfect but may help someone trying to see another approach:

It was returning 1024 combos before with all enabled, now it's returning 39!

/**
 * Default enumeration method for authcache keys.
 *
 * Default method for enumerating possible combinations of roles. Each item is
 * an array with one or more role-ids forming a role-combination.
 *
 * This approach will build an array with 2^n elements. If you enable more
 * than a couple of roles (say 10) in authcache, then memory consumption and
 * computation time will rise quickly.
 *
 * In order to override this function with your own implementation, set the
 * variable authcache_enum_role_combine to the desired function, e.g. in
 * settings.php:
 *
 *   $conf['authcache_enum_role_combine'] = '_feature_cache_enum_default_role_combine';
 */
function _feature_cache_enum_default_role_combine() {
  module_load_include('inc', 'authcache_enum', 'authcache_enum.comb');

  $roles = authcache_get_roles();
  $choices = array();

  // Anonymous users do not have any authcache-key. Therefore there is no need
  // to include this role in the property-info.
  unset($roles[DRUPAL_ANONYMOUS_RID]);

  // The authenticated-user role is exclusive, only include it once. Do not
  // generate combinations including this role.
  if (isset($roles[DRUPAL_AUTHENTICATED_RID])) {
    $choices[] = array(DRUPAL_AUTHENTICATED_RID);
    unset($roles[DRUPAL_AUTHENTICATED_RID]);
  }
  
  // Reference.
  // $roles [
  //   1 => 'anonymous user',
  //   2 => 'authenticated user (without additional roles',
  //   3 => 'administrator',
  //   4 => 'Site Admin',
  //   9 => 'Editor',
  //   12 => 'B2B Commerce',
  //   8 => 'Sales Rep',
  //   5 => 'Partner - USA',
  //   10 => 'Partner - Canada',
  //   11 => 'Partner - International',
  //   7 => 'Staff',
  //   6 => 'Trainers',
  // ];

  $admin_roles = [
    3 => 'administrator',
    4 => 'Site Admin',
    9 => 'Editor',
  ];

  $customer_roles = [
    5 => 'Partner - USA',
    10 => 'Partner - Canada',
    11 => 'Partner - International',
  ];

  $staff_roles = [
    4 => 'Site Admin',
    9 => 'Editor',
    6 => 'Trainers',
    8 => 'Sales Rep',
  ];

  // Any combo of the staff roles + B2B Commerce.
  $choices = array_merge($choices, _feature_cache_enum_comb(array_keys($staff_roles) + [12], 4));

  // Admin + B2B Commerce.
  foreach (array_keys($admin_roles) as $rid) {
    $choices = array_merge($choices, _feature_cache_enum_comb([$rid, 12], 2));
  }

  // Other Staff roles + Staff and/or B2B Commerce.
  foreach (array_keys($staff_roles) as $rid) {
    $choices = array_merge($choices, _feature_cache_enum_comb([$rid, 7, 12], 3));
  }

  // Customer + B2B Commerce.
  foreach (array_keys($customer_roles) as $rid) {
    $choices = array_merge($choices, _feature_cache_enum_comb([$rid, 12], 2));
  }

  // Staff + B2B.
  $choices[] = [7, 12];

  // Combine remaining roles.
  $choices = array_merge($choices,  _feature_cache_enum_comb(array_keys($roles), 1));
  sort($choices);
  $choices = array_unique($choices, SORT_REGULAR);
  return $choices;
}

// Limit the combos.
function _feature_cache_enum_comb(array $set, $limit) {
  $result = [];
  for ($k = 1; $k <= $limit; $k++) {
    $result = array_merge($result, _authcache_enum_comb_k($set, $k));
  }

  return $result;
}

A little test script to see the combos:

// drush scr test.php
module_load_include('inc', 'authcache_enum', 'authcache_enum.comb');

$combos = _authcache_enum_default_role_combine();
foreach ($combos as $combo) {
  drush_print(implode(',', $combo));
}
drush_print('--------');
drush_print(count($combos));

$combos = _feature_cache_enum_default_role_combine();
foreach ($combos as $combo) {
  drush_print(implode(',', $combo));
}
drush_print('--------');
drush_print(count($combos));

One thing I was thinking that may be cool for an admin UI thing to help manage these at least as a start:
What about just a text area to write the combos in. People can just write in their most common combos, one per line and leave the single value ones as they could be auto added.

Then I could just dump in something like this:

3,12
4,6
4,9
4,12
5,12
6,12
7,12
8,12
8,7
9,6
9,7
9,8
9,12
10,12
11,12
4,6,8
4,7,12
4,9,6
4,9,8
6,7,12
8,7,12
9,6,8
9,7,12
4,9,6,8

Or if you want to get really fancy you could use a tag style autocomplete (thinking Chosen JS style) with add "combo button" aka phase 2.

joelpittet’s picture

May not be quite right... because I noticed quite a few of these notices after using my function:

Notice: Undefined offset: 38 in _authcache_enum_cartesian() (line 94 of contrib/authcache/modules/authcache_enum/authcache_enum.comb.inc).

Sneakyvv’s picture

Status: Active » Needs review
StatusFileSize
new6.62 KB

This patch does what znerol suggested in note 13.

A completely different approach might be to simply record the keys used every time a page is cached and accumulate them in the database. Upon expiration, use the recorded keys. However, if the recorded keys are lost, expiration will stop working.

Your concerns about the recorded keys getting lost... I don't see how this might happen.
I was thinking about a fallback mechanism along the lines of

SELECT DISTINCT group_concat(rid) FROM users_roles GROUP BY uid

but since I don't see how the keys could get lost, I didn't include that fallback. Btw, the result would be all existing role combinations, which is probably much less than every possible role combination. And the result would have to be filtered further on the selected roles for authcache.

To summarise, this patch
* stores the used role id combinations
* uses those combinations on expiration, i.e. in the authcache_enum_authcache_enum_key_properties function
* clears the table with the used combinations on hook_flush_caches. At that point, the whole cache is cleared, so there's no need to keep the used keys any longer.

To finish, just these reflections... Maybe the name autcache_enum is no longer appropriate? Perhaps I should have created a submodule and hook into the variable authcache_enum_role_combine instead of patching the authcache_enum module. On the other hand, the listing of all possible roles becomes moot with this change.

Status: Needs review » Needs work
Sneakyvv’s picture

Status: Needs work » Needs review
StatusFileSize
new8.29 KB

"fixed" tests. Actually removed tests which are based on the enum module creating a list of all possible role id combinations

TODO: add test to verify storage of used role id combinations

Status: Needs review » Needs work
Sneakyvv’s picture

Status: Needs work » Needs review
StatusFileSize
new8.29 KB

Previous patch was created against 7.x-2.0-beta5. This patch applies to latest dev (minor difference in part about authcache_enum.info).

Status: Needs review » Needs work
Sneakyvv’s picture

Status: Needs work » Needs review
StatusFileSize
new8.19 KB

Jeez... Apparently I should have created the patch against the authcache project repo instead of in a local repository (as stated at https://www.drupal.org/patch/apply). Never had a problem with that before. But than again I never created a patch that changed an .info file.
Final attempt (I hope).

jannis’s picture

Hello, I have experienced similar pain with Authcache enumeration. I built this module to actually identify existing user role combinations and provide enumeration for only the roles that exist, instead of the brute force method.

You can check it out in my sandbox: https://www.drupal.org/sandbox/jannis/2536674

1. Install and enable the module like normal
2. go to admin/config/system/existing-role-combinations
3. hit submit to calculate each role that actually exists among your users (depending on how active your role changes are you might want to put this in a cronjob) this is important otherwise it won't have any roles to expire from your cache
4. it automatically replaces the enumeration default combine function with one that looks at only the roles this module has identified.

No more memory overloads.

znerol’s picture

Status: Needs review » Closed (won't fix)

I'm closing this as a won't fix since the first thing people should do if they run into this problem is to optimize their site/role-model for caching. Less role combinations means better cache efficiency (hit/miss ratio).

I think that for most sites the very simple brute-force approach currently implemented in Authcache Enum is good enough. If not, #15 will help people to discover the underlying problem and #26 might be an option for those who really need to use many different roles.