Wildcard clear lock contention [#888002]

When doing a lot of wildcard cache clears like cache_clear_all("node:render:$nid", 'cache_performance_hacks', TRUE); a lock is created for the entire cache bin.

Two issues with this:

1. usleep(1000) isn't long to wait, so there could be a lock stampede - patch converts this to use the locking framework, completely untested.

2. locking clears on the whole bin, which in our cases has entries for 3 million nodes, means it's quite likely that clears are going to run into a lock.

Jeremy going to look at a hash_wrapper lookup for prefixes so we don't have to lock and search a single array for the entire bin, which would fix this a lot more solidly.

Comment	File	Size	Author
#12	888002-flushes.patch	816 bytes	catch
#7	memcache.inc-4.patch	11.13 KB	jeremy
#6	memcache.inc-3.patch	14.18 KB	jeremy
#6	test.php_.txt	5.02 KB	jeremy
#5	memcache.inc-2.patch	1.12 KB	jeremy
#1	memcache.inc_.patch	13.56 KB	jeremy
	lock.patch	2.69 KB	catch

Comments

Comment #1

jeremy commented 19 August 2010 at 23:26

Status	File	Size
new	memcache.inc_.patch	13.56 KB

Here's an attempt at cleaning this up. The hashing function needs more love to better spread out unique cid's, and overall should be optimized to use multiget rather than making so many wildcard queries.

Feedback and testing very welcome.

Comment #2

catch

he/him

English

commented 20 August 2010 at 01:16

     if (empty($cid) || $wildcard === TRUE) {
-      if (variable_get('cache_lifetime', 0)) {
+      if (variable_get('cache_lifetime', 0) && empty($cid)) {
         // Update the timestamp of the last global flushing of this bin.  When
         // retrieving data from this bin, we will compare the cache creation
         // time minus the cache_flush time to the cache_lifetime to determine
@@ -179,60 +133,196 @@
         if ($cid == '*') {
           $cid = '';
         }

Does the $cid = '' hunk now need to happen above empty($cid)?

Will try to review the patch properly tomorrow. The rest is thoughts not really directly related to the patch, but I think it's feasible to keep gets down at least in the performance_hacks case.

Haven't looked at code for this, but I think we can optimize this at least for the common case where wildcard clears end with an int.

In cache_performance_hacks() we have cids like 'node:render:1234:foo:bar:baz'

These are always cleared as cache_clear_all('node:render:1234', 'cache_performance_hacks');

On clearing, it would be easy to determine if the end of the wildcard is an int, then segment the wildcard directory numerically - say in chunks of 1000. These will be different sizes in reality of course depending on how many nodes are actually cached with nid 1-1000.

On ->get(), we need to account for the fact that that we could be doing cache_get('node:render:1234:foo:bar:bar') or cache_get('node:render:1234:z:x); when you mentioned this I more or less gave up..

However, from the clears, we could determine that they always match the pattern node:render:$int I think.

So... as well as numeric segmentation, we could also store the wildcard patterns which have been used to clear bins - meaning we only need to confirm that node:render:$int if that's the only way things ever get cleared. Not sure how best to figure out that node:render:1234 and node:render:654321 are the same pattern - it might mean requiring this pattern (colon delimiters and an int at the end, split into an array, confirm the text strings are the same and the last array key is an int) then a fallback to more generic hashing otherwise.

So this would require a wildcard patterns variable somewhere, could probably store a single array with patterns for all bins - there'll never be many patterns - so that should be small and cheap to fetch. Then we'd be able to fetch one set of actual wildcards per bin on ->get().

Also starting to think that even if we fix this, it might still have issues with the specific case of performance hacks and a very high number of nodes + comments being posted, so may need to refactor that to not use wildcard clears at all still.

Comment #3

jeremy commented 20 August 2010 at 01:41

The "hash" function we currently use is really just grabbing a few chunks of characters from the front of the cid to split up the wildcard directories in a given bin. Reading your feedback, I suspect what we really want is to use a few characters from the front of the cid and a few characters from the end of the cid both -- that should help to better spread out the wildcard directories minimizing both lock contention and memory usage.

As for optimizing the number of gets, I'm still leaning toward doing a single multiget of all possible wildcards, rather than looping through all possible wildcards as we currently do. That will reduce the necessary roundtrips to the memcache server(s) which is the only performance concern I currently have.

Comment #4

jeremy commented 20 August 2010 at 01:58

Actually, we have to take the characters from the left. We could skip every other, every third, every fourth, whatever -- so long as it's consistent. But we can't take from the left and right side of the cid, as that would make it impossible to find all matching wildcards.

Comment #5

jeremy commented 20 August 2010 at 03:28

Status	File	Size
new	memcache.inc-2.patch	1.12 KB

For example, here's a patch to my patch that improves the hashing algorithm to better split up the $cid.

Comment #6

jeremy commented 20 August 2010 at 16:00

Status	File	Size
new	test.php_.txt	5.02 KB
new	memcache.inc-3.patch	14.18 KB

I fixed another bug affecting long cid names, and made the hash length configurable through a variable if desired. In my testing, however, the default size has worked best.

The negative of this patch is that it requires considerably more memcache gets to validate that we've not had a wildcard flush -- the longer the cid, the more gets we have to make.

I started a patch to add multi-get support to minimize the number of added gets, but that will only help if using the memcached pecl extension. At this point, I'd appreciate general reviews of the patch, and am open to better ideas on how to accomplish this.

Note that I renamed 'prefixes' to 'wildcards', as there was already another type of 'prefix' in the memcache module and this was highly confusing when reading the code.

The attached "test.php" script can be run from drush like "drush scr test" and will do a basic regression test.

Comment #7

jeremy commented 24 August 2010 at 21:35

Status	File	Size
new	memcache.inc-4.patch	11.13 KB

This patch improves over previous versions in several significant ways:

It removes the need for a lock (and thus lock-contention)
Optimized with multi-get to check all possible wildcard matches with a single memcache get request
Completely removes the _counters directory and wildcard (prefixes) array, resulting in logic that is much simpler to understand

This patch also adds proper multiget support to the standard memcache pecl plugin, in addition to the multiget support that already exists for the memcached pecl plugin.

Comment #8

jeremy commented 25 August 2010 at 08:38

Status:

Active

» Fixed

Patch committed, slightly modified from above. Also tested with minimum cache lifetime.

Comment #9

jeremy commented 25 August 2010 at 08:39

Version:	7.x-1.x-dev	» 6.x-1.x-dev
Status:	Fixed	» Patch (to be ported)

Needs to be ported to the 6.x branch.

Comment #10

jeremy commented 25 August 2010 at 14:11

In DRUPAL-7-x-1.x-dev, fixed typo, and made it possible to completely disable cid hashing:
http://drupal.org/cvs?commit=411938

Comment #11

catch

he/him

English

commented 25 August 2010 at 16:56

Status:

Patch (to be ported)

» Needs work

got this running drush after merge:

PHP Notice:  Undefined property: stdClass::$flushes in /home/ncatchpole/public_html/examiner-d7/sites/all/modules/contrib/memcache/memcache.inc on line 230

I think this is just because there wasn't a full cache flush. Can add an isset there which will instantly invalidate all items, or maybe always return true if !isset()?

Comment #12

catch

he/him

English

commented 25 August 2010 at 21:35

Status	File	Size
new	888002-flushes.patch	816 bytes

Comment #13

jeremy commented 25 August 2010 at 22:36

It seems to me that this will force-flush all previously cached content following an upgrade to this version of the memcache module -- is that the goal? If not, perhaps we're looking for something like:

    $flushes = isset($cache->flushes) ? (int)$cache->flushes : 0;
    if ($flushes < (int)$this->wildcard_flushes($cid)) {

More specifically, $this->wildcard_flushes($cid) will return 0 if there have been no relevant wildcard flushes.

Comment #14

catch

he/him

English

commented 26 August 2010 at 00:03

Version:

6.x-1.x-dev

» 7.x-1.x-dev

I could go either way on this - the variable will be zero both for items which have been wildcard flushed, and those that haven't, so we have no way to know which is which. It might be better to do your patch, then let sites do a cc all if they don't want to take the risk.

Comment #15

jeremy commented 26 August 2010 at 07:23

Version:	7.x-1.x-dev	» 6.x-1.x-dev
Priority:	Critical	» Major
Status:	Needs work	» Patch (to be ported)

Applied, thanks for reporting the issue.

Marking as to-be-ported again.

Comment #16

jeremy commented 26 August 2010 at 21:51

Status:

Patch (to be ported)

» Fixed

Patch back-ported and applied to 6.x-1.x-dev:
http://drupal.org/cvs?commit=412680

Comment #17

9 September 2010 at 22:00

Status:

Fixed

» Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Wildcard clear lock contention

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

News items

Our community

Documentation

Drupal code base

Governance of community