Looking at lock_release, it seems like the memcache implementation would force-break a lock of the given name, regardless of whether it is held by the current process. The current process may have already lost the lock, and so breaking it in this way would remove the lock from another process that legitimately claimed that lock.

See http://cgit.drupalcode.org/memcache/tree/memcache-lock-code.inc#n133

function lock_release($name) {
  global $locks;

  dmemcache_delete($name, 'semaphore');
  // We unset unconditionally since caller assumes lock is released anyway.
  unset($locks[$name]);
}

If you compare that with core/SQL lock.inc, the core version will only unset a lock that has a value of _lock_id.

IMO we should do the same in memcache lock, but there is a tricky issue: you have to check the value coming out of memcache, and that allows a potential race condition:

  1. My process has the lock still when it calls release
  2. I check that I have the lock, and I do
  3. This happens between me checking the lock and proceeding to delete:
    1. The lock expires and I lose it
    2. Another process claims the lock
  4. I proceed with deleting the lock, because it was mine to delete
  5. The second process continues as though it had the lock, but currently the lock is free for a third process to grab because I broke it

This condition doesn't seem impossible to solve, but I don't have the solution currently.

Looking at the memcache implementation of lock_release_all, it seems that it has correct handling if lock IDs, but is vulnerable to the same race condition described above.

CommentFileSizeAuthor
#2 memcache-2379897-lock-break.patch631 bytesAlan Evans
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Alan Evans’s picture

So, some PoC code:


$lock = lock_acquire('test');
var_dump($lock);
sleep(30);

$lock = lock_acquire('test');
var_dump($lock);
lock_release('test');
$lock = lock_acquire('test');
var_dump($lock);

Run the first, then the second in a separate process within the 30 second sleep - you'll see output from the first indicating that the lock was acquired and then it will hang for 30 sec (note: default lock_acquire timeout is 30 sec):

bool(true)

Run the second shortly after starting the first. You'll see this:

bool(false)
bool(true)

which indicates that the second process forcibly broke the lock held by the first.

Alan Evans’s picture

Status: Active » Needs review
FileSize
631 bytes

The attached patch allows process 1 to keep the lock until it completes. So, during the 30 second sleep, it fails at both attempts to get the lock:

bool(false)
bool(false)

And then once it completes, the lock is available immediately:

bool(true)
bool(true)

This does still allow the race condition as described

Alan Evans’s picture

Note: the behaviour with the patch emulates the behaviour of core/SQL lock.inc more closely. I don't see any option in memcache for "delete this key entry, but only if the value matches X lock ID", so the small race condition might be something we have to live with for now - it's closer to core, but not 100% yet while that race condition still exists.

  • Jeremy committed aa0f7ff on 7.x-1.x authored by Alan Evans
    Issue #2379897 by Alan Evans: cconfirm lock_id before releasing lock
    
Jeremy’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
Status: Needs review » Patch (to be ported)

Thanks! Committed:
http://cgit.drupalcode.org/memcache/commit/?id=aa0f7ff

Needs back port.

  • Jeremy committed c8e723e on 6.x-1.x
    Issue #2379897 backported from d7, only release locks we own
    
Jeremy’s picture

Status: Patch (to be ported) » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.