The timeout values used in the lock system rely on the local system time with the resulting timestamp being written to the semaphore table to be shared amongst processors in a web farm. Clock drift of more than 30 milliseconds (the default value for expiry in the system) between systems will render it useless in a web farm environment since locks will not be honoured and the behaviour will be unpredictable.

Comments

lostchord’s picture

Priority: Critical » Normal

Ooops - 30 seconds. Less likely but still an issue.

damien tournoud’s picture

Component: base system » documentation

That's a documentation issue.

jhodgdon’s picture

OK... Any suggestions on what text should be added to this page?

lostchord’s picture

Well that's a little tricky and I'm not sure that it's all down to documentation.

There are some assumptions in the code that the time-slop is in the order of milliseconds. Adjustments made by NTP can result in sudden jerks in time several orders of magnitude greater than that. If you are in a hosted environment you have little or no control over time...it's down to the admins at the hosting site I guess.

Given all that I would suggest some words to the effect that time adjustments and lack of close synchronisation between systems will impact the reliability of the locking sub-system in single and multi-server environments. It will not be 100% reliable even if the applications/modules are well coded. Don't try and launch rockets or run pacemakers using it :-)

jhodgdon’s picture

Would you care to make a patch? Nothing will be changed until someone does that (that's how things work around here -- Drupal is a community-based open-source project, and filing issues is only part of the process). I personally don't think I understand the issues well enough to attempt it.

lostchord’s picture

StatusFileSize
new2.52 KB

See if this works...based on drupal-7.0-alpha4 which is what I have downloaded at present. Patch created in NetBeans, I hope I got the files the right way around :-)

If it all goes pear shaped here is the text of the revised comments at the start of the module:

/**
 * @defgroup lock Functions to coordinate long-running operations across requests.
 * @{
 * In most environments, multiple Drupal page requests (a.k.a. threads or
 * processes) will execute in parallel. This leads to potential conflicts or
 * race conditions when two requests execute the same code at the same time. A
 * common example of this is a rebuild like menu_rebuild() where we invoke many
 * hook implementations to get and process data from all active modules, and
 * then delete the current data in the database to insert the new afterwards.
 *
 * This is a cooperative, advisory lock system. Any long-running operation
 * that could potentially be attempted in parallel by multiple requests should
 * try to acquire a lock before proceeding. By obtainng a lock, one request
 * notifies any other requests that a specific operation is in progress which
 * must not be executed in parallel.
 *
 * To use this API, pick a unique name for the lock. A sensible choice is the
 * name of the function performing the operation. 
 *
 * If a function acquires a lock it should always release it when the
 * operation is complete by calling lock_release(), as in the example.
 *
 * A function that has acquired a lock may attempt to renew a lock (extend the
 * duration of the lock) by calling lock_acquire() again during the operation.
 * Failure to renew a lock is indicative that another request has acquired
 * the lock, and that the current operation may need to be aborted.
 *
 * If a function fails to acquire a lock it may either immediately return, or
 * it may call lock_wait() if the rest of the current page request requires
 * that the operation in question be complete. After lock_wait() returns,
 * the function may again attempt to acquire the lock, or may simply allow the
 * page request to proceed on the assumption that a parallel request completed
 * the operation.
 *
 * lock_acquire() and lock_wait() will automatically break (delete) a lock
 * whose duration has exceeded the timeout specified when it was acquired.
 *
 * Alternative implementations of this API (such as APC) may be substituted
 * by setting the 'lock_inc' variable to an alternate include filepath. Since
 * this is an API intended to support alternative implementations, code using
 * this API should never rely upon specific implementation details (for example
 * no code should look for or directly modify a lock in the {semaphore} table).
 *
 * LIMITATIONS:
 * The following limitations in this implementation should be carefully noted:
 *
 * Time - Timestamps are derived from the local system clock of the processor
 *        the code is executing on. The orderly progression of time from this
 *        viewpoint can be disrupted by external events such as NTP
 *        synchronisation and operator intervention. Where multiple web servers
 *        are involved in serving the site they will have their own independent
 *        clocks introducing another source of error in the time keeping process.
 *        Timeout values applied to locks must therefore be considered approximate
 *        and should not be relied upon.
 * Uniqueness of lock names
 *        This is not enforced. The impact of the use of a common lock name will
 *        depend on what processes and resources the lock is being used to manage.
 *        Best case is a denial of service situation.
 * Limited support for resources shared across sites
 *        The locks are stored as rows in the semaphore table and, as such, they
 *        have the same visibility as the table. If managed resources are shared
 *        across sites then the semaphore table must be shared across sites. This
 *        is a binary situation, either all resources are shared and the semaphore
 *        table is shared or no resources are shared and the semaphore table is
 *        not shared. Mixed mode operation is not supported.
 *
 * SIDE EFFECTS:
 * A global array, $locks, is created by the lock management API.
 *
 * EXAMPLES:
 * A very simple example use of this API:
 * @code
 * function mymodule_long_operation() {
 *   if (lock_acquire('mymodule_long_operation')) {
 *     // Do the long operation here.
 *     // ...
 *     lock_release('mymodule_long_operation');
 *   }
 * }
 * @endcode
 *
 */
jhodgdon’s picture

Status: Active » Needs review

Launching the test bot on this one.

Note to lostchord: When you attach a patch, you need to set the status to "needs review", which launches the test bot to test your patch, and also alerts humans that there's something here to look at. Thanks.

EDIT/ADDED: Also, it is not necessary to paste anything from your patch into the text of your comment -- we can read the patch. The only exception would be if you are making a patch that affects a page Drupal generates -- in that case it is helpful to attach before and after screen shots.

jhodgdon’s picture

Status: Needs review » Needs work

Please read http://drupal.org/patch/create to find out how to create patches that will work. Needs to be in unified diff format. Thanks.

jhodgdon’s picture

Also, you need to read
http://drupal.org/node/1354
which explains our documentation standards. Thanks.

lostchord’s picture

I will dig a bit deeper. I had assumed, based on this extensive documentation in the patch creation documentation, that NetBeans produced the right format.

lostchord’s picture

Status: Needs work » Needs review
StatusFileSize
new3.13 KB

Trying again.....

Status: Needs review » Needs work

The last submitted patch, lock.inc_.patch, failed testing.

jhodgdon’s picture

Ummm... We don't generally put headers in all caps into our documentation blocks. In fact, I don't know of a single instance where we have ever done that.

Patch also needs some punctuation and grammar editing. I can do this later (no time today).

Wrapping is also an issue. Lines within a paragraph should wrap as close to 80 chars as possible without going over.

I'm not sure you need to mention the global array really... why is it important to mention in this doc?

lostchord’s picture

Re global array: What happens if it gets stomped on? This is a warning that if you use the locks API then you'd better not use a global array called locks in your own code. This should be a standard part of the documentation, if the API implementation leaves things hanging around that the user can trip over then document it. If the API also uses particular resources and hangs onto them, document it, etc etc. This avoids nasty surprises.

Regards to wrapping. Is this because the output from the API module doesn't just glob all the lines into a single para or because it looks ugly for those reading the raw code?

I will have one more go and try and fix the end-of-line issue, if that doesn't do it then I'm not going to waste any more time on it.

lostchord’s picture

Status: Needs work » Needs review
StatusFileSize
new3.06 KB

Replaced CRLF with LF

Status: Needs review » Needs work

The last submitted patch, lock.inc_.patch, failed testing.

lostchord’s picture

Status: Needs work » Needs review
StatusFileSize
new3.2 KB

Status: Needs review » Needs work

The last submitted patch, lock.inc_.patch, failed testing.

jhodgdon’s picture

lostchord: We don't need to document every side effect of every function. That is what the code is for.

Not sure why your patches are failing.

lostchord’s picture

I think I know why the patch is failing, I don't know if there will be any further obstacles once I get past this one, I'm not going to find out.

If you rely on the code for glossing over deficiencies in the documentation, and cultivate the attitude that it's OK to leave things out because it's all in the code, then your API documentation is not going to get any better.

jhodgdon’s picture

Status: Needs work » Needs review
StatusFileSize
new3.52 KB

Here's a new patch. I've cleaned up the documentation, made it conform to our standards, etc.

Status: Needs review » Needs work

The last submitted patch, 808416-fixup.patch, failed testing.

jhodgdon’s picture

Status: Needs work » Needs review

#21: 808416-fixup.patch queued for re-testing.

jhodgdon’s picture

#21: 808416-fixup.patch queued for re-testing.

jhodgdon’s picture

#21: 808416-fixup.patch queued for re-testing.

daniels220’s picture

Looks reasonable to me.

daniels220’s picture

Status: Needs review » Reviewed & tested by the community
dries’s picture

This looks good. One small detail:

+++ includes/lock.inc	25 Jul 2010 23:50:12 -0000
@@ -58,6 +57,27 @@
+ * - Time: Timestamps are derived from the local system clock of the processor
+ *   the code is executing on. The orderly progression of time from this

The clock is maintained by the operating system / BIOS, not the processor. I'd change 'processor' to 'machine'.

lostchord’s picture

If you factor virtualisation into the picture it can become even more confusing....and BIOS is very much a PC thing. There will be a hardware clock that the O/S will initially take its time from at boot, it does not have to rely on this clock beyond that point although it does need an interrupt source to maintain its own notion of time. Applications may also maintain an independent view of time that is distinct from the other views of time around them. So it is not simple.

If you want a rewrite I would therefore suggest:

"Timestamps are derived from the local system clock of the environment the code is executing in"

daniels220’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new3.36 KB

Rerolled with that change.

damien tournoud’s picture

+ * - Uniqueness: Uniqueness of lock names is is not enforced. The impact of the
+ *   use of a common lock name will depend on what processes and resources the
+ *   lock is being used to manage. The best case is a denial of service
+ *   situation.

"is is"

+ * - Sharing: There is limited support for resources shared across sites.
+ *   The locks are stored as rows in the semaphore table and, as such, they
+ *   have the same visibility as the table. If managed resources are shared
+ *   across sites then the semaphore table must be shared across sites. This
+ *   is a binary situation: either all resources are shared and the semaphore
+ *   table is shared or no resources are shared and the semaphore table is
+ *   not shared. Mixed mode operation is not supported.

This sounds too verbose to me. This is really nothing more then a cornercase. I would drop that completely.

daniels220’s picture

StatusFileSize
new3.36 KB

I'll let others discuss your second point, but here's a patch that fixes the typo.

lostchord’s picture

Fundamental flaw = Corner case.

Nice.

===== Edit =====

From the D6 settings.php:

 * To provide prefixes for specific tables, set $db_prefix as an array.
 * The array's keys are the table names and the values are the prefixes.
 * The 'default' element holds the prefix for any tables not specified
 * elsewhere in the array. Example:
 *
 *   $db_prefix = array(
 *     'default'   => 'main_',
 *     'users'     => 'shared_',
 *     'sessions'  => 'shared_',
 *     'role'      => 'shared_',
 *     'authmap'   => 'shared_',
 *   );

This is a documented supported configuration option...but not supported by the lock routine. Feature????? Next you'll be saying that a 'memory leak' isn't a bug ;-)

jhodgdon’s picture

Version: 7.x-dev » 8.x-dev
Issue tags: +Needs backport to D7

Cleaning up old docs bugs. I guess this one needs to be d8 now... and needs a retest. and a review.

jhodgdon’s picture

Issue tags: -Needs backport to D7

#32: lock_system_D7_new_808416.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Needs backport to D7

The last submitted patch, lock_system_D7_new_808416.patch, failed testing.

jhodgdon’s picture

Apparently this needs a reroll as well as a review.

oriol_e9g’s picture

Status: Needs work » Needs review
StatusFileSize
new3.37 KB

Reroll

jhodgdon’s picture

Thanks! This still needs a review.

jhodgdon’s picture

#38: lock_system_docu-808416-38.patch queued for re-testing.

jhodgdon’s picture

Title: Clock drift will cause lock system to fail » Document that clock drift will cause lock system to fail

Status: Needs review » Needs work
Issue tags: -Needs backport to D7

The last submitted patch, lock_system_docu-808416-38.patch, failed testing.

Version: 8.0.x-dev » 8.1.x-dev

Drupal 8.0.6 was released on April 6 and is the final bugfix release for the Drupal 8.0.x series. Drupal 8.0.x will not receive any further development aside from security fixes. Drupal 8.1.0-rc1 is now available and sites should prepare to update to 8.1.0.

Bug reports should be targeted against the 8.1.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.2.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

jp.stacey’s picture

Version: 8.2.x-dev » 8.3.x-dev
Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new3.41 KB
new1.41 KB

The last patch pre-dates a lot of work rearchitecting D8 core, and hence is always going to fail (includes/lock.inc doesn't exist any more!) However, the relevant code was moved to core/lib/Drupal/Core/Lock/LockBackendInterface.php, fortunately without much change to the docblocks.

I've therefore manually rerolled the patch for 8.3.x, and attach both the patch and an interdiff.

kiwimind’s picture

Status: Needs review » Needs work
Issue tags: +mssprintjan17
+++ b/core/lib/Drupal/Core/Lock/LockBackendInterface.php
@@ -53,6 +52,27 @@
+ * - Time: Timestamps are derived from the local system clock of the evironment

s/evironment/environment

Otherwise looks good to go.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

jacobsanford’s picture

Status: Needs work » Needs review
StatusFileSize
new3.41 KB
new832 bytes

Enclosed is a patch correcting the typo as outlined in #47.

kiwimind’s picture

Status: Needs review » Needs work
+++ b/core/lib/Drupal/Core/Lock/LockBackendInterface.php
@@ -15,15 +15,14 @@
+ * To avoid this type of conflicts, Drupal has a cooperative, advisory lock

Sorry, another one.

Either "To avoid this type of conflict" or "To avoid these types of conflicts"

pushpinderchauhan’s picture

Status: Needs work » Needs review
StatusFileSize
new808 bytes
new3.41 KB

"To avoid these types of conflicts" looks good to me.

kiwimind’s picture

Status: Needs review » Reviewed & tested by the community

This reads well and applies correctly to 8.4.x.

gábor hojtsy’s picture

Status: Reviewed & tested by the community » Needs work

I see this issue has a very long history. Would have been happy to score a win for y'all given the issue is exactly 7 years old today, but stumbled into some stuff:

  1. +++ b/core/lib/Drupal/Core/Lock/LockBackendInterface.php
    @@ -53,6 +52,27 @@
    + * - Uniqueness: Uniqueness of lock names is not enforced. The impact of the
    + *   use of a common lock name will depend on what processes and resources the
    + *   lock is being used to manage. The best case is a denial of service
    + *   situation.
    

    What does the last sentence mean? Denial of service does not sound like a best case in and of itself(?)

  2. +++ b/core/lib/Drupal/Core/Lock/LockBackendInterface.php
    @@ -53,6 +52,27 @@
    + * - Sharing: There is limited support for resources shared across sites.
    + *   The locks are stored as rows in the semaphore table and, as such, they
    + *   have the same visibility as the table. If managed resources are shared
    + *   across sites then the semaphore table must be shared across sites. This
    + *   is a binary situation: either all resources are shared and the semaphore
    + *   table is shared or no resources are shared and the semaphore table is
    + *   not shared. Mixed mode operation is not supported.
    

    Does this try to convey that whatever is the process acquiring the lock is working on should be shared among processes too that could potentially acquire the lock? I am not sure "managed resources" makes this very clear, it does not say who manages said resources, is this confined to the process asking for the lock or more general?

gaurav.kapoor’s picture

StatusFileSize
new3.38 KB
new1.45 KB

Removed DDoS example as it wasn't providing any useful information. Edited the "Sharing" point accordingly.

gaurav.kapoor’s picture

Status: Needs work » Needs review
kiwimind’s picture

Status: Needs review » Reviewed & tested by the community

Personally I like how you're rewritten the shared resources point. I had a look into that too and think that yours reads well.

Patch looks good, so I'm happy to mark as RTBC again in the hope that this old ticket can move forwards.

Thanks.

  • Gábor Hojtsy committed c25da47 on 8.3.x
    Issue #808416 by lostchord, daniels220, gaurav.kapoor, jp.stacey,...

  • Gábor Hojtsy committed 40e14fc on 8.4.x
    Issue #808416 by lostchord, daniels220, gaurav.kapoor, jp.stacey,...
gábor hojtsy’s picture

Version: 8.4.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Superb, thanks! Comitted and pushed to 8.4.x and 8.3.x.

As per the issue tags, it needs to be ported to 7.x too.

gaurav.kapoor’s picture

Status: Patch (to be ported) » Needs review
StatusFileSize
new3.13 KB

Did respective changes for D7.

gaurav.kapoor’s picture

..

poker10’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Pending Drupal 7 commit

I think the D7 backport looks good, thanks! Compared the changes with the current D10 code and the text is the same, see: https://git.drupalcode.org/project/drupal/-/blob/11.x/core/lib/Drupal/Co... . RTBC.

  • mcdruid committed 3895455f on 7.x
    Issue #808416 by lostchord, gaurav.kapoor, daniels220, JacobSanford, jp....
mcdruid’s picture

Status: Reviewed & tested by the community » Fixed
Issue tags: -Needs backport to D7, -Pending Drupal 7 commit

Thank you everybody that worked on this!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.