Voting starts in March for the Drupal Association Board election.
Drupal currently face several concurrency issues (see #249185 for variable caching, #238760 for menu rebuilding, ...). All relatively long operations (cache building, ...) are potentially affected by two different types of issues:
- database inconsistency: when the operation rebuilds a whole table, using the DELETE-then-INSERT pattern, requests served during the operation will see an inconsistent table (to demonstrate the issue, simply reload twice the module page in Drupal 6.x/7.x);
- race conditions: several operations can run at the same time, thus leading to unneeded duplication of effort.
The first issue can be solved by reducing the window of inconsistency (that's what is suggested in #238760 for
session_write(), or in #230029 for
node_save()), or by using transactions (that's one of the point of the new database layer).
The second issue cannot be solved by simply using transactions. You need some sort of locking (or semaphores), as discussed in #248739 and #238760. Without locking, these long operations can lead to performance hits (because several instances of the operation are run in parallel), and to space hits (because the result of the operation is saved several times in the database, thus in its binary log). About this last point, one of the busiest Drupal site in France (France 24) was forced to disabled locale caching because its invalidation lead to several hundred megabytes of binary log data.
The attached patch implements a pluggable soft locking framework for Drupal. Two points worth noting:
- Two implementations are available: an APC-based one, which is fast but implies that there is an unique web node; a slower database-based one, that should work across a cluster of nodes. High-performances implementations for a cluster could be made, for example based on Memcached or on a multicast bus.
- All long standing operations should be locked: the patch only demonstrates how to do this for
locale()cache refresh. Several locking strategies can be considered: acquire the lock and give up the operation if it fails (that's the implemented strategy for locale cache refresh), acquire the lock and wait for the other operation to complete (
menu_rebuild()), or acquire the lock and retry acquiring until success.
PASSED: [[SimpleTest]]: [MySQL] 17,539 pass(es). View