Resolve infinite stampede in mtime protected PHP storage [#2527478]

Problem/Motivation

MTimeProtectedFastFileStorage::save() has concurrency issues when multiple processes are dumping the container at the same time. This is because the mechanism relies on non-atomic file operations without locking. The two problematic parts are:

The enclosing directory is recursively removed and recreated.
The mechanism to synchronize the directory modification time with the filename hash relies on busy loop which involves file operations.

Note that a change of the directory modification time will result in the invalidation of an existing container. As a result if multiple processes concurrently modify the same directory, there is a chance that newly written containers are invalidated immediately by another process.

Steps to reproduce (based on #2497243-214: Replace Symfony container with a Drupal one, stored in cache:

Remove the compiled container from disk: cd sites/default/files/php && rm -rf service_container. Add a line of code to MTimeProtectedFastFileStorage::save to write 1 line per save into some log file. Now start hitting your site hard with ab, 100 concurrent requests: ab -n 100000 -c 100 http://drupal-8.localhost/ Observe the log file has more than 100 lines, possibly it does not even finish.

Proposed resolution

Never remove the enclosing directory
Reset the modification time of the enclosing directory using touch() to the one of the newly written file instead.

This resolves the endlessness in #2497243: Replace Symfony container with a Drupal one, stored in cache. On busy systems there is still the possibility that multiple processes concurrently rebuild the container. However, because the storage relies on atomic filesystem operations, this will neither result in a deadlock, nor in multiple processes each other newly written containers. Per #2547827-7: PhpStorage: past, present and future the multiple builds is a major bug, the endlessness is critical.

Remaining tasks

Review.

User interface changes

None.

API changes

API addition: PhpStorageInterface::garbageCollection().

Data model changes

None.

Comment	File	Size	Author
#17	resolve_concurrency-2527478-17.patch	7.32 KB	znerol
#17	8.0.x: PHP 5.5 & MySQL 5.5 13,509 pass PHP 5.4 & MySQL 5.5 CI error PHP 5.6 & MySQL 5.5 13,217 pass, 2 fail PHP 7 & MySQL 5.5 10,250 pass, 503 fail
#17	interdiff.txt	3.08 KB	znerol
#14	interdiff.txt	1.01 KB	znerol
#14	resolve_concurrency-2527478-14.patch	7.01 KB	znerol
#14
#12	interdiff-5-10.txt	3.81 KB	znerol
#10	interdiff.txt	3.81 KB	googletorp
#10	resolve_concurrency-2527478-10.patch	6.79 KB	googletorp
#10
#7	interdiff.txt	6.79 KB	znerol
#7	resolve_concurrency-2527478-7.patch	0 bytes	znerol
#7
#5	interdiff.txt	1.56 KB	znerol
#5	resolve_concurrency-2527478-5.patch	3.32 KB	znerol
#5
#1	resolve_concurrency-2527478-1.patch	2.84 KB	znerol
#1

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Comment #1

znerol CreditAttribution: znerol commented 6 July 2015 at 13:05

Status:

Needs work

» Needs review

File	Size
resolve_concurrency-2527478-1.patch	2.84 KB

Comment #2

dawehner

German

CreditAttribution: dawehner commented 6 July 2015 at 13:15

Priority:

Normal

» Major

Nice research, nice work!

This is of course at least major, if not even critical, given what it solves.

Comment #3

Fabianx CreditAttribution: Fabianx as a volunteer commented 6 July 2015 at 14:56

Issue tags:

+Security, +Needs security review

Tagging security.

This is probably one of the most security reviewed code in core, so will need a security review unfortunately.

Comment #4

znerol CreditAttribution: znerol commented 6 July 2015 at 15:25

FYI, I can reproduce the original issue with the following setup:

LAMP in a vagrant box.
Drupal exported via NFS from host into the VM in data/htdocs/drupal-8-bootstrap.
ab -c 100 -n 1000000 -s120 http://127.0.0.1:8080/drupal-8-bootstrap/
rm -rf data/htdocs/drupal-8-bootstrap/sites/default/files/php/service_container

Then observe the container being rewritten constantly using:

watch -n0.1 find data/htdocs/drupal-8-bootstrap/sites/default/files/php/service_container

I've turned down the number of servers to only 8 and the issue still exists:

<IfModule mpm_prefork_module>
    StartServers          4
    MinSpareServers       4
    MaxSpareServers       8
    MaxClients            8
    MaxRequestsPerChild   0
</IfModule>

This is on a not-so-new but still decent laptop with an SSD.

Comment #5

znerol CreditAttribution: znerol commented 6 July 2015 at 16:19

File	Size
resolve_concurrency-2527478-5.patch	3.32 KB

interdiff.txt	1.56 KB

In order to prevent that touch creates a file (in the place of the directory) if the directory does not exist, ensure that the path terminates with a directory separator.

PHP source of touch(). Updating the mtime is done using the utime() system call on unix and that seems to be POSIX. I'm trying to find a windows now to see whether that works the same way.

Comment #6

znerol CreditAttribution: znerol commented 6 July 2015 at 18:11

Confirmed that this approach also works in Windows.

Comment #7

znerol CreditAttribution: znerol commented 7 July 2015 at 10:50

File	Size
resolve_concurrency-2527478-7.patch	0 bytes

interdiff.txt	6.79 KB

Add garbage collection.

Comment #8

znerol CreditAttribution: znerol commented 7 July 2015 at 11:10

Issue summary:

View changes

Comment #9

znerol CreditAttribution: znerol commented 7 July 2015 at 11:11

Issue summary:

View changes

Comment #10

googletorp CreditAttribution: googletorp as a volunteer commented 7 July 2015 at 15:10

Status:

Needs review

» Needs work

File	Size
resolve_concurrency-2527478-10.patch	6.79 KB

interdiff.txt	3.81 KB

2 files were hidden/shown/deleted

File	Size
resolve_concurrency-2527478-7.patch	0 bytes

interdiff.txt	6.79 KB

Patches from #7 is a bit messed up, interdiff.txt is the actual patch and the patch is empty. Fixed this.

Comment #11

googletorp CreditAttribution: googletorp as a volunteer commented 7 July 2015 at 15:11

Status:

Needs work

» Needs review

Comment #12

znerol CreditAttribution: znerol commented 7 July 2015 at 15:43

File	Size
interdiff-5-10.txt	3.81 KB

Oh, sorry about that. Here is the interdiff for #5-#10.

Comment #13

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Chapter Three commented 8 July 2015 at 14:40

+++ b/core/lib/Drupal/Component/PhpStorage/MTimeProtectedFastFileStorage.php
@@ -78,43 +78,26 @@ public function save($name, $data) {
+    rename($temporary_path, $full_path);

Using an atomic file operation here is clever. Let's document why we're doing this.

Comment #14

znerol CreditAttribution: znerol commented 8 July 2015 at 16:06

File	Size
resolve_concurrency-2527478-14.patch	7.01 KB

interdiff.txt	1.01 KB

Added a comment for the rename.

Comment #15

bjaspan CreditAttribution: bjaspan at Acquia commented 9 July 2015 at 18:10

I am a crufty Drupal core developer that has not been involved for five years, and I have not looked at D8 before this week. So, it is entirely possible that I do not know what I'm talking about. However, I have some concerns with this patch.

First, specific comments on the code:

+++ b/core/lib/Drupal/Component/PhpStorage/MTimeProtectedFastFileStorage.php
@@ -78,43 +78,29 @@ public function save($name, $data) {
+    rename($temporary_path, $full_path);

This operation can fail, and you are not checking the return value.

+++ b/core/lib/Drupal/Component/PhpStorage/MTimeProtectedFastFileStorage.php
@@ -78,43 +78,29 @@ public function save($name, $data) {
+    touch($directory . '/', $mtime);

Ditto, even if you know the rename succeeded. For example, a network partition might have just separated you from the NFS server.

+++ b/core/lib/Drupal/Component/PhpStorage/MTimeProtectedFastFileStorage.php
@@ -162,6 +148,37 @@ public function delete($name) {
+        foreach (new \DirectoryIterator($directory) as $fileinfo) {

The directory might cease to exist between file_exists() and the creation of the iterator.

+++ b/core/lib/Drupal/Component/PhpStorage/MTimeProtectedFastFileStorage.php
@@ -162,6 +148,37 @@ public function delete($name) {
+              @chmod($directory, 0777);
+              @unlink($fileinfo->getPathName());

Return codes.

+++ b/core/lib/Drupal/Component/PhpStorage/MTimeProtectedFastFileStorage.php
@@ -162,6 +148,37 @@ public function delete($name) {
+          $this->unlink($name);

Return codes.

Second, and more importantly, even if you address all of those comments, I'm still concerned. It is commonplace now for Drupal to run in multi-web environments using a distributed filesystem, and distributed filesystems are just not that reliable. A "POSIX compatible distributed filesystem" is basically a myth. It is not safe to assume that rename() is atomic or that touch() operations work or, basically, almost anything else "clever" about a filesystem. So, this patch will result in D8 being not that reliable.

Furthermore, even if I'm wrong and there is some way to make this work reliably using the filesystem (I doubt it), I'm *still* concerned, because then you will have forced D8 to be absolutely dependent on using a distributed filesystem that provides strictly POSIX-compatible rename() operations. There are a number of potentially compelling choices for Drupal's shared filesystem that do not meet this criteria (e.g. S3).

The common component in a Drupal stack that is capable of performing atomic operations is the SQL database. So, for operations that need to be locked or atomic, use the database. Or, use a locking service.

Does D8 have an existing service interface for locking across all web nodes?

Comment #16

chx CreditAttribution: chx commented 9 July 2015 at 19:42

> because then you will have forced D8 to be absolutely dependent on using a distributed filesystem that provides strictly POSIX-compatible rename() operations

Noone should use the mtime protected storage on a shared filesystem. That's a piss poor idea. #2301163: Create a phpstorage backend that works with a local disk but coordinates across multiple webheads and #2513326: Performance: create a PHP storage backend directly backed by cache both offer better ideas.

Comment #17

znerol CreditAttribution: znerol commented 9 July 2015 at 20:40

File	Size
interdiff.txt	3.08 KB
resolve_concurrency-2527478-17.patch	7.32 KB
8.0.x: PHP 5.5 & MySQL 5.5 13,509 pass PHP 5.4 & MySQL 5.5 CI error PHP 5.6 & MySQL 5.5 13,217 pass, 2 fail PHP 7 & MySQL 5.5 10,250 pass, 503 fail

Please note that the term concurrency in the title is not meant to express distributed filesystems, but merely parallel processing. We currently have a problem with multiple server processes invalidating each others work when rebuilding the container.

Storing code on the filesystem has the advantage that we can leverage opcode caching. Especially for small sites this is probably the fastest option for retrieving generated code we currently have. The picture is certainly different in a load balanced / cluster setup. We already have the option in HEAD to swap out container and twig storage for something different.

Even if we eventually move container storage to somewhere else as #2497243: Replace Symfony container with a Drupal one, stored in cache aims to do it, we probably still will be using PHP storage for different things. Therefore I think it is important to make it as robust as possible. The other option would be to ditch it completely.

Good call on the return values. In the garbageCollection method we do not care though, because there is nothing which wants to know whether the operation went well or not. Regardless it is possible to write this function in a more defensive way.

Comment #18

chx CreditAttribution: chx commented 9 July 2015 at 21:15

> Storing code on the filesystem has the advantage that we can leverage opcode caching.

Still beating my own drum: both approaches linked in #16 keeps this via different means.

> Especially for small sites this is probably the fastest option for retrieving generated code we currently have.

That is certainly true -- querying a distributed data store to coordinate across webheads will certainly be slower than not querying one :) Once coordination is done, opcache hits so it's a wash from there.

On to this patch, this is a very nice trick, making the directory mtime a (possibly) false mtime, very nice, requires PHP 5.3.0 but we are way past that.

Comment #19

pwolanin CreditAttribution: pwolanin as a volunteer and at Acquia commented 3 August 2015 at 15:08

Should we find a way to totally remove mtime storage before release if it's not reliable? Is this issue or the linked issues critical?

It feels like we should consider also facilitating the pattern of committing the built PHP to git and blocking config changes in production?

Comment #20

Berdir

German

Switzerland

CreditAttribution: Berdir commented 3 August 2015 at 15:12

It feels like we should consider also facilitating the pattern of committing the built PHP to git and blocking config changes in production?

Locking down is all good and well, but at some point, you need to do actually do a config deployment, which can absolutely include new modules.

I really don't see how that is supposed to be possible unless you also deploy the whole database from staging which already has the new modules installed. Or have multiple prepared containers ready, which then depend on the list of enabled modules or something like that.

Comment #21

catch

he/him

English

CreditAttribution: catch commented 3 August 2015 at 15:44

Or have multiple prepared containers ready, which then depend on the list of enabled modules or something like that.

We've got a bit closer to that with the addition of $deployment_identifier in settings.php, although that would mean the new container is used immediately on code deployment and before update.php is run or config is synced, so it'd still be out of sync until both of those are done. You could go further and write the deployment hash to memcache then read it in settings.php but that gets really nasty fast.

#2497243: Replace Symfony container with a Drupal one, stored in cache means the container is no longer written to PHP (and is critical), so the container will be completely out of phpstorage.

However #2429659: Race conditions in the twig template cache - twig templates are still using phpstorage.

It would be possible to use read-only phpstorage for twig templates, but they can differ based on enabled modules (due to node visitors http://twig.sensiolabs.org/doc/advanced.html) - so while that could be pre-generated in a build step on a per-site basis, we can't add template building to d.o packaging etc. as it stands.

Comment #22

znerol CreditAttribution: znerol commented 3 August 2015 at 19:44

I believe that mtime protected PHP storage can be implemented in a way such that it reliably works (i.e., does not obstruct itself). It is not even a problem if the underlying system fails to implement rename() in an atomic way, the worst thing that can happen is a reduced hit/miss ratio.

The same goes for the touch() call. In most cases, the timestamp of the enclosing directory is correct. The call attempts to fix the timestamp in the rare case of a mismatch. Even if the filesystem does not support updating the metadata, this will simply result in a cache-miss on a subsequent request and that will just rebuild the data and retry to store it. In fact I believe we only need the touch() in order to prevent random test failures, in production this is most likely irrelevant.

The problem of the current implementation is the recursive unlink() which will happily remove work done by other processes.

Comment #23

catch

he/him

English

CreditAttribution: catch commented 7 August 2015 at 21:18

Priority:

Major

» Critical

Going to bump this to critical on the basis that:

- It fixes #2497243: Replace Symfony container with a Drupal one, stored in cache (even if I think that issue is release blocking for other reasons).

- Also fixes #2429659: Race conditions in the twig template cache even though we worked around it in twig - although the workaround requires an eval() which still concerns me a bit.

- Any other use of phpstorage will have the same race condition/stampede issue until this is in.

Comment #24

chx CreditAttribution: chx commented 8 August 2015 at 05:43

Note #2547827: PhpStorage: past, present and future

Comment #25

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Chapter Three commented 11 August 2015 at 09:50

Status:

Needs review

» Reviewed & tested by the community

I done a line-by-line review of this. The use of atomic operations to avoid race conditions is the way to go. Nice work @znerol.

Comment #26

11 August 2015 at 10:27

Status:

Reviewed & tested by the community

» Needs work

The last submitted patch, 17: resolve_concurrency-2527478-17.patch, failed testing.

Comment #27

znerol CreditAttribution: znerol commented 11 August 2015 at 11:55

That's weird. This used to pass on the 5.4 testbot, but it seems to break now on PHP 5.5 :/

Comment #28

znerol CreditAttribution: znerol commented 11 August 2015 at 16:24

I believe that the test failures are caused by a bug in the installer: #2549591: Installer does not ensure presence of container_yamls setting, breaking sites installing from an empty settings.php

Comment #29

chx CreditAttribution: chx commented 12 August 2015 at 18:20

Title:

Resolve concurrency issues in PHP storage

» Resolve infinite stampede in mtime protected PHP storage

Comment #30

chx CreditAttribution: chx commented 12 August 2015 at 18:26

Issue summary:

View changes

Comment #31

13 August 2015 at 19:40

Status:

Needs work

» Needs review

dawehner queued 17: resolve_concurrency-2527478-17.patch for re-testing.

Comment #32

13 August 2015 at 20:15

Status:

Needs review

» Needs work

The last submitted patch, 17: resolve_concurrency-2527478-17.patch, failed testing.

Comment #33

16 August 2015 at 05:21

Status:

Needs work

» Needs review

chx queued 17: resolve_concurrency-2527478-17.patch for re-testing.

Comment #34

16 August 2015 at 05:47

Status:

Needs review

» Needs work

The last submitted patch, 17: resolve_concurrency-2527478-17.patch, failed testing.

Comment #35

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Chapter Three commented 16 August 2015 at 07:59

If you what to repeat the test result you need to run the test without a parent site installed and have no settings.php at all. You need to use the sqlite test runner. Something like:
php ./core/scripts/run-tests.sh --sqlite /tmp/coretest.sqlite --dburl sqlite://localhost/sites/default/files/db.sqlite --color --url http://your.url

Comment #36

alexpott

he/they

English

🇪🇺🌍

CreditAttribution: alexpott at Chapter Three commented 16 August 2015 at 08:03

    if (!$this->addServiceFiles(Settings::get('container_yamls'))) {
      throw new \Exception('The container_yamls setting is missing from settings.php');
    }

Is this code actually valid? Is it really mandatory to have a container_yamls setting in settings.php. This looks wrong to me. Simpletest sets it to an empty array a lot so that this does not thrown an exception.

Comment #37

dawehner

German

CreditAttribution: dawehner commented 17 August 2015 at 09:01

Afaik #2547447: Stop creating services.yml by default will get rid of that requirement, so strictly speaking I don't think we would really need it.

Comment #38

23 August 2015 at 21:20

Status:

Needs work

» Needs review

dawehner queued 17: resolve_concurrency-2527478-17.patch for re-testing.

Comment #39

dawehner

German

CreditAttribution: dawehner as a volunteer and at Tag1 Consulting for Drupal Association commented 24 August 2015 at 05:44

Status:	Needs review	» Reviewed & tested by the community
Issue tags:		+D8 Accelerate

So the other issue fixed indeed the test failures ... back to RTBC

Comment #40

catch

he/him

English

CreditAttribution: catch commented 24 August 2015 at 07:26

Status:

Reviewed & tested by the community

» Fixed

Really good to see this green. Committed/pushed to 8.0.x, thanks!

Comment #41

24 August 2015 at 07:26

catch committed 8d12555 on 8.0.x

Issue #2527478 by znerol, googletorp: Resolve infinite stampede in mtime...

Comment #42

7 September 2015 at 07:34

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Resolve infinite stampede in mtime protected PHP storage

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments

Parent issue

Related issues

Referenced by