Problem/Motivation

Per #1314214: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols) there is currently no uniqueness constraint on the uri field in the file_managed table enforced in the db schema in D8, because of limitations on lengths of keys under utf8mb4 with lowest-common-denominator MySQL configs. Instead we rely on a Drupal/application-level constraint in \Drupal\file\Plugin\Validation\Constraint\FileUriUnique.

However, this constraint is not currently enforced in file_save_upload(), as the validate() method is never explicitly called on the $file object.

Coupled with the way new filenames are chosen in file_destination() and file_create_filename() (based on checking file_exists() on a candidate name) this can lead to a race condition whereby more than one entry can be created in the file_managed table with the same value for the uri field.

When this happens, there is a risk of data loss. For example, if garbage collection deletes temporary files which have the same uri as other (permanent) files.

Slow implementations of hook_file_validate() (e.g. in the ClamAV module) can exacerbate the race condition, by increasing the likelihood that multiple parallel(ish) processes select the same filename before anything is written to disk.

Proposed resolution

Enforce the unique constraint on files' uri to avoid multiple rows with the same value in the file_managed table.

Remaining tasks

  • Review the patch.
  • Commit the patch.

User interface changes

There may be a new error message presented to users if they try to upload a file and the race condition occurs. However, it actually seems that changes since the initial report of this bug make it less likely that this will happen in the UI (see #52 for example).

API changes

None.

Data model changes

None.

Release notes snippet

Use "Problem/Motivation" section above?

Original report by manarth

First discovered in the ClamAV module: #2864506: Long ajax process causing files to disappear.

A slow implementation of hook_file_validate() can cause inappropriate entries in the file_managed table, causing an in-use file to be treated as orphaned, and therefore deleted on a cron-run. This causes permanent loss of data.

More significantly, the upload appears to succeed and the data is available for a period of time, whilst the data-loss happens on cron, between 6 hours and 3 months after the data was uploaded. This can make the cause of the issue harder to recognise, and reduce the likelihood that the original file is still available for recovery.

Steps to reproduce:

Not possible to manually reproduce since 8.6.x (see #52) - see the new test instead: SaveUploadTest.php::testDuplicate()

  1. Prepare a Drupal install with a content-type that contains a file field.
  2. Enable a module which has a slow implementation of hook_file_validate(); either use example code provided below, or install this sandbox (machine name: slow_file_upload)
  3. Go to the node-add page for the relevant content-type
  4. Click the "browse" button, and choose a file to upload
  5. Click the form's "Save and publish" button quickly, before the AJAX upload + validate process has completed

The outcome is that two entries are added to the file_managed table, and one entry added to the file_usage table. Here is an example:

mysql> select * from file_managed;
+-----+--------------------------------------+----------+------+----------+----------------------------+------------+----------+--------+------------+------------+
| fid | uuid                                 | langcode | uid  | filename | uri                        | filemime   | filesize | status | created    | changed    |
+-----+--------------------------------------+----------+------+----------+----------------------------+------------+----------+--------+------------+------------+
|   1 | 3bcdd711-988e-4906-b8c6-0b22df271b93 | en       |    1 | foo.txt  | public://2017-04/foo_0.txt | text/plain |        9 |      0 | 1492187343 | 1492187343 |
|   2 | 0a7a9874-7e73-43da-8eeb-a7caa03ae63d | en       |    1 | foo.txt  | public://2017-04/foo_0.txt | text/plain |        9 |      1 | 1492187346 | 1492187346 |
+-----+--------------------------------------+----------+------+----------+----------------------------+------------+----------+--------+------------+------------+
2 rows in set (0.00 sec)


mysql> select * from file_usage;
+-----+--------+------+----+-------+
| fid | module | type | id | count |
+-----+--------+------+----+-------+
|   2 | file   | node | 1  |     1 |
+-----+--------+------+----+-------+
1 row in set (0.00 sec)

One entry is correctly recorded in the file_managed table, with a corresponding file_usage entry; the second is an orphaned file. However, both entries share the same uri. When the cron cleanup process deletes orphaned files, the valid, in-use file is deleted, causing permanent data loss.

The issue was originally reported in the context of the ClamAV module, however, I've been able to reproduce using nothing more than a sleep(10) command in a new debug module:

/**
 * Implements hook_file_validate().
 */
function mymodule_file_validate(Drupal\file\FileInterface $file) {
  $errors = [];

  if (!isset($_POST['op']) || ($_POST['op'] != 'Save')) {
    // Add a delay if we're not actually saving the node.
    sleep(30);
  }

  return $errors;
}

This particularly affects the ClamAV module, where certain virus-scan configurations can add 10 seconds to the file-validate process, but this issue could affect any scenario where validation invokes a slow external process.

CommentFileSizeAuthor
#91 2869855-91.patch7.64 KBmcdruid
#91 2869855-91_test_only.patch5.96 KBmcdruid
#91 interdiff-2869855-88-91.txt919 bytesmcdruid
#88 2869855-88.patch7.64 KBmcdruid
#88 2869855-88_test_only.patch5.96 KBmcdruid
#88 interdiff-2869855-85-88.txt917 bytesmcdruid
#85 2869855-85.patch7.64 KBmcdruid
#85 2869855-85_test_only.patch5.95 KBmcdruid
#85 interdiff-2869855-78-85.txt1.45 KBmcdruid
#81 2869855-81.patch7.32 KBmcdruid
#81 2869855-81_test_only.patch5.64 KBmcdruid
#81 interdiff-2869855-78-81.txt1.44 KBmcdruid
#78 2869855-78.patch7.65 KBmcdruid
#78 2869855-78_test_only.patch6.08 KBmcdruid
#78 interdiff-2869855-65-78.txt3.57 KBmcdruid
#65 2869855-65.patch3.68 KBKrzysztof Domański
#65 2869855-65-test-only.patch2 KBKrzysztof Domański
#65 interdiff-49-65.txt1.8 KBKrzysztof Domański
#63 2869855-63-test-only.patch2 KBKrzysztof Domański
#63 interdiff-61-63-test-only.txt885 bytesKrzysztof Domański
#61 2869855-D8-61.patch3.68 KBKrzysztof Domański
#61 2869855_test_only.patch2 KBKrzysztof Domański
#61 interdiff-49-61.txt1.8 KBKrzysztof Domański
#59 race_condition_error.png9.73 KBrjg
#49 2869855-49.patch3.72 KBmcdruid
#49 2869855-49_test_only.patch3.73 KBmcdruid
#44 2869855-44.patch3.74 KBmcdruid
#44 2869855-44_test_only.patch2.11 KBmcdruid
#37 2869855-35_test_only.patch2.04 KBwturrell
#37 2869855-35.patch3.85 KBwturrell
#34 2869855-34_test_only.patch2.05 KBwturrell
#34 2869855-34.patch3.86 KBwturrell
#29 2869855-29.patch3.81 KBmcdruid
#29 2869855-29_test_only.patch2 KBmcdruid
#27 2869855-27_test_only.patch1.76 KBmcdruid
#24 interdiff-2869855-19-24.txt2.7 KBmcdruid
#24 2869855-24.patch1.79 KBmcdruid
#19 2869855-19.patch1.45 KBmcdruid
#10 file_save_upload_race_condition-2869855-10.patch471 bytesmcdruid
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

manarth created an issue. See original summary.

manarth’s picture

Issue summary: View changes
xjm’s picture

Version: 8.3.0 » 8.3.x-dev

Yikes. Thanks for reporting this!

xjm’s picture

cilefen’s picture

Issue tags: +Triaged D8 critical

@manarth: Thank you for discovering and reporting this issue. @xjm, @alexpott, @effulgentsia, @lauriii, @catch and I discussed this issue at a recent meeting and agreed it is critical priority because of data loss.

NancyDru’s picture

Does this also affect 7.x?

Wim Leers’s picture

The outcome is that two entries are added to the file_managed table, and one entry added to the file_usage table. Here is an example:

Exactly the same thing is being reported at #2884052: Uploading a managed file on a custom form that allows multiple files also triggers the remove button which results in a duplicate temporary record and results in the file being deleted.

It sounds like this is in fact a duplicate of that, which would mean that hook_file_validate() is irrelevant here.

catch’s picture

vijaycs85’s picture

Issue summary: View changes
mcdruid’s picture

Version: 8.3.x-dev » 8.5.x-dev
Status: Closed (duplicate) » Needs work
FileSize
471 bytes

Un-marking this as a dupe of #2884052: Uploading a managed file on a custom form that allows multiple files also triggers the remove button which results in a duplicate temporary record and results in the file being deleted as I think they're very closely related, but not exactly the same thing. I'd argue the other issue should stay focused on the JS fix, whereas this one should look at the problem of duplicate rows in the file_managed table, and the race condition in file_save_upload.

I'll update the IS shortly.


I was able to reproduce the problem of ending up with duplicate rows in the file_managed table for the same file if you add a slow implementation of hook_file_validate, and submit the node form containing a file field before the ajax upload has completed. It's even easier to manually reproduce if you make the validation hook slow only when it's not processing a full node form submit e.g.:

/**
 * Implements hook_file_validate().
 */
function mymodule_file_validate(Drupal\file\FileInterface $file) {
  $errors = array();

  if (!isset($_POST['op']) || ($_POST['op'] != 'Save')) {
    // Add a delay if we're not actually saving the node.
    sleep(30);
  }

  return $errors;
} 

The result is indeed that you end up with two rows for the same file in file_managed, with the same uri. The one from the node form submission is marked as FILE_STATUS_PERMANENT (1) whereas the one from the ajax upload is temporary (0):

mysql> SELECT fid, uuid, filename, uri, status, created FROM file_managed WHERE fid > 27;
+-----+--------------------------------------+-----------------+----------------------------------+--------+------------+
| fid | uuid                                 | filename        | uri                              | status | created    |
+-----+--------------------------------------+-----------------+----------------------------------+--------+------------+
|  28 | 18911c72-ad43-464b-92af-c7653bba85ee | test_upload.txt | public://2017-10/test_upload.txt |      1 | 1508419478 |
|  29 | 46d4ddf7-fd4d-490c-a3fa-a4d8f0061df3 | test_upload.txt | public://2017-10/test_upload.txt |      0 | 1508419474 |
+-----+--------------------------------------+-----------------+----------------------------------+--------+------------+

The temp file (status 0) will be cleaned up by a cron run, and that will blow away the permanent file (status 1) too, as they share the same uri.

I think the problem here is that there's potentially quite a big gap between file_save_upload deciding what filename (/ uri) it's going to use for an upload:

http://cgit.drupalcode.org/drupal/tree/core/modules/file/file.module?h=8...

$file->destination = file_destination($destination . $file->getFilename(), $replace);

...and actually writing that file to disk:

http://cgit.drupalcode.org/drupal/tree/core/modules/file/file.module?h=8...

    if (!drupal_move_uploaded_file($file_info->getRealPath(), $file->getFileUri())) {

...and eventually saving the file to the db:

http://cgit.drupalcode.org/drupal/tree/core/modules/file/file.module?h=8...

I think the significant gap though is between the call to file_destination and actually moving the upload to that destination.

In between, validation takes place and this is why a slow validation exacerbates the problem.

file_destination - and file_create_filename which it calls - decide on the filename which should be used based on checking file_exists.

This means if more than one thread processes an upload with the same filename / upload settings more-or-less in parallel, they can both decide to use the same destination based on the absence of a file at that path when they call file_destination.

One very simple - in fact, crude - way of getting around this is to "claim" the destination once its been derived by touching the file on disk e.g.:

     $file->destination = file_destination($destination . $file->getFilename(), $replace);
     // If file_destination() returns FALSE then $replace === FILE_EXISTS_ERROR and
     // there's an existing file so we need to bail.
     if ($file->destination === FALSE) {
       drupal_set_message(t('The file %source could not be uploaded because a file by that name already exists in the destination %directory.', ['%source' => $form_field_name, '%directory' => $destination]), 'error');
       $files[$i] = FALSE;
       continue;
     }
+    touch($file->destination);

With this in place, trying to reproduce the issue with the duplicate entries gives us something more like:

mysql> SELECT fid, uuid, filename, uri, status, created FROM file_managed WHERE fid > 27;
+-----+--------------------------------------+-----------------+-------------------------------------+--------+------------+
| fid | uuid                                 | filename        | uri                                 | status | created    |
+-----+--------------------------------------+-----------------+-------------------------------------+--------+------------+
|  28 | 18911c72-ad43-464b-92af-c7653bba85ee | test_upload.txt | public://2017-10/test_upload_13.txt |      1 | 1508419478 |
|  29 | 46d4ddf7-fd4d-490c-a3fa-a4d8f0061df3 | test_upload.txt | public://2017-10/test_upload_12.txt |      0 | 1508419474 |
+-----+--------------------------------------+-----------------+-------------------------------------+--------+------------+

...where the cleanup of the temp file would no longer blow away the permanent one.

I'm not suggesting we actually implement that, but I think it illustrates the point. I'll include a patch with just that change to get the ball rolling. (Reasons not to do this include that some stream wrappers may not support touch, and you may end up creating a mess on the disk if you touch files which don't end up with corresponding entries in the db, and are therefore never cleaned up).

What other approaches could be taken to "claiming" the destination before the file is actually written to disk?

The existing locking API could be used; as well as checking file_exists, the upload process could acquire a lock for a not-yet-used destination. That seems almost like a fit, but I'm not sure it's really what the lock API is for.

We could implement something new where an entry is made in the db (/ pluggable storage backend) when a destination is claimed but the file has yet to be written to disk (assuming there isn't already something like this that I'm not aware of).

Fundamentally, it probably shouldn't be possible to end up with multiple entries in the file_managed table with the same value for uri. Perhaps the scope of this issue should be limited to ensuring that doesn't happen during file_save_upload though.

mcdruid’s picture

Issue summary: View changes
mcdruid’s picture

Another thought; it might improve the UI if we disabled the submit button on the node form while the ajax upload is being processed?

mcdruid’s picture

Noting also that file_save_upload works in almost exactly the same way in D7 in terms of the call to file_destination (which is also pretty much the same), validation, and then drupal_move_uploaded_file. So it seems quite likely that the same issue applies to D7?

mcdruid’s picture

Another way of looking at this would be to tolerate the duplicate entries in file_managed, but ensure that file_cron doesn't actually delete files from disk if there's another file ( / row in the table) with the same uri.

That's sort of similar, but not quite the same as what's being looked at in #2821423: Dealing with unexpected file deletion due to incorrect file usage.

Personally, I think I'd prefer to avoid the duplicate entries.

mcdruid’s picture

Title: Race condition in hook_file_validate causes data loss » Race condition in file_save_upload causes data loss

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

mcdruid’s picture

Had a quick look at this in D7. AFAICS the same problem exists, but there's a significant difference in that D7's schema includes this:

  $schema['file_managed'] = array(
    'description' => 'Stores information for uploaded files.',

[..snip..]

    'unique keys' => array(
      'uri' => array('uri'),
    ),

http://cgit.drupalcode.org/drupal/tree/modules/system/system.install?h=7...

So if you end up with the race condition between two upload processes trying to use the same uri, one will hit an error e.g.:

PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'public://field/image/druplicon.png' for key 'uri' ...

In a couple of tests I did using a slow implementation of hook_file_validate (identical to the D8 one outlined in a previous comment), it was the ajax upload's temp file (status = 0) which hit the Integrity constraint violation, and therefore the problem being discussed in this issue was avoided, albeit somewhat crudely.

In the absence of a more elegant solution to this race condition, adding a constraint in the schema in D8 would be one workaround here.

mcdruid’s picture

Turns out the uniqueness constraint on the uri field in the file_managed table was removed as part of #1314214-257: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols) because of the limitations on lengths of keys under utf8mb4 with lowest-common-denominator MySQL configs. Per that issue:

We have a Drupal-level unique constraint on the File URI field instead of a database-level unique constraint, until we fix #2492171: Provide options to sanitize filenames (transliterate, lowercase, replace whitespace, etc) (at which point we can put back the database constraint).

That application-level constraint is provided by core/modules/file/src/Plugin/Validation/Constraint/FileUriUnique.php and is tested in:

https://api.drupal.org/api/drupal/core%21modules%21file%21tests%21src%21...

  // Ensure the database URI uniqueness constraint is triggered.
  $uppercase_file_duplicate = File::create($uppercase_values);
  file_put_contents($uppercase_file_duplicate->getFileUri(), 'hello world');
  $violations = $uppercase_file_duplicate->validate();
  $this->assertEqual(count($violations), 1);
  $this->assertEqual($violations[0]->getMessage(), t('The file %value already exists. Enter a unique file URI.', [
    '%value' => $uppercase_file_duplicate->getFileUri(),
  ]));

However, AFAICS that constraint is never enforced in file_save_upload, as the validate method is never explicitly called on the $file object.

Plus the uri is actually set on the file object after validation:

    // Call the validation functions specified by this function's caller.
    $errors = file_validate($file, $validators);

    // Check for errors.
    if (!empty($errors)) {
      $message = [
        'error' => [
          '#markup' => t('The specified file %name could not be uploaded.', ['%name' => $file->getFilename()]),
        ],
        'item_list' => [
          '#theme' => 'item_list',
          '#items' => $errors,
        ],
      ];
      // @todo Add support for render arrays in drupal_set_message()? See
      //  https://www.drupal.org/node/2505497.
      drupal_set_message(\Drupal::service('renderer')->renderPlain($message), 'error');
      $files[$i] = FALSE;
      continue;
    }

    // Move uploaded files from PHP's upload_tmp_dir to Drupal's temporary
    // directory. This overcomes open_basedir restrictions for future file
    // operations.
    $file->setFileUri($file->destination);

(The comment at the end there is misleading; the last line typically sets the uri to the actual path where the file will be stored e.g. public://2018-01/druplicon.png)

Adding in a quick crude test after the uri has been set, but before the file is saved:

    $violations = $file->validate();
    if (count($violations)) {
      foreach ($violations as $violation) {
        error_log($violation->getMessage());
      }
    }

    // If we made it this far it's safe to record this file in the database.
    $file->save();

...shows that the uniqueness constraint would trigger a violation when this race condition results in two uploads using the same uri e.g.:

[Fri Jan 19 12:13:32.608268 2018] [:error] [pid 859] The file <em class="placeholder">public://2018-01/druplicon_5.png</em> already exists. Enter a unique file URI., referer: http://example.com/node/add/article

So it seems to me that we need to fix the fact that the "Drupal-level unique constraint on the File URI field instead of a database-level unique constraint" isn't enforced at the moment for uploads processed by file_save_upload.

(Or, of course, we could put the constraint back into the db if transliteration of file names landed.)

mcdruid’s picture

Status: Needs work » Needs review
FileSize
1.45 KB

Here's a patch which adds a call to the validate method on the file object itself.

To make this work, we have to move $file->setFileUri to before $file->validate, but (if we change nothing else) it seems that we can't set the uri before calling file_validate. That's probably worthy of a comment if we go with this approach.

I've also blown away the comment about moving the file to the temp directory as that seems to be legacy cruft AFAICS.

In my testing, this prevents the race condition creating two files with matching uri's in the file_managed table. I've also confirmed that the error message from the FileUriUnique constraint is displayed when appropriate (in order to see it I had to hack a value into the file_managed table which I knew would collide with the next uri that file_destination would come up with).

Status: Needs review » Needs work

The last submitted patch, 19: 2869855-19.patch, failed testing. View results

mcdruid’s picture

Will have a look at why the patch caused a handful of upload tests to fail...

Also, discussed with @catch - we should file a follow up to reconcile $file->validate and (hook)_file_validate.

mcdruid’s picture

AFAICS the tests are failing because of the extra validation e.g.:

Error message

The specified file image-test.png could not be uploaded.
The file temporary://image-test.png already exists. Enter a unique file URI.

Epic upload FAIL!

So those tests may need a few tweaks in order to prevent them from triggering the violation of the unique constraint (which we've not been checking properly before).

These are the tests in question:

  • File.Drupal\file\Tests\RemoteFileSaveUploadTest.testHandleExtension
  • File.Drupal\file\Tests\RemoteFileSaveUploadTest.testExistingReplace
  • File.Drupal\file\Tests\SaveUploadFormTest.testHandleExtension
  • File.Drupal\file\Tests\SaveUploadFormTest.testExistingReplace
  • File.Drupal\file\Tests\SaveUploadTest.testHandleExtension
  • File.Drupal\file\Tests\SaveUploadTest.testExistingReplace
mcdruid’s picture

The reason the tests are failing is that the validation of the constraint is currently happening when $replace is FILE_EXISTS_REPLACE.

A little later in file_save_upload Drupal does this:

    // If we are replacing an existing file re-use its database record.
    // @todo Do not create a new entity in order to update it. See
    //   https://www.drupal.org/node/2241865.
    if ($replace == FILE_EXISTS_REPLACE) {
      $existing_files = entity_load_multiple_by_properties('file', ['uri' => $file->getFileUri()]);
      if (count($existing_files)) {
        $existing = reset($existing_files);
        $file->fid = $existing->id();
        $file->setOriginalId($existing->id());
      }
    }

...so the record in the file_managed table will be updated for this replacement upload, and a duplicate record is not created.

It's therefore not valid to flag a violation of the constraint.

I don't think it makes sense to only call $file->validate() if $replace is not FILE_EXISTS_REPLACE, as there could theoretically be other constraints which should be tested (I don't think that's the case at present, but it could be).

One way around this would be to implement the constraint as one of the validators passed to file_validate() which would mean we could decide not to check for uniqueness when $replace is FILE_EXISTS_REPLACE.

I'm not sure if the constraint is ever tested by Drupal without this patch... I'll see if I can find that out before thinking about removing the existing constraint in order to implement it differently.

Another option would be to see if we can test the contraint(s) as part of the actual $file->save().

mcdruid’s picture

Status: Needs work » Needs review
FileSize
1.79 KB
2.7 KB

Here's a simple way of making this work; leave the call to $file->validate() until just before $file->save() which means we don't get a violation for FILE_EXISTS_REPLACE as the existing file entity is going to be updated.

All the tests which failed with #19 pass locally with this patch.

It's a bit ugly doing two rounds of validation, but AFAICS if the first round fails, the second will never be reached so we shouldn't end up displaying two sets of error messages.

(This patch restores the misleading comment about moving files to the temp directory, but that's probably out of scope here anyway - I'll file a follow up for that).

Oh, and this does successfully prevent the duplicate rows in file_managed when the race condition's caused by submitting the node form while the ajax upload is still underway.

mcdruid’s picture

webchick’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Looks like this still is in need of tests.

mcdruid’s picture

Status: Needs work » Needs review
FileSize
1.76 KB

Here's a test which I think confirms the problem.

It simulates the race condition which allows two managed files to be created with the same URI by:

* Uploading a test file.
* Removing the file from disk, but leaving the entry in the file_managed table.
* Uploading the same file again.

At present this results in two entries in the file_managed table with the same URI, which is the situation we're trying to avoid.

So we expect this test to fail.

I don't think the patch from 24 will make this test pass successfully, as it should make the second upload fail validation.

Hopefully this test is useful to illustrate the problem we're trying to solve though.

I suppose the next question is whether to adapt the test so that it "matches" the patch, or the other way around.

To change the patch so that it makes this test pass would mean altering the way that Drupal chooses the file name (URI) it should use for an upload, as opposed to running the validation on the file object to enforce the URI uniqueness constraint once the wrong (duplicate) file name has already been derived. Perhaps that would be a better approach, but it seems like it will require more significant changes than the current patch.

Either way, I think we do still want to ensure that the URI uniqueness constraint is enforced during uploads, which is not the case at present (as this test hopefully illustrates).

Status: Needs review » Needs work

The last submitted patch, 27: 2869855-27_test_only.patch, failed testing. View results

mcdruid’s picture

Status: Needs work » Needs review
FileSize
2 KB
3.81 KB

Here's another test-only patch which should fail, showing that the URI uniqueness constraint is not being enforced when an upload creates a duplicate URI in the file_managed table.

...plus a patch which enforces the URI uniqueness constraint in file_save_upload and should therefore make the new test(s) pass.

The last submitted patch, 29: 2869855-29_test_only.patch, failed testing. View results

mcdruid’s picture

Confirming that was the expected result; test-only patch failing and the other passing.

mcdruid’s picture

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

wturrell’s picture

Rerolled for 8.7.x

The test could need refactoring - as the reason the patch wouldn't apply is the changes in #2870448: Convert web tests to browser tests for file module - I've not studied it in detail, all I've done is move it to the correct file. No changes to the patch itself.

The last submitted patch, 34: 2869855-34.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 34: 2869855-34_test_only.patch, failed testing. View results

wturrell’s picture

wturrell’s picture

Status: Needs work » Needs review

The last submitted patch, 37: 2869855-35.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 37: 2869855-35_test_only.patch, failed testing. View results

wturrell’s picture

Can someone else verify the extensive refactoring in #2894193: Multiple file upload does not validate extensions correctly has fixed this issue in core? (I asked there about it as well.)
Using the steps in the issue summary, I can no longer reproduce the duplicate DB entries.

mcdruid’s picture

Thanks for looking at this!

Unfortunately I don't think the changes have fixed the duplicate db entries issue.

I ran the test from this issue (having pulled all the latest changes), but commented out the assertion that checks for the extra validation the patch in this issue introduces:

  $image2 = $image1;
    $edit = ['files[file_test_upload]' => \Drupal::service('file_system')->realpath($image2->uri)];
    $this->drupalPostForm('file-test/upload', $edit, t('Submit'));
    $this->assertResponse(200, 'Received a 200 response for posted test file.');
    //$message = t('The file %file already exists. Enter a unique file URI.', ['%file' => $file1->getFileUri()]);
    //$this->assertRaw($message, 'Cannot upload a duplicate file.');
    $max_fid_before_duplicate = $max_fid_after;
    $max_fid_after = (int) \Drupal::entityQueryAggregate('file')->aggregate('fid', 'max')->execute()[0]['fid_max'];
    $this->assertEqual($max_fid_before_duplicate, $max_fid_after, 'A new managed file was not created.');

(I also updated the way the test checks the max fid based on the test from the other issue).

That last assertion failed, and checking in the test db:

mysql> SELECT fid, filename, uri, filemime, filesize, status, created FROM test10688815file_managed;
+-----+----------------+------------------------------+-----------+----------+--------+------------+
| fid | filename       | uri                          | filemime  | filesize | status | created    |
+-----+----------------+------------------------------+-----------+----------+--------+------------+
|   1 | image-test.png | temporary://image-test.png   | image/png |      125 |      0 | 1544804354 |
|   2 | image-test.png | temporary://image-test_0.png | image/png |      125 |      0 | 1544804354 |
|   3 | image-test.png | temporary://image-test_0.png | image/png |      125 |      0 | 1544804355 |
+-----+----------------+------------------------------+-----------+----------+--------+------------+

So it looks like the problem is still there AFAICS; the test ended up with two managed files with the same value for uri.

wturrell’s picture

@mcdruid ok, firstly, having got tests running locally, I can confirm what you found, i.e. the new test is still failing on 8.7.x.

I'm confused (largely because I'm guessing) about how to reroll the patch.

e.g. I tried adding your $file->setValidationRequired(TRUE); in the same place (line 965) and the $violations foreach loop (the one appends to the $errors array) after the call to file_validate(), but if I do that, every test screenshot is showing a failed file upload message with the error: 'This value should be of the correct primitive type.'

mcdruid’s picture

Status: Needs work » Needs review
FileSize
2.11 KB
3.74 KB

Thanks again for wrestling with these; the core module had changed quite significantly in the meantime!

I think these patches bring the changes up-to-date.

I've made a change in the test-only patch by commenting out the check for the message which would only ever appear with the extra validation in place. In other words, the test-only focuses on whether the duplicate entries were created in the file_managed table.

I'm getting the expected results locally, let's see if d.o agrees...

The last submitted patch, 44: 2869855-44_test_only.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

mcdruid’s picture

...and that was the expected outcome.

Leaving this as "Needs review" as that's what needs to happen for the actual changes in the patch now.

wturrell’s picture

What's the shortest sleep interval we can reasonably use? i.e. with sleep(600) as now, all core test jobs will take 10 minutes longer?

andypost’s picture

Sleep 300 should be enough

mcdruid’s picture

Ooohhhhhh! I'm glad you pointed that out. I hadn't noticed the sleep() being introduced (it wasn't there in the patches from #29).

The sleep in the steps to reproduce is not actually necessary in the test; that's just to make one thread pause in order to allow the race condition with another thread to be reproduced manually.

In the test, we do this instead:

    // Simulate a race condition where two files are uploaded at almost the same
    // time, by removing the first uploaded file from disk (leaving the entry in
    // the file_managed table) before trying to upload another file with the
    // same name.
    unlink(\Drupal::service('file_system')->realpath($file1->getFileUri()));

Here are the same patches again without the sleep (unsurprisingly the tests run a lot quicker!)

The last submitted patch, 49: 2869855-49_test_only.patch, failed testing. View results

mcdruid’s picture

...and confirming again that we expected the test-only patch to fail.

...and that "Needs review" is appropriate.

wturrell’s picture

Issue summary: View changes

Just to point out, as mentioned previously, I've been unable to reproduce this manually for a while - I've tested today on a few different versions of core (without the patch) and it seems that's been the case since 8.6.0 (8.5.x is still affected, including 8.5.8).

I'd like to be more precise about what changed, however I gave up trying to run git bisect across releases because of all the complications with composer and the database.

I've added a note to the issue summary, updated the old test code there with the more recent version (and also created a module others can use, which contains the hook that adds a 30 second delay). With new versions of core, if you test manually, you won't see any new entry in the file_managed table until the sleep() command has finished executing.

mcdruid’s picture

Thanks wturrel, looks like you're right... the original steps to manually reproduce this no longer seem to produce quite the same outcome.

I'm not sure exactly what's changed in the interim either.

I think the test (e.g. in the patches from #49) still proves that there's a potential race condition because the uniqueness constraint on file_managed.uri is not being enforced... which - I believe - can still lead to the issue described whereby files are unintentionally / unexpectedly deleted.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

rjg’s picture

I am able to recreate this issue in Drupal 8.7.5. I've had a handful of files go missing in Production and have been trying to track down the issue and just now I am able to consistently replicate.

  1. I have a content type with 2 file fields: one supports PDFs, one for images
  2. I added a hook_file_validate() with a 5 second sleep
  3. Whether adding or updating a node, make sure the image field has a value
  4. Upload a PDF file that is large enough to take a few seconds to upload
  5. Click "Remove" on the image file field
  6. Save the node

When the AJAX request for the second file field (remove image) is initiated before the browser receives the response from the first file field (upload PDF), the form is not yet updated with the AJAX response, and the second request includes the "new" PDF file and core acts on it. It will create a second new file entity with the uploaded file, though this one will be marked as "temporary".

The race condition comes in where _file_save_upload_single checks for duplicate file names before running hook_file_validate() and before actually saving the file. So, if the 2 AJAX request makes it to _file_save_upload_single before the first AJAX request is finished, then both requests will create separate file entities with the same exact URI.

rjg’s picture

Adding some more information to my previous comment (#55):

  • Without adding a hook_file_validate() to force a delay, the first file AJAX request almost always finishes before getting a chance to click the second file field.
  • The PDF uploads average 1-3 MB in size
  • It is easy to click on the second file field before the first AJAX upload is done A) when on a slow connection, and/or B) when uploading a file from a network drive or a remote server - probably due to the delay of the browser downloading the file to the browser's local temporary folder before it can load the file data and include that in the POST request
  • In cases where the race condition does not come into play, two file entities are always created; on node save, one stays temporary with 0 usages and one goes permanent, but without the race condition the file entities have different URIs e.x. my.pdf and my_0.pdf

So I feel the race condition in _file_save_upload_single must be leading to both file entities getting the same URI.

I don't know if the file fields concurrent AJAX requests problem should be a separate issue from this one. Anyone have thoughts on this? My first reaction is ideally multiple form ajax requests are forced to be synchronous, whether through some queuing mechanism or disabling other AJAX form fields until the first AJAX response is received. Definitely sounding like larger scope than this issue. EDIT: related to https://www.drupal.org/project/drupal/issues/2884052 ?

I think originally this race condition issue occurred with 1 file upload and with a node save POST, and may have been fixed coincidentally due to other core updates because I can't recreate this issue in core 8.7. If I upload 1 file and click save before the AJAX request is done, I get an error that my file field is required but empty.

rjg’s picture

Drilling down more into _file_save_upload_single():

  1. Takes in params for file info, form field name, destination, and flag to replace or rename
  2. Check that file upload had no errors and file exists on file system (assuming in temporary folder)
  3. Creates file entity with appropriate values (should be flagged as temporary at this point)
  4. Runs some validations (file extension, destination path, etc.)
  5. Call getDestinationFilename() which will append “_#” if the file already exists on the file system (core should maybe have this also do a file entity query on file uri)
  6. Validate filename length
  7. Call all `hook_file_validate()` implementations
  8. Call moveUploadedFile() (moves file from “temporary://” to “private://”)
  9. Updates the file entity with any fields changes during validation (extension, filename if “_#” was appended via getDestinationFilename(), etc.)
  10. Save file entity
  11. Return file object

When two file entity AJAX requests execute at the same time, the second request for example to remove an image field has the form data for the 'new' file upload, and that file gets uploaded a second time (scoping issue? the AJAX request passes the field name in to the file widget processor and it's executing actions on a separate field as well), and if “getDestinationFilename()” gets called on the second AJAX req before “moveUploadedFile()” is called on the other first, the second file entity that was created earlier in the function will keep the same file URI.

Two file AJAX requests can execute at the same time if the first is a file upload of either a large file, a slow client connection, or uploading a remote file, and the second file field is modified before the first request finishes.

It makes sense to call getDestinationFilename() early on, before file name length and hook file validate logic. Maybe it should immediately touch the destination to alleviate filename conflicts, as long as it deleted the blank file if any errors are thrown

wturrell’s picture

@rjg: does applying the patch make any difference to your situation?

rjg’s picture

FileSize
9.73 KB

@wturrell great news, the patch in #49 fixed the duplicate file issue for me:

form validation error

I tested on core 8.7.7

There's room for improvement in the content management UX in the entity form, but I believe this addresses the critical issue (data loss).

Testing steps & results:

  1. Implement hook_file_validate() with sleep(5); (or 10) in an active module
  2. Have a content type with 2 file fields. I used a file field and an image field
  3. Create a node and upload files for each field
  4. Edit the node
  5. Remove the file field
  6. Upload a new file
  7. Immediately click the remove button to remove the image file.
  8. Wait the 5 or 10 seconds for the AJAX requests to complete
  9. Error message appears below the file field:
    The specified file PDF Sample6.pdf could not be uploaded.
    The file private://document/file/PDF Sample 6.pdf already exists. Enter a unique file URI.
  10. The error comes back with the image field AJAX response.
  11. The image field is not removed. Not quite data loss, but there's also no message indicator announcing that the image file was not removed
  12. The newly uploaded file persists in the file field. It is the only file entity created, and is marked as Temporary
  13. Hitting Save will save the node with the new file field, and without removing the image field. New file is flagged as Permanent

What I think is bad UX is the image removal silently failing. If the image field is not required, it would be easy for a content manager to assume the file was removed and overlook that it's still displayed on the node edit form.

Since the image removal did not work, I tested again to see what happens with a new node and uploading 2 files at once. The same error occured and duplicate files were prevented, and there was no data loss or unintended results with the image field. Testing steps & results:

  1. With the same content type from above, create a new node
  2. Upload a new file
  3. Immediately upload a new image file. Note that the "Choose file" fades out/is disabled for a fraction of a second
  4. Wait the 5 or 10 seconds for the AJAX requests to complete
  5. Error message appears below the file field
  6. Both file and image file uploads persisted in the form
  7. Saving the node (after filling in image alt text) correctly saves the 1 file and the 1 image file
dpagini’s picture

Status: Needs review » Reviewed & tested by the community

Based on the previous comment, thought it was appropriate to push this to RTBC status, and potentially get the next round of reviews.

I'm having a little bit of a hard time following the problem here, but what's confusing to me is that the latest "test only" patch has code updates to the file.module file. Wondering if that was an oversight, because I see the other test-only patches only had a test method...?

Krzysztof Domański’s picture

1. Methods (assertResponse and assertRaw) call uses 2 parameters, but method signature uses 1 parameters.

-    $this->assertRaw($message, 'Cannot upload a duplicate file.');
+    $this->assertRaw($message);
protected function assertRaw($raw) {
  $this->assertSession()->responseContains($raw);
}

2. There may be less code.

-  if (count($violations) > 0) {
-    foreach ($violations as $violation) {
-      $errors[] = $violation->getMessage();
-    }
+  foreach ($violations as $violation) {
+    $errors[] = $violation->getMessage();

3. Fixed test only.

what's confusing to me is that the latest "test only" patch has code updates to the file.module file.

@dpagini Good point!

The last submitted patch, 61: 2869855_test_only.patch, failed testing. View results

Krzysztof Domański’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
885 bytes
2 KB

Uploaded new test only. Commented message test that does not yet exist.

     $message = t('The file %file already exists. Enter a unique file URI.', ['%file' => $file1->getFileUri()]);
-    $this->assertRaw($message);
+    // $this->assertRaw($message);

Status: Needs review » Needs work

The last submitted patch, 63: 2869855-63-test-only.patch, failed testing. View results

Krzysztof Domański’s picture

Cleaning patches and comments.

1. It has been reviewed in #60 so back to RTBC. Small changes - see interdiff-49-65.

2. I updated test only, because in #49 it had code updates to the file.module file.

what's confusing to me is that the latest "test only" patch has code updates to the file.module file.

The test only has the commented line below. That is because this message does not yet exist (it will be added in a new patch).

     $message = t('The file %file already exists. Enter a unique file URI.', ['%file' => $file1->getFileUri()]);
-    $this->assertRaw($message);
+    // $this->assertRaw($message);

The last submitted patch, 65: 2869855-65-test-only.patch, failed testing. View results

The last submitted patch, 65: 2869855-65-test-only.patch, failed testing. View results

The last submitted patch, 65: 2869855-65-test-only.patch, failed testing. View results

The last submitted patch, 65: 2869855-65-test-only.patch, failed testing. View results

klonos’s picture

Since this has been confirmed to exist in D7 too, should this be tagged for backport?

catch’s picture

@klonos for 7.x backport it's best to create a separate issue against 7.x, linking back to this one.

Wim Leers’s picture

  1. +++ b/core/modules/file/file.module
    @@ -1124,6 +1127,28 @@ function _file_save_upload_single(\SplFileInfo $file_info, $form_field_name, $va
    +  $violations = $file->validate();
    +  foreach ($violations as $violation) {
    +    $errors[] = $violation->getMessage();
    +  }
    
    +++ b/core/modules/file/tests/src/Functional/SaveUploadTest.php
    @@ -135,6 +135,36 @@ public function testNormal() {
    +  public function testDuplicate() {
    

    This seems to be bringing the test coverage that we have for REST/JSON:API file uploads (\Drupal\Tests\rest\Functional\FileUploadResourceTestBase::testPostFileUploadDuplicateFile() and \Drupal\Tests\jsonapi\Functional\FileUploadTest::testPostFileUploadDuplicateFile() respectively) over to form-uploaded files.

    Does this race condition then also exist there?

    Both REST and JSON:API already do perform entity validation. I think that means they're safe from this?

  2. +++ b/core/modules/file/tests/src/Functional/SaveUploadTest.php
    @@ -135,6 +135,36 @@ public function testNormal() {
    +    // Simulate a race condition where two files are uploaded at almost the same
    +    // time, by removing the first uploaded file from disk (leaving the entry in
    +    // the file_managed table) before trying to upload another file with the
    +    // same name.
    +    unlink(\Drupal::service('file_system')->realpath($file1->getFileUri()));
    

    Should we expand the REST/JSON:API test coverage to also simulate this race condition?

mcdruid’s picture

@klonos see #17 - D7 is protected by the unique constraint on the file uri in the schema, so although it's possible that the code will try it the db won't allow two rows in the file_managed table with the same value for uri. Perhaps there's room for improvement there, but I'm not sure a backport for D7 would be prioritised given that the schema prevents the race condition from causing the data loss issue outlined here.

@Wim Leers thanks for the review; I'll try to update the IS and have a look at REST/JSON:API unless anyone beats me to it.

mcdruid’s picture

Issue summary: View changes
mcdruid’s picture

Issue summary: View changes
mcdruid’s picture

Hopefully the IS is in better shape now.

The last submitted patch, 65: 2869855-65-test-only.patch, failed testing. View results

mcdruid’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
3.57 KB
6.08 KB
7.65 KB

Confirmed that REST and JSON:API already do entity validation on file uploads and therefore enforce the unique constraint.

Added test coverage for both:

  • \Drupal\Tests\jsonapi\Functional\FileUploadTest::testPostFileUploadDuplicateFileRaceCondition()
  • \Drupal\Tests\rest\Functional\FileUploadResourceTestBase::testPostFileUploadDuplicateFileRaceCondition()

...which should hopefully verify this - I've added these to the test_only patch, but these should pass where we expect \Drupal\Tests\file\Functional\SaveUploadTest::testDuplicate() to fail.

I'd hope to put this back to RTBC if the only failure is that one in the test_only patch.

The last submitted patch, 78: 2869855-78_test_only.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Status: Needs review » Needs work

The last submitted patch, 78: 2869855-78.patch, failed testing. View results

mcdruid’s picture

Oops, I see what I did there. In the REST and JSON:API tests, we're working directly with a $response object so calling $this->assertResourceErrorResponse(422, ... doesn't work.

Changed that to simply checking for the 422 status code in order to hopefully see these passing. We could add more assertions to check for e.g. the error text "Unprocessable Entity: file validation failed." (and the rest).

I'm not sure what they best way of doing that would be in these tests, other than something like:

$this->assertContains($expected_error_text, $response->getContent());

Suggestions for how to do that better welcome!

The last submitted patch, 81: 2869855-81_test_only.patch, failed testing. View results

dpagini’s picture

Status: Needs review » Reviewed & tested by the community

This looks good to me, and other than the added test changes, there really hasn't been much changed from previous patches, so I think this is appropriate to move back. Looks good to me.

dpagini’s picture

Is it worth mentioning/consideration on any front-end regression from #59?

mcdruid’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
1.45 KB
5.95 KB
7.64 KB

Ok, good that we got the expected result that time.

However, I misunderstood the errors in #78 and I think we can use $this->assertResourceErrorResponse(422, ... instead of just checking for the status codes.

It caught me out that the function signature's slightly different for assertResourceErrorResponse() in rest and jsonapi.

With any luck these should work, and they test the full validation error messages when the file URI already exists.

Interdiff is with #78 as I'm trying to fix those failures. We know #81 worked when only checking status code.

[edit: cross-posted with 83 and 84... yes, thanks - I agree, and will put this back to RTBC if/when the tests pass as expected]

The last submitted patch, 85: 2869855-85_test_only.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 85: 2869855-85.patch, failed testing. View results

mcdruid’s picture

Nearly there, hopefully.

The error message in the rest module actually seems to come from \Drupal\rest\Plugin\rest\resource\EntityResourceValidationTrait which means it's very slightly different to what I was expecting.

It also seems to include a trailing newline \n which we can see being added in the foreach() in that trait's validate() method, but that's one for a followup if we want to address it.

Let's try again...

mcdruid’s picture

Follow-up for the trailing newline is #3082130: \Drupal\rest\Plugin\rest\resource\EntityResourceValidationTrait produces error messages with a trailing newline.

If that were to be committed first, we'd have to remove the newline from the $message in the REST module test here.

Status: Needs review » Needs work

The last submitted patch, 88: 2869855-88.patch, failed testing. View results

mcdruid’s picture

Oh dear, just some escaping I shouldn't have left in the expected error message this time I think...

The last submitted patch, 91: 2869855-91_test_only.patch, failed testing. View results

mcdruid’s picture

Status: Needs review » Reviewed & tested by the community

Got there in the end!

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Crediting @manarth for filing the issue, @dpagini, @Wim Leers for code review and @cilefen, @catch and @xjm for issue triage.

Committed e103770 and pushed to 8.8.x. Thanks!

I've committed only to 8.8.x for now just in case triggering the validation throws up some unexpected issues. That said this should be safe for backport because we should not be saving invalid entities in the database. Also 8.8.x is just around the corner.

One possible follow-up is that the message

+++ b/core/modules/file/tests/src/Functional/SaveUploadTest.php
@@ -134,6 +134,35 @@ public function testNormal() {
+    $message = t('The file %file already exists. Enter a unique file URI.', ['%file' => $file1->getFileUri()]);
+    $this->assertRaw($message);

Is not the most user friendly. I didn't push back on this because:

  • It's from a race condition that has become very hard to reproduce in supported versions of Drupal 8
  • It's the current message being used by JsonAPI / Rest where the message makes more sense.

  • alexpott committed e103770 on 8.8.x
    Issue #2869855 by mcdruid, Krzysztof Domański, wturrell, rjg, manarth,...
Wim Leers’s picture

Issue tags: +API-First Initiative

#78+++++++++++

Thank you so much for adding REST + JSON:API test coverage for this too! 👏

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Gábor Hojtsy’s picture

Issue tags: +8.8.0 release notes
pameeela’s picture

Status: Closed (fixed) » Needs work
Issue tags: +Needs release note

If there's a possible disruption as a result of this can we please add a release note snippet to the IS?

If no disruption, we can just untag it as 8.8.0 release notes and include it as a critical bug fix.

catch’s picture

Status: Needs work » Fixed
Issue tags: -8.8.0 release notes, -Needs release note

There shouldn't be any disruption here, it's 100% bug fix and something the vast majority of sites won't run into.

Wim Leers’s picture

Issue tags: +8.8.0 release notes

Restoring the tag that @Gábor Hojtsy added and I think @catch accidentally removed :)

pameeela’s picture

Issue tags: -8.8.0 release notes

Thanks @Wim Leers - we’re using the release notes tag for issues that should be called out as disruptive, so it should be removed - but this will still be included in the list of fixes.

Wim Leers’s picture

Ah! My bad :) Thanks for clarifying that, @pameeela! 👍

alexpott’s picture

Turns out there is disruption here :) see #3089697: Regression due to File::setValidationRequired in _file_save_upload_single() however I think we should fix this before 8.8.0 is released because I'm not sure there's any need for the disruption.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.