Problem/Motivation

Images referenced in text fields using <img src="…" data-entity-type="file" data-entity-uuid="…" /> have their usage tracked by the editor_file_reference filter plugin, and the accompanying hook_entity_insert() + hook_entity_update() + hook_entity_delete() + hook_entity_revision_delete() hooks.

Unfortunately, hook_entity_update() does not take translations into account.

Therefore, images referenced in a text field are lost when a new image is being added while creating a new translation without creating a new revision (because file usage incorrectly reaches zero, causing the permanent file to be marked temporary, and then eventually being deleted — after 6 hours by default).

There's likely also other bugs, such as when changing the default language — see #2810355: $entity->isDefaultTranslation() behaves incorrectly when changing default translation, causing file/image field usage to be set to zero, causing files to be deleted for that. Or when deleting a translation.

Steps to reproduce

  1. install the language module
  2. create 2 languages: X & Y
  3. install the content_translation
  4. mark the article node type's body field as translatable
  5. create an article in language X, using the Basic HTML format for the body field, with an image inserted into the body field via the text editor's image dialog
  6. verify that the uploaded file has its status column in the file_managed table set to 1,
    and it has 1 usage in the file_usage table
  7. create a new translation in language Y for that node, uncheck "Create new revision", then save (without changing any fields)
    • expected result: the uploaded file has 2 uses in the file_usage table
    • actual result: the uploaded file has 1 use in the file_usage table
  8. go to either translation and remove the <img> from the body
    • expected result: the uploaded file still has its status column in the file_managed table set to 1, and it has 1 use in the file_usage table
    • actual result: the uploaded file now has its status column in the file_managed table set to 0, and it has 0 uses in the file_usage table
  9. run cron 6 hours later (this is the default value of the temporary_maximum_age setting in system.file)
    • expected result: the uploaded file is still present (because usage > 0)
    • actual result: the uploaded file is absent (because usage = 0)

Proposed resolution

  1. Extend test coverage to ensure all potential edge cases wrt translations & revisions are tested.
  2. Fix logic to make tests pass.

To prevent this class of bugs in the future, Entity Field API should provide an API to iterate over the values of a field across all translations & revisions.

Remaining tasks

  1. Have >=1 Entity Field API maintainer either provide code or review until the code is doing what they deem is the correct way to iterate over the values for a certain field across all revisions and all translations.
  2. Decide whether this logic should live in a helper function of editor.module or whether it belongs in Entity API, and whether we should block this issue on that.
  3. Expand test coverage to cover all edge cases for translations & revisions. We likely need input from an Entity Field API maintainer to indicate what all those edge cases are.
  4. #2810355: $entity->isDefaultTranslation() behaves incorrectly when changing default translation, causing file/image field usage to be set to zero, causing files to be deleted must land before the expanded test coverage can pass.
  5. (soft requirement) Fix the other bug surfaced here by @Leksat (in #17.1 + #18.1 + #22 + #26) — we have #2892368: editor.module's hook_entity_update() does not count file usages correctly when adding/removing images without creating new revisions for that. Ideally it'd already land since it'd simplify this patch, and that in itself does not cause data loss, so there is no need for an update path, which makes that one simpler.

User interface changes

None.

API changes

TBD

Data model changes

None.

CommentFileSizeAuthor
#49 interdiff.txt656 bytesdawehner
#49 2708411-49.patch21.16 KBdawehner
#44 interdiff.txt10.76 KBdawehner
#44 2708411-44.patch21.11 KBdawehner
#38 2708411-29v8.2-2.patch20.33 KBquiron
#37 2708411-29v8.2.patch20.05 KBquiron
#29 2708411-interdiff-20-29.txt11.56 KBLeksat
#29 2708411-29.patch19.87 KBLeksat
#20 2708411-20.patch12.81 KBLeksat
#20 2708411-interdiff-14-20.txt5.8 KBLeksat
#14 2708411-14.patch13.14 KBLeksat
#14 2708411-interdiff-13-14.txt2.9 KBLeksat
#13 2708411-13.patch10.25 KBLeksat
#13 2708411-interdiff-11-13.txt6.7 KBLeksat
#11 2708411-11.patch9.66 KBLeksat
#11 2708411-11-test_only.patch5.97 KBLeksat
#9 file_usage_for_translations-test-2708411-9.patch2.53 KBLuxian
#4 file_usage_not_incremented-2708411-4.patch790 bytesLuxian
drupal-8--file-usage-new-translations.patch956 bytesLuxian
Members fund testing for the Drupal project. Drupal Association Learn more

Comments

Luxian created an issue. See original summary.

Luxian’s picture

PS: patch was created using 8.0.x branch, but apparently nothing change in 8.1.x. I hope it will apply :)

Status: Needs review » Needs work

The last submitted patch, drupal-8--file-usage-new-translations.patch, failed testing.

Wim Leers’s picture

Priority: Normal » Major
Status: Needs review » Needs work
Issue tags: +D8MI, +Needs tests

Great catch! :) Patch looks great, this now just needs test coverage. \Drupal\Tests\editor\Kernel\EditorFileUsageTest already has test coverage, this patch just needs to expand it.

Luxian’s picture

I just realized that an update hook will also be welcomed for content that was already created :(

Wim Leers’s picture

#6 Hm… you are probably right. That should not be too difficult fortunately. But first, let's focus on test coverage.

Luxian’s picture

Status: Needs work » Needs review

Ok, this patch contains only the test, let's see how it fails. Complete patch will follow.

Luxian’s picture

Status: Needs review » Needs work

The last submitted patch, 9: file_usage_for_translations-test-2708411-9.patch, failed testing.

Leksat’s picture

Based on Luxian's work.
Overall:
- added editor_entity_translation_insert() and hook_entity_translation_delete()
- editor_entity_update() updated to iterate over all changed translations

The last submitted patch, 11: 2708411-11-test_only.patch, failed testing.

Leksat’s picture

FileSize
6.7 KB
10.25 KB

I think I found a better way to do the same. I reverted the previous changes to the editor.module and updated _editor_get_file_uuids_by_field() to gather field text from all field translations.

Leksat’s picture

FileSize
2.9 KB
13.14 KB

Added an update hook to recalculate file usages.

The last submitted patch, 13: 2708411-13.patch, failed testing.

Leksat’s picture

A test failure in #13 seems to be unrelated.

The patch looks good from my side. Ready for reviews!

Wim Leers’s picture

  1. +++ b/core/lib/Drupal/Component/Utility/DiffArray.php
    @@ -47,4 +47,35 @@ public static function diffAssocRecursive(array $array1, array $array2) {
    +   * ¶
    

    I'm not sure this is actually better. Can you explain why this is better?

  2. +++ b/core/modules/editor/editor.install
    @@ -20,3 +20,26 @@ function editor_update_8001() {
    +  $file_usage->clear('editor');
    
    +++ b/core/modules/file/src/FileUsage/DatabaseFileUsageBackend.php
    @@ -108,4 +108,13 @@ public function listUsage(FileInterface $file) {
    +  public function clear($module) {
    
    +++ b/core/modules/file/src/FileUsage/FileUsageBase.php
    @@ -33,4 +33,11 @@ public function delete(FileInterface $file, $module, $type = NULL, $id = NULL, $
    +  public function clear($module) {
    
    +++ b/core/modules/file/src/FileUsage/FileUsageInterface.php
    @@ -67,4 +67,15 @@ public function delete(FileInterface $file, $module, $type = NULL, $id = NULL, $
    +  public function clear($module);
    

    I don't think this is a good idea. It's an API change. And I don't think this is even necessary.

    It's possible to use ->delete() instead.

  3. +++ b/core/modules/editor/editor.install
    @@ -20,3 +20,26 @@ function editor_update_8001() {
    +      // Load 10 entities at once to avoid memory usage issues.
    +      $result = \Drupal::entityQuery($entity_type_id)->execute();
    +      while ($ids = array_splice($result, 0, 10)) {
    

    This won't work. What if you have 10K, 100K, 1M entities?

    This must use the batch API. Although I'm not sure how to do that in updates. I think there are already some update hooks that do this though.

Leksat’s picture

Status: Needs review » Needs work

@Wim Leers, thanks for the review!

1. I noticed that during entity insertion, the editor module counts all file usages. I mean, if a single image is used twice in a field, the usage will be 2.
However, during entity update, array_diff() is used. It removed duplicate elements from the resulting array so that file usages may be updated incorrectly.

For example:
- we create an entity referencing two identical images [1, 1] => 2 usages of the file
- we update entity removing 1 image => array_diff([1, 1], [1]) => [] => file usage stays the same
- we update entity removing the last image => image usage is decreased by 1 => 1 "lost" file usage remains

I was able to reproduce this on simplytest.me

Probably this also needs a test coverage.

2. Yeah, this is not nice. Will try to find a better way.

3. Even if we have 1M entities, memory usage should be fine. But, yeah, I forgot about the time limit. Will check for batch examples.

Wim Leers’s picture

  1. I mean, if a single image is used twice in a field, the usage will be 2.

    We don't have explicit test coverage for that, but I don't think that's wrong. If you have two file fields on an entity, and they both reference the same file, you'd also count it twice.

  2. Cool :)
  3. Awesome!

Great work here, thank you for pushing it forward!

Leksat’s picture

Status: Needs work » Needs review
FileSize
5.8 KB
12.81 KB

Reverted API changes. The update function now uses batch.

Wim Leers’s picture

Status: Needs review » Needs work

#20 addressed #17.2 and #17.3, but not #17.1. See #19.1 for why I think the entire diffOnce() thing is wrong/unwanted.

AWESOME work on the Batch-based update hook! Did you test it? Does it work?

Leksat’s picture

Status: Needs work » Needs review

@Wim Leers,

For #17.1:
I think there is a misunderstanding. The diffOnce() thingy does not change the current editor behavior (if one image is used 2 times in a node, it will be counted twice). Instead, it fixes an additional bug described in #18.1. Can you please check that comment again?

I tested batch-update manually:
- generated 150 nodes and 150 files with devel
- updated some nodes to have some images (including one-image-several-times case)
- applied patch, ran update
- step-debugged important places of the update function
-- ensured that usages were first cleared and then recalculated
-- ensured that batch works
- ensured that new usages are correct

Thanks for the review!

Wim Leers’s picture

Issue tags: -Needs upgrade path

I tested batch-update manually:
- generated 150 nodes and 150 files with devel
- updated some nodes to have some images (including one-image-several-times case)
- applied patch, ran update
- step-debugged important places of the update function
-- ensured that usages were first cleared and then recalculated
-- ensured that batch works
- ensured that new usages are correct

This is great! It means I can remove the Needs upgrade path tag. But I'm afraid I can't remove the Needs upgrade path tests tag: your manual testing needs to be proven by an automated test too.

Leksat’s picture

Status: Needs review » Needs work

Uhh... Okay. Never did this before, but I hope there are some examples in the core.

swentel’s picture

Also wondering out loud, do we need a test also that file usage is /not/ incremented when the body itself is not translatable (so synced accross translations) ? Granted, that scenario is probably not something that will happen a lot in the real world, but it's a use case though.

Wim Leers’s picture

I think there is a misunderstanding. The diffOnce() thingy does not change the current editor behavior (if one image is used 2 times in a node, it will be counted twice). Instead, it fixes an additional bug described in #18.1. Can you please check that comment again?

Oh! Sorry, I indeed misread. Thanks!

I'm afraid you're 100% correct :) We totally failed to add test coverage for this case in #1932652: Add image uploading to WYSIWYGs through editor.module.

That means this patch is close, and on the right path, but just needs a bit more refinement:

  1. +++ b/core/lib/Drupal/Component/Utility/DiffArray.php
    @@ -47,4 +47,35 @@ public static function diffAssocRecursive(array $array1, array $array2) {
    +  public static function diffOnce(array $array1, array $array2, $strict = FALSE) {
    

    This needs unit test coverage in \Drupal\Tests\Core\Common\DiffArrayTest.

  2. +++ b/core/modules/editor/editor.install
    @@ -20,3 +20,68 @@ function editor_update_8001() {
    +  ¶
    

    Trailing space.

  3. +++ b/core/modules/editor/editor.install
    @@ -20,3 +20,68 @@ function editor_update_8001() {
    +    foreach ($entities as $entity) {
    +      editor_entity_insert($entity);
    +    }
    

    This is NOT iterating over all revisions! That seems wrong.

  4. +++ b/core/modules/editor/editor.module
    @@ -456,21 +459,38 @@ function _editor_delete_file_usage(array $uuids, EntityInterface $entity, $count
    - * @param EntityInterface $entity
    + * @param FieldableEntityInterface $entity
    

    Needs FQCN. (Yes, this was already wrong before.)

Wim Leers’s picture

I was going to remove the Needs tests tag, because we do have "regular" tests (just not yet upgrade path tests), but @swentel just suggested another edge case that doesn't have test coverage yet.

Wim Leers’s picture

Leksat’s picture

Status: Needs work » Needs review
FileSize
19.87 KB
11.56 KB

In this patch:
- added unit test for DiffArray::diffOnce() (that was a really good idea! I caught a fatal error :))
- update now processes revisions as well
- added test for the update

Important: the update test uses Entity API prior to $this->runUpdates() call. That works, but I'm not sure if that's how it should be. If API usage is not allowed before updates are executed, I guess the only way would be to provide a new database dump for this update.

Gábor Hojtsy’s picture

Issue tags: +Media Initiative
Gábor Hojtsy’s picture

Issue tags: -Media Initiative +D8Media

Using the proper media tag now.

Version: 8.1.x-dev » 8.2.x-dev

Drupal 8.1.9 was released on September 7 and is the final bugfix release for the Drupal 8.1.x series. Drupal 8.1.x will not receive any further development aside from security fixes. Drupal 8.2.0-rc1 is now available and sites should prepare to upgrade to 8.2.0.

Bug reports should be targeted against the 8.2.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

dawehner’s picture

Priority: Major » Critical

I'm we are loosing files, I think this is by no way not a critical issue.

Status: Needs review » Needs work

The last submitted patch, 29: 2708411-29.patch, failed testing.

dawehner’s picture

+++ b/core/modules/editor/editor.module
@@ -456,21 +459,38 @@ function _editor_delete_file_usage(array $uuids, EntityInterface $entity, $count
 
+  $field_definitions = $entity->getFieldDefinitions();
   $formatted_text_fields = _editor_get_formatted_text_fields($entity);
   foreach ($formatted_text_fields as $formatted_text_field) {
+
+    // In case of a translatable field, iterate over all its translations.
+    if ($field_definitions[$formatted_text_field]->isTranslatable() && $entity instanceof TranslatableInterface) {
+      $langcodes = array_keys($entity->getTranslationLanguages());
+    }
+    else {
+      $langcodes = [LanguageInterface::LANGCODE_NOT_APPLICABLE];
...
-      $text .= $field_item->value;
+    foreach ($langcodes as $langcode) {
+      if ($langcode == LanguageInterface::LANGCODE_NOT_APPLICABLE) {
+        $field_items = $entity->get($formatted_text_field);

I'm seriously wondering whether we should have form of iterating over translations and revisions or so. Having custom behaviour here feels really weird.

Wim Leers’s picture

I'm seriously wondering whether we should have form of iterating over translations and revisions or so. Having custom behaviour here feels really weird.

I very much agree. Back when this was added, every reviewer understood:

  1. hook_entity_(insert|update|…)() to be operating on/called for every revision
  2. a translation of an entity to also be a revision

This is clearly not true.

Entity API's hooks are extremely confusing in that regard. It's likely lots of contrib modules will have the same problems. In fact, that's apparently why a bunch of new hooks were added in #2382675: hook_entity_create() affects the data of new translations of existing entities in unexpected and undocumented ways. This was committed in November 2015, shortly before D8's release. This logic in editor module was written more than two years before that.
Furthermore, the entity translation "insert" and "delete" hooks were added around the same time (#1810370: Entity Translation API improvements) as this code was written. The "entity translation create/delete" hooks are about "translation storage". There is no corresponding "update" hook. This seems to further confirm that just looking at revisions should be okay, but apparently it is not.

So, at this point, I have no idea what is actually the correct way. We really need better documentation. Should we move this to the entity system component for feedback from the Entity API maintainers?


It's also not clear to me why the much simpler approach in #4 is not sufficient. #11 changed the implementation significantly.

quiron’s picture

FileSize
20.05 KB

Hi all,

I needed the patch #29 but for 8.2 there was an error on the patch apply, so I have recreated it for 8.2.

hope be usefull.

quiron’s picture

FileSize
20.33 KB

Sorry, there was a typo error :S

this is the correct one

Wim Leers’s picture

Gábor Hojtsy’s picture

A translation is not a revision and a revision is not necessarily a translation. Revisions and translations are the two axes on the "spreadsheet" of an entity. If you use the UI AND have revisions enabled, then a new translation change would create a new revision (with copy of all data for other languages in that revision). If you don't use revisions or you use the API, then you can change multiple translations within one revision. Basically the revisions are columns on the spreadsheet and translations are rows. Does that help? :)

Wim Leers’s picture

@Gábor: Yes, that's clear now. But the thing is that it's very unclear in Entity API itself, which is why nobody noticed this during the review process back when this functionality was added. It's not clear from Entity API documentation, and it's not clear by reading Entity API's hooks. Grep entity.api.php for translation. I think it'd be worth adding #40's explanation to entity.api.php.

dawehner’s picture

@Gábor: Yes, that's clear now. But the thing is that it's very unclear in Entity API itself, which is why nobody noticed this during the review process back when this functionality was added. It's not clear from Entity API documentation, and it's not clear by reading Entity API's hooks. Grep entity.api.php for translation. I think it'd be worth adding #40's explanation to entity.api.php.

Maybe we could open up an issue to either document that or even provide some API helpers to deal with that.

  1. +++ b/modules/editor/editor.install
    @@ -20,3 +20,86 @@ function editor_update_8001() {
    +function editor_update_8002(&$sandbox) {
    ...
    +    foreach (\Drupal::entityTypeManager()->getDefinitions() as $entity_type) {
    ...
    +    $files = \Drupal::entityTypeManager()
    +      ->getStorage('file')
    +      ->loadMultiple(array_splice($sandbox['data']['file_ids'], 0, $entity_load_limit));
    +    foreach ($files as $file) {
    

    Note: Using the entity API in a hook_update_N is dangerous. I could imagine that this is safe to be done in a hook_post_update_N

  2. +++ b/tests/Drupal/Tests/Core/Common/DiffArrayTest.php
    @@ -66,7 +48,55 @@ class DiffArrayTest extends UnitTestCase {
    +  public function testDiffOnce() {
    +    $object1 = new \stdClass();
    +    $object2 = new \stdClass();
    +    $sets = [
    +      [
    +        'arrays' => [
    +          [1, 1, 1, 1, 1],
    +          [1, 1, 1],
    

    Note: We could use a dataprovider here.

dawehner’s picture

Assigned: Unassigned » dawehner

Working on it for a while

dawehner’s picture

Assigned: dawehner » Unassigned
Status: Needs work » Needs review
FileSize
21.11 KB
10.76 KB
  • Discussed with @alexpott and we agreed that we should use a hook_post_update_N()
  • We also discussed that we should not deal with entity IDs directly. This allows the update to scale better.

Status: Needs review » Needs work

The last submitted patch, 44: 2708411-44.patch, failed testing.

Leksat’s picture

... whether we should have form of iterating over translations ...

I don't really remember, but I guess I did it this way because it's theoretically possible to programmatically create a new entity having several translations from the beginning. I doubt if Drupal will fire translation hooks for each translation in this case.

alexpott’s picture

Issue tags: +Triaged D8 critical

Recently @catch, @webchick, @xjm, @cottser, @cilefen and I discussed #2801777: Give users the option to prevent drupal from automatically marking unused files as temporary , #2810355: $entity->isDefaultTranslation() behaves incorrectly when changing default translation, causing file/image field usage to be set to zero, causing files to be deleted and #2708411: [PP-1] editor.module's editor_file_reference filter not tracking file usage correctly for translations. We all agreed that the current file usage behaviour that is resulting in files being deleted when they are being used by a site is unacceptable and a critical issue. We've created #2821423: Dealing with unexpected file deletion due to incorrect file usage to plan for how to fix and mitigate the related issues.

dawehner’s picture

Status: Needs work » Needs review
FileSize
21.16 KB
656 bytes

This could fix a bunch of those errors ...

Status: Needs review » Needs work

The last submitted patch, 49: 2708411-49.patch, failed testing.

Berdir’s picture

Sounds like you had a loop in the update function?

  1. +++ b/core/modules/editor/editor.module
    @@ -543,23 +546,39 @@ function editor_file_download($uri) {
    +
    +    // In case of a translatable field, iterate over all its translations.
    +    if ($field_definitions[$formatted_text_field]->isTranslatable() && $entity instanceof TranslatableInterface) {
    +      $langcodes = array_keys($entity->getTranslationLanguages());
    +    }
    +    else {
    +      $langcodes = [LanguageInterface::LANGCODE_NOT_APPLICABLE];
    +    }
    ...
    +      if ($langcode == LanguageInterface::LANGCODE_NOT_APPLICABLE) {
    +        $field_items = $entity->get($formatted_text_field);
    +      }
    

    This is a bit strange.

    LANGCODE_NOT_APPLICABLE is an actual language that an entity can have, here we use it to indicate that we should not get a specific translation but pick the one we got.

    It would make a lot more sense to merge those two things together, and only loop in the if case.

    if (is translatable) {
    loop over languages, merge the text together)
    else {
    just do it once with the current translation, as it doesn't matter through which translation we access it.

    We will duplicate 2-3 lines of code with that, but I think the result will make more sense.

  2. +++ b/core/modules/editor/editor.post_update.php
    @@ -19,3 +21,112 @@ function editor_post_update_clear_cache_for_file_reference_filter() {
    +        if ($entity_type->isRevisionable()) {
    +          $query->allRevisions();
    +        }
    

    this isn't quite so easy and might be the reason for your loop?

    here you care about revisions, but you still store the result as entity_ids in either case.

    also $result is weird when you actually have the count?

  3. +++ b/core/modules/editor/editor.post_update.php
    @@ -19,3 +21,112 @@ function editor_post_update_clear_cache_for_file_reference_filter() {
    +    $ids = \Drupal::entityQuery($entity_type_id)
    +      ->condition($entity_type->getKey('id'), $sandbox['data']['last_entity_id'][$entity_type_id], '>')
    +      ->pager($entity_load_limit)
    +      ->sort($entity_type->getKey('id'), 'ASC')
    +      ->execute();
    +    if ($entity_type->isRevisionable()) {
    +      foreach ($ids as $revision_id) {
    +        $entity = \Drupal::entityTypeManager()
    

    here you don't seem to care about revisions anymore, so you will never reach the total.

    Also, pager() doesn't make sense, pager() relies on GET query arguments. You need to page yourself, by doing the proper range().

    One approach for dealing with revisions would be to query over entity IDs and then load all revisions for that if it supports revisions. And hope that there aren't millions of revisions ;)

    You could also have a look at paragraphs_post_update_set_paragraphs_parent_fields(), that's a "loop over all revisions or ids" post update function that we wrote in paragraphs and it *seems* to work :)

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.6 was released on February 1, 2017 and is the final full bugfix release for the Drupal 8.2.x series. Drupal 8.2.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.3.0 on April 5, 2017. (Drupal 8.3.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.3.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

tstoeckler’s picture

+++ b/core/modules/editor/editor.module
@@ -543,23 +546,39 @@ function editor_file_download($uri) {
-      $text .= $field_item->value;

Seems this not being re-added, unless I'm missing something.

Wim Leers’s picture

Title: File usage not incremented when adding new translation » [PP-1] editor.module's editor_file_reference filter not tracking file usage for new translations
Issue summary: View changes
Status: Needs work » Postponed
Related issues: +#2892368: editor.module's hook_entity_update() does not count file usages correctly when adding/removing images without creating new revisions, +#2892377: Document relationship of entities, entity revisions and entity translations

I went through the entire issue.

The following comments/discussions were especially important:

  • @Luxian provided a patch in #4 that fixes the case of "is new translation", and did so by updating editor.module's hook_entity_update() implementation. In #9, he also added test coverage.
  • @Leksat then changed this in #11: he reverted @Luxian's change in hook_entity_update(), but instead uses hook_entity_translation_insert(). He also implemented hook_entity_translation_delete(). And expanded the test coverage.
  • In #13 + #14, he implements the same fix in a different way. This removed the hook_entity_translation_insert() + hook_entity_translation_delete() implementations. So this goes back to only relying on hook_entity_update() to also detect the "change events" that hook_entity_translation_insert() + hook_entity_translation_delete() are specifically designed for.
  • Then in #14, @Leksat also adds an update path to recalculate all file usages for text fields, tracked by the editor_file_reference filter plugin.
  • In #17, I post a review, questioning some of the significant changes that the patch introduces. @Leksat and I go back and forth for a while, until #27. In #17.1 + #18.1 + #22 + #26, @Lekset convinces me that there is another bug in editor.module's hook_entity_update(): not only does it fail in case of new/removed translations, it also fails when adding/removing images without creating a new revision (i.e. when editing a particular revision): if images are added/removed, then the file usage is never incremented/decremented accordingly. It's probably a good idea to split this off to a separate issue. Opened #2892368: editor.module's hook_entity_update() does not count file usages correctly when adding/removing images without creating new revisions for this.
  • @swentel suggested yet another edge case that probably needs tests, in #25 (when dealing with translations, but the body field itself is not translatable, meaning that we shouldn't count the same file usage in N translations N times, because the text field itself is not translated).
  • @dawehner indicated in #35 that this module should not be introducing custom iteration logic; it's Entity/Field API that should provide an API to iterate over all revisions & translations of a field.
  • @Gábor Hojtsy posted a very clear way of looking at how revisions & translations of entities relate to each other, in response to my comment in #36. dawehner (#42) and I (#41) suggested that adding this documentation would be worthwhile. Opened issue for that: #2892377: Document relationship of entities, entity revisions and entity translations.
  • @dawehner discussed this with @alexpott and they agreed that the update path should not use hook_update_N(), but hook_post_update_N(). He made that happen in #44 + #49.
  • Entity Field API maintainer @Berdir pointed out some flaws in the current patch in #51 which further underscore the importance of @dawehner's suggestion in #35.

Having looked at this again, I now disagree with my own analysis in #47. It's a similar problem space to #2708411: [PP-1] editor.module's editor_file_reference filter not tracking file usage correctly for translations, but it's not the same. The difference:

  1. for #2708411 (this issue), the root cause is editor.module's use of hook_entity_insert() + hook_entity_update() + hook_entity_delete() + editor_entity_revision_delete() that were introduced in #1932652: Add image uploading to WYSIWYGs through editor.module. It looks like it should also use hook_entity_translation_insert() + hook_entity_translation_delete(), but those were added after #1932652 had already been started, in #1810370: Entity Translation API improvements. But it also turns out that we actually don't even want to use those hooks (as the last several patches demonstrate) — what we want is for Entity Field API to provide a standardized way to iterate over the values for a field across all revisions and all translations.
  2. for #2810355, the root cause is \Drupal\file\Plugin\Field\FieldType\FileFieldItemList::delete() removing file usages, and it relying on $entity->isDefaultTranslation() (i.e. \Drupal\Core\TypedData\TranslatableInterface::isDefaultTranslation()) to determine whether a translation is being deleted or not, and apparently $entity->isDefaultTranslation() behaves incorrectly in case the default translation is being changed

Finally, a patch review.

I think this patch is in a fairly good place. But there are certain things that still need to happen:

  1. I'm not convinced the current test coverage is sufficient. What about the correct updating of file usage when deleting a particular translation, for example?
  2. Furthermore, once translations are taken into account, then changing the default translation cannot but run into the critical bug described in #2810355: $entity->isDefaultTranslation() behaves incorrectly when changing default translation, causing file/image field usage to be set to zero, causing files to be deleted. Therefore this issue is blocked on #2810355. Marking Postponed and prefixing title with [PP-1].
  3. The first and foremost thing that needs to happen is that we need some generic iteration logic in a helper function in editor.module that @Berdir can then review & approve. Then we'll have a working, green patch, with all the logic sound. Then either as a blocker or in a follow-up issue, we can turn that into a proper API. Perhaps on ContentEntityBase. Perhaps as a trait. Something like getFieldValuePerRevisionAndPerTranslation($field_name) which returns an array with each item in the array containing ['value' => …, 'revision id' => …, 'langcode' => …]. Let's discuss the details.
  4. The other bug surfaced here now has a separate issue: #2892368: editor.module's hook_entity_update() does not count file usages correctly when adding/removing images without creating new revisions. Ideally we can land that one first (and soon), because it'd simplify this patch.
Wim Leers’s picture

Wim Leers’s picture

Issue summary: View changes

Fix HTML in IS.

Wim Leers’s picture

Issue summary: View changes

Fix HTML in IS.

Wim Leers’s picture

Issue summary: View changes

One final clarification in the proposed resolution.

Wim Leers’s picture

Title: [PP-1] editor.module's editor_file_reference filter not tracking file usage for new translations » [PP-1] editor.module's editor_file_reference filter not tracking file usage correctly for translations

This is actually not limited to new translations. It's that there's no translation handling at all.

catch’s picture

Priority: Critical » Major
Issue tags: -Triaged D8 critical

Now that #2801777: Give users the option to prevent drupal from automatically marking unused files as temporary is in, this no longer results in unexpected file deletion. #2821423: Dealing with unexpected file deletion due to incorrect file usage is open as a critical meta-issue to discuss the broader approach on this. So I’m going to downgrade this to major after discussion with @xjm and @cilefen, since we feel effort would be better spent on replacing the file_usage system at this point. We can still fix this bug of course as a major.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.6 was released on August 2, 2017 and is the final full bugfix release for the Drupal 8.3.x series. Drupal 8.3.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.4.0 on October 4, 2017. (Drupal 8.4.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.4.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.4 was released on January 3, 2018 and is the final full bugfix release for the Drupal 8.4.x series. Drupal 8.4.x will not receive any further development aside from critical and security fixes. Sites should prepare to update to 8.5.0 on March 7, 2018. (Drupal 8.5.0-alpha1 is available for testing.)

Bug reports should be targeted against the 8.5.x-dev branch from now on, and new development or disruptive changes should be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.