Problem/Motivation

Field purge can purge data from fields that have been recreated with the same name.

Proposed resolution

Remaining tasks

User interface changes

tba

API changes

tba

Data model changes

tba

Original issue summary

When developing a module .install you often have to uninstall/reinstall. When you uninstall the module, the fields are marked for deletion (see key_value table at the row with collection to state and field.field.deleted and field.storage.deleted). If you reinstall your module without running the cron manually, your fields values will be deleted at the next cron pass.

see http://drupal.stackexchange.com/questions/210824/field-values-added-prog...

REPRO STEPS

- install a module that programmatically inserts data in taxonomy term custom fields
- uninstall the module
- reinstall the module
- run cron

RESULT:
The values of the custom fields are deleted

EXPECTED:
The values are not deleted

class schedule_block_class{

  const schedule_block_channels_terms = array(
    array(
      'name' => 'Brava',
      'field_channel_id' => 'BRA001'
    ),
    array(
      'name' => 'DJazz',
      'field_channel_id' => 'DJA001'
    ));
  const schedule_block_channels_vid = 'schedule_channels';
}

function schedule_block_install(){
  $schedule_block_terms = schedule_block_class::schedule_block_channels_terms;
  $schedule_block_vid = schedule_block_class::schedule_block_channels_vid;
  foreach($schedule_block_terms as $schedule_block_term){
    $term = Term::create(array(
      'name' => $schedule_block_term['name'],
      'field_channel_id' => $schedule_block_term['field_channel_id'],
      'description' => '',
      'parent' => array(0),
      'vid' => $schedule_block_vid
    ));
    $term->save();
  }
}

function schedule_block_uninstall(){

  $schedule_block_terms = schedule_block_class::schedule_block_channels_terms;
  $schedule_block_vid = schedule_block_class::schedule_block_channels_vid;
  $terms = array();
  foreach($schedule_block_terms as $schedule_block_term){
    if($terms_name = taxonomy_term_load_multiple_by_name($schedule_block_term['name'], $schedule_block_vid)){
      $terms = array_merge($terms, $terms_name);
    }
  }
  foreach($terms as $term){
    $term->delete();
  }
  $taxo = \Drupal\taxonomy\Entity\Vocabulary::load(schedule_block_class::schedule_block_channels_vid);
  if(!is_null($taxo)){
    $taxo->delete();
  }
}

Comments

reblutus created an issue. See original summary.

reblutus’s picture

Issue summary: View changes
catch’s picture

Title: Field values dispear after a certain time after reinstalling a module » Uninstalling a module can leave field values still to be purged confusing when the module is re-enabled

Field module should prevent uninstall if there are fields to purge, it could show a message telling you to run cron.

https://api.drupal.org/api/drupal/core%21modules%21file%21src%21Plugin%2...

catch’s picture

Version: 8.1.7 » 8.2.x-dev
swentel’s picture

Hmm, could this also mean that we have data loss when you'd add a field through the UI, add values, then delete the field and recreate it with the same name ? That would be pretty hilarious (and probably would be the same in D7). We could write a test for that to find out quickly.

mallezie’s picture

@swentel that doesn't count as dataloss according to me, you're deleting the field?

swentel’s picture

@mallezie sure, but if you recreate the field with the same name, and there is a bug in the purging (apparently) where it picks up the wrong table name ( the 'live' one instead of the 'deleted/renamed' table) you have data loss on your new field. And that's not good :)

swentel’s picture

Status: Active » Needs review
StatusFileSize
new2.86 KB

Rough test that proves the problem - create a field, add values, delete it, recreate, add values, call purge, values are deleted from the main table (for some reason I couldn't get it to replicate it using entity query, so it's a direct db query)

Status: Needs review » Needs work

The last submitted patch, 8: 2782009-8.patch, failed testing.

alexpott’s picture

Title: Uninstalling a module can leave field values still to be purged confusing when the module is re-enabled » Create a field with the same name as one being purged results in data destruction of the new field
Issue summary: View changes
Issue tags: +Triaged D8 critical

Discussed with @xjm, @effulgentsia, @webchick and @Cottser. We agreed that this was super critical because unexpected field delete is really bad.

mile23’s picture

field.purge.inc says this in a few places:

    // We cannot purge anything if the entity type is unknown (e.g. the
    // providing module was uninstalled).
    // @todo Revisit after https://www.drupal.org/node/2080823.

#2080823: Create API to discover config entities' soft dependencies and use this to present a confirm form on module uninstall is closed, so time to revisit. :-)

That issue has this follow-up: #2343517: Cleanup @todo referring to the config dependencies API issue which deals with the @todo mentioned above.

alexpott’s picture

So this bug is happening because \Drupal\Core\Entity\Sql\SqlContentEntityStorage::readFieldItemsToPurge() works using the column name of the storage. I think this means we need to rename the field when it is deleted - that is the only way around this. Or we don't permit fields to be created when we have deleted fields which will end up with the same column name which will break module re-install.

alexpott’s picture

Status: Needs work » Needs review
StatusFileSize
new4.7 KB

Ah I see - there's a bug in \Drupal\Core\Entity\Sql\SqlContentEntityStorage::storageDefinitionIsDeleted() - the way this answers this question is wrong for configurable fields which can be recreated with the same name.

I've also made some of the queries safer by adding a condition on the deleted column which we set when we delete fields.

swentel’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
    @@ -1543,6 +1543,7 @@ protected function readFieldItemsToPurge(FieldDefinitionInterface $field_definit
    +        ->condition('deleted', 1)
    

    So with the fix underneath, this shouldn't be needed anymore normally right ? But it probably can't hurt to add these.

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
    @@ -1684,7 +1687,7 @@ public function countFieldData($storage_definition, $as_bool = FALSE) {
    +    return ($storage_definition instanceOf FieldStorageConfigInterface && $storage_definition->isDeleted()) || !array_key_exists($storage_definition->getName(), $this->entityManager->getLastInstalledFieldStorageDefinitions($this->entityTypeId));
    

    Nice catch, my brain hurt a bit at first because of the OR condition, but it's the right check. We should probably document this on the the interface docblock.

alexpott’s picture

StatusFileSize
new1.22 KB
new4.75 KB

@swentel yep the delete checks are not strictly necessary as far as I know BUT they can't hurt and added an element of safety since ensuring we don't delete things which haven't been marked deleted seems sane.

swentel’s picture

Oh, nice change in the method, way friendlier to read!

Guess we need a nice test comment of course.
I'm fairly confident about the test, don't really think we need more. The only thing that I'm not sure of is the last part which doesn't use the entity query factory as the other tests do in that class (in general, I feel this class could use some cleanup, but oh well).

Leaving on NR though to get more potential eyeballs on this one, the more the better as this is really tricky stuff.

swentel’s picture

don't really think we need more

Actually, to answer myself on that, we might explicitly test that the data is removed from the deleted field table, so we're really sure that we're ok, even though we have other tests in the same class that test that behavior.

catch’s picture

So the patch in itself looks good, but do we have test coverage for things like updating a node with deleted field values to add new data etc. - i.e. are there any more queries missing deleted = 1?

Or we don't permit fields to be created when we have deleted fields which will end up with the same column name which will break module re-install.

This seems safer to me.

alexpott’s picture

I considered

Or we don't permit fields to be created when we have deleted fields which will end up with the same column name which will break module re-install.

But ended up dismissing this as way too complex re module uninstall and re-install.

alexpott’s picture

alexpott’s picture

+++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
@@ -1684,6 +1687,12 @@ public function countFieldData($storage_definition, $as_bool = FALSE) {
   protected function storageDefinitionIsDeleted(FieldStorageDefinitionInterface $storage_definition) {
+    // Configurable fields are marked for deletion.
+    if ($storage_definition instanceOf FieldStorageConfigInterface) {
+      return $storage_definition->isDeleted();
+    }
+    // For non configurable fields check whether they are still in the last
+    // installed schema repository.
     return !array_key_exists($storage_definition->getName(), $this->entityManager->getLastInstalledFieldStorageDefinitions($this->entityTypeId));
   }

This is the most important part of the change - since the bug was not in the field being deleted but the field storage being deleted. The other conditions added here are for safety and can not do any harm.

alexpott’s picture

StatusFileSize
new9.78 KB
new11.31 KB

Looking at the code some more I think being able to pass the $is_deleted flag into TableMapping is wrong headed. This thing has the field storages injected - it should be checking itself. Atm I'm checking it's stored field definitions - that might not be right though as the check in \Drupal\Core\Entity\Sql\SqlContentEntityStorage::storageDefinitionIsDeleted is using the last installed schema repository but the fact that \Drupal\Core\Entity\Sql\DefaultTableMapping::getDedicatedDataTableName and \Drupal\Core\Entity\Sql\DefaultTableMapping::getDedicatedRevisionTableName are not checking the field storages they have is kinda scary. Let's see what breaks.

Improved the test.

alexpott’s picture

StatusFileSize
new11.28 KB

Let's just see if anything breaks if we don't rely on $is_deleted at all - this would give us some idea if we can remove all usages in core.

swentel’s picture

+++ b/core/modules/field/tests/src/Kernel/BulkDeleteTest.php
@@ -212,6 +212,100 @@ function testDeleteField() {
+      ->select('entity_test__' . $field_name, 'f')

should we make use of $active_table_name here ?

alexpott’s picture

StatusFileSize
new8.21 KB
new18.18 KB

Here's what I think would be necessary for a proper deprecation of the $is_deleted parameter. Hopefully #23 fails.

alexpott’s picture

re #24 @swentel I think it is ok hard coding that expectation there.

The last submitted patch, 23: 2782009-23.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 25: 2782009-24.patch, failed testing.

The last submitted patch, 25: 2782009-24.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
StatusFileSize
new1.17 KB
new18.2 KB

Fixed the last fail.

Status: Needs review » Needs work

The last submitted patch, 30: 2782009-30.patch, failed testing.

alexpott’s picture

Hmmm...

  protected function deleteDedicatedTableSchema(FieldStorageDefinitionInterface $storage_definition) {
    // When switching from dedicated to shared field table layout we need need
    // to delete the field tables with their regular names. When this happens
    // original definitions will be defined.
    $deleted = !$this->originalDefinitions;
    $table_mapping = $this->storage->getTableMapping();
    $table_name = $table_mapping->getDedicatedDataTableName($storage_definition, $deleted);
    $this->database->schema()->dropTable($table_name);
    if ($this->entityType->isRevisionable()) {
      $revision_name = $table_mapping->getDedicatedRevisionTableName($storage_definition, $deleted);
      $this->database->schema()->dropTable($revision_name);
    }
    $this->deleteFieldSchemaData($storage_definition);
  }

The $deleted = !$this->originalDefinitions; is ugly here because it is depending on something outside of the field storage...

alexpott’s picture

Status: Needs work » Needs review
StatusFileSize
new4.07 KB
new6.18 KB

Given the issues with #32 I think we should back-pedal and do #13 with more tests.

I think we should open a followup about deprecating the $is_deleted flag from getDedicatedDataTableName() and just have an explicit getDeletedDedicatedDataTableName() and make the calling code responsible for deciding which one to call.

swentel’s picture

Status: Needs review » Reviewed & tested by the community

Makes sense to me. #32 is tricky, and I haven't got full clue yet either what's going on there, or why it's doing the check like that. Last patch fixes the bug as is first and we can improve in 8.3.x further.

swentel’s picture

Hmm, is #2763097: Field purge eats live data a duplicate then, or is there a subtle difference ?

alexpott’s picture

I think it is a dupe - just another way of ending p with a deleted filed with the same name.

  • catch committed b6a64fc on 8.3.x
    Issue #2782009 by alexpott, swentel, reblutus: Create a field with the...

  • catch committed c39da96 on 8.2.x
    Issue #2782009 by alexpott, swentel, reblutus: Create a field with the...
catch’s picture

Status: Reviewed & tested by the community » Fixed

Committed/pushed to all three 8.x branches, thanks!

  • catch committed 8ed5d65 on 8.1.x
    Issue #2782009 by alexpott, swentel, reblutus: Create a field with the...

Status: Fixed » Needs work

The last submitted patch, 33: 2782009-2-33.patch, failed testing.

chx’s picture

Status: Needs work » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.