Problem/Motivation

In #2705389: All content entities should extend RevisionableContentEntityBase we want to enable revisions everywhere. But in order for this to be committed in 8.x we need a data upgrade/migration path.

What currently stops us is the fact that we need to change the entity type definition to enable revisions. Why is this a problem? That's because in \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::onEntityTypeUpdate we throw a EntityStorageException if \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::requiresEntityDataMigration returns true (which it will for existing sites).

Proposed resolution

A service that can be called from within an update hook.
Pass in the entity type object and it will create / remove tables and columns, update the installed entity and field definition, and migrate and data.

This issue will also add tests for the service to test switching an entity_test entity from non-revisionable to revisionable.

Files: 

Comments

timmillwood created an issue. See original summary.

dawehner’s picture

What blocks us from writing individual update functions which moves the data over from one table storage to anotherone?

timmillwood’s picture

As mentioned to @dawehner in person I don't think using update hooks is scalable.

Depending on how much we want to be revisionable this will need to work in different ways.

Personally I would love to see core enforce all content entities are revisionable (even if they don't use revisions). Therefore the migration will have to migrate all content entity schemas.

I still think that doing this migration through \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::onEntityTypeUpdate using Migrate module.

dixon_’s picture

Issue summary: View changes
Priority: Normal » Major
Parent issue: » #2721129: Workflow Initiative

I'm starting to like the idea of update hooks the more I think about it. I have updated the OP with two possible approaches. I'm also bumping this to Major as this is a big blocker for the first phase of the now-official Workflow initiative.

dixon_’s picture

Issue tags: +Workflow Initiative
dawehner’s picture

I personally think that the initial additional of revisions for certain entity types can be totally done in specific update hooks, given its basically a table copy and column copy plus removal of non revisionable fields. The scaling aspect could be probably solved by some clever SQL queries.

On the other hand it would be super nice to have automatic migration from one schema to the other using the migrate module, but given the possible operations, they can't be always automated

timmillwood’s picture

Category: Feature request » Plan

If we decide to go with the update hook approach this will have to ship in the same patch that we enable the revisionable entities in. #2705389: All content entities should extend RevisionableContentEntityBase

timmillwood’s picture

The update hooks might simply call \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::onEntityTypeCreate, then loop through the non-revision tables, and copy data to the revision tables.

dixon_’s picture

timmillwood’s picture

Issue summary: View changes
Status: Active » Closed (outdated)
timmillwood’s picture

Title: Migrate entities between entity type schemas » Upgrade path between revisionable / non-revisionable entities.
Issue summary: View changes
Status: Closed (outdated) » Needs work
timmillwood’s picture

Assigned: Unassigned » timmillwood
Status: Needs work » Needs review
FileSize
22.36 KB

Adding a starting point for this.

timmillwood’s picture

This includes a database dump of a standard site with entity_test module enabled, although entity_test_mulrev was made non-revisionable before creating the dump, so the the test, tests making it revisionable again.

Status: Needs review » Needs work

The last submitted patch, 13: upgrade_path_between-2721313-13.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
175.45 KB
175.99 KB

Status: Needs review » Needs work

The last submitted patch, 15: upgrade_path_between-2721313-14.patch, failed testing.

timmillwood’s picture

It seems making this testable with entity_test is a little more complex than I expected.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
10.12 KB
9.77 KB

Using shortcut entity as an initial test for this upgrade path. This patch therefore makes shortcut entity revisionable. We can then make all other entities revisionable in #2705389: All content entities should extend RevisionableContentEntityBase.

Status: Needs review » Needs work

The last submitted patch, 18: upgrade_path_between-2721313-18.patch, failed testing.

timmillwood’s picture

This should fix some of the failing test, but others looks to be due to the update hooks only making schema changes and not migrating any data.
TODO: Make sure revision tables and fields are populated

Status: Needs review » Needs work

The last submitted patch, 20: upgrade_path_between-2721313-20.patch, failed testing.

timmillwood’s picture

Status: Needs review » Needs work

The last submitted patch, 22: upgrade_path_between-2721313-22.patch, failed testing.

timmillwood’s picture

Need to work out why the langcode field in shortcut_field_revision is being created with a primary key.

shortcut module

Update #8001

  • Failed: Drupal\Core\Database\IntegrityConstraintViolationException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'en' for key 'PRIMARY': INSERT INTO {shortcut_field_revision} (id, revision_id, langcode, title, link__uri, link__options, default_langcode) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6); Array
    (
    [:db_insert_placeholder_0] => 2
    [:db_insert_placeholder_1] => 2
    [:db_insert_placeholder_2] => en
    [:db_insert_placeholder_3] => All content
    [:db_insert_placeholder_4] => internal:/admin/content
    [:db_insert_placeholder_5] => a:0:{}
    [:db_insert_placeholder_6] => 1
    )
    in Drupal\Core\Database\Connection->handleQueryException() (line 668 of /home/timmillwood/Code/drupal1/core/lib/Drupal/Core/Database/Connection.php).
timmillwood’s picture

So it looks like langcode is getting added as a primary key in \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::initializeDataTable and \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::initializeRevisionDataTable, not sure I understand why yet.

Any insight anyone?

amateescu’s picture

@timmillwood, the 'langcode' field is added as a primary key in most entity table for (entity query) performance purposes. Not sure why that would affect the data copy operation though..

dawehner’s picture

diff --git a/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
index 107495b..6caf20b 100644
--- a/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
+++ b/core/lib/Drupal/Core/Database/Driver/mysql/Schema.php
@@ -413,7 +413,10 @@ public function addField($table, $field, $spec, $keys_new = array()) {
     }
     if ($fixnull) {
       $spec['not null'] = TRUE;
-      $this->changeField($table, $field, $field, $spec);
+      $count = $this->connection->query('SELECT COUNT(*) AS count FROM {' . $table . '}')->fetchAssoc();
+      if ($count['count'] == 0) {
+        $this->changeField($table, $field, $field, $spec);
+      };
     }
   }

Just a quick happy reminder. This seems to be quite an assumption which is maybe better added at some higher level of the stack.

timmillwood’s picture

@amateescu - I assumed the primary key on 'langcode' is causing the issue outlined in #24. Maybe it's a deeper issue.

@dawehner - I have been thinking that the part of the patch you mention should go into it's own issue. The problem I am having is when I add the 'revision_id' field to an existing table, if the table has content (such as the Shortcut module adds on install) then it causes issues if the new field can't be null. Maybe there can be something to pass into installFieldStorageDefinition() to either add some data, or not add the NOT NULL attribute, this could then be added later.

timmillwood’s picture

So it looks like this issue with langcode is because revision_id isn't getting a primary key set.

mysql> describe simpletest436003shortcut_field_revision;
+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| id                   | int(10) unsigned | NO   | MUL | NULL    |       |
| langcode             | varchar(12)      | NO   | PRI | NULL    |       |
| title                | varchar(255)     | YES  |     | NULL    |       |
| link__uri            | varchar(2048)    | YES  | MUL | NULL    |       |
| link__title          | varchar(255)     | YES  |     | NULL    |       |
| link__options        | longblob         | YES  |     | NULL    |       |
| default_langcode     | tinyint(4)       | NO   |     | NULL    |       |
| revision_id          | int(10) unsigned | NO   |     | NULL    |       |
| revision_created     | int(11)          | YES  |     | NULL    |       |
| revision_user        | int(10) unsigned | YES  | MUL | NULL    |       |
| revision_log_message | longtext         | YES  |     | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+
11 rows in set (0.00 sec)
timmillwood’s picture

timmillwood’s picture

This is failing in SQLite and PostgreSQL because the NOT NULL "fix" has only been done for MySQL. I'm trying to work on a proper fix for this in #2746279: Adding not null fields to table which have data.

amateescu’s picture

amateescu’s picture

Issue tags: +DevDaysMilan

Status: Needs review » Needs work

The last submitted patch, 34: upgrade_path_between-2721313-34.patch, failed testing.

The last submitted patch, 34: upgrade_path_between-2721313-34.patch, failed testing.

timmillwood’s picture

Can't seem to replicate these failures locally.

jibran’s picture

Issue tags: +shortcut

Tagging it so that I can keep track of it.

jeqq’s picture

Status: Needs work » Needs review
jeqq’s picture

Some tests are failing because Drupal project tests are failing and they are not related to this patch.

Bojhan’s picture

Category: Plan » Task

Status: Needs review » Needs work

The last submitted patch, 39: upgrade_path_between-2721313-39.patch, failed testing.

timmillwood’s picture

Yay! All tests passing again!

Please review so we can get this committed and get the ball rolling on the parent issue.

jibran’s picture

I think @effulgentsia and @plach should review 'system.entity_schema_updater' changes.

Berdir’s picture

Status: Needs review » Needs work

First review.

  1. +++ b/core/modules/shortcut/tests/src/Kernel/ShortcutSevenIntegrationTest.php
    @@ -21,6 +21,7 @@ public function testInstallUninstall() {
     
    +    \Drupal::service('module_installer')->install(['user']);
         \Drupal::service('module_installer')->install(['shortcut']);
         $this->assertTrue($this->config('seven.settings')->get('third_party_settings.shortcut.module_link'), 'The shortcut module_link setting is in seven.settings.');
    

    what makes this necessary exactly? Ah, the revision user field? A bit weird to have that on shortcuts, especially given that they don't have a user?

  2. +++ b/core/modules/system/src/EntitySchemaUpdater.php
    @@ -0,0 +1,186 @@
    +
    +namespace Drupal\system;
    +
    

    why is this in system.module?

  3. +++ b/core/modules/system/src/EntitySchemaUpdater.php
    @@ -0,0 +1,186 @@
    +  /**
    +   * {@inheritdoc}
    +   */
    +  public function updateEntityType(EntityTypeInterface $entity_type) {
    +    $last_entity_type = $this->lastInstalledSchemaRepository->getLastInstalledDefinition($entity_type->id());
    +    $keys = $last_entity_type->getKeys();
    +    $keys['revision'] = 'revision_id';
    +    $last_entity_type->set('entity_keys', $keys);
    +    $last_entity_type->set('revision_table', $entity_type->getRevisionTable());
    +    $last_entity_type->set('revision_data_table', $entity_type->getRevisionDataTable());
    +    $this->lastInstalledSchemaRepository->setLastInstalledDefinition($last_entity_type);
    +  }
    

    This seems to be doing something very specific but the method name is generic. Either it should have a better method name or it should just be in the update function directly.

    For example, we might want to think about the order in which we do things. The idea is to get the currently installed type, then call the method that make the changes and only when that was successful update the type.

  4. +++ b/core/modules/system/src/EntitySchemaUpdater.php
    @@ -0,0 +1,186 @@
    +  public function createTables(EntityTypeInterface $entity_type) {
    +    /** @var \Drupal\Core\Entity\EntityTypeListenerInterface $storage */
    +    $storage = $this->entityTypeManager->getStorage($entity_type->id());
    +    if ($storage instanceof EntityTypeListenerInterface) {
    +      $storage->onEntityTypeCreate($entity_type);
    +    }
    

    This also seems a bit strange, calling on EntityTypeCreate() on an existing entity type?

    I think that only works because the storage currently silently skips existing tables, which can cause weird bugs/behavior (see https://www.drupal.org/node/2751363). If we change that, this will break.

  5. +++ b/core/modules/system/src/EntitySchemaUpdater.php
    @@ -0,0 +1,186 @@
    +   */
    +  public function copyData(EntityTypeInterface $entity_type) {
    +    /** @var \Drupal\Core\Entity\Sql\TableMappingInterface $table_mapping */
    +    $table_mapping = $this->entityTypeManager->getStorage($entity_type->id())->getTableMapping();
    +    $table_names = $table_mapping->getTableNames();
    +    $data = [];
    +    foreach ($table_names as $table_name) {
    +      $column_names = $table_mapping->getAllColumns($table_name);
    +      $results = $this->database->select($table_name, 't')->fields('t')->execute()->fetchAll();
    +      foreach ($results as $key => $result) {
    +        foreach ($column_names as $column_name) {
    +          if (!empty($result->{$column_name})) {
    +            $data[$key][$column_name] = $result->{$column_name};
    +          }
    +        }
    +      }
    +    }
    

    I know the example basically says that, but that really isn't going to work if you have tens or hundreds of thousands of entities. Sure, not the case for shortcuts (I hope...) but for terms? absolutely possible.

    This needs to use batch API.

    From what I understand, this is basically an attempt to make the update code for multiple entity types re-usable. Updates tend to be complicated and it is likely we'll need entit type specific customizations. But if we do keep this, then we should at least give this very specific class/service/method names. And possibly have fewer public methods, that are more closely tied to update/batch API.

    Meaning, this would then accept a $sandbox argument and implement something like the example in https://api.drupal.org/api/drupal/modules%21system%21system.api.php/func...

timmillwood’s picture

jhedstrom’s picture

Some initial review. I think the biggest question is around using a filled vs bare database for the tests in #1. The rest are mostly style/nits.

  1. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,40 @@
    +  /**
    +   * @inheritDoc
    +   */
    +  protected function setDatabaseDumpFiles() {
    +    $this->databaseDumpFiles = [__DIR__ . '/../../../../../modules/system/tests/fixtures/update/drupal-8.bare.standard.php.gz'];
    +  }
    

    Shouldn't this test check with some content (eg, the drupal-8.filled.standard.php.gz starting db)?

    Also a nit, the docblock should be {@inheritdoc}

  2. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +      $sandbox['max'] = $this->database->select($base_table)
    +          ->countQuery()
    +          ->execute()
    +          ->fetchField();
    

    Indentation nit.

  3. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +        $exists = $this->database->select($table_name)
    +          ->fields($table_name, [$id])
    +          ->condition($id, $record[$id])
    +          ->execute()
    +          ->fetchAll();
    +        if (isset($exists) && isset($exists[0]) && ($exists[0]->{$id} == $record[$id])) {
    

    $exists will always be set, so this could just be if ($exists && ...).

  4. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverterInterface.php
    @@ -0,0 +1,22 @@
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    ...
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    +   * @param array $sandbox
    

    Some nits: these need parameter descriptions.

timmillwood’s picture

Addressing issues in #49.

Status: Needs review » Needs work

The last submitted patch, 50: upgrade_path_between-2721313-1.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
691 bytes
17.09 KB

Looks like the filled db has 4 shortcut entities.

jhedstrom’s picture

This is looking good. Are there any outstanding tasks left to do here?

+++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
@@ -0,0 +1,40 @@
+    $storage = \Drupal::entityTypeManager()->getStorage($entity_type->id());
+    $entities = $storage->loadMultiple();
+    $this->assertEqual(count($entities), 4, "Four Shortcut entities found");
+

We're testing shortcut here, and then presumably this test would be updated over in #2705389: All content entities should extend RevisionableContentEntityBase to check other entity types?

timmillwood’s picture

@jhedstrom - I don't think there are any remaining tasks.
Yes, #2705389: All content entities should extend RevisionableContentEntityBase is where we will update other content types to be revisionable, add update hooks for them, and update the test.

jhedstrom’s picture

I think this is ready for RTBC, but as @jibran mentions in #45, it would be good for additional reviews of the new service. However, that was 10 days ago, and it would also be nice for this to move forward. Neither of those folks are on IRC at the moment either.

jhedstrom’s picture

Here's a deeper review of the new service to try and move this along.

  1. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +   * EntitySchemaUpdater constructor.
    +   * @param \Drupal\Core\Entity\EntityLastInstalledSchemaRepositoryInterface $last_installed_schema_repository
    +   * @param \Drupal\Core\Entity\EntityTypeManagerInterface $entity_type_manager
    +   * @param \Drupal\Core\Entity\EntityDefinitionUpdateManagerInterface $entity_definition_update_manager
    +   * @param \Drupal\Core\Entity\EntityFieldManagerInterface $entity_field_manager
    +   * @param \Drupal\Core\Database\Connection $database
    

    Nit: needs a space between the description and the @param statements. Also need variable descriptions for each param.

  2. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +  public function copyData(EntityTypeInterface $entity_type, array &$sandbox) {
    

    There's a lot going on in this method. Can you add inline code comments at each section to explain the process for easier potential debugging in the future?

  3. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +        ->range(0,5)
    

    I had to dig a bit do figure out we were looping through 5 at a time. An explicit comment here would be good.

  4. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    

    Nit: variable description.

  5. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +
    

    Whitespace.

  6. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,223 @@
    +    $revision_created = BaseFieldDefinition::create('created')
    +      ->setLabel(t('Revision create time'))
    +      ->setDescription(t('The time that the current revision was created.'))
    +      ->setRevisionable(TRUE);
    +    $this->entityDefinitionUpdateManager
    +      ->installFieldStorageDefinition("revision_created", $entity_type->id(), $entity_type->id(), $revision_created);
    +
    +    $revision_user = BaseFieldDefinition::create('entity_reference')
    +      ->setLabel(t('Revision user'))
    +      ->setDescription(t('The user ID of the author of the current revision.'))
    +      ->setSetting('target_type', 'user')
    +      ->setRevisionable(TRUE);
    +    $this->entityDefinitionUpdateManager
    +      ->installFieldStorageDefinition("revision_user", $entity_type->id(), $entity_type->id(), $revision_user);
    ...
    +    $revision_log_message = BaseFieldDefinition::create('string_long')
    +      ->setLabel(t('Revision log message'))
    +      ->setDescription(t('Briefly describe the changes you have made.'))
    +      ->setRevisionable(TRUE)
    +      ->setDefaultValue('')
    +      ->setDisplayOptions('form', [
    +        'type' => 'string_textarea',
    +        'weight' => 25,
    +        'settings' => [
    +          'rows' => 4,
    +        ],
    +      ]);
    

    Rather than hard-code these here, can we re-use the RevisionLogEntityTrait::revisionLogBaseFieldDefinitions() method so labels and definitions are always in sync?

Berdir’s picture

6: No, we can't. This is an update function, just like a schema definition, it must not change, otherwise it will get confusing for sites that already get an updated definition and try to do another update on top of it. We have to keep the original definition here.

jhedstrom’s picture

We have to keep the original definition here.

But this is a re-usable update-helper-service. In theory, it could be used on one entity type now, and another in 6 months. If for some reason entity revisions have another column added, or column schema change, or a label change, etc, shouldn't the update hook run with the latest definition? It isn't changing existing fields, it's adding new ones.

Berdir’s picture

Well, we'll see how re-usable it exactly is :) Still not 100% sure about where it should live exactly and how it should be named.

Update functions and re-usable usually don't go well with each other. Update functions are always for very specific versions and must be guaranteed to work any combination of old and new core version (as far as supported), so they must *never* change.

timmillwood’s picture

Fixing nits from #56.
Agree with #57 and #59.

amateescu’s picture

Status: Needs review » Needs work

First review pass.. it looks pretty good already :)

  1. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +        ->range(0,5)
    

    Missing space before '5'.

  2. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +        $exists = $this->database->select($table_name)
    +          ->fields($table_name, [$id])
    +          ->condition($id, $record[$id])
    +          ->execute()
    +          ->fetchAll();
    +        if ($exists && isset($exists[0]) && ($exists[0]->{$id} == $record[$id])) {
    +          $this->database
    +            ->update($table_name)
    +            ->fields($values)
    +            ->condition($id, $record[$id])
    +            ->execute();
    +        }
    +        else {
    +          $this->database
    +            ->insert($table_name)
    +            ->fields($values)
    +            ->execute();
    +        }
    

    Wouldn't it be simpler to write this as a merge query?

    Or, even better, if it depends only on a single lookup key, it can be an upsert query.

  3. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +    $keys['revision'] = 'revision_id';
    

    We shouldn't hardcode the name of the 'revision' entity key here, it needs to be an optional parameter.

  4. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +    $last_entity_type->set('revision_table', $entity_type->getRevisionTable());
    +    $last_entity_type->set('revision_data_table', $entity_type->getRevisionDataTable());
    

    Same for the revision table names. They might change in the future so the only way for an update function to perform the exact same task at any point in time is to not depend on any live/changeable state.

  5. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +      $storage->onEntityTypeCreate($entity_type);
    

    @Berdir's feedback was not addressed here...

  6. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +   * Installs new fields.
    +   *
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    +   */
    +  protected function installFields(EntityTypeInterface $entity_type) {
    

    This method seems to deal only with revisionable fields, so it should be named a bit better.

  7. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +    if ($entity_type->hasKey('revision')) {
    +      $revision_id = BaseFieldDefinition::create('integer')
    +        ->setLabel(new TranslatableMarkup('Revision ID'))
    +        ->setReadOnly(TRUE)
    +        ->setSetting('unsigned', TRUE);
    +      $this->entityDefinitionUpdateManager
    +        ->installFieldStorageDefinition($entity_type->getKey('revision'), $entity_type->id(), $entity_type->id(), $revision_id);
    +    }
    

    Why is only this field conditioned by the 'revision' entity key?

    All the other fields from this method don't make sense without a 'revision' key either.

  8. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +      ->installFieldStorageDefinition("revision_created", $entity_type->id(), $entity_type->id(), $revision_created);
    ...
    +      ->installFieldStorageDefinition("revision_user", $entity_type->id(), $entity_type->id(), $revision_user);
    ...
    +      ->installFieldStorageDefinition("revision_log_message", $entity_type->id(), $entity_type->id(), $revision_log_message);
    

    Like above, we shouldn't hardcode the names of the base fields.

  9. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,227 @@
    +    $revision_data_table = $entity_type->getRevisionDataTable();
    +    $schema = $this->database->schema();
    +    $schema->dropPrimaryKey($revision_data_table);
    +    $schema->addPrimaryKey($revision_data_table, [$entity_type->getKey('revision'), $entity_type->getKey('langcode')]);
    

    Why are we updating the indexes only for the revision data table? The base, data and revision tables need the same treatment I think, no? :)

  10. +++ b/core/modules/shortcut/shortcut.install
    @@ -67,3 +67,19 @@ function shortcut_uninstall() {
    +function shortcut_update_8001() {
    ...
    +function shortcut_update_8002(&$sandbox) {
    

    Is there a reason for having two update functions?

  11. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,40 @@
    + * Class EntitySchemaUpdaterTest
    + * @package Drupal\system\Tests\Update
    + * @group entity
    

    Missing empty lines and a slightly more descriptive text for this test class :)

  12. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,40 @@
    +class RevisionableSchemaConverterTest extends UpdatePathTestBase {
    +  /**
    

    Missing empty line.

  13. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,40 @@
    +  public function testMakeRevisionable() {
    

    Missing docblock for this method.

  14. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,40 @@
    +    $this->assertTrue($post_last_installed->isRevisionable());
    ...
    +    $this->assertEqual(count($entities), 4, "Four Shortcut entities found");
    

    We should also test loading a revision of this entity type in order to be sure that the revision tables were created and populated properly.

timmillwood’s picture

Addressing items in #61:
1. Done
2. Done
3. Done
4. The table names are being fetched from the $entity_type, all we're hardcoding here is the property.
5. Not sure how to address this yet.
6. Done
7. Done
8. Where can we fetch them from?
9. Revision data table was the only one that caused an issue.
10. yes, @Berdir's suggestion, the first doesn't loop, the second does.
11. Done
12. Done
13. Done

jibran’s picture

Some more review points and some suggestion. Nice work @timmillwood.

  1. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,208 @@
    +   * @var \Drupal\Core\Entity\EntityLastInstalledSchemaRepositoryInterface
    ...
    +   * @var \Drupal\Core\Entity\EntityTypeManagerInterface
    ...
    +   * @var \Drupal\Core\Entity\EntityDefinitionUpdateManagerInterface
    ...
    +   * @var \Drupal\Core\Entity\EntityFieldManagerInterface
    ...
    +   * @var \Drupal\Core\Database\Connection
    ...
    +   * @param \Drupal\Core\Entity\EntityLastInstalledSchemaRepositoryInterface $last_installed_schema_repository
    +   * @param \Drupal\Core\Entity\EntityTypeManagerInterface $entity_type_manager
    +   * @param \Drupal\Core\Entity\EntityDefinitionUpdateManagerInterface $entity_definition_update_manager
    +   * @param \Drupal\Core\Entity\EntityFieldManagerInterface $entity_field_manager
    +   * @param \Drupal\Core\Database\Connection $database
    ...
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    ...
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    ...
    +   * @param \Drupal\Core\Entity\EntityTypeInterface $entity_type
    

    param description missing.

  2. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,50 @@
    + * Testing RevisionableSchemaConverter correctly updates Shortcut entity to be
    + * revisionable.
    ...
    +   * Runs the updates then test the entity type is revisionable and all
    +   * shortcut entities are accessible.
    

    This should be a single line.

  3. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,50 @@
    +   * @inheritDoc
    

    This is not correct.

  4. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,50 @@
    +    $entity_type = \Drupal::entityTypeManager()->getStorage('shortcut')->getEntityType();
    ...
    +    /** @var ContentEntityInterface $last_installed */
    +    $pre_last_installed = \Drupal::service('entity.last_installed_schema.repository')->getLastInstalledDefinition($entity_type->id());
    +    $this->assertFalse($pre_last_installed->isRevisionable());
    

    Not needed at all.

  5. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,50 @@
    +    $post_last_installed = \Drupal::service('entity.last_installed_schema.repository')->getLastInstalledDefinition($entity_type->id());
    ...
    +    $storage = \Drupal::entityTypeManager()->getStorage($entity_type->id());
    

    I think $this->container->get() is preferred instead of \Drupal::service() in tests.

  6. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,50 @@
    +    $entities = $storage->loadMultiple();
    

    We should create and save new revision of a shortcut here and assert some values.

jibran’s picture

Status: Needs work » Needs review
amateescu’s picture

Status: Needs review » Needs work

For #62.4 and .8: That's my point exactly, we shouldn't fetch them from $entity_type because that might change in the future but rather add them as method parameters and the let the update function pass them in.

#62.9: That doesn't mean all the other tables don't need this as well :)

#62.5: Copying over the code seems to be the only "good" solution here..

Berdir’s picture

For #62.4 and .8: That's my point exactly, we shouldn't fetch them from $entity_type because that might change in the future but rather add them as method parameters and the let the update function pass them in.

If you get the entity type definition from the entity definitions update manager, then no, it won't change. The problem will be that not all places (like the storage handler) will do that if you just go through that for some part of the code.

amateescu’s picture

+++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
@@ -0,0 +1,208 @@
+  protected function updateEntityType(EntityTypeInterface $entity_type) {
+    $last_entity_type = $this->lastInstalledSchemaRepository->getLastInstalledDefinition($entity_type->id());
+    $keys = $last_entity_type->getKeys();
+    $keys['revision'] = $entity_type->getKey('revision');
+    $last_entity_type->set('entity_keys', $keys);
+    $last_entity_type->set('revision_table', $entity_type->getRevisionTable());
+    $last_entity_type->set('revision_data_table', $entity_type->getRevisionDataTable());
+    $this->lastInstalledSchemaRepository->setLastInstalledDefinition($last_entity_type);
+  }

+++ b/core/modules/shortcut/shortcut.install
@@ -67,3 +67,19 @@ function shortcut_uninstall() {
+  $entity_type = \Drupal::entityTypeManager()->getStorage('shortcut')->getEntityType();
+  \Drupal::service('entity.revisionable_schema_converter')->convertSchema($entity_type);

@Berdir, that's true, but right now we're not getting the entity type correctly in the update function.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
6.4 KB
17.71 KB

Fixing items from #63 before I refactor all the things.

timmillwood’s picture

Working in the open.

Status: Needs review » Needs work

The last submitted patch, 69: upgrade_path_between-2721313-69.patch, failed testing.

jibran’s picture

Thanks for the quick feedback fixes @timmillwood.

+++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
@@ -16,35 +17,35 @@
+    $new_shortcut = Shortcut::create(['shortcut_set' => 'default']);
+    $new_shortcut->save();
+    $this->assertEqual(5, $new_shortcut->id(), "New shortcut created");

Sorry, I meant new revision of an existing shortcut.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
20.38 KB
10.94 KB

Progress update.

Status: Needs review » Needs work

The last submitted patch, 72: upgrade_path_between-2721313-72.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
20.65 KB
1.59 KB

Making some progress.

Status: Needs review » Needs work

The last submitted patch, 74: upgrade_path_between-2721313-74.patch, failed testing.

timmillwood’s picture

Added a change record - https://www.drupal.org/node/2774203

timmillwood’s picture

Status: Needs work » Needs review
FileSize
21.37 KB
8.02 KB

I think this direction would need a lot of sign off, but it works.

Status: Needs review » Needs work

The last submitted patch, 77: upgrade_path_between-2721313-77.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
21.83 KB
2.28 KB

Adding the exception back in.

Status: Needs review » Needs work

The last submitted patch, 79: upgrade_path_between-2721313-79.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
21.42 KB
3.82 KB

Should fix some of the tests, but not sure how to fix \Drupal\Tests\Core\Entity\Sql\SqlContentEntityStorageSchemaTest::setUpStorageDefinition.

Status: Needs review » Needs work

The last submitted patch, 81: upgrade_path_between-2721313-81.patch, failed testing.

timmillwood’s picture

FileSize
21.55 KB
3.14 KB

Converting the RevisionableSchemaConverter service into just a standard class and marking it as @internal.

Not fixed any of the two failing tests ;(

timmillwood’s picture

Status: Needs work » Needs review
FileSize
20.14 KB
5.16 KB

Looking at duplicating onEntityTypeCreate, calling it onEntityTypeCreateMissingTables.

Status: Needs review » Needs work

The last submitted patch, 84: upgrade_path_between-2721313-84.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
21.06 KB
1.24 KB

Looks like in #84 I accidentally removed the initial_from_field.

timmillwood’s picture

Updating #86 with a change of method name, and interface addition.

Starting to feel happy with this solution.

Status: Needs review » Needs work

The last submitted patch, 87: upgrade_path_between-2721313-87.patch, failed testing.

timmillwood’s picture

Status: Needs work » Needs review
FileSize
22.46 KB
7.11 KB

It helps when changing a method name to change it everywhere.

Version: 8.2.x-dev » 8.3.x-dev

Drupal 8.2.0-beta1 was released on August 3, 2016, which means new developments and disruptive changes should now be targeted against the 8.3.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

dawehner’s picture

There are a couple of bugs with the current patch in regards of configurable fields:

  • It uses the wrong column, when selecting data
  • Configurable fields require a delta column

See https://www.drupal.org/files/issues/interdiff_22477.txt for fixes around that.

dawehner’s picture

A couple of more bugs:

  • We always assume that the entity_Id was unsigned, so should be the revision ID. It throws an exception because, it potentially tries to insert a 11 byte value (signed) into a 10 byte column (10 bytes)
  • When copying the data, we need to choose a different algorithm as the current one:
    1. Does not take into account optional rows in (dedicated) tables. They don't return anything for some fields, so rows with higher entity IDs are merged in
    2. Does not take into account language when fetching the data
    3. Does not take into account multiple deltas for dedicated tables
    4. The revision user could be set based upon the owner of the entity
dawehner’s picture

Here is a patch which should fix all points from above.

dawehner’s picture

FileSize
6.13 KB

Some additional problems:

* *_field_revision table currently just gets one copy per language
* The data is not returned for multiple languages
* We cannot longer use upsert, as the keyspace needs to include at least also the langcode

Here is an interdiff, sorry I don't have more time right now, but it should be doable to apply this interdiff onto the patch, sorry.

dawehner’s picture

@amateescu
Do you mind commenting the stuff you worked on / found out regarding bugs over the last week? People are using this patch for actual sites, so continues progress is kinda helpful :)

amateescu’s picture

Assigned: timmillwood » amateescu
Status: Needs review » Needs work
Issue tags: -shortcut +Dublin2016

@dawehner, sure, here's the list of major bugs that I found in the current patch:

1.

+++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
@@ -0,0 +1,294 @@
+    /** @var \Drupal\Core\Field\FieldDefinitionInterface[] $field_definitions */
+    $field_definitions = $this->entityFieldManager->getBaseFieldDefinitions($entity_type_id);
+    foreach ($field_definitions as $field_definition) {
+      $this->entityDefinitionUpdateManager->updateFieldStorageDefinition($field_definition);
+    }

Updating all base fields unconditionally like this leads to complete data loss for every base field which uses a dedicated table (i.e. multi-valued base fields).

2.

+++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
@@ -0,0 +1,294 @@
+  protected function updateEntityType($entity_type_id, $options) {
+    $last_entity_type = $this->lastInstalledSchemaRepository->getLastInstalledDefinition($entity_type_id);
+    $keys = $last_entity_type->getKeys();
+    $keys['revision'] = $options['revision'];
+    $last_entity_type->set('entity_keys', $keys);
+    $last_entity_type->set('revision_table', $options['revision_table']);
+    $last_entity_type->set('revision_data_table', $options['revision_data_table']);
+    $this->lastInstalledSchemaRepository->setLastInstalledDefinition($last_entity_type);
+  }

Sidetracking (or not calling) \Drupal\Core\Entity\EntityDefinitionUpdateManagerInterface::updateEntityType() at all means that entity type events are not fired by \Drupal\Core\Entity\EntityTypeListener, which in turn means that, for example, \Drupal\views\EventSubscriber\ViewsEntitySchemaSubscriber does not have a chance to run so all the views for the entity type that's being updated will be broken because of the base (data) table changes.

3. An additional problem found by @Berdir is that we don't clean up the base table columns for fields that are now stored in the revision data table.

The second point is the most painful one, because it means that the data migration process needs to run within the boundaries of \Drupal\Core\Entity\EntityDefinitionUpdateManager::updateEntityType(), which poses a problem as I couldn't yet find a clean backwards-compatible way to pass over the $sandbox parameter from a hook_update_N() implementation.

I'm still working on updating the patch to fix all of the above, so I'm going to assign the issue to myself for now.

dawehner’s picture

@amateescu++

plach’s picture

Quick review, while I try to get familiar with this issue:

  1. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,294 @@
    +class RevisionableSchemaConverter implements RevisionableSchemaConverterInterface {
    

    Just a note: this implementation will be tied to hook_update_N() functions, so I guess we will have to "version" it if we need to perform changes in the future so we get a predictable/reproducable update behavior. That is we will have to provide a RevisionableSchemaConverterV2 (actual name tbd) possibly extending the previous one, if we need to perform any change to the inner logic after that the update functions using the previous version are officially released.

  2. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverter.php
    @@ -0,0 +1,294 @@
    +  protected function getTableData($table_name, array $conditions, $id_key) {
    

    Not sure whether this is a new policy, but there are a few PHP docs missing here and below.

  3. +++ b/core/lib/Drupal/Core/Entity/RevisionableSchemaConverterInterface.php
    @@ -0,0 +1,29 @@
    +interface RevisionableSchemaConverterInterface {
    

    This interface and its implementation look pretty SQL-specific, so they should live in the Drupal\Core\Entity\Sql namespace.

amateescu’s picture

Title: Upgrade path between revisionable / non-revisionable entities. » Upgrade path between revisionable / non-revisionable entities

In addition to #98, I discussed the issue with @plach and his idea is to rename all the existing tables, then call updateEntityType() in order to create the tables/schema from scratch and after that copy the data over from the original tables. I've already started experimenting with this idea.

jibran’s picture

+1 to #99. As per my experience with DER, it's better to create them from scratch then altering the existing one.

Berdir’s picture

I like that idea as well. It's kind of like the migrate-based updates dream that we have except doing it by hand.

We could keep the old backup tables around and start using some kind of naming pattern for them, although we need to be careful to keep it short, as we could accidently rename them to something that's too long. Maybe combined with a manual clean-up tasks that reports on admin/reports/status about those tables, that could also be a follow-up.

What would be neat is if we could create the new tables with a different prefix and then rename when the date is successfully migrated. Then we wouldn't leave the system in an entirely broken state if something goes wrong. But I guess that won't be easy..

plach’s picture

I thought a bit more about this and I'm concerned that the approach I discussed with @amateescu in #99 may lead to an update workflow that's not predictable/reproducable (this is similar to the first comment I had in #98). If we generate the schema then it will rely on whatever runtime version of the entity schema handler class is being executed. It would suck but I'm wondering whether we need to hardcode the schema definition for each entity type in the various update functions.

amateescu’s picture

#102 is correct in assuming that the approach discussed in #99 would make updates unpredictable, because of this code from \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::getEntitySchema() (which is called by onEntityTypeCreate() through onEntityTypeUpdate()):

    if (!isset($this->schema[$entity_type_id]) || $reset) {
      // Back up the storage definition and replace it with the passed one.
      // @todo Instead of switching the wrapped entity type, we should be able
      //   to instantiate a new table mapping for each entity type definition.
      //   See https://www.drupal.org/node/2274017.
      $actual_definition = $this->entityManager->getDefinition($entity_type_id);
      $this->storage->setEntityType($entity_type);

But that doesn't necessarily mean we have to hardcode the schema in the update functions, instead we should finally fix #2274017: Make SqlContentEntityStorage table mapping agnostic , which would allow \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::getEntitySchema() to return the schema for the passed in $entity_type definition, and not use the actual/current one available in code, right?

plach’s picture

Sorry, but wouldn't that be still tied to the "version" of the schema handler class?

amateescu’s picture

Right, we're tied to whatever SqlContentEntityStorageSchema decides to return at the time when the update function runs, but isn't that the whole point of dynamically generated entity schemas instead of hardcoding them in hook_schema() implementations? :)

catch’s picture

The trouble with #105 is if there are conflicting updates/contrib modules. If a contrib update is written on a core version where this patch has been applied, but then runs on a site where it hasn't (alongside a core update). If we hard-code the schema in the update, then whatever the state of the site you always get the same outcome.

It would be really great to not worry about this, but don't think we can stop yet.

amateescu’s picture

Status: Needs work » Needs review
FileSize
23.22 KB
254.81 KB
278.03 KB
301.41 KB

Ok, here we go. Since we can't rely on generating the schema automatically, the best we can do in this patch is to provide a helper for getting the schema that we want to upgrade to, and, of course, the data copying and entity definition update operations.

In order to get the hard-coded schema for an update function, this code can be used:

    $entity_type = \Drupal::entityDefinitionUpdateManager()->getEntityType('entity_test_to_rev_conversion');
    $keys = $entity_type->getKeys();
    $keys['revision'] = 'revision_id';
    $entity_type->set('entity_keys', $keys);
    $entity_type->set('revision_table', 'entity_test_to_rev_conversion_revision');
    $entity_type->set('revision_data_table', 'entity_test_to_rev_conversion_field_revision');

    $storage_definitions = \Drupal::service('entity_field.manager')->getFieldStorageDefinitions('entity_test_to_rev_conversion');
    $storage_definitions['test_single_property']->setRevisionable(TRUE);
    $storage_definitions['test_multiple_properties']->setRevisionable(TRUE);
    $storage_definitions['test_single_property_multiple_values']->setRevisionable(TRUE);
    $storage_definitions['test_multiple_properties_multiple_values']->setRevisionable(TRUE);

    $schema = RevisionableSchemaConverter::getFullEntitySchema($entity_type, $storage_definitions);
    drupal_set_message(RevisionableSchemaConverter::varExportShortArraySyntax($schema));

An interdiff to #93/#94 is provided but due to the extensive changes that I went through in the last two weeks, it's pretty much useless and it's just reverting the previous code :/

There's only thing that I still need to figure out: how to get the schema for configurable fields, because \Drupal\Core\Entity\EntityFieldManagerInterface::getFieldStorageDefinitions() only returns the storage definitions for base fields. But until then, this patch is ready for review.

amateescu’s picture

I just realized this morning that we cannot use a fully hard-coded schema definition in the update functions because that won't contain the schema for custom base fields added via hook_entity_base_field_info().

So we're back to square 1 I guess.. :(

catch’s picture

So the closest we can get to fixed schema looks like this after talking with amateescu in irc:

1. Use the fixed known schema from core

2. Invoke hook_entity_base_field_info() + _alter() on it to pick up contrib alterations.

3. Collect schema from configurable fields (they'll be the same before and after the specific update runs)

#2 can change over time and depending on which modules are installed, however:

- if the core is updated in a separate deployment from contrib, then the result is predictable

- if contrib adds a dependency on the core update so that they run first, then the result is predictable

- if neither of those happen, then the core update might pre-empt the contrib update and make the same change, this may or may not be OK.

- if we wanted to force it, we could add a new hook in-between hook_update_N() and hook_post_update_N() (I facetiously suggested hook_pre_post_update_N()) so that core's guaranteed to run last.

amateescu mentioned using migrate instead - that also gives us fixed start and finish points (or removes the need to care about them), and that seems worth looking at.

I don't think we can revert to a dynamic update and hope it works - no matter how difficult this is to get right, debugging real-life data-integrity issues and update conflicts on tens of thousands of sites is worse.

amateescu’s picture

I thought about this a lot during the weekend and I think I have another viable alternative that eliminates all the risks involved with the fixed + dynamic schema approach discussed in #109.

The main point is that the current design of the storage schema was written with the migrate use-case in mind, so all the previous patches are kinda working against it. So why not "go with the flow" and let the storage + schema handle it for us in a post_update hook by loading the unrevisionable entities and saving them as revisionable.

There are two ways in which this can accomplished:

1. leave the old tables in place and load the entities using the last installed entity type definition (which is unrevisionable at this point), inject the revisionable entity type and a "temporary" table mapping in the storage and save the entities, which will write the final (revisionable) data in temporary entity tables; at the end of the process, remove the old tables and put the temporary ones in place.

2. rename the old tables to temporary names, create the schema from scratch with the new (revisionable) entity type definition; load the old data by using the last installed entity definition and a table mapping that returns the temporary tables, save the data using the current (revisionable) entity type definition; at the end of the process, remove the temporary tables with the old data.

Approach #1 has one drawback: as soon as the annotation for an entity type is updated with a revision entity key (i.e. when the code is updated), the storage is not able to load anything anymore, so the site will be in a broken state until the post_update batch is done and the new tables are put into place.

This would have happened with all the previous patches in this issue as well!

Approach #2 is better in this regard because the new entity tables are created at the beginning of the process, so the site will not throw exceptions, but its drawback is that the content will initially "disappear" and it will be brought back gradually as the post_update batch runs.

Given that #1 (and all the previous patches from this issue) leaves the site in a state of "throw exceptions all over the place", I'm leaning towards approach #2 which gives us a less and less broken state while the batch is running.

amateescu’s picture

Here's a patch which implements the approach from #110.2. The entire logic and code from RevisionableSchemaConverter is now *much* simpler and easy to follow.

There are couple of @todo's left in the patch, but it's definitely ready for a good review :)

jibran’s picture

but it's definitely ready for a good review :)

Good review, I don't know but it is a review nonetheless. No one has reviewed it in last 10 days so here it goes. Mostly nits and docs suggestions. I agree with @amateescu. The entire logic and code from RevisionableSchemaConverter is now *much* simpler and easy to follow.. I attached the interdiff for minor stuff. Feel free to add that to the patch or ignore it. It is a good idea to not convert any core entity in this issue i.e. shortcuts. And thank you! it is looking quite good now.

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +/**
    + * Helpers for converting an entity type with existing data to be revisionable.
    + */
    +class RevisionableSchemaConverter {
    

    I think we should document whole #110.2 here.

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +    // @todo Put the site in maintenance mode at the beginning of this process.
    

    Do we want to fix it here or in the follow-up?

  3. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +      $sandbox['temporary_table_mapping'] = $this->getTemporaryTableMapping($temporary_entity_type, $original_storage_definitions);
    

    I'm confused by reading this why can't we wrap this into

          $storage->setEntityType($temporary_entity_type);
    
          /** @var \Drupal\Core\Entity\Sql\DefaultTableMapping $table_mapping */
          $table_mapping = $storage->getTableMapping($original_storage_definitions);

    ? A comment to state the reason will help a lot to understand this better.

  4. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +      // @todo Remove the temporary tables.
    

    Oh! yeah, let's do it now.

  5. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +    $base_table = $temporary_entity_type->getBaseTable();
    

    Let's rename $base_table to $temporary_base_table for better understanding.

  6. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +    // Now inject the updated entity type definition in the storage and re-save
    +    // the entities. This will also reset the internal table mapping.
    +    $storage->setEntityType($entity_type);
    

    Very helpful comment. Thank you!

  7. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,332 @@
    +      // Manually set the original entity since it cannot be read from the
    +      // temporary storage anymore.
    +      $entity->original = $entity;
    

    Nice!

  8. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
    @@ -267,6 +267,18 @@ public function setEntityType(EntityTypeInterface $entity_type) {
    +    $this->tableMapping = $table_mapping;
    

    Why are we not calling initTableLayout() here?

  9. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -121,6 +121,52 @@ protected function installedStorageSchema() {
    +  public function getFullEntitySchema(ContentEntityTypeInterface $entity_type, array $storage_definitions) {
    ...
    +  public function updateFieldSchemaData(array $storage_definitions) {
    

    Should we move these to the interface?

  10. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -121,6 +121,52 @@ protected function installedStorageSchema() {
    +    $schema = $this->getEntitySchema($entity_type, TRUE);
    

    We should update the docs of getEntitySchema() to mention the difference about both and why can't we update the getEntitySchema() to accommodate this with additional default parameter?

  11. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1138,6 +1190,12 @@ protected function createSharedTableSchema(FieldStorageDefinitionInterface $stor
    +              if($created_field_name == $this->entityType->getKey('revision')) {
    

    Space missing after if.

  12. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1185,10 +1243,14 @@ protected function deleteDedicatedTableSchema(FieldStorageDefinitionInterface $s
           $revision_name = $table_mapping->getDedicatedRevisionTableName($storage_definition, $deleted);
    

    Let's rename to $revision_name to $revision_table_name.

  13. +++ b/core/modules/system/tests/fixtures/update/drupal-8.2.0-filled.standard.entity_test_to_rev_conversion.php.gz
    --- /dev/null
    +++ b/core/modules/system/tests/fixtures/update/drupal-8.entity-test-to-rev-conversion-enabled.php
    

    I'm going to copy this and create a revert schema Drush command out of it.

  14. +++ b/core/modules/system/tests/modules/entity_test_revisionable_schema_converter/entity_test_revisionable_schema_converter.module
    @@ -0,0 +1,6 @@
    +<?php
    +
    +/**
    + * @file
    + * Provides an entity type for testing the revisionable schema converter.
    + */
    

    This file can be removed.

  15. +++ b/core/modules/system/tests/modules/entity_test_revisionable_schema_converter/entity_test_revisionable_schema_converter.post_update.php
    @@ -0,0 +1,32 @@
    +    ['test_single_property', 'test_multiple_properties', 'test_single_property_multiple_values', 'test_multiple_properties_multiple_values']);
    

    Let's convert to multi-line array.

  16. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,78 @@
    +      $translation = $revision->getTranslation('ro');
    

    Nice @amateescu++. :D

amateescu’s picture

Assigned: amateescu » Unassigned
FileSize
268.34 KB
469.78 KB

Ok, here's the 'final' version of the patch, with no more @todos and complete test coverage.

I've updated the test with a base field added through hook_entity_base_field_info() and also a configurable field.

Re #112:

1: Documented what we're doing in the doc block of \Drupal\Core\Entity\Sql\RevisionableSchemaConverter::convertToRevisionable().
2: That's actually already done by \Drupal\system\Controller\DbUpdateController::triggerBatch(), removed the @todo.
3: Didn't understand that comment.
4: Done.
5: 11: 12: 14: 15: Already done in your interdiff.
8: Because we don't need to?
9: 10: Nope, they were leftovers from a previous approach, now removed from the patch.

amateescu’s picture

And here's a version of the patch without the database dump, just to make reviews a bit easier :)

jibran’s picture

Thanks for addressing the feedback. Let me eloborate point #3. Other then that this ready for me.

+++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
@@ -0,0 +1,338 @@
+      $temporary_entity_type = clone $original_entity_type;
+      $temporary_entity_type->set('base_table', TemporaryTableMapping::getTempTableName($original_entity_type->getBaseTable()));
+      if ($temporary_entity_type->isTranslatable()) {
+        $temporary_entity_type->set('data_table', TemporaryTableMapping::getTempTableName($original_entity_type->getDataTable()));
...
+      $sandbox['temporary_table_mapping'] = $this->getTemporaryTableMapping($temporary_entity_type, $original_storage_definitions);

At this point, we already have changed the tables. Why can't we use storage table mapping instead of replicating the code?

amateescu’s picture

Because the storage creates a DefaultTableMapping class, and we need a TemporaryTableMapping which has a 'special' generateFieldTableName() method.

jibran’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Needs subsystem maintainer review, +Needs framework manager review

Got it, thanks for the explanation. As I said in #115 patch is ready. It is pretty self-explanatory. The documentation is there and makes a lot of sense. Update path test is verifying all the possible scenarios. As per #95 people are using this functionality so I think we should add a change notice for this as well.
The only thing I can think of is sub-system maintainer review i.e. entity system maintainers and framework manager review i.e. @effulgentsia or @alexpott. So, tagging the issue and moving it to RTBC to get more eyes on this.

catch’s picture

Status: Reviewed & tested by the community » Needs review
  1. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,338 @@
    +   * The last installed schemea repository service.
    

    schemea.

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,338 @@
    +   *   - load the old data by using the original last install entity definition
    

    last installed

  3. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,338 @@
    +
    +    // Get the next 10 entity IDs to migrate.
    +    $entity_ids = $this->database->select($temporary_base_table)
    +      ->fields($temporary_base_table, [$id])
    +      ->condition($id, $sandbox['current_id'], '>')
    +      ->range(0, 10)
    +      ->execute()
    +      ->fetchAllKeyed(0, 0);
    +
    

    Shouldn't this order by ID ASC to make it explicit?

    Also is 10 all we can manage here? Would think 100 would still be safe for memory limits while saving a lot of batch overhead. And/or we could allow the batch size to be set in $settings too.

  4. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,338 @@
    +      $storage->save($entity);
    

    There's no error handling here. Should we be trying to catch an exception and replace the old storage back where it was? If we could use transactions that'd be great, but of course impossible with batch/UI updates.

    The other thing going on here is that during the update any requests to the site are going to be missing entities, maintenance mode possibly makes this OK, but could also mean sites in maintenance mode for hours.

    There's one possible way to prevent that issue, although I have a feeling we discussed it in irc and it wasn't possible for some reason, but would be good to confirm here:

    1. Create the new entity schema with temporary table names
    2. Load entities from the current schema, save them into the new one
    3. Rename the old entity tables to temp names, rename the new ones to the proper names
    4. Drop the old names and update the schema definition

    That would at least in theory allow the site to operate (if only read only), and the advantage is that if it fails somewhere, you'd still have a working site just with some redundant temporary tables up until the final switch around.

amateescu’s picture

Fixed #118.1, 2 and 3.

For #118.4, it's true that we should have a fallback mechanism in case an exception is thrown at any time during the batch requests. I'll implement it shortly :)

The other thing going on here is that during the update any requests to the site are going to be missing entities, maintenance mode possibly makes this OK, but could also mean sites in maintenance mode for hours.

There's one possible way to prevent that issue, although I have a feeling we discussed it in irc and it wasn't possible for some reason, but would be good to confirm here:

That was not only an IRC discussion, it's also documented in #110 (5th paragraph) why I chose to implement it "the other way around" :)

catch’s picture

OK I knew it was somewhere, #110 is the one.

I think we can get around the 'instant change' issue by using hook_entity_type_alter().

i.e.

- add a new, (possibly hidden?) module, which alters the entity definition
- make the module required so it's always on for new sites
- when we update, do the full data migration, and enable the module at the end.

This way we explicitly change the entity definition only when the data is already migrated. If the migration fails, then you have some tables to delete, but your site still works otherwise.

Alternatively put the alter in an existing module as use a state flag in the update to change it, but that seems a bit more fragile.

amateescu’s picture

So #120 is one possibility which would allow the site to 'function' (in maintenance mode at least) until the update process is done, but maybe we can also try to do #2554235: Make the Entity Manager return the last installed definitions instead of the ones living in code, or just a small part of it and use the last installed definitions in the storage handler.

This is a lot of food for thought so I'm going to wait on some feedback from the entity (storage) maintainers before working in either direction :)

catch’s picture

#121 also sounds like a plan, so yeah let's get more opinions.

My main concern here isn't so much leaving the site in working state during the update (although that's nice too), but ensuring the smallest possible window to leave the site in an unrecoverable state if something goes wrong. This is a non-optional upgrade path that people could end up doing as part of an update to a security release, so if we can make it so that a failure (or the most likely failure of a bug in a TaxonomyTerm::save() somewhere, I've seen network requests inside entity saving before) leaves the site working with cruft and a pending update, that's better than fatals due to schema mis-matches and having to restore a backup and insecure code base.

amateescu’s picture

Since there are no other opinions in over a month, I've rewritten this patch for the third time to fully account for #118.4, which means that error handling is now as robust as it can get, and also fully tested :)

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

dawehner’s picture

Recreating the tables and then rename tables seems to be a really good approach here!

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +      $this->entityTypeManager->useCaches(FALSE);
    

    It would be great to document why this is needed.

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +    if ($sandbox['#finished'] === 1) {
    

    I'm a bit nervous about this line (strict comparison). 100/100 is indeed 1, but note: 1.0 !== 1

  3. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +      foreach ($sandbox['original_table_mapping']->getTableNames() as $table_name) {
    +        $this->database->schema()->dropTable($table_name);
    +      }
    

    Isn't it a bit dangerous to drop the old tables first and then get in the new ones? What about renaming the old ones so we could recover somehow and then at the end of the process drop them? So I'm wondering whether a try/catch around the next lines and then a rename in case of an exception could reduce some problems during it.

  4. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +        $this->database->schema()->renameTable($temp_table_name, $new_table_name);
    

    I was wondering whether database table operations can actually be wrapped inside a transaction, but yeah https://stackoverflow.com/questions/4692690/is-it-possible-to-roll-back-... says no, at least for mysql.

  5. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +    // Get the next 10 entity IDs to migrate.
    ...
    +      ->range(0, 100)
    

    This does no longer match.

  6. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +
    +      // Treat the entity as new in order to make the storage do an INSERT
    +      // rather than an UPDATE.
    +      $entity->enforceIsNew(TRUE);
    +
    

    For some better test coverage, can we check that the IDs stay the same? Otherwise there is no guarantee that existing entity reference swill continue to work.

  7. +++ b/core/lib/Drupal/Core/Entity/Sql/RevisionableSchemaConverter.php
    @@ -0,0 +1,376 @@
    +      try {
    +        // Finally, save the entity in the temporary storage.
    +        $storage->save($entity);
    +      }
    +      catch (\Exception $e) {
    

    I am wondering whether the scope of the try statement should be increased, like the entire copy method.

  8. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php
    @@ -279,8 +311,10 @@ public function getTableMapping(array $storage_definitions = NULL) {
    +      $table_mapping_class = !$this->isTemporary ? DefaultTableMapping::class : TemporaryTableMapping::class;
    

    Nitpick: For slightly better readability I would have negated the condition.

  9. +++ b/core/modules/system/tests/fixtures/update/drupal-8.entity-test-to-rev-conversion-enabled.php
    @@ -0,0 +1,30 @@
    +if ($key = array_search('entity_test_revisionable_schema_converter_post_update_make_revisionable', $existing_updates)) {
    +  unset($existing_updates[$key]);
    +}
    

    Why does the dump contains this update function in the first place?

  10. +++ b/core/tests/Drupal/Tests/Core/Entity/RevisionableSchemaConverterTest.php
    @@ -0,0 +1,335 @@
    +    for ($i = 1; $i <= 101; $i++) {
    

    We should check IDs as well, as written above.
    For sanity reason I would also save the entities to ensure this works as well.

amateescu’s picture

@dawehner, thanks for reviewing! :)

Re #125:
1. That's because we want to be sure we always get the latest entity definition from code. Added a comment.
2. Ok, let's use '==' then :)
3. Done, we rename the initial tables, use a try/catch for the whole code block below and bring back the original tables if needed.
4. Bummer :/
5. FIxed that and also added a setting for the update step size.
6. 10. Added assertions for entity IDs as well.
7. I think it's nice to be able to be tell which entity failed exactly, so I just expanded the scope of the try/catch to the scope of the foreach statement.
8. Done :)
9. Because I installed the module with the post update function in place, but there's no real reason for it to be there. I removed it and updated the dump. Note that that dump change is not in the interdiff because it would be useless.