Problem/Motivation

Found in #2542132: Drupal\field\Tests\Number\NumberFieldTest fails on SQLite. As part of running update.php we currently run automatic entity updates as the last step. This was added to update.php by #2333113: Add an EntityDefinitionUpdateManager so that entity handlers can respond (e.g., by updating db schema) to code updates in a controlled way (e.g., from update.php) and moved to be the last update in #2535082: Allow hook_update_N() implementations to run before the automated entity updates. However we have a problem that many automatic entity schema updates only works for operations that don't change data. In that case the current behavior is to throw an exception (some silent failures were fixed in #2544954: SqlContentEntityStorageSchema does not detect column-level schema changes in non-base-fields).

The problem is that our update process with hook_update_N() expects atomic operations with a known outcome whereas the automatic entity updater takes us from the current database state to the new state as defined by code. In one go, and the tools added by #2535082: Allow hook_update_N() implementations to run before the automated entity updates to operate on the field and entity level don't resolve this. This means that if entity updates and hook_update_N()'s conflict there is no way to resolve this with hook_update_dependencies(). Further, since entity schema update can not handle data updates they are not very useful for real live sites with content, because in that cse you need to manually migrate data in an update function or run a migration.

Proposed resolution

  1. Stop automatic entity updates from running during update.php. This means the modules that change entity types need to provide a hook_update_N(). This will also resolve the dependency issues discussed in #2535082: Allow hook_update_N() implementations to run before the automated entity updates.
  2. Decide if we need to postpone this on #2346013: Improve DX of manually applying entity/field storage definition updates, since applying an update in hook_update_N() should be simple - see below for how this should work.
  3. Remove \Drupal::service('entity.definition_update_manager')->applyUpdates(); from install_install_profile().
  4. Ensure that schema additions (indexes and fields) are automatic on module install.
  5. Update documentation to hook_update_N() to show how to run entity schema updates.
  6. File PR to add command to Drush to run all available updates (see https://github.com/drush-ops/drush/pull/1521).
  7. For each hook_update_N() that changes field definitions:
    1. Get the current field definitions (matching the database) from state.
    2. The hook_update_N() takes those definitions and applies the specific changes to it - same as it would in the alter. (This is conceptually similar to copying a field definition array to use as the argument to db_change_field() in a hook_update_N() for 7.x).
    3. At this point, the definitions match the code as if only this module was updated in isolation to this specific hook_update_N() point.
    4. Run the auto-updater to resolve the differences between the original and modified definitions.
    5. Save the modified definition back to state.
    6. The next update gets the new state entry, which should once again match what's in the database in reality, and we run through steps 1-5 again.

This will result in a better UX, since users will have to fiddle less with the Update UI and the DX regression will be vastly compensated by the availability of dedicated drush commands allowing to apply updates in bulk as previously done in the UI.

Remaining tasks

  • Review
  • Update change records
  • Commit

User interface changes

Introduce a separate status report item if there are entity updates, being displayed only when no update functions are pending anymore:

API changes

  • Removed update_entity_definitions(&$context).
  • DbUpdateController::__construct() does not take an $entity_definition_update_manager anymore (was second to last argument). New signature is __construct($root, KeyValueExpirableFactoryInterface $key_value_expirable_factory, CacheBackendInterface $cache, StateInterface $state, ModuleHandlerInterface $module_handler, AccountInterface $account, BareHtmlPageRendererInterface $bare_html_page_renderer)

Data model changes

No.

Docs changes

We need to delete the entire “Content entity and field changes” section from https://www.drupal.org/node/2535316, and replace with:

<h3 id="overview-content">Content entity and field changes</h3>

Changes to your content entity and field changes include:
<ul>
<li>Adding, removing, or renaming a field.</li>
<li>Modifying a field property, such as an index name or required value.</li>
<li>Modifying an entity definition, such as if it is revisionable.</li>
<li>Etc.</li>
</ul>

For example, if you had a product entity type without a price base field, and then later added that field to a later version of your code, you will need an update function in order for sites built on the older code to get the new field.

———-

Then this page https://www.drupal.org/node/2535476 needs:

All references to automated updates need to be removed.
Change the existing example to an “Advanced Example”
"General notes" should be updated with the text from the issue summary:


For each hook_update_N() that changes field definitions:

<ol>
<li>From entity manager, get the most recently installed field definitions (matching the current database).</li>
<li>The hook_update_N() takes those definitions and applies the specific changes to it.</li>
<li>At this point, the definitions match the code as if only this change was updated in isolation to this specific hook_update_N() point.</li>
<li>Run the auto-updater (e.g. < code >$entity_manager->onFieldStorageDefinitionCreate()</ code >) to resolve the differences between the original and modified definitions. This will update the database to the new definition.</li>
<li>The next update gets the new state entry, which should once again match what's in the database in reality, and we run through steps 1-4 again.</li>
</ol>

And add a much simpler example:

<h2>Example: Add new field to entity</h2>

<?php
function example_update_8001() {
  // Obtain the most recent field definitions for node.
  $entity_manager = \Drupal::entityManager();
  $field_storage_definitions = $entity_manager->getLastInstalledFieldStorageDefinitions('node');

  // Check if the field is already there somehow from an earlier attempt.
  if (!isset($field_storage_definitions[‘new_field'])) {
    // Define the new field. Copy/paste from the entity class’s BaseFieldDefinition.
    // DO NOT call it directly, because the schema may have other changes.
    $field_storage_definition = BaseFieldDefinition::create(‘boolean')
      ->setLabel(t(‘New field’))
      ->setDescription(t(‘New checkbox field.'))
      ->setReadOnly(TRUE)
      ->setRevisionable(TRUE)
      ->setTranslatable(TRUE)

      // Specify these values by hand because EntityManager::buildBaseFieldDefinitions()
      // is bypassed when we create/update a single field at a time.
      ->setProvider('node')
      ->setName('revision_translation_affected')
      ->setTargetEntityTypeId(‘node')
      ->setTargetBundle(NULL);

    // Now that we have a field definition in a known state, let the Entity Manager know about it. It will then delegate to all the necessary handlers to do things like create the database tables and whatever else is needed.
   $entity_manager->onFieldStorageDefinitionCreate($field_storage_definition);
  }
}
?>

CommentFileSizeAuthor
#180 entity-rm_auto_updates-2542748-180.patch12.25 KBplach
#171 automatic_entity-2542748-171.patch81.5 KBmpdonadio
#161 entity-rm_auto_updates-2542748-161.patch81.68 KBplach
#161 entity-rm_auto_updates-2542748-161.interdiff.txt11.97 KBplach
#155 entity-rm_auto_updates-2542748-155.patch75.2 KBplach
#155 entity-rm_auto_updates-2542748-155.interdiff.txt29.65 KBplach
#145 entity-rm_auto_updates-2542748-145.patch66.98 KBplach
#145 entity-rm_auto_updates-2542748-145.interdiff.txt24.39 KBplach
#144 interdiff.txt12.33 KBeffulgentsia
#144 2542748-144.patch60.61 KBeffulgentsia
#143 interdiff.txt863 byteswebchick
#143 2542748-143.patch59.44 KBwebchick
#138 Status report | My site 2015-08-20 15-00-59.png130.38 KBgábor hojtsy
#137 interdiff.txt834 bytesgábor hojtsy
#137 2542748-137.patch58.59 KBgábor hojtsy
#133 2542748-133.patch58.72 KBgábor hojtsy
#123 entity-rm_auto_updates-2542748-123.patch58.87 KBplach
#123 entity-rm_auto_updates-2542748-123.interdiff.txt7.23 KBplach
#122 entity-rm_auto_updates-2542748-122.patch57.08 KBplach
#122 entity-rm_auto_updates-2542748-122.interdiff.txt5.83 KBplach
#120 2542748-2551341-combo.patch869.31 KBwebchick
#117 Status report | drupal 8.0.x 2015-08-19 11-03-42.png178.43 KBgábor hojtsy
#113 entity-rm_auto_updates-2542748-113.patch56.64 KBplach
#113 entity-rm_auto_updates-2542748-113.interdiff.txt15.2 KBplach
#108 entity-rm_auto_updates-2542748-108.patch51.39 KBplach
#108 entity-rm_auto_updates-2542748-108.interdiff.txt7.61 KBplach
#106 entity-rm_auto_updates-2542748-106.patch55.67 KBplach
#106 entity-rm_auto_updates-2542748-106.interdiff.txt10.82 KBplach
#105 entity-rm_auto_updates-2542748-105.interdiff.txt11.22 KBplach
#105 entity-rm_auto_updates-2542748-105.patch50.68 KBplach
#102 entity-rm_auto_updates-2542748-102.patch45.91 KBplach
#102 entity-rm_auto_updates-2542748-102.interdiff.txt10.59 KBplach
#95 entity-rm_auto_updates-2542748-95.interdiff.txt1.66 KBplach
#95 entity-rm_auto_updates-2542748-95.patch35.32 KBplach
#87 Status_report___Drupal_8_x_-_Dev.png262.07 KBplach
#86 entity-rm_auto_updates-2542748-86.patch35.26 KBplach
#86 entity-rm_auto_updates-2542748-86.interdiff.txt4.18 KBplach
#85 entity-rm_auto_updates-2542748-85.patch33.73 KBplach
#85 entity-rm_auto_updates-2542748-85.interdiff.txt6.52 KBplach
#80 entity-rm_auto_updates-2542748-80.interdiff.txt4.07 KBplach
#80 entity-rm_auto_updates-2542748-80.patch28 KBplach
#58 interdiff.txt2.71 KBeffulgentsia
#58 auto-updates-2542748-58.patch23.93 KBeffulgentsia
#43 interdiff.txt18.23 KBeffulgentsia
#43 auto-updates-2542748-43.patch22.78 KBeffulgentsia
#37 interdiff.txt5.71 KBeffulgentsia
#37 auto-updates-2542748-37.patch12.07 KBeffulgentsia
#35 interdiff.txt4.78 KBeffulgentsia
#35 auto-updates-2542748-35.patch5.72 KBeffulgentsia
#33 auto-updates-2542748-33.patch11.24 KBjhedstrom
#33 interdiff.txt455 bytesjhedstrom
#32 auto-updates-2542748-32.patch10.99 KBjhedstrom
#32 interdiff.txt949 bytesjhedstrom
#29 auto-updates-2542748-29.patch10.06 KBjhedstrom
#29 interdiff.txt3.37 KBjhedstrom
#16 2542748-16.patch668 bytesjhedstrom
#13 Screen Shot 2015-07-30 at 10.13.13 AM.png96.65 KBjhedstrom
#4 2542748.3.patch6.69 KBalexpott

Comments

alexpott’s picture

Discussed in IRC with @plach and @catch. The outcome is something like this:

  • Automatic entity updates should not run from update.php - the uncertainty wrt to what happens if a field has data and the issue with update dependencies means that this is just too hard
  • It should be easy to call something to add field or index in hook_update_N() see #2346013: Improve DX of manually applying entity/field storage definition updates
  • On module install additions should be automatic
  • The catch all update in the installer should be removed
  • drush can have a command to apply updates to make developing nice and easy
alexpott’s picture

@catch, @plach and I feel this is critical because this is a significant change to updates and the current behaviour is broken.

catch’s picture

Title: Automatic entity updates can appear successful but actually they are not. » Automatic entity updates are not safe to run on update.php by default

Re-titling

alexpott’s picture

Status: Active » Needs review
StatusFileSize
new6.69 KB

Here's a patch that breaks everything.

plach’s picture

Here's a patch that breaks everything.

lol, good strategy ;)

Status: Needs review » Needs work

The last submitted patch, 4: 2542748.3.patch, failed testing.

jhedstrom’s picture

If step 3 is omitted the schema update is successful.

Step 3 adds content, and once there's content, the automatic updates refuse to run--isn't that by design? My understanding is that the module, or the patch mentioned in the IS, would need to provide an update hook to address this before automatic entity updates are successful.

alexpott’s picture

@jhedstrom exactly so what's the point of automated thing if once there is data you have to provide a hook_update_N()? Since contrib and core with always have to assume there is data and write a hook_update_N().

catch’s picture

Crossposted with alexpott saying more or less the same thing..

Step 3 adds content, and once there's content, the automatic updates refuse to run--isn't that by design? My understanding is that the module, or the patch mentioned in the IS, would need to provide an update hook to address this before automatic entity updates are successful.

For contrib modules and core, there is no way to know whether the site the update will run on has content or not. So the answer is always to provide an update hook with a migration.

For custom modules or alpha where there is no known install base with content, the update hook could probably trigger the automated entity update itself.

It makes sense to me to make either of these an explicit choice though.

jhedstrom’s picture

so what's the point of automated thing if once there is data you have to provide a hook_update_N()

Well, the ones that still *can* run automatically with data (column additions, adding indexes, others?) are still valuable. However, I see the point--once there are ones that fail to run due to content, no further ones will run (see #2530214: Implement more granular exception handling in update entity definitions processing). Thus, at that juncture, an update hook is required to unblock the ones that will succeed.

alexpott’s picture

@jhedstrom exactly and then what happens if the net update is a security update?

catch’s picture

For the updates that fail here in the test coverage, we know they've already run on every site that has got to beta13, so we don't need to retrospectively go back and add manual updates for them. This is true if we require that people only update sequentially between beta releases though - no beta-9 to beta-13. We will need to update the upgrade path test dumps to be based on a later hash though.

However, we have no such restrictions available with how this interacts with contrib module updates:

Contrib changes to field/entity schema between module versions will have been relying on this behaviour.

So we have

8.0.0-beta13 (auto-updates and silently fails when it can't)

8.0.0-beta14 (does not auto-update at all)

contrib-8.x-beta1 (initial entity schema)

contrib-8.x-beta2 (changes schema).

When a site updates from contrib-8.x-beta1 to contrib-8.x-beta2 using 8.0.0-beta13, then as long as they have no content in the field, the schema update happens.

Then a different site updates to 8.0.0-beta14 first, and goes from contrib-8.x-beta1 to contrib-8.x-beta2 - the schema update will not happen, because the module hasn't made any changes. It will only happen if they install contrib-8.x-beta3 that adds the update hook.

We could possibly split this issue into two - one that documents the requirement to add update hooks with the requisite change notice, one that removes the auto-updating from update.php - that gives contrib a chance to catch up and put out new releases if they need to.

I don't know how much this is an issue with the contrib modules out there at the moment.

jhedstrom’s picture

Issue summary: View changes
StatusFileSize
new96.65 KB

I had missed the bit about this being a silent failure. I can reproduce this using the steps in the IS. However, I cannot reproduce that if I change a base field (say in the User entity, change langcode to an int after install). So more than anything, there appears to be a bug in the apply updates, and I would advocate for fixing that rather than remove *all* automatic updates.

jhedstrom’s picture

This logic in SqlContentEntityStorageSchema::updateDedicatedTableSchema():

      if ($storage_definition->getColumns() != $original->getColumns()) {
        throw new FieldStorageDefinitionUpdateForbiddenException("The SQL storage cannot change the schema for an existing field with data.");
      }

isn't detecting the change the way the shared table example (user entity base field change) does. The really odd thing is $original has the updated schema when this check is run.

jhedstrom’s picture

Confirmed that $field_definition->schema is *not* being stored for non-base fields. So when the check in #14 is performed, $original->schema is recalculated in FieldStorageConfig::getSchema():

  public function getSchema() {
    if (!isset($this->schema)) {
      // Get the schema from the field item class.
      $class = $this->getFieldItemClass();
      $schema = $class::schema($this);
jhedstrom’s picture

Status: Needs work » Needs review
StatusFileSize
new668 bytes

Well, that explains why it wasn't stored :(

jhedstrom’s picture

I forgot to mention, with that patch, the steps in the IS properly throw the exception during applyUpdates().

jhedstrom’s picture

If we need an update hook to eventually resolve this missing data, we have the information in the key_value table in the entity.storage_schema.sql collection.

jhedstrom’s picture

An alternative to storing/serializing $schema (since it was removed for performance reasons in #2313053: Field storage and instance call toArray() in __wakeup() is very slow, remove it?) would be to use the logic that is used in SqlContentEntityStorageSchema::requiresFieldStorageSchemaChanges:

    if ($table_mapping->requiresDedicatedTableStorage($storage_definition)) {
      return $this->getDedicatedTableSchema($storage_definition) != $this->loadFieldSchemaData($original);
    }

this is how the update process is detecting that an update is needed in the first place.

plach’s picture

So more than anything, there appears to be a bug in the apply updates, and I would advocate for fixing that rather than remove *all* automatic updates.

Personally I think we need to do both. Entity updates are in no way db updates and should not be handled the same way or should not be assumed to be able to behave the same way. They are a generalization of the schema handling logic we used to have in the D7 Field API, which is triggered via the Field UI, which in turn refuses to proceed when changes would require a data migration.

When working on the initial patches, I proposed a separate UI for entity updates, then I was convinced that exploiting the same UI we have for DB updates was a good idea, because they perform similar operations from a user POV. However they are not the same and they were never meant to be the same. The fact that we are trying to make entity updates work with HEAD 2 HEAD and support an upgrade path simply terrifies me, because the system was not designed with these use cases in mind.

The initial stated goals were quite simple: support schema updates if you have no data, i.e. when setting up your data model, or only additive updates. Otherwise you need a migration.

I think the proposed solution is sane because it brings back predictability to the db update system, since applying entity updates in bulk removes every possibility to transition from a know state to another known state, which is basically the only way to have a working db update system.

I think H2H as any other contrib module should just perform manual db updates as usual. If the Entity Storage gets in the way, we simply need a mechanism to bypass it. Trying to force it to do something it was not designed to do is not going to work, I'm afraid.

jhedstrom’s picture

The fact that we are trying to make entity updates work with HEAD 2 HEAD and support an upgrade path simply terrifies me

It terrified me initially too. I'm conflicted on this, since once I understood update hooks would need to match the entity storage perfectly, these automatic update served as a built-in testing framework for ensuring updates were complete. If we remove the automated system, site owners and module developers may continue on blissfully unaware of drifting out of sync with their code base and actual storage.

I don't like to use head 2 head as an example, since it exists solely for the fact that it is unsupported, but I don't think the issues encountered therein are necessarily unique.

I think we should probably decouple the automated system as suggested, but not hide it too deeply.

xjm’s picture

plach’s picture

If we remove the automated system, site owners and module developers may continue on blissfully unaware of drifting out of sync with their code base and actual storage.

Yeah, that's a good point. However it's not much more different from the regular case when you add a column to a table via db updates: you need to replicate the schema both in hook_schema() and hook_update_N() and you need to ensure both are in-sync. I see that entity updates are problematic in this case because you don't have a schema definition to copy around, so the chances to get something wrong are higher.

So, I guess probably #2533282: Make schema methods' visibility public in SqlContentEntityStorage could alleviate this problem.

catch’s picture

Discussed this with alexpott and on the EuroCriticals call this morning, here's a proposed resolution:

We need to remove the automatic running of entity updates from update.php.

However, in many cases individual updates will work using that API, and we don't want to lose the entity storage agnosticism, nor require people to write manual database manipulations for things that can be handled via the entity storage API.

This means the ideal situation is that we use the auto-updating API, but it has to be triggered for specific changes from hook_update_N() - that should be both predictable and the least amount of work for module developers.

alexpott pointed out than when doing that, we need to resolve the following case:

Module 1 adds an index to a field
Module 2 adds a different index to a different field
Both modules add a hook_update_N() which triggers entity auto-update on the field.

If we only allow people to specify a field in hook_update_N(), not a specific change to that field, then the following can happen:

When a site updates both module releases at the same time, the first hook_update_N() wins, and updates are still not atomic. Sometimes this might still succeed, but it's easy to run into conflicts.

However I think there's a way to deal with this that should work:

For each hook_update_N() that changes field definitions:

1. Get the current field definitions (matching the database) from state.
2. The hook_update_N() takes those definitions and applies the specific changes to it - same as it would in the alter. (This is conceptually similar to copying a field definition array to use as the argument to db_change_field() in a hook_update_N() for 7.x).
3. At this point, the definitions match the code as if only this module was updated in isolation to this specific hook_update_N() point.
4. Run the auto-updater to resolve the differences between the original and modified definitions.
5. Save the modified definition back to state.
6. The next update gets the new state entry, which should once again match what's in the database in reality, and we run through steps 1-5 again.

This means:
- the state entry always matches what should be in the database after each individual update handler runs so it's always possible to do the comparison of exactly what's before and after.
- we never execute any code to figure out field definitions during the update handlers - we just use the information in state vs state + update handler modifications
- all changes are made atomically - so that two hook_update_N() in the same module in the same field can run sequentially either in the same update run or weeks apart, and end up with the same schema at the end having gone through the same sequential states.

This doesn't prevent people trying to make changes that can't be done or conflicts between different modules, but that's something we just have to throw exceptions for and people will have to resolve.

On the call, alexpott also mentioned that the change for example from float to double (or an even simpler one like increasing the length of a varchar) is something that while currently we'd block it if there's any data in the field, would actually complete successfully. So as well as state + additions, we can also allow hook_update_N() to force the auto-schema handling to run even when tables/columns have data in. This will allow the majority of definition changes to 'migrate' data without having to write a manual migration path for it - which means a much broader category of changes don't have to do raw database queries - so we keep entity storage agnosticism right up until it's absolutely not possible to do so.

alexpott’s picture

Issue summary: View changes

Started the issue summary update.

catch’s picture

Issue summary: View changes
jhedstrom’s picture

Issue summary: View changes

Adding step to fix the silent failures to the IS.

jhedstrom’s picture

Issue summary: View changes

We should add docs to hook_update_N() for how to run these perhaps as well.

jhedstrom’s picture

Issue tags: +Needs tests
StatusFileSize
new3.37 KB
new10.06 KB

This adds a fix for the silent failures. I tested this *without* the patch from #4, since without the automatic updates, this code won't run. Thus, adding a needs tests tag. Interdiff is against #4. This approach, unlike #16 won't require an update hook to go back and populate the stored schema definitions.

To fix the 2 failing tests, should we manually call entity schema updates from within the tests?

Status: Needs review » Needs work

The last submitted patch, 29: auto-updates-2542748-29.patch, failed testing.

jhedstrom’s picture

Issue summary: View changes
jhedstrom’s picture

Status: Needs work » Needs review
StatusFileSize
new949 bytes
new10.99 KB

This is one way to address the failing tests. It also shouldn't be a problem on existing sites, since these have been running via update.php all along. It will serve as an example update hook too, perhaps linked from hook_update_N() docs.

jhedstrom’s picture

StatusFileSize
new455 bytes
new11.24 KB

Forgot to add a use statement.

dawehner’s picture

+++ b/core/modules/system/system.install
@@ -1203,3 +1204,21 @@ function system_update_8001(&$sandbox = NULL) {
+    $ret['#abort'] = array('success' => FALSE, 'query' => t('%type: !message in %function (line %line of %file).', $variables));

Do we really need t() here? IMHO the other places of exceptions we don't translate it

effulgentsia’s picture

StatusFileSize
new5.72 KB
new4.78 KB
  1. +++ b/core/includes/install.core.inc
    @@ -1656,12 +1656,6 @@ function install_profile_themes(&$install_state) {
    -  // Now that all modules are installed, make sure the entity storage and other
    -  // handlers are up to date with the current entity and field definitions. For
    -  // example, Path module adds a base field to nodes and taxonomy terms after
    -  // those modules are already installed.
    -  \Drupal::service('entity.definition_update_manager')->applyUpdates();
    

    The IS says to do this, but it seems to me to be out of scope for this issue's title and problem/motivation.

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1263,7 +1263,7 @@ protected function updateDedicatedTableSchema(FieldStorageDefinitionInterface $s
    -      if ($storage_definition->getColumns() != $original->getColumns()) {
    +      if ($this->hasColumnChanges($storage_definition, $original)) {
    

    For this and related changes in this class, this does look like a necessary bug fix for use cases like #2542132: Drupal\field\Tests\Number\NumberFieldTest fails on SQLite, but I think that could use its own issue, rather than being done here.

  3. +++ b/core/modules/system/system.install
    @@ -1203,3 +1204,21 @@ function system_update_8001(&$sandbox = NULL) {
    +function system_update_8002() {
    

    I don't think we want this. Looks like it was added to fix the test failure in #29, but I think we should find a different way to fix that.

So, uploading a patch with all that reverted to see which tests fail.

Status: Needs review » Needs work

The last submitted patch, 35: auto-updates-2542748-35.patch, failed testing.

effulgentsia’s picture

Status: Needs work » Needs review
StatusFileSize
new12.07 KB
new5.71 KB

How about this as a way to deal with the legacy, pre-beta-13, update tests? Note that per #2341575: [meta] Provide a beta to beta/rc upgrade path still being an open issue, there was never a claim that beta-12 to beta-13 updates would work: we just got lucky that they mostly do (unless you have a contrib/custom module installed that breaks it). And the approach in this patch doesn't stop beta-12 to beta-13 from working, it only stops a direct beta-12 to beta-14 from working (i.e., you need to go to beta-13 first).

alexpott’s picture

Status: Needs review » Needs work
+++ b/core/modules/system/src/Tests/Update/UpdatePathTestBase.php
@@ -200,6 +200,26 @@ protected function runUpdates() {
+  protected function fixPreBeta13Schema() {
+    \Drupal::service('entity.definition_update_manager')->applyUpdates();
+  }

This suffers from the problem that once we do more things that will update the entity schema it will do them to. This needs to only do the beta 12 to 13 entity updates. This kind of illustrates the problem in a nutshell. The control we have over the entity update system does not have atomicity we need to provide predictable and repeatable updates.

berdir’s picture

#38: Yes, hat is exactly the problem.

What I don't understand it why we think that manually triggering those updates in update_N() hooks will improve anything. As far as I understand this, triggering it manually from a specific update has a high chance of making the situation worse, not better since future updates will mess it up and later updates will have to support all kinds of source and target destinations.

The only really secure way to ensure that updates would run in a "predictable and repeatable"* way would be to remove this system completely. And I'd like to not do that, now that we spent so much time on it ;)

* Assuming that people would write perfect update functions, which we all know isn't going to happen anyway ;)

I still think that the two changes that would help to most are what I mentioned in #2535082-41: Allow hook_update_N() implementations to run before the automated entity updates. The event/callback would be conceptually much closer to the entity updates, as it just needs to support from possibly multiple old states to now.

I just had to expand the redirect.module upgrade path today for a beta11 -> beta13 update today in https://github.com/md-systems/redirect/pull/37/files. the first update perfectly showed the problem with using entity API in update functions and the second would be an example that would be much easier to deal with in such a callback, I'd just get the old state, can check that it has the wrong length and update according. It will get more complicated with multiple old states, multiple changes (possibly overlapping, e.g. a base field change and a field type change) but I still think that's doable.

More later, train arriving.

tstoeckler’s picture

I didn't read the entire issue, but can someone explain, why this is not postponed on #2346013: Improve DX of manually applying entity/field storage definition updates ?

Currently in core with Content Translation or Path module (basically any module with dynamic field definitions) you can configure your site into a site where the field definitions in code do not match the database schema. The only way to resolve this and avoid missing column SQL exceptions is running update.php. With this patch that possibility would be gone so without manually executing PHP code you cannot recover from this and it would basically make Content Translation unusable (for people who cannot run drush ev "\Drupal::service('entity.definition_update_manager')->applyUpdates();" in the shell or do not figure out they have to do this).

jhedstrom’s picture

re #40 there are actually 2 issues here. The truly critical one IMO, is the silent failure of non-base-field updates since their schema isn't stored. See #14 and #15. The second issue is whether update.php should or should not attempt to run these. Perhaps we could split the issue, and postpone the later one on #2346013: Improve DX of manually applying entity/field storage definition updates.

jhedstrom’s picture

I added #2544954: SqlContentEntityStorageSchema does not detect column-level schema changes in non-base-fields for the fix originally posted in #29 that has been suggested it needs its own issue.

effulgentsia’s picture

Status: Needs work » Needs review
StatusFileSize
new22.78 KB
new18.23 KB

Re #42: thanks for opening that issue.

Re #38: I tracked down all the entity schema changes that had been automated since drupal-8.bare.standard.php.gz was committed, which was around beta 11.5 (#1314214-268: MySQL driver does not support full UTF-8 (emojis, asian symbols, mathematical symbols)), and wrote atomic update functions for them. Even though beta-to-beta updates aren't "supported" from that early, these turned out to be interesting use cases for evaluating our existing atomic APIs for this. I think we'll still need #2346013: Improve DX of manually applying entity/field storage definition updates to handle additional use cases not covered here, but perhaps we can proceed without that for these use cases?

catch’s picture

@berdir the suggestion in #24 is not to just trigger auto-updates from hook_update_N(), it's to apply specific changes to the definitions each time.

You'd still run into issues if there are multiple updates later, but if each change specifies exactly what it is, then that doesn't seem different to having multiple hook_schema()/db_*() changes.

plach’s picture

Currently in core with Content Translation or Path module (basically any module with dynamic field definitions) you can configure your site into a site where the field definitions in code do not match the database schema. The only way to resolve this and avoid missing column SQL exceptions is running update.php. With this patch that possibility would be gone so without manually executing PHP code you cannot recover from this and it would basically make Content Translation unusable

That's not completely accurate: CT leverages the Entity Update API to manually trigger updates, when translatability is enabled for a certain entity type. This avoids requiring user to manually run update.php to get a working system, which would be a horrible UX. This perfectly exemplifies why I'm +1 on removing entity updates from update.php: developers should use the API to reconcile the schema with the definitions as soon as possible, to avoid requiring users to manually fix things.

In D7 if your module adds a column to a table you'd perform the schema update in hook_install(), precisely for that reason. You'd definitely rely on db updates for subsequent updates, but we have seen multiple times now why applying entity updates in bulk is problematic in update.php.

tstoeckler’s picture

That's not completely accurate: CT leverages the Entity Update API to manually trigger updates, when translatability is enabled for a certain entity type. This avoids requiring user to manually run update.php to get a working system, which would be a horrible UX.

OK, if that is the case, then there are some bugs in that code, because I've regularly had to hit update.php on sites during development. That's not related to this then, though. Will try to find some time to investigate.

effulgentsia’s picture

One example is path.module has a path_entity_base_field_info() that adds a field, but there's no path_install() that invokes $entity_manager->onFieldStorageDefinitionCreate() on it, or a path_uninstall() that invokes $entity_manager->onFieldStorageDefinitionDelete(). But, should that be up to each field-adding module to do themselves, or should the system automate the creations/deletions (but not updates) via system_modules_installed()/system_modules_uninstalled() (or some other central place)?

Status: Needs review » Needs work

The last submitted patch, 43: auto-updates-2542748-43.patch, failed testing.

effulgentsia’s picture

I haven't looked into it yet, but I suspect those new failures are due to the schema change in #2542132: Drupal\field\Tests\Number\NumberFieldTest fails on SQLite landing today, and if so, I don't know yet if that means this issue needs to be postponed on #2544954: SqlContentEntityStorageSchema does not detect column-level schema changes in non-base-fields or if there's a way to proceed without that.

catch’s picture

One example is path.module has a path_entity_base_field_info() that adds a field, but there's no path_install() that invokes $entity_manager->onFieldStorageDefinitionCreate() on it, or a path_uninstall() that invokes $entity_manager->onFieldStorageDefinitionDelete(). But, should that be up to each field-adding module to do themselves, or should the system automate the creations/deletions (but not updates) via system_modules_installed()/system_modules_uninstalled() (or some other central place)?

A hook_update_N() could enable path module - this is very, very common for custom 7.x hook_update_N() to enable modules. Going to be less common in 8.x with config sync but I expect we'll still see that sometimes.

Since the hook_install() runs within the updates, that could run into the same problem with automatically running all entity changes - i.e. it could trigger unrelated changes made in other modules.

Even two modules that add different fields: given that hook_modules_installed() runs on all modules that got enabled, you'd get two on* calls vs. one depending if they're enabled together or separately.

So using the same API as for updates seems right here, although the inconvenience is a bit more compared to hook_update_N() I think.

This is a similar limitation to hook_schema_alter() in 7.x where you had to call db_add_field() in hook_install() as well as implementing the alter hook - just the alter hook wouldn't do anything by itself to the actual schema, always had to do both.

catch’s picture

#2542132: Drupal\field\Tests\Number\NumberFieldTest fails on SQLite was a test-only change, not a schema change.

jhedstrom’s picture

Issue summary: View changes

I'm of the opinion that if #2544954: SqlContentEntityStorageSchema does not detect column-level schema changes in non-base-fields is fixed, this issue is no longer critical. Core can continue to attempt automated entity schema updates, and if they don't apply, exceptions will be properly caught. If we get #2542422: Improve DX of exceptions thrown in SqlContentEntityStorageSchema committed, the information displayed will guide site owners and/or developers to the entity types that require manual schema changes. Core itself currently can apply all of its automated updates (from Beta 12 onward), and will continue to provide update hooks to address the ones it can't.

webchick’s picture

Well that sounds amazing to me, if others agree.

alexpott’s picture

I don't see how #53 addresses the point made in #38 (which @Berdir agreed with in #39).

jhedstrom’s picture

For #38 and #39, I think I agree that we shouldn't attempt to run these from update hooks at all, since there is no knowing if they will succeed. Once an update hook fails, no further update hooks from that module will run, so if modules start calling the automatic updates, that could get very messy very quickly. Rather, the current system allows all update hooks to run, and attempts to do the automated updates at the end. If the automated updates fail, another round of update hooks can run, and on and on until the automated ones succeed.

A possible alternative would be to completely decouple them, and add a hook_requirements to display the status of the entity schema (and a link to run the schema updates) on the site status report.

catch’s picture

Once an update hook fails, no further update hooks from that module will run, so if modules start calling the automatic updates, that could get very messy very quickly.

So I still think the answer to this is #24 - we don't run automatic updates automatically [sic], but we use the (storage-agnostic) plumbing that's already built to apply specific changes to field definitions in hook_update_N().

If the automated updates fail, another round of update hooks can run, and on and on until the automated ones succeed.

For some changes this might just about work, but for others a failed auto-update could leave the site inoperable.

It's OK if the auto-update fails for a developer making changes to their module (and we can have a drush helper to apply those to save people writing hook_update_N() each time), but not for a regular site admin updating from contrib.

If a hook_update_N() fails you know exactly which update it was from which module and can post the bug report. The very nature of automated entity updates means there's not a direct link between the failed auto-update and any one module.

effulgentsia’s picture

Status: Needs work » Needs review
StatusFileSize
new23.93 KB
new2.71 KB
+++ b/core/modules/system/system.install
@@ -1203,3 +1203,43 @@ function system_update_8001(&$sandbox = NULL) {
+          $db_schema->addIndex($table_name, $index_name, $index_spec);

The errors from #49's retest were due to #2537928: MySQL index normalization only works on table create landing which added a new required parameter to this function. Here's a fix for that.

dawehner’s picture

+++ b/core/modules/node/node.install
@@ -153,3 +151,119 @@ function node_uninstall() {
+      $field_storage_definition = $field_storage_definitions[$field_name];
+      // @todo This throws an exception if there are existing nodes. Figure out
+      //   how to fix that.
+      $entity_manager->onFieldStorageDefinitionUpdate($field_storage_definition, $field_storage_definition);

Does that mean that the exception throwing should be handled in a more upper level than the changing of the table structure?

dawehner’s picture

... doublepost ...

gábor hojtsy’s picture

Status: Needs review » Needs work

The remaining tasks are not documented on the summary. Are they limited to the two @todos in the patch? Are they supposed to be solved in this issue or in followups? Either way, needs work for them.

gábor hojtsy’s picture

Issue summary: View changes

Looks like these are the remaining tasks based on the list of proposed resolution?

  • Decide if we need to postpone this on #2346013: Improve DX of manually applying entity/field storage definition updates since applying an update in hook_update_N() should be simple - see below for how this should work
  • Remove \Drupal::service('entity.definition_update_manager')->applyUpdates(); from install_install_profile().
  • Ensure that schema additions (indexes and fields) are automatic on module install
  • Update documentation to hook_update_N() to show how to run entity schema updates

Updated the summary.

dawehner’s picture

Ensure that schema additions (indexes and fields) are automatic on module install

Doesn't that already works perfectly?

gábor hojtsy’s picture

Sorry I am not aware of how that works, but the issue summary mentioned it as part of the proposed resolution without an external issue and the patch does not seem to touch anything in that area. So it is either NOT a preexisting problem, was fixed already elsewhere, in the scope of another issue to resolve or works well as-is :)

effulgentsia’s picture

I think there are some things in the current (and previous) IS that are still debatable. But I think before we get to debating them, my opinion is that the key next step for this issue is to address this part of the problem statement:

However we have a problem that the automatic entity schema update only works for operations that don't change data.

#53 says:

Core itself currently can apply all of its automated updates (from Beta 12 onward)

But, that's not true if you have nodes.

Install beta-12, add a node, then try updating to HEAD. You will need to run update.php twice, because there is both an entity definition update and field-level changes that are needed, and EntityDefinitionUpdateManager::getChangeList() doesn't support doing both at the same time, per its 2nd @todo. After running the 2nd update, you will get:

core module

Update #update_entity_definitions

Failed: Drupal\Core\Entity\Exception\FieldStorageDefinitionUpdateForbiddenException: The SQL storage cannot change the schema for an existing field (uid in node entity) with data. in Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema->updateSharedTableSchema() (line 1377 of /core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php).

If you run through the same scenario with the patch, then you only need to run update.php once, and instead, you get this failure:

node module

Update #8002

Failed: Drupal\Core\Entity\Exception\FieldStorageDefinitionUpdateForbiddenException: The SQL storage cannot change the schema for an existing field (status in node entity) with data. in Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema->updateSharedTableSchema() (line 1377 of /core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php).

In other words, same error, but from an atomic update function. Note that this corresponds to the 2nd @todo in the patch's node_update_8002().

The error is due to needing to update for #2498919: Node::isPublished() and Node::getOwnerId() are expensive, which changed uid and status to entity keys, which means adding a NOT NULL constraint, and SqlContentEntityStorageSchema::updateSharedTableSchema() throws that error for any column-level schema change if there's existing records, even if only adding a NOT NULL constraint.

So I think the next steps are:

  • Add an update path test that includes a fixture for an existing node and tests that after applying updates, the expected NOT NULL constraints are there (for both node_field_data and node_field_revision).
  • Decide how to fix this: should SqlContentEntityStorageSchema::updateSharedTableSchema() be smarter and call $schema_handler->changeField() for the subset of changes for which that is safe? Or should it not do that, and therefore, make it the responsibility of node_update_8002() to do that?
  • Make the test pass by implementing the above decision.

I don't know when my next availability to work on this will be, but will assign the issue to me when I have it. Until then, if someone else wants to push it along, please do.

effulgentsia’s picture

Potentially in parallel with #66:

Decide if we need to postpone this on #2346013: Improve DX of manually applying entity/field storage definition updates since applying an update in hook_update_N() should be simple - see below for how this should work

Any reviews on the patch's update functions so far? What is the expected simplification that #2346013: Improve DX of manually applying entity/field storage definition updates will allow?

gábor hojtsy’s picture

@effulgentsia: I reviewed the update functions and they make total sense especially with the steps written out in the issue summary on how they are formulated. I wanted to go ahead to write the update hook docs for them, but if we are not yet sure that this is the best approach (although I don't see anything questionable with it), then it may not be the best idea :/

fago’s picture

Reading through the issue summary, this makes a lot of sense. If you have a module update, involving data schema changes and manual data migration, you'll have to run them into the right order, else it won't be possible to apply them again later on.
e.g.
schema A
data schema change 1 -> schema B
data migration/update1: A -> B
data schema change 2 -> schema C
data migration/update2: B -> C

So, with happening data schema changes automatic on a later update, it would update from schema A to C, skipping. Thus there is no way to run data migration 1 and 2, as the schema B will never be active. I agree, that to make this work we'll need to have a manual way of triggering the individual schema changes, what requires us to copy the target field storage definitions into the update hook.
However, if there are no update hooks triggering the schema change AND there is no data preventing schema changes, it still makes sense to me to run entity schema changes automatically once all other updates are applied.

+++ b/core/modules/block_content/block_content.install
@@ -0,0 +1,48 @@
+    $entity_manager->onFieldStorageDefinitionCreate($field_storage_definition);

That's a very intuitive method to call, is it? Wouldn't it better to introduce some official API for a manual trigger of the update process, e.g. $entity_manager->applyFieldStorageDefinition() which wraps the existing "event handler" ?

Any reviews on the patch's update functions so far? What is the expected simplification that #2346013: Improve DX of manually applying entity/field storage definition updates will allow?

I'd curious about that also. However, I think adding the simplified API should happen as part of prerequisite for this issue to avoid the number of API changes / new APIs to learn for people.

gábor hojtsy’s picture

However, if there are no update hooks triggering the schema change AND there is no data preventing schema changes, it still makes sense to me to run entity schema changes automatically once all other updates are applied.

How would we do that? Set flags if the entity schema changed in these update functions? What if data updates are also needed later on? Add the update then before the data update?

fago’s picture

How would we do that? Set flags if the entity schema changed in these update functions? What if data updates are also needed later on? Add the update then before the data update?

Couldn't we just continue to have the automatic updates run as last step? If there were manual updates before, they should be able to pick up from where those stopped. If not, the automatic update can run and work in some cases. Remaining cases requiring data migration will have to tell and notify the user still.

gábor hojtsy’s picture

I think that is not very good for *educational* purposes. If developers find it works, then they will not think and be forced to write the update function and will only get to hear about it after the fact when its too late via some issue from an angry site developer somewhere.

fago’s picture

Note sure. The manual update is not better as long as there no one taking care of the data migration as part of it - we cannot enforce that in any situation.

But I figured the issue summary already states the reason why we have to require a manual update anyway:

This means that if entity updates and hook_update_N's conflict there is no way to resolve this with hook_update_dependencies().

We cannot know beforehand, whether some other module needs to write an update relying on update_N involving schema changes. Thus, for that being possible we have to indeed enforce every schema change to have an hook_update_N implementation.

gábor hojtsy’s picture

That's a much better reason yeah :D

effulgentsia’s picture

Couldn't we just continue to have the automatic updates run as last step?

What about the suggestion in #56: to not run it as the last update.php step, but to allow running it explicitly as a separate link from /admin/reports/status? What are the pros/cons of last update.php step vs. separate link?

effulgentsia’s picture

Wouldn't it better to introduce some official API for a manual trigger of the update process, e.g. $entity_manager->applyFieldStorageDefinition() which wraps the existing "event handler" ?

During a config deployment, FieldStorageConfig::preSaveNew() can be called, which calls $entity_manager->onFieldStorageDefinitionCreate(). What makes a code deployment invoking update functions different enough to warrant a different API than that? Or are you suggesting we make both places use whatever new wrapper API we invent for this?

gábor hojtsy’s picture

@effulgentsia: re @fago

Wouldn't it better to introduce some official API for a manual trigger of the update process, e.g. $entity_manager->applyFieldStorageDefinition() which wraps the existing "event handler" ?

I think he meant we should not call an event handler directly, but instead wrap it in a public API that looks sane, so eg. if we need to do the same thing with another event, etc. then our public API for invoking it would not change even if the event changes.

catch’s picture

  1. +++ b/core/modules/node/node.install
    @@ -153,3 +151,119 @@ function node_uninstall() {
    +  // The 'node__default_langcode' and 'node_field__langcode' indexes were
    +  // removed from \Drupal\node\NodeStorageSchema in
    +  // https://www.drupal.org/node/2261669, but this update function wasn't
    +  // added until https://www.drupal.org/node/2542748. In between, sites could
    +  // have performed interim updates, which would have included automated entity
    +  // schema updates prior to that being removed (see that issue for details).
    +  // Depending on what contributed or custom modules are installed, those
    +  // interm updates could have also customized the schema handler. Therefore,
    +  // we check for whether the schema data, tables, and indexes exist before
    +  // modifying them. If any of it doesn't, we assume the interim updates or
    +  // the customizing modules took care of the necessary updates.
    

    Dropping an index that doesn't exist shouldn't throw an error so I'm not sure why we have to check first?

    I think this should use the API to remove the index via the field definitions rather than the manual SQL schema manipulation and assumptions.

  2. +++ b/core/modules/system/system.install
    @@ -1203,3 +1203,63 @@ function system_update_8001(&$sandbox = NULL) {
    +  // \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema was changed in
    +  // https://www.drupal.org/node/2261669 to include a (id, default_langcode,
    +  // langcode) compound index, but this update function wasn't added until
    +  // https://www.drupal.org/node/2542748. In between, sites could have
    +  // performed interim updates, which would have included automated entity
    +  // schema updates prior to that being removed (see that issue for details).
    +  // For sites that did not perform those interim updates, update the existing
    +  // schema to match that change. Note that if these indexes were added as an
    +  // implementation of a changed entity type or field storage definition, we
    +  // would invoke the higher-level $entity_manager->onEntityTypeUpdate() or
    +  // $entity_manager->onFieldStorageDefinitionUpdate() functions and let the
    +  // schema handler handle the database details, but since the indexes were
    +  // added by a code change of SqlContentEntityStorageSchema itself, this
    +  // update function needs to operate at this lower level.
    

    If the automated updates have already been applied, then an update that uses the API in the issue summary to apply specific changes to the field definitions in state would result in absolutely no change and nothing being applied - the system will compare two identical definitions and find no changes.

    Then if no update got applied, it'll find it fine and make the changes.

    So again I don't think the manual work done here should be necessary at all?

plach’s picture

Assigned: Unassigned » plach
Issue summary: View changes

Update IS, working on this...

About

Decide if we need to postpone this on #2346013: Improve DX of manually applying entity/field storage definition updates, since applying an update in hook_update_N() should be simple - see below for how this should work.

The two main goals of that issue were:

Bullet 7 of the proposed solution has the potential to make the need for the second item here way less common, so I don't think we need to postpone this on #2346013: Improve DX of manually applying entity/field storage definition updates, although probably also implementing the callback system proposed by @berdir would be an useful addition.

plach’s picture

Status: Needs work » Needs review
StatusFileSize
new28 KB
new4.07 KB

This implements bullet 3 and 4 of the proposed solution, settings NR for the bot.

It would be nice to address:

Wouldn't it better to introduce some official API for a manual trigger of the update process, e.g. $entity_manager->applyFieldStorageDefinition() which wraps the existing "event handler" ?

Actually we already introduced such an API in #2535082: Allow hook_update_N() implementations to run before the automated entity updates: EntityDefinitionUpdateManagerInterface::applyEntityUpdate() and EntityDefinitionUpdateManagerInterface::applyFieldUpdate(). The main difference is that those check whether updates need to be run. So we should probably use them in the module installer code.

plach’s picture

What about the suggestion in #56: to not run it as the last update.php step, but to allow running it explicitly as a separate link from /admin/reports/status? What are the pros/cons of last update.php step vs. separate link?

Not sure about that: once this lands, such an error condition would be only caused by wrong code being installed (i.e. not caring about its entity updates). Trying to apply updates in this case would be not ideal because either we would have a lazy developer relying on the user to apply updates he should have taken care of, or we would have an error condition.

I think that is not very good for *educational* purposes. If developers find it works, then they will not think and be forced to write the update function and will only get to hear about it after the fact when its too late via some issue from an angry site developer somewhere.

This looks to me like a very sane argument for not having a UI to manually run updates in bulk. For developers the drush command should be quite enough.

If we don't provide such a UI, I'm wondering whether it would make sense to split the entity updates status check into a separate item on the status report, which would make it clear it cannot be fixed by running update.php. This approach would also obsolete #2337921: Improve the UX of update.php's handling of entity type updates, I guess.

Status: Needs review » Needs work

The last submitted patch, 80: entity-rm_auto_updates-2542748-80.patch, failed testing.

plach’s picture

Assigned: plach » Unassigned

I will get back to this tomorrow morning, feel free to work on it meanwhile.

plach’s picture

Assigned: Unassigned » plach
plach’s picture

Status: Needs work » Needs review
StatusFileSize
new6.52 KB
new33.73 KB

This should fix the test failure.

plach’s picture

StatusFileSize
new4.18 KB
new35.26 KB

This separates the entity update status report requirement into its own item, that is displayed only if there are no pending db updates.

plach’s picture

Assigned: plach » Unassigned
StatusFileSize
new262.07 KB

Done for today

Status: Needs review » Needs work

The last submitted patch, 86: entity-rm_auto_updates-2542748-86.patch, failed testing.

effulgentsia’s picture

@catch: re #78, can you clarify what you mean? Both bullet points refer to "use the API in the issue summary to apply specific changes to the field definitions", but the problem is that the code change being updated for is what was done in #2261669: Slow query in NodeRevisionAccessCheck, which did not involve any changes to field storage definitions. It involved changes to table schema only. I.e., it was an "internal" change to the schema handlers, not any "public" change to the entity type definitions or field storage definitions. So I'm confused on what your proposal is for dealing with that.

effulgentsia’s picture

Title: Automatic entity updates are not safe to run on update.php by default » EntityDefinitionUpdateManager::applyUpdates() can fail when there's existing content, leaving the site's schema in an unpredictable state, so should not be called during update.php
Issue tags: +Triaged D8 critical

Discussed this with @alexpott, @catch, @xjm, and @webchick on our last triage call, and we agreed that there's some scope here that's critical, so tagging. Retitling to what I understood as that scope.

I'm not convinced that proposed resolution item #3 falls within that scope, but its related item #4 might have an interesting intersection with this issue as explained in #51. The code for that (in the #80 and #85 interdiffs) looks like a great start, but I think it might make sense to split that into a separate issue anyway for easier reviewability and prioritization.

The last submitted patch, 86: entity-rm_auto_updates-2542748-86.patch, failed testing.

catch’s picture

@catch: re #78, can you clarify what you mean? Both bullet points refer to "use the API in the issue summary to apply specific changes to the field definitions", but the problem is that the code change being updated for is what was done in #2261669: Slow query in NodeRevisionAccessCheck, which did not involve any changes to field storage definitions. It involved changes to table schema only. I.e., it was an "internal" change to the schema handlers, not any "public" change to the entity type definitions or field storage definitions.

That's because I completely missed that point, sorry.

plach’s picture

Assigned: Unassigned » plach
Issue tags: +D8 Accelerate

I'm not convinced that proposed resolution item #3 falls within that scope, but its related item #4 might have an interesting intersection with this issue as explained in #51. The code for that (in the #80 and #85 interdiffs) looks like a great start, but I think it might make sense to split that into a separate issue anyway for easier reviewability and prioritization.

The reason why I'm proposing to address these bullets together is that this should hopefully be the issue finally setting the way to deal with entity updates for developers. If we address #3 and #4 in a separate issue we'll have a double API change: first developers will be in charge of manually applying all entity updates, even those that's completely sensible to expect the system to be able to apply automatically, and then we will have another change that fixes that and so would force developers to change their code and expectations again.

plach’s picture

StatusFileSize
new35.32 KB
new1.66 KB

This should fix the failures...

plach’s picture

Status: Needs work » Needs review
catch’s picture

  1. +++ b/core/modules/node/node.install
    @@ -153,3 +151,119 @@ function node_uninstall() {
    +function node_update_8002() {
    +  $entity_manager = \Drupal::entityManager();
    +  $old_entity_type = $entity_manager->getLastInstalledDefinition('node');
    +  $entity_keys = $old_entity_type->getKeys();
    +
    +  // The 'status' and 'uid' fields were added to the 'entity_keys' annotation
    +  // of \Drupal\node\Entity\Node in https://www.drupal.org/node/2498919, but
    +  // this update function wasn't added until
    +  // https://www.drupal.org/node/2542748. In between, sites could have
    +  // performed interim updates, which would have included automated entity
    +  // schema updates prior to that being removed (see that issue for details).
    +  // Therefore, we check for whether the keys have already been installed.
    +  if (!isset($entity_keys['uid']) || !isset($entity_keys['status'])) {
    +    $entity_keys['status'] = 'status';
    +    $entity_keys['uid'] = 'uid';
    +    $new_entity_type = clone $old_entity_type;
    +    $new_entity_type->set('entity_keys', $entity_keys);
    +    $entity_manager->onEntityTypeUpdate($new_entity_type, $old_entity_type);
    +
    +    // @todo The above should be enough, since that is the only definition that
    +    //   changed. But \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema
    +    //   varies field schema by whether a field is an entity key, so invoke
    +    //   onFieldStorageDefinitionUpdate() with an unmodified
    +    //   $field_storage_definition to trigger the necessary changes.
    +    //   SqlContentEntityStorageSchema::onEntityTypeUpdate() should be fixed
    +    //   to automatically handle this.
    +    $field_storage_definitions = $entity_manager->getLastInstalledFieldStorageDefinitions('node');
    +    foreach (array('status', 'uid') as $field_name) {
    +      $field_storage_definition = $field_storage_definitions[$field_name];
    +      // @todo This throws an exception if there are existing nodes. Figure out
    +      //   how to fix that.
    +      $entity_manager->onFieldStorageDefinitionUpdate($field_storage_definition, $field_storage_definition);
    +    }
    +  }
    +}
    

    I'm not sure we need the check for the existing keys - if $old_entity_type already has the keys, and $new_entity_type gets them, then $entity_manager->onEntityTypeUpdate($new_entity_type, $old_entity_type) should be a no-op

    If that's right, then it also seems generally handy for development sites where people run the drush helper to auto-apply then write an update - the update should be fine as long as the schema is either the old state or identical to the new one.

    Overall during beta I'd really like to avoid any updates that try to deal with two previous states of the database - we ran into a lot of trouble with that in the 7.x cycle. If a site had a problem with this update with that check removed, they could always add an early return or tweak schema version to miss it.

    Otherwise except for the @todo this is just how I thought this would end up looking, is pretty encouraging.

  2. +++ b/core/modules/node/node.install
    @@ -153,3 +151,119 @@ function node_uninstall() {
    +function node_update_8003() {
    

    So I know effulgentsia pointed out that this isn't a field definition change. But given automated entity updates added the indexes, can it really not remove them too?

    I'm not clear whether there's a way to alter the index definitions for the SQL schema specifically without subclassing, if there is then the load/modify/apply approach would be good to use here too.

effulgentsia’s picture

+++ b/core/modules/block_content/block_content.install
@@ -0,0 +1,48 @@
+/**
+ * Add 'revision_translation_affected' field to 'block_content' entities.
+ */
+function block_content_update_8001() {
...
+++ b/core/modules/node/node.install
@@ -153,3 +151,119 @@ function node_uninstall() {
+/**
+ * Add 'revision_translation_affected' field to 'node' entities.
+ */
+function node_update_8001() {

FYI: #2550615: Update the DB dump to have a leading slash for the frontpage. just updated the db dump used by update tests to include #2509300: Path alias UI allows node/1 and /node/1 as system path then fatals which was the commit that followed #2453153: Node revisions cannot be reverted per translation. I don't know if that means that the dump now also includes that latter issue, but if it doesn't, it probably should, since the dump should be from a known commit hash, not cherry picked (see #2550615-9: Update the DB dump to have a leading slash for the frontpage.). Once the dump is known to include the revision_translation_affected fields, we can remove these update functions from this issue.

dawehner’s picture

No in depth review

  1. +++ b/core/lib/Drupal/Core/Extension/ModuleInstaller.php
    @@ -212,15 +211,34 @@ public function install(array $module_list, $enable_dependencies = TRUE) {
    +                  $entity_manager->onFieldStorageDefinitionCreate($storage_definition);
    

    Urgs that API method sounds kinda wrong, but well this is how head works

  2. +++ b/core/lib/Drupal/Core/Extension/ModuleInstaller.php
    @@ -212,15 +211,34 @@ public function install(array $module_list, $enable_dependencies = TRUE) {
    +                catch (EntityStorageException $e) {
    +                  watchdog_exception('system', $e, 'An error occurred while notifying the creation of the @name field storage definition: "!message" in %function (line %line of %file).', ['@name' => $storage_definition->getName()]);
    +                }
    

    Do we really want to hide the exception entirely? I would have thought that we better throw it and let the module install fail?

  3. +++ b/core/modules/block_content/block_content.install
    @@ -0,0 +1,48 @@
    +    $field_storage_definition = BaseFieldDefinition::create('boolean')
    +      ->setLabel(t('Revision translation affected'))
    
    +++ b/core/modules/node/node.install
    @@ -153,3 +151,119 @@ function node_uninstall() {
    +    $field_storage_definition = BaseFieldDefinition::create('boolean')
    +      ->setLabel(t('Revision translation affected'))
    +      ->setDescription(t('Indicates if the last edit of a translation belongs to current revision.'))
    

    I'm curious whether we should take the base field definition from the entity itself so something like $field_storage_definition = BlockContent::baseFieldDefinitions()['revision_translation_affected']

jibran’s picture

Status: Needs review » Needs work

NW for pending feedbacks.

plach’s picture

Issue summary: View changes
Status: Needs work » Needs review
Issue tags: -Needs issue summary update, -Needs tests

Here is additional test coverage, not sure whether we should also emulate a "broken" entity update to have a failing test on HEAD.

I think the main issue still to be discussed is the DX of applying individual updates, see the new test code, which brings us back to #69 and #99.

Reviews still to be addressed.

plach’s picture

StatusFileSize
new10.59 KB
new45.91 KB

And now also with patch!

plach’s picture

Title: EntityDefinitionUpdateManager::applyUpdates() can fail when there's existing content, leaving the site's schema in an unpredictable state, so should not be called during update.php » Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state

Let's describe only the issue in the title.

plach’s picture

@effulgentsia:

I tried to implement #98, but I'm getting some failures, see #2496337-36: [plach] Testing issue. Any idea?

plach’s picture

StatusFileSize
new50.68 KB
new11.22 KB

This adds further test coverage, I'm still working on the reviews above.

plach’s picture

StatusFileSize
new10.82 KB
new55.67 KB

This addresses #97.1, bullet 2 still to do. We probably need additional test coverage now.

It also switches the content of node_update_8002() and node_update_8003() to match the order in which mismatches are detected by the updates manager.

A few answers to @dawehner:

Urgs that API method sounds kinda wrong, but well this is how head works

Yep, we definitely need to improve DX, I will work on that soon.

Do we really want to hide the exception entirely? I would have thought that we better throw it and let the module install fail?

Yep, I think so: it's a problematic condition in the schema handler, but it's expected in the installer, which knows in most cases things will just work. For the rare exceptions, we are actually logging the problem. I think it's a fair compromise.

I'm curious whether we should take the base field definition from the entity itself

Nope, think of regular schema updates: you are supposed to copy the schema definition form hook_schema() not to call hook_schema() and use the result, otherwise any change in hook_schema() would make change also the update function.

plach’s picture

+++ b/core/modules/system/system.install
@@ -591,6 +591,7 @@ function system_requirements($phase) {
     // Verify that no entity updates are pending after running every DB update.
     if (!isset($requirements['update']['severity']) && \Drupal::service('entity.definition_update_manager')->needsUpdates()) {
+      debug(\Drupal::service('entity.definition_update_manager')->getChangeSummary());
       $requirements['entity_update'] = array(

This makes me wonder whether we need at least a page summarizing the mismatches.

plach’s picture

StatusFileSize
new7.61 KB
new51.39 KB

This addresses #97.2: as @catch noted, simply running the update with the original definition allows to regenerate the schema even when data is present (I manually tested this case) and thus get rid of obsolete indexes and create new ones. Any interim update would be kept, if the related code is still deployed, because we are only regenerating the schema as currently defined according to the definitions.

I'm now working on additional test coverage for the NOT NULL handling, then I will propose some DX improvements now that we have some code examples to look at.

Meanwhile, it would be good to discuss whether we want to have a separate route to apply updates in bulk, as suggested by @effulgentsia in #75 or whether the current solution (dedicated status report item) is acceptable. If we go with the latter, should we provide a page summarizing the mismatches? Or relying on drush also for this would be enough?

catch’s picture

Interdiff on #108 is great!

wim leers’s picture

#86: Seeing "Entity/field definitions" in the status report sounds very abstract, and thus quite frightening. Can we give it a different label, and just mention "entity/field definitions" in the description?

Which means I think this needs the Usability tag… and quite possibly also Needs usability review?

plach’s picture

Issue summary: View changes
Issue tags: +Usability, +Needs usability review

Seeing "Entity/field definitions" in the status report sounds very abstract, and thus quite frightening. Can we give it a different label, and just mention "entity/field definitions" in the description?

Yeah, I already read similar arguments before we decided to unify the db updates and the entity updates UIs.

However is that really more scary than PHP DOMDocument class, PHP open_basedir restriction, or Twig C extension, just to name a few? At least most site builders should be familiar with the concepts of field and (possibly) entity type.

webchick’s picture

Can you post a screenshot? That makes it much easier for UX people to review.

plach’s picture

StatusFileSize
new15.2 KB
new56.64 KB

Now test coverage should be complete.

@webchick:

The screenshot is in the issue summary :)

webchick’s picture

Ugh, sorry, it's still early over here! :)

So if this only appears in the Report section, I think that the label is fine. Better to be descriptive than not, IMO. If it shows up every time you run update.php, we probably need to talk about it more.

The bigger problem with that error message though is it does not give the user any sort of clue how to go about resolving the error. Questions I would have:

1) Where was the mismatch in installed entity/field_definitions?
2) How do I tell what module(s) are problematic?

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -95,13 +95,13 @@ public function getChangeSummary() {
    -    $change_list = $this->getChangeList();
    -    if ($change_list) {
    +    $complete_change_list = $this->getChangeList();
    +    if ($complete_change_list) {
           // self::getChangeList() only disables the cache and does not invalidate.
           // In case there are changes, explicitly invalidate caches.
           $this->entityManager->clearCachedDefinitions();
         }
    -    foreach ($change_list as $entity_type_id => $change_list) {
    +    foreach ($complete_change_list as $entity_type_id => $change_list) {
    

    I'm curious, did the semantics changed here or did you just renamed it, because you wanted to do? Not sure I can really judge what is more useful ...

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1391,6 +1392,20 @@ protected function updateSharedTableSchema(FieldStorageDefinitionInterface $stor
    +                  throw new EntityStorageException(SafeMarkup::format('The "@name" column cannot have NOT NULL constraints as it holds NULL values.', ['@name' => $column_name]));
    

    Note: We no longer use SafeMarkup to create those. Just use " ... $column_name ..." instead

  3. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1423,6 +1438,26 @@ protected function updateSharedTableSchema(FieldStorageDefinitionInterface $stor
    +   * @param $table_name
    ...
    +   * @param $column_name
    

    Nitpick: Let's put 'string' onto the docs.

  4. +++ b/core/modules/system/system.install
    @@ -588,10 +588,15 @@ function system_requirements($phase) {
    +        'description' => t('A mismatch in the installed entity/field definitions was detected. If new modules were installed, the problematic one(s) should be uninstalled. If new module versions were installed, the problematic one(s) should be reverted to the previous version.'),
    

    This text mentions "new modules" twice. I think the second bit should talk about updates though

  5. +++ b/core/modules/system/tests/modules/entity_test/update/entity_definition_updates_8001.inc
    @@ -0,0 +1,43 @@
    +  // Remove data from the storage.
    +  $connection->update('entity_test')
    +    ->fields(['user_id' => NULL])
    +    ->execute();
    ...
    +  $insert_query = $connection->insert('entity_test__user_id')
    +    ->fields(['bundle', 'deleted', 'entity_id', 'revision_id', 'langcode', 'delta', 'user_id_target_id']);
    

    It is a bit odd that we insert new entries instead of updating the existing ones, but yeah this is just in a test, but I guess this is due to multiple updates functions. What about adding a quick comment about it?

  6. +++ b/core/modules/system/tests/modules/entity_test/update/entity_definition_updates_8001.inc
    @@ -0,0 +1,43 @@
    +  $storage_definition = clone $original;
    ...
    +  $entity_manager->onFieldStorageDefinitionUpdate($storage_definition, $original);
    
    +++ b/core/modules/system/tests/modules/entity_test/update/entity_definition_updates_8002.inc
    @@ -0,0 +1,41 @@
    +  $storage_definition = clone $original;
    ...
    +  $entity_manager->onFieldStorageDefinitionUpdate($storage_definition, $original);
    

    There are multiple places which need to clone the storage definition first. Does that mean we maybe should do that internally and return the new one?

gábor hojtsy’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1423,6 +1438,26 @@ protected function updateSharedTableSchema(FieldStorageDefinitionInterface $stor
    +   * @param $table_name
    +   *   The name of the table to inspect.
    +   * @param $column_name
    +   *   The name of the column holding the field property data.
    

    Missing data type docs.

  2. +++ b/core/modules/node/node.install
    @@ -153,3 +151,96 @@ function node_uninstall() {
    +  // @todo The above should be enough, since that is the only definition that
    +  //   changed. But \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema varies
    +  //   field schema by whether a field is an entity key, so invoke
    +  //   onFieldStorageDefinitionUpdate() with an unmodified
    +  //   $field_storage_definition to trigger the necessary changes.
    +  //   SqlContentEntityStorageSchema::onEntityTypeUpdate() should be fixed to
    +  //   automatically handle this.
    

    Planned to be resolved in this issue?

  3. +++ b/core/modules/system/tests/modules/entity_test/entity_test.module
    @@ -119,14 +120,25 @@ function entity_test_entity_base_field_info(EntityTypeInterface $entity_type) {
    +  // @todo Remove this if we end up using state definitions at runtime.
    +  //   See TODO.
    

    Todo, todo, todo, todo, todo, todo... Looks like we should have an issue open :) Do we care so much about the niceness of the test approach?

gábor hojtsy’s picture

Issue summary: View changes
StatusFileSize
new178.43 KB

Updated API changes and UI changes sections. It also looks bad that the requirements error shows up on a freshly installed version of D8 with the patch?! It is not supposed have entity updates to run, does it? Screenshot when visiting the status screen on a brand new D8 site with the patch that I did not do anything else on yet:

That should not happen, should it?

plach’s picture

That should not happen, should it?

Nope, I'll investigate, thanks :)

gábor hojtsy’s picture

I started on a change notice for developers at https://www.drupal.org/node/2554097 and a change notice for site builders at https://www.drupal.org/node/2554101. I found it tricky to title them appropriately, but I think I got it as good as I could :) While writing the dev change notice, the question of using the event handlers directly still looks odd to me. However I noticed that we use both the on...Create and on...Update, should we add proxy methods for both? Why not on...Delete then? (This was both raised by @fago and @dawehner).

webchick’s picture

Status: Needs work » Needs review
StatusFileSize
new869.31 KB

For testing purposes (please don't credit me), here's both this patch and #2551341: Update test database dump should be based on beta 12 and contain content together just to see what happens.

plach’s picture

Working on the reviews.

@webchick:

If it shows up every time you run update.php, we probably need to talk about it more.

Not necessary, the message shows up only as a status report item :)

The bigger problem with that error message though is it does not give the user any sort of clue how to go about resolving the error.

That's a bit unfair :)
The message does tell you what do, it's "just" lacking important details. The main reason for that is only developers can actually fix such a situation, what a site builder can do is reverting the installation of any new or updated module.

1) Where was the mismatch in installed entity/field_definitions?

We could fix this by displaying the change summary the updates manager is already able to generate. Since it can get quite long, I'm wondering whether it would make sense to provide a separate page where displaying it, and linking it from the error message as we do for db updates.

2) How do I tell what module(s) are problematic?

That's more complex: actually a mismatch may be caused by the module providing a definition, but also by a module altering a definition. Unfortunately we don't have the latter information, so providing the former would just be misleading. OTOH in the 99% of cases this error will pop up if you just installed new code, so combine that with the change summary and you should have some clue of which module could be the culprit.

plach’s picture

StatusFileSize
new5.83 KB
new57.08 KB

Glad to see #120 green :)

The attached patch should address #115 and #116. Working on #117 now.

A few replies:

@dawehner:

1: I renamed it because while debugging that part of code I realized it was just wrong to reuse the variable that way. @alexpott already pointed that out somewhere else. Semantic is unchanged.
5: Well, that code is actually moving data from a shared table to a dedicated table, so it needs to insert the collected data. I added a comment to clarify that.
6: We need both the original and the current definition, so returning a clone wouldn't help. However it's a good point, and another thing to consider while thinking to DX improvements for the Entiy Update API.

@Gábor Hojtsy:

2: Created #2554245: Updating the entity type definition should be enough when adding new fields as entity keys.
3: Well, it's not just to clean up the test, we may actually have very good reasons to do that, see #2554235: Make the content entity storage and entity query use the last installed definitions instead of the ones living in code.

plach’s picture

StatusFileSize
new7.23 KB
new58.87 KB

This should address #117.

Tomorrow I will post a proposal for some DX improvements, taking into account the various suggestions made so far, if no one beats me to it. To be done here, I think we are only missing those, documentation updates, and finalizing the UI changes.

gábor hojtsy’s picture

+++ b/core/modules/system/tests/modules/entity_test/update/entity_definition_updates_8001.inc
@@ -14,16 +14,16 @@ function entity_test_update_8001() {
-  $connection = \Drupal::database();
+  $database = \Drupal::database();
 
   // Retrieve existing field data.
-  $user_ids = $connection->select('entity_test', 'et')
+  $user_ids = $database->select('entity_test', 'et')

etc, etc, etc... not sure including these unrelated changes is a good idea for getting reviews faster.

plach’s picture

Not sure what you mean: those lines are added by the patch, so they would need to be reviewed anyway, sooner or later... and providing two separate interdiffs sounds a bit overkill, honestly

gábor hojtsy’s picture

+++ b/core/modules/system/system.install
@@ -594,10 +594,15 @@ function system_requirements($phase) {
+        'description' => t('A mismatch in the installed entity/field definitions was detected. If new modules were installed, the problematic one(s) should be uninstalled. If any modules were updated, the problematic one(s) should be reverted to the previous version.'),

So what happens is more something like "Entity and field definitions do not match what is currently expected by modules. New or updated modules may cause that if the developer did not provide an update. Until the module code is fixed, uninstall / downgrade the module."

I am not saying to use that text, but it may inspire some more user friendly rewrite. :) Not that I am in favor of holding this fix for textual questions here. It does not matter that much IMHO, we can refine the text however long we want later on.

gábor hojtsy’s picture

@plach: re #124 sorry, I only looked at the interdiff.

catch’s picture

Agreed with #126. The idea here is that developers are responsible for writing updates when they change code, so the hook_requirements() message should most of the time be a friendly reminder to developers, and never show up to site builders unless there's a buggy module (or they haven't run update.php yet, but then they'll get the update.php reminder too).

plach’s picture

I agree with #126 and #128.

@webchick:

Would you please confirm #126 is ok or provide a suitable rewrite? :)

gábor hojtsy’s picture

Status: Needs review » Needs work

The last submitted patch, 123: entity-rm_auto_updates-2542748-123.patch, failed testing.

gábor hojtsy’s picture

Status: Needs work » Needs review
StatusFileSize
new58.72 KB

Rerolled.

webchick’s picture

Talked about this with @plach.

One of the big problems with the existing/proposed error message is it literally tells site builders to start indiscriminately destroying data (we no longer have module disable, and module uninstall is destructive) in order to make the error go away. That arguably introduces another critical bug (data loss), although I'll concede that's a bit of a stretch. :)

The other big problem though is that it does not actually tell even someone who is capable to solve their own problems what do about it. I understand why we can't tie the problem back to a particular module, but we don't even tell them which entities and what metadata is out of sync. Case in point: Gábor in #117 sees that the error's appearing on install, but there's no info on where to start fixing it.

We didn't have such a warning in D7 and below; instead you'd just run into a fatal error, revert back to the old version of the module, file a bug with the maintainer, they'd go "Oh, crap! I need to add an update function!" and issue a new release with the fix. So my preference is really to just to kick this entire UI stuff to a separate, non-critical issue and sort it out there (it's completely non-invasive so could totally be committed during RC). It's eaten up a chunk of time of this nuclear critical patch, and it really doesn't need to, since it's new.

However, plach pointed out that the message is helpful for developers who happen upon it, because it reminds them that they missed something. So from his POV, it would be good to get something in.

I asked if we could do something like:

A mismatch in the following installed entity/field definitions detected:

- Page: field_image (required attribute not found)
- Article: field_tags (maxlength mismatch)

Revert your modules to a prior working version.

...or whatever. While this list could become ridiculously huge, in the general case it's going to be 3-5 things, according to plach, and this would let a developer instantly know "'Ah-ha!' I know exactly what that is."

It sounded like that was going to be hard(er) (and would need more fleshing out/thinking, hence my desire for a sub-issue :)), so plach proposed:

"Entity/field definitions" -> "Mismatch detected"

Which is probably good enough to get us out of the way of a critical, since it does not tell site builders to do destructive actions. It does still leave them without a clue of what to do, but at least they have something to Google for, I guess. Committing interim fixes at ~10 issues away from RC is definitely not my idea of a fun time, but I won't block it.

catch’s picture

fwiw I'd be OK with either a short message or no message to get this resolved. Even if we wanted to make the message a release blocker, I don't think it should hold up this issue getting committed since there's a lot more moving parts here to get right that need careful review.

gábor hojtsy’s picture

Well, my definition of a googleable things is not satisfied by "Mismatch detected" :D Maybe "Mismatched entity and/or field definitions." (which yes is repetitive, but is a unique enough string sequence to google for).

gábor hojtsy’s picture

gábor hojtsy’s picture

Updated screenshot. Will use this in an updated change notice as well.

gábor hojtsy’s picture

Issue summary: View changes
gábor hojtsy’s picture

So what is left is:

- DX improvements (not calling the onX methods directly),
- documentation updates for hook_update_N()

gábor hojtsy’s picture

Issue summary: View changes
Status: Needs review » Needs work
gábor hojtsy’s picture

Issue summary: View changes
Status: Needs review » Needs work
webchick’s picture

Issue summary: View changes
StatusFileSize
new59.44 KB
new863 bytes

Here's just a small update to fix the API docs. In addition to that, we'll need to update two handbook pages. Added the proposed text to the issue summary, which has been co-written by @Gábor.

effulgentsia’s picture

Status: Needs work » Needs review
StatusFileSize
new60.61 KB
new12.33 KB

Several comments in the issue (rightly) complained about update functions calling $entity_manager->on*() functions directly. So, here's a proposal for how to wrap that.

plach’s picture

StatusFileSize
new24.39 KB
new66.98 KB

Ugh, I just cross-posted with #144, so now we have two competing proposals (my interdiff is with #143) :(

Initially I wanted to leverage the new methods introduced in #2535082: Allow hook_update_N() implementations to run before the automated entity updates to apply individual updates. However I soon realized those are not perfectly suitable for the kind of needs we have here, so I added new methods on the update manager "competing" with them. I didn't do that yet, but I'd like to deprecate EntityDefinitionUpdateManagerInterface::applyFieldUpdates() and EntityDefinitionUpdateManagerInterface::applyEntityUpdates() or even remove them, if we are still allowed to do that. It's a BC break, but probably it won't affect many developers, since they could rely on auto updates so far.

If this approach is accepted, then we need a couple quick test coverage additions and we need to update documentation accordingly.

gábor hojtsy’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -166,6 +166,60 @@
    +  public function updateEntityType(EntityTypeInterface $entity_type) {
    +    $original = $this->entityManager->getLastInstalledDefinition($entity_type->id());
    +    $this->entityManager->onEntityTypeUpdate($entity_type, $original);
    +  }
    

    Why not call touchEntityType() here? Why not name it ...Definition()? Like with field storage? Would be good to be consistent.

  2. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -166,6 +166,60 @@
    +  public function touchEntityType($entity_type_id) {
    

    Same here for ....Definition()

  3. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -166,6 +166,60 @@
    +      // @todo What to do if a definition is already installed (e.g., from a
    +      //   partially run update function): nothing? update the definition
    +      //   to $definition? or throw an exception?
    

    Nothing. We did nothing in the manual versions too :)

  4. +++ b/core/modules/block_content/block_content.install
    @@ -13,36 +13,20 @@
    -      // @see \Drupal\Core\Entity\EntityManager::buildBaseFieldDefinitions()
    -      ->setProvider('block_content')
    -      ->setName('revision_translation_affected')
    -      ->setTargetEntityTypeId('block_content')
    -      ->setTargetBundle(NULL);
    
    +++ b/core/modules/node/node.install
    @@ -156,39 +156,24 @@
    -      // @see \Drupal\Core\Entity\EntityManager::buildBaseFieldDefinitions()
    -      ->setProvider('node')
    -      ->setName('revision_translation_affected')
    -      ->setTargetEntityTypeId('node')
    -      ->setTargetBundle(NULL);
    

    Why are these not needed anymore?

plach’s picture

Well, It seems they are very similar, so merging them should be easy, whew :D

plach’s picture

Assigned: plach » Unassigned

I'm unassigning now, just to avoid further confusion...

gábor hojtsy’s picture

Yeah they are very similar. I don't think we can continue working on this tonight. Would be great if there would be a discussion about this on the D8 criticals call, especially if fago, berdir could attend, but even if they don't. This is so close, so I hope we can obsess only a little over the API ;)

effulgentsia’s picture

Agreed. Please don't wait on me. Run with whatever you all come up with on the EuroCriticals call.

Status: Needs review » Needs work

The last submitted patch, 145: entity-rm_auto_updates-2542748-145.patch, failed testing.

catch’s picture

#146-3: if we don't do anything when the field is already installed, then how is this different from the update version? I think it is probably not and we could live with 'touch' and 'update', or just 'update'.

larowlan’s picture

  1. +++ b/core/lib/Drupal/Core/Extension/ModuleInstaller.php
    @@ -362,9 +381,25 @@ public function uninstall(array $module_list, $uninstall_dependents = TRUE) {
    +      $update_manager = \Drupal::entityDefinitionUpdateManager();
    

    Can this be injected?

  2. +++ b/core/modules/block_content/block_content.install
    @@ -0,0 +1,44 @@
    +      ->installFieldStorageDefinition($name, 'node', 'node', $storage_definition);
    

    Why is this using node if we're dealing with 'block_content'?

dawehner’s picture

Can this be injected?

Well we don't do at the moment for all the other services as well, but once #2380293: Properly inject services into ModuleInstaller we would have it injected. Note: The other issue just needs to be committed *hint* *hint*

plach’s picture

Status: Needs work » Needs review
StatusFileSize
new29.65 KB
new75.2 KB
7f4eac7 Moved core entity type installation before module installation.
3600bbe Removed obsolete methods.
3345786 Merged 144
7da10eb Code clean-up
cb8274b Updated documentation.

Will post longer notes later.

plach’s picture

#155 is not including the touch* methods, but I'm fine with introducing them if people think they are helpful. I personally find them a bit confusing, although I can see how they make things easier.

@catch:

The difference between install and update is that you have to specify the field name, entity type and provider, which would normally be populated by the entity manager.

@larowlan:

I didn't inject that because we have #2350111: Introduce an event to decouple the module installer from the entity manager that will move that code in a separate subscriber. Not injecting the update manager and the entity manager will avoid API changes when we do that.

catch’s picture

There's a combined patch on #2554151-24: Test content/configuration in update database dump.

That shows this fixes most of the upgrade path test failures which are exposed when our implicit testing actually runs (see explanation in that issue for why it doesn't at the moment).

We won't be able to fix that issue until this is fixed, but this issue benefits from the additional test coverage there. Discussed this on the EuroCriticals call and we thought it's easiest to:

- get this issue in first (which means the test coverage is blocked on this issue)

- post combined patches on that issue so we know if anything else fails - and whether it fails because of a bug/omission here or something else wrong in the upgrade path. That way we don't miss out on the extra test coverage too much.

- post new critical issues for any upgrade path bugs that aren't related to this which come out of the testing issue.

- get the corrected database dump and implicit test coverage committed asap once that can be done without HEAD test failures

Something like that anyway.

plach’s picture

We should be ready for reviews here.

plach’s picture

Issue summary: View changes

Probably CRs need to be updated now, I'll do it later if none beats me to it.

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManagerInterface.php
    @@ -84,67 +89,84 @@ public function getChangeSummary();
    +   * Returns a field storage definition ready to be manipulated.
    +   *
    ...
    +  public function getFieldStorageDefinition($name, $entity_type_id);
    

    Its not really clear for me what this is used for. Maybe we could document when exactly we want to call out to that method?

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1391,6 +1391,20 @@ protected function updateSharedTableSchema(FieldStorageDefinitionInterface $stor
    +                  throw new EntityStorageException('The "' . $column_name . '" column cannot have NOT NULL constraints as it holds NULL values.');
    

    Nitpick: you could just use double quotes.

  3. +++ b/core/lib/Drupal/Core/Extension/ModuleInstaller.php
    @@ -212,14 +211,34 @@ public function install(array $module_list, $enable_dependencies = TRUE) {
    +          elseif ($entity_type->isSubclassOf('\Drupal\Core\Entity\FieldableEntityInterface')) {
    

    Just for future refernce: We can now use FieldableInterface::CLASS with php 5.5

plach’s picture

StatusFileSize
new11.97 KB
new81.68 KB

I discussed with @dawehner some improvements to the documentation. This also restores some test coverage that was inadvertently removed in previous versions.

Nitpick: you could just use double quotes.

We should really introduce a coding standard about this. Personally I prefer to concatenate strings because that way variables are easier to spot when syntax highlighting is not available...

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Thank you for addressing the feedback.

jhedstrom’s picture

I'm a bit concerned about this going in prior to #2555183: Fix the filled update tests, they are broken, since those are our only tests with content.

dawehner’s picture

I'm a bit concerned about this going in prior to #2555183: Fix the filled update tests, they are broken, since those are our only tests with content.

Well, we sorta have a circular dependency ... dont' we?

stefan.r’s picture

@dawehner that issue ought to be merely about fixing the existing filled DB tests, I agree it with @jhedstrom it would be nice if that could go in before this... We just have 2 or 3 test failures to resolve there anymore.

catch’s picture

#2554151-24: Test content/configuration in update database dump made a big difference to that issue, is that because the new beta12 dump we have in core isn't tested at all both before and after #2542748: Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state and we'll only get that testing once #2554151: Test content/configuration in update database dump is done?

plach’s picture

I'm fine to postpone this on #2555183: Fix the filled update tests, they are broken, but it would be good to get some final reviews on the new DX proposed here just to be sure the work on this is actually complete. That would also allow me to update the CRs :)

catch’s picture

Confirmed that this issue fixes test failures on #2555183: Fix the filled update tests, they are broken. Given that, we should definitely not block this issue on that one because we'll either have to duplicate work or we'll have a circular dependency.

I need to catch up on the interdiffs from #155 onwards before I can feel comfortable committing this.

Although if another committer does that and commits I wouldn't mind - was very happy up until then and doubt they've made it worse.

jhedstrom’s picture

Agreed we can get this in now, rather than have a circular dependency :)

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 161: entity-rm_auto_updates-2542748-161.patch, failed testing.

mpdonadio’s picture

Status: Needs work » Needs review
StatusFileSize
new81.5 KB

Reroll b/c #2557519: Remove many usages SafeMarkup::checkPlain() and replace with Html::escape()

$ git checkout 8540c2f1c5bc86cd6db45b86ec0cc367c1b1fa65
$ git apply --index entity-rm_auto_updates-2542748-161.patch
$ git rebase 8.0.x
First, rewinding head to replay your work on top of it...
Applying: entity-rm_auto_updates-2542748-161
Using index info to reconstruct a base tree...
M	core/includes/update.inc
Falling back to patching base and 3-way merge...
Auto-merging core/includes/update.inc
CONFLICT (content): Merge conflict in core/includes/update.inc
Failed to merge in the changes.
Patch failed at 0001 entity-rm_auto_updates-2542748-161
The copy of the patch that failed is found in:
   /Users/mpdonadio/Documents/PhpStorm/drupal-8.0.x/.git/rebase-apply/patch

When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

Simple conflict in the use section. Chose HEAD. Not seeing SafeMarkup in the file after the reroll.

webchick’s picture

Status: Needs review » Reviewed & tested by the community

Testbot approved, back to RTBC.

effulgentsia’s picture

Patch looks great to me. I'm about to commit it. First, checking boxes to add credit to @catch and @dawehner for their multiple reviews.

effulgentsia’s picture

In #120, @webchick asked to not be credited, so unchecking that box.

effulgentsia’s picture

Hm. d.o. seems to not want to allow uncrediting webchick. Perhaps it knows best :)

  • effulgentsia committed ba95bbb on 8.0.x
    Issue #2542748 by plach, effulgentsia, jhedstrom, Gábor Hojtsy, alexpott...
effulgentsia’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed to 8.0.x. Thanks everyone, and especially @plach, for the great work on this.

Here's some feedback for noncritical followup material.

  1. +++ b/core/modules/node/node.install
    @@ -153,3 +151,73 @@ function node_uninstall() {
    +  // https://www.drupal.org/node/2542748. In between, sites could have
    +  // performed interim updates, which would have included automated entity
    +  // schema updates prior to that being removed (see that issue for details).
    +  // Therefore, we check for whether the keys have already been installed.
    

    We no longer do this check (at least not explicitly here), so some of this comment can be removed.

  2. +++ b/core/modules/node/node.install
    @@ -153,3 +151,73 @@ function node_uninstall() {
    +  //   field schema by whether a field is an entity key, so invoke
    +  //   onFieldStorageDefinitionUpdate() with an unmodified
    

    We're not invoking onFieldStorageDefinitionUpdate() (at least not directly).

  3. +++ b/core/modules/system/src/Tests/Entity/EntityDefinitionUpdateTest.php
    @@ -610,21 +610,86 @@ public function testEntityTypeSchemaUpdateAndRevisionableBaseFieldCreateWithoutD
    +    // Ensure that installing an existing entity type is a no-op.
    +    $this->entityDefinitionUpdateManager->installFieldStorageDefinition('new_base_field', 'entity_test_update', 'entity_test', $storage_definition);
    +    $this->assertTrue($db_schema->fieldExists('entity_test_update', 'new_base_field'), 'Installing an existing entity type is a no-op');
    

    Great test but wrong comment and assertion message.

  4. +++ b/core/modules/system/src/Tests/Entity/Update/SqlContentEntityStorageSchemaIndexTest.php
    @@ -19,11 +19,6 @@ class SqlContentEntityStorageSchemaIndexTest extends UpdatePathTestBase {
    -  protected static $modules = ['update_order_test'];
    

    Nothing is using update_order_test any more, so let's remove the module entirely.

  5. +++ b/core/modules/system/src/Tests/Update/DbUpdatesTrait.php
    @@ -0,0 +1,57 @@
    +  protected function runUpdates() {
    

    It's a little weird to have a DbUpdatesTrait::runUpdates() and a UpdatePathTestBase::runUpdates(), and for them to have different implementations. Let's either clean that up or document why they're different.

plach’s picture

Thank you all!

I will post a patch addressing #177 and update CRs later today.

gábor hojtsy’s picture

@plach: updated and published the coder change notice at https://www.drupal.org/node/2554097/revisions/view/8797233/8821371, updated and published the user facing change notice at https://www.drupal.org/node/2554101/revisions/view/8801563/8821351. Hope I got them right.

plach’s picture

Priority: Critical » Normal
Status: Fixed » Needs review
StatusFileSize
new12.25 KB

I updated the CRs with some more details and examples. Here's a patch addressing #177 and performing some additional clean-up (I'll restore the issue priority once it's committed).

plach’s picture

Issue tags: +entity storage
effulgentsia’s picture

#180 addresses 90% of #177, so thank you. But #177.5 is still not fully addressed:

+++ b/core/modules/system/src/Tests/Update/DbUpdatesTrait.php
@@ -32,9 +35,9 @@ protected function enableUpdates($module, $group, $index) {
-   * Runs DB updates.
+   * Applies any pending DB updates through the Update UI.
    */
-  protected function runUpdates() {
+  protected function applyUpdates() {

This is a step in the right direction, but it still doesn't say why a test should call this instead of UpdatePathTestBase::runUpdates(). Really, I'm left with three questions here:

  1. Why/when should a test call applyUpdates() instead of runUpdates()? One reason I can think of is if you don't want runUpdates() asserting that entity schema is up to date, if for example, your test is by design testing the case where it's not. Any other reasons? Let's make sure to get all the important reasons into the docs.
  2. Why does UpdateApiEntityDefinitionUpdateTest not subclass UpdatePathTestBase?
  3. If it did, could/should we move applyUpdates() into that base class so that it's near runUpdates()? At which point, should we rename applyUpdates() to doRunUpdates() and have runUpdates() call it instead of duplicating its implementation?
effulgentsia’s picture

I'd also be ok with any or all of #182 get punted to a new issue.

plach’s picture

@effulgentsia:

UpdatePathTestBase requires a DB dump to work, while UpdateApiEntityDefinitionUpdateTest has no need for that: the effort of creating a sample entity is way less than providing a filled DB dump, at least at the time I wrote the test. Also, while usually the goal of tests extending UpdatePathTestBase is testing some specific, real, update function, the goal of UpdateApiEntityDefinitionUpdateTest is ensuring that the API used in a couple of test update functions behaves correctly. This implies executing updates multiple times instead of performing a single run and comparing that the status after the update is the expected one.

Actually ::runUpdates() and ::applyUpdates() have more or less the same meaning, although the latter does less, i.e. just triggers the updates execution, as it does not need to assert anything. Much like the ::reloadEntity() function that can be found implemented in a few different tests just because it makes the test code more readable. I decided to rename them to avoid confusion, but they could be very well named the same.

I don't see much value in trying to move ::applyUpdates() in UpdatePathTestBase as then using it would mean inheriting from that base class, which is not something one may always wish to do.

Unrelated, I just posted a very rough proposal in #2554911-2: Inform users/developers about entity/field mismatch details if it happens.

plach’s picture

Assigned: Unassigned » effulgentsia

@effulgentsia:

Should I open a separate task to clean this up?

effulgentsia’s picture

Title: Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state » Follow-up to Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state
Assigned: effulgentsia » Unassigned
Status: Needs review » Reviewed & tested by the community

#180 looks like a great improvement to me, so RTBC. #182 / #184 can be a new issue.

alexpott’s picture

Priority: Normal » Critical
Status: Reviewed & tested by the community » Fixed

Committed 7253263 and pushed to 8.0.x. Thanks!

I wish that #180 had been posted on a new issue since on issue followups are bad for so many reasons - for example, it messes with the credit system and it unnecessarily changes issue priority and messes with issue statistics.

  • alexpott committed 7253263 on 8.0.x
    Issue #2542748 by plach: Follow-up to Automatic entity updates can fail...
plach’s picture

Title: Follow-up to Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state » Automatic entity updates can fail when there is existing content, leaving the site's schema in an unpredictable state

Sorry, next time I will open a follow-up

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.