Problem/Motivation

There is no way for a hook_update_N to come before the automated updates done by the entity manager when an entity type changes.

See #2464427-96: Replace CacheablePluginInterface with CacheableDependencyInterface and #2533282-8: Make schema methods' visibility public in SqlContentEntityStorage.

An example problem:

  1. Updates to both views schema and entity type
  2. The views hook_update_N needs to run before the entity type updates because ViewsEntitySchemaSubscriber::onEntityTypeUpdate() saves all views when an entity type changes.

Proposed resolution

Add ability to make updates come before the automatic entity type updates using hook_update_dependencies(). It could return something like:
$dependencies['core']['update_entity_definitions'] = ['views' => 8001];

The current solution makes the automatic entity updates just like a regular update if an update needs to ensure that it comes before or after it then it needs to implement hook_update_dependencies. With this patch if there are no dependencies the order is essential random - it's up to how php manages arrays.

Remaining tasks

  • Agree approach
  • Write patch
  • Review Commit

User interface changes

None

API changes

None

Data model changes

None

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

alexpott’s picture

Status: Active » Needs review
FileSize
10.1 KB

This patch makes the automatic entity update more like a regular update that can be part of hoo_update_dependencies(). It means we have to do some special casing of the update - but I can't see anyway of avoiding that if we want to use the dependency graph to do this.

alexpott’s picture

We don't have to change update_do_one() - we can just call a different operation - like we do in HEAD.

alexpott’s picture

I think we might have to force everything that does not have to come before the automatic entity updates to come after - otherwise the order will be what ever PHP arrays and our graph implementation comes up with.

The last submitted patch, 1: 2535082.1.patch, failed testing.

jibran’s picture

Drupal\system\Tests\Update\DependencyOrderingTest 4 passes 1 fails Test needs an update.

Status: Needs review » Needs work

The last submitted patch, 2: 2535082.2.patch, failed testing.

alexpott’s picture

Status: Needs work » Needs review
FileSize
2.37 KB
11.6 KB

Fixing tests. All installs need the \Drupal::service('entity.definition_update_manager')->applyUpdates(); - not just standard.

jibran’s picture

#7 seems nasty bug. I don't know about update.inc changes but rest seems good to me including tests and it made #2464427: Replace CacheablePluginInterface with CacheableDependencyInterface green. Nice work finding and fixing it @alexpott.

alexpott’s picture

Issue summary: View changes
FileSize
1.21 KB
12.81 KB

Adding some tests for the change in #7

alexpott’s picture

Let's use the core namespace here instead of the system namespace.

alexpott’s picture

Issue summary: View changes
webchick’s picture

Issue tags: +beta target

Since we're aiming to get the beta-to-beta upgrade path supported by the next beta, and since this is a blocker to that happening, tagging "beta target."

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Needs change record

This is pretty nice overall.

Do we need a change record?

Even if it just affects drush I think knowing the new update dependency is important.

alexpott’s picture

I've created https://www.drupal.org/node/2535396 and also added to the issues that introduced the automatic entity storage update code.

Fabianx’s picture

Issue tags: -Needs change record

Wonderful change record!

jhedstrom’s picture

I haven't had a chance to test this yet, but looking at the docs, can a module specify more than one update hook to run first? In head2head, we'll probably want to run *all* of them first.

+++ b/core/lib/Drupal/Core/Extension/module.api.php
@@ -596,6 +596,20 @@ function hook_update_dependencies() {
+  $dependencies['core']['update_entity_definitions'] = array(
+    'mymodule' => 8007
+  );

.

alexpott’s picture

@jhedstrom it just needs to specify a dependency on the latest head2head update as all the others will be guaranteed to run before it. This is how the system works.

jhedstrom’s picture

@alexpott ah, that makes total sense--I was too focused on the details :)

jhedstrom’s picture

Status: Reviewed & tested by the community » Needs review

So I was going to test this in combination with #2530940-12: Provide beta7 update tests, but that test went green with this patch applied to core before I implemented the update dependencies hook. So something is slightly off here, since update hooks are now running before the automated schema updates regardless of hook_update_dependencies.

jhedstrom’s picture

It appears to be due to module weight in update_resolve_dependencies(). The order is fine until

  uasort($graph, array('Drupal\Component\Utility\SortArray', 'sortByWeightElement'));

and then the beta2beta hooks are suddenly on top (having not implemented hook_update_dependencies()).

If I implement hook_update_dependencies and specify that the entity schema updates should run first, that works.

jhedstrom’s picture

Having said all that, does it matter if update_entity_definitions runs in between other update hooks (in a somewhat random order until update hooks start to declare dependencies on it)?

alexpott’s picture

@jhedstrom I documented this behaviour in the IS

The current solution makes the automatic entity updates just like a regular update if an update needs to ensure that it comes before or after it then it needs to implement hook_update_dependencies. With this patch if there are no dependencies the order is essential random - it's up to how php manages arrays.

alexpott’s picture

Status: Needs review » Needs work

The point is now you can control whether an update comes before or after an entity update. However I think we have a problem because of a situation like this.

  • A module provides an update hook 8001 that has to run after the auto update service.
  • The same module then provides an update hook 8002 that has to run before the the auto update service.

The module is then upgraded on a site which is at version 8000. This would result in an uncalculable dependency tree :(

jhedstrom’s picture

re #22 sorry I missed that bit.

Regarding the incalculable dependencies, perhaps a way to also entirely disable the automatic updates for the really tricky situations?

plach’s picture

FileSize
61.59 KB

I discussed #23 with @alexpott and @berdir, you can find the full log attached.

The idea we came up with is assigning an identifier (an hash) to each single entity update, e.g. "add FOO field" or "delete FOO field" and add the ability to specify a dependency for a single entity update.

There are tricky edge cases though (surprise). Consider the following scenario where pseudo-hashes in the Update ID column denote entity updates, and the row order implies update dependencies:

Release Update ID Description
Bar 1.1 8001 Install the Foo submodule that defines the foo1 field.
Bar 1.1 #addf001 Add the foo1 field schema.
Bar 1.2 #addf002 Add the foo2 field schema.
Bar 1.2 8002 Move data from foo1 to foo2.
Bar 1.2 #de1f001 Delete the foo1 field schema.
Bar 1.3 #addf001 Add the foo1 field schema again, with a slightly different purpose.
Bar 1.3 8003 Populate foo1 by deriving its data from foo2 values.

If we apply the updates incrementally everything will work fine (hopefully :), but if we go straight from Bar 1.0 to Bar 1.3 we will have a problem: since the foo1 and foo2 field definitions, would be both defined in the Foo submodule codebase, #addf001 will happen just once, while #de1f001 will never be detected. This will result in unmet dependencies, since #addf001 will be marked as depending on #de1f001 (which is not defined). Ideally the following sequence would work:

  1. 8001
  2. [#de1f001] (ignored)
  3. #addf001
  4. #addf002
  5. 8002
  6. 8003

but only if we ignored the unmet dependency, which may be fair, since blocking the update process on an non-existing update may not make sense. Another possibility could be not adding the dependency, since for incremental updates it wouldn't be needed and when updating from 1.0 to 1.3 it would only be problematic.

Overall I'm not sure whether we should block this bug fix on such an edge case, but maybe someone can come up with a more common use case.

If we decide to implement individual entity updates, we will need to key the change list by update id and add an ::applyUpdate($id) method to the entity update manager. Btw, this will also let modules apply their own updates on install instead of requiring users to apply updates manually after installing them, which feels quite counter intuitive. This would basically address parts of #2346013: Improve DX of manually applying entity/field storage definition updates

jhedstrom’s picture

re #25 this seems overly complex to me.

e.g. "add FOO field" or "delete FOO field" and add the ability to specify a dependency for a single entity update

The problems thus-far encountered weren't around adding or deleting fields in the entity storage, but more simplistic schema changes (eg, the type varchar_ascii change), or unrelated changes that threw exceptions such as the removal of a views plugin.

I think the patch in #10 is probably good as-is (for instance, it allows one to go from beta7 directly to beta12+ within a single run of updates). The alternative being pursued in the head2head module was to set a state that disabled auto-updates (#2532356: Provide a wrapper to core entity.definition_update_manager to temporarily disable updates), and this also worked fine, for a very complex update path (7 to 12), the only difference being that one would have to manually disable auto-updates, run updates, then re-enable, and run updates again.

jhedstrom’s picture

I also meant to add, that for the unresolvable dependency issue noted above, the second update hook could invoke the same updates as the auto-updates by simply including:


  try {
    \Drupal::service('entity.definition_update_manager')->applyUpdates();
  }
  ...

or directly calling update_entity_definitions(), rather than declare those as an update dependency.

Fabianx’s picture

#27: Unfortunately that problem still exists.

- Schema adds new [something]
- Update A_8001 updates the content with a default value for [something]

--

- Schema changes from [something] to [some_bar]
- Update A_8002 wants to fix the data to change from something top some_bar.

=> Per the solution here:

Update A_8002 needs to run first, because else the schema is invalid when it tries to load the data to fix the data.

So:

- Schema adds new [something]
- Update A_8001 updates the content with a default value for [something]

--

- Update A_8002 wants to fix the data to change from something to some_bar
- Schema changes from [something] to [some_bar]

--

However schema updates are not versioned, so when you come from 8000 then you end up with either:

- Schema adds new [something]
- Schema changes from [something] to [some_bar]

- Update A_8001 updates the content with a default value for [something] - FAIL
- Update A_8002 wants to fix the existing data to change from something to some_bar. - FAIL

--

or

- Update A_8001 updates the content with a default value for [something] - FAIL
- Update A_8002 wants to fix the data to change from something to some_bar. - FAIL

- Schema adds new [something]
- Schema changes from [something] to [some_bar]

--

=> Probably entity schema updates need to be versioned and old schema's be preserved.

e.g. a possible solution could theoretically be:

- Schema adds new [something]
- Schema changes from [something] to [some_bar]

- Update A_8001 updates the content with a default value for [something], but has a stored state for schema that is overridden and loaded - OK.
- Update A_8002 wants to fix the existing data to change from something to some_bar, but has a stored state for schema that is overridden and loaded. - OK

=> My proposal:

When you write a data-fixing hook_update_N(), store the current schema state in some dump file and override the entity schema before your update starts, then remove the override again.

That could also be a closure for sanity, e.g.:

executeWithEntitySchemaDump(__DIR__ . "/dumps/8001_schema.dump", function() {
  // Do data updates with old schema.
});

etc.

That would also fix the problem in the IS and entity updates could happily come first again independently.

plach’s picture

@jhedstrom:

The problems thus-far encountered weren't around adding or deleting fields in the entity storage, but more simplistic schema changes (eg, the type varchar_ascii change)

Well, I'm not sure that's a good argument to ignore #23, in fact disabling entity updates altogether might not be a solution, if we end-up with different update functions with competing dependencies on them. However being able to mark an individual update as "fixed" before it runs, and then manually applying equivalent changes in an regular update function should work. And that's one of the proposals in #2346013: Improve DX of manually applying entity/field storage definition updates.

@Fabianx:

Not sure whether a full entity schema dump would help. Regular update functions need to transition between known states. Entity updates are currently a black box, because they are applied in bulk and cannot be identified individually.

All the examples above rely on the implicit assumption of knowing which is the current state, which may be true only if the module is relying on entity updates concerning items the module itself, or a related known module (i.e. with a soft/hard dependency relating the two), is a provider for. IMO a dump would be useful only if it were possible to dump individual changes.

plach’s picture

If targeting individual updates is deemed too complex, also targeting all the updates for a certain provider could work. Mark all the Foo module updates as fixed, and re-create them manually via a regular update function.

jhedstrom’s picture

@plach, sorry, I didn't mean to imply that either #23 or #25 should be ignored. I meant to suggest a workaround for #23 (by manually calling the entity updates from within the second hook). For #25 and #28, perhaps we could do a follow-up major issue, since this patch addresses what will most likely be very common scenarios.

alexpott’s picture

The problem is with the current patch the ordering is not reliable. If you use hook_update_dependencies() to say that update_test_order_update_8002 should come after the entity updates but update_test_order_update_8003 should come before what actually happens is this...

  • If the schema version of update_test_order is not set or 8001 then BOTH update_test_order_update_8002 and update_test_order_update_8003 will run AFTER the entity updates.
  • If the schema version of update_test_order is not set or 8002 then (of course) update_test_order_update_8002 will not run BUT update_test_order_update_8003 will run BEFORE the entity updates.

This behaviour can't be right - but I have no idea what the actual behaviour should be.

alexpott’s picture

Status: Needs work » Needs review
FileSize
24.3 KB

Discussed with @plach and @catch on IRC. @plach and I have concluded that perhaps the best approach is this:

  1. to run all the automatic updates after the hook_update_N since it has become obvious that by running them first we can't make anything consistent
  2. modules should be able to run specific updates early
  3. @plach is also recommending that the post update event is fired after all the hook_update_N's have been fired

The patch attached does 1 and 2 but I can't see how to do 3. Further discussion of that is needed. This patch also calls into question that entity API's are reliable during hook_update_N. (Well actually that is called into question by the issues we're having in HEAD2HEAD and #2464427: Replace CacheablePluginInterface with CacheableDependencyInterface).

No interdiff because it is a new approach it's almost the size of the patch.

plach’s picture

Quick review

  1. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -127,25 +116,100 @@ public function applyUpdates() {
    +  public function entityUpdate($op, $entity_type_id) {
    +    $change_list = $this->getChangeList();
    +    if (!isset($change_list[$entity_type_id]) || $change_list[$entity_type_id]['entity_type'] !== $op) {
    +      return FALSE;
    ...
    +  public function fieldUpdate($op, $entity_type_id, $field_name) {
    +    $change_list = $this->getChangeList();
    +    if (!isset($change_list[$entity_type_id]['field_storage_definitions']) || $change_list[$entity_type_id]['field_storage_definitions'][$field_name] !== $op) {
    +      return FALSE;
    

    Didn't we agree to throw an exception here?

    Also, I'd find applyFieldUpdate and applyEntityUpdate more consistent as names.

  2. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -127,25 +116,100 @@ public function applyUpdates() {
    +    // getChangeList() only disables the cache and does not invalidate.
    ...
    +    // getChangeList() only disables the cache and does not invalidate.
    

    Shouldn't we use ::getChangeList() here?

  3. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -127,25 +116,100 @@ public function applyUpdates() {
    +      case static::DEFINITION_CREATED:
    +        $this->entityManager->onEntityTypeCreate($entity_type);
    +
    +      case static::DEFINITION_UPDATED:
    +        $original = $this->entityManager->getLastInstalledDefinition($entity_type_id);
    +        $this->entityManager->onEntityTypeUpdate($entity_type, $original);
    

    Missing breaks

alexpott’s picture

Thanks @plach

  1. Changed the names. The reason I didn't throw an exception was because I think that update code should have the option of checking the return value and doing different things depending.
  2. Sure - fixed although this is a c&p :)
  3. Fixed
plach’s picture

Thanks @alex, I'll perform a more thorough review later tonight. Meanwhile it would be good to hear what @jhedstrom thinks about this new approach.

Changed the names. The reason I didn't throw an exception was because I think that update code should have the option of checking the return value and doing different things depending.

Couldn't it just catch the exception?

jhedstrom’s picture

The approach outlined in #33 makes perfect sense to me. I've done manual testing with head2head of the patch in #35, and it is working as expected.

+1 for RTBC.

Berdir’s picture

  1. +++ b/core/includes/install.core.inc
    @@ -1656,6 +1656,12 @@ function install_profile_themes(&$install_state) {
     function install_install_profile(&$install_state) {
    +  // Now that all modules are installed, make sure the entity storage and other
    +  // handlers are up to date with the current entity and field definitions. For
    +  // example, Path module adds a base field to nodes and taxonomy terms after
    +  // those modules are already installed.
    +  \Drupal::service('entity.definition_update_manager')->applyUpdates();
    

    Nice to see that install profiles don't have to call this themself anymore.

  2. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -127,25 +116,102 @@ public function applyUpdates() {
    +    // self::getChangeList() only disables the cache and does not invalidate.
    +    // In case there are changes, explicitly invalidate caches.
    +    $this->entityManager->clearCachedDefinitions();
    

    Wondering if we can optimize those calls and do it only once?

    I've seen very long list of field updates, when e.g. a certain field type schema changes. I guess we do a lot of cache clearing right now. That said, if we can check that there are no calls in between those updates usually (unless one is called specifically and I guess that's exactly when we need this call?) then that is also fine. The cache clear itself is fast, we just re-empty a bunch of properties and invalidate tags, which are optimized to be ignored on repeated calls.

Looks good to me a well otherwise. I'll leave it to @plach to do his final review but I'm fine to RTBC this as soon as he is happy with it.

There are still some important things left to do, like the ability to tell the entity manager that you did some manual schema updates that he couldn't handle. But we can tackle that in non-critical issues as they should be API additions (eg. the ability to update last installed field definitions).

plach’s picture

Looks great, only a couple of minor issues, RTBC otherwise.

  1. +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManager.php
    @@ -127,25 +116,102 @@ public function applyUpdates() {
    +   * @param string $op
    +   *   The operation to perform - either static::DEFINITION_CREATED or
    +   *   static::DEFINITION_UPDATED
    
    +++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManagerInterface.php
    @@ -83,4 +83,55 @@ public function getChangeSummary();
    +   * @param string $op
    +   *   The operation to perform - either static::DEFINITION_CREATED or
    +   *   static::DEFINITION_UPDATED
    ...
    +   * @param string $op
    +   *   The operation to perform - either static::DEFINITION_CREATED or
    +   *   static::DEFINITION_UPDATED
    

    Missing trailing dot.

  2. +++ b/core/modules/system/src/Tests/Entity/EntityDefinitionUpdateTest.php
    @@ -604,4 +605,23 @@ public function testEntityTypeSchemaUpdateAndRevisionableBaseFieldCreateWithoutD
    +    // Create a new base field.
    +    $this->addRevisionableBaseField();
    +    $this->assertTrue($this->entityDefinitionUpdateManager->applyFieldUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_CREATED, 'entity_test_update', 'new_base_field'), 'Calling applyFieldUpdate() correctly returns TRUE.');
    +    // Make the entity type revisionable.
    +    $this->updateEntityTypeToRevisionable();
    +    $this->assertTrue($this->entityDefinitionUpdateManager->applyEntityUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_UPDATED, 'entity_test_update'), 'Calling applyEntityUpdate() correctly returns TRUE.');
    

    Shouldn't we assert also that the schema was actually updated correctly?

There are still some important things left to do, like the ability to tell the entity manager that you did some manual schema updates that he couldn't handle. But we can tackle that in non-critical issues as they should be API additions (eg. the ability to update last installed field definitions).

+1

I think #2346013: Improve DX of manually applying entity/field storage definition updates is the right issue for that.

effulgentsia’s picture

This patch looks great, and I agree is a step forward, so +1 for RTBC and commit once Berdir, plach, and alexpott are satisfied.

I don't think the use cases from #28 have been addressed though. For example, let's say path.module has a 8001 that removes its custom storage and lets regular field storage be used, and includes its own update function to migrate the data from the old custom storage to the regular field storage. Then, let's say path.module has a 8002 that renames the pid property to path_id (core wouldn't do such a thing, cause that would be an API change, but a contrib field type module might, especially during its beta cycle).

So, how would such a module do its update if a site goes straight from the pre-8001 codebase to the 8002 codebase? Because path_update_8001(), if it calls \Drupal::service('entity.definition_update_manager')->applyFieldUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_UPDATED, 'node', 'path'), with the most recent codebase, would end up with a schema whose column is already path_id, but the rest of the path_update_8001() function assumes pid.

Possible solutions:

  • Make the new apply*Update() functions take optional definition objects, not just names, so a hook_update_N() function can put a schema into a fixed state.
  • Have cases like this go back and retroactively modify their old hook_update_N() functions to match the new codebase.
  • Other ideas?

I don't think we need to solve that in this issue, but wondering if there's any sense on if/how we'll solve it and what to tell developers in the meantime.

Berdir’s picture

I don't think that specific example could work since I very much doubt we can automatically update a schema that already exists and is a new entity type, so I guess that would need to be worked out manually anyway. But I get what you mean.

As discussed above, I think that #2346013: Improve DX of manually applying entity/field storage definition updates will need to add public API to set the last installed field storage definitions/schema manually. That might require quite a bit of code to initialize them but should be able to deal with some cases.

Other ideas:
* Introduce some sort of callback system for the entity type/field type providers. Where the entity manager can call out to those and tell them that he has a set of old/changed/new/deleted field storage definitions (entity type provider) or an old/new single storage definition (field type) that he can't handle. Then the can do whatever they need to get from the old state to the new. Including things like renames if it actually is a rename. Those callbacks could get very complicated as well if they have to support a set of different changes over time in every combination but that might be better suited to how entity schema updates work? For example, they only every have to worry about getting to the final state from different originating states and can be updated over time.

* Push the schema introspection API, so that the entity update manager doesn't have to rely on stored schemas that he *thinks* is what in the database but just ask the database about the current schema. That could avoid the thing above that requires us to manually build and store the current state.

Fabianx’s picture

#41 sounds very sensible to me.

1. The callbacks idea sounds nice and feels like a proper step - even if that can get complicated, too - it allows hooking into the updates to do things, when they are done, so schema and DB are always in sync.

2. I think we should do that anyway. Is that currently still on the table for 8.0.x or was it postponed to 8.1.x? (Or in other words: Do you have an issue link?)

catch’s picture

#2 there is #1119094: Add a test to verify the schema matches after a database update (which I've just re-opened as it's relevant now again). I can't find an issue for the schema introspection API itself yet.

alexpott’s picture

Addressed @Berdir's review in #38 and @plach's review in #39.

It seems we have all the follow-ups in place.

I still have a concern about firing the event and views listener as the only reason this fixes #2464427: Replace CacheablePluginInterface with CacheableDependencyInterface is that there are no updates which call applyEntityUpdate or applyFieldUpdate. That said I guess I've answered this already if that happens then we just need to add a hook_update_dependencies() for that update and say it has to run after the views update in #2464427: Replace CacheablePluginInterface with CacheableDependencyInterface. This makes me concerned about the number of hook_update_dependencies() implementations we're going to have in 8.

effulgentsia’s picture

Status: Needs review » Reviewed & tested by the community

#38 and #39 said they were +1 on RTBC once those nits were addressed, and #44 addresses them, so RTBC. Great work here!

dawehner’s picture

NIce work!

xjm’s picture

Overall this approach makes sense to me and the docs/tests/etc. look great.

+++ b/core/lib/Drupal/Core/Entity/EntityDefinitionUpdateManagerInterface.php
@@ -83,4 +83,65 @@ public function getChangeSummary();
+   * This method should be used from hook_update_N functions to process entity
+   * definition updates as part of the update function. This is only necessary
+   * if the hook_update_N implementation relies on the entity definition update.
...
+   * This method should be used from hook_update_N functions to process field
+   * storage definition updates as part of the update function. This is only
+   * necessary if the hook_update_N implementation relies on the field storage
+   * definition update.

Doesn't need to block the patch, but I think it would be good to explicitly state here that the entity updates will happen automatically at the end if this is not used. (Also, hook_update_N() should have parens so api.d.o can automatically link it.)

(Still reviewing the patch.)

jhedstrom’s picture

I can't find an issue for the schema introspection API itself yet

There's #301038: Add a cross-compatible database schema introspection API, which was the @todo linked from #2497323: Create a php script that can dump a database for testing update hooks.

jhedstrom’s picture

xjm’s picture

Assigned: Unassigned » xjm
  1. +++ b/core/includes/install.core.inc
    @@ -1656,6 +1656,12 @@ function install_profile_themes(&$install_state) {
     function install_install_profile(&$install_state) {
    +  // Now that all modules are installed, make sure the entity storage and other
    +  // handlers are up to date with the current entity and field definitions. For
    +  // example, Path module adds a base field to nodes and taxonomy terms after
    +  // those modules are already installed.
    +  \Drupal::service('entity.definition_update_manager')->applyUpdates();
    
    +++ b/core/profiles/standard/standard.install
    @@ -16,12 +16,6 @@
    -  // Now that all modules are installed, make sure the entity storage and other
    -  // handlers are up to date with the current entity and field definitions. For
    -  // example, Path module adds a base field to nodes and taxonomy terms after
    -  // those modules are already installed.
    -  \Drupal::service('entity.definition_update_manager')->applyUpdates();
    

    Well THIS is interesting... uh. So yes, profiles other than standard should get the entity updates that are necessary based on things done during installation...

    I started to try to understand the big picture for install_install_profile by looking at install_tasks(), and regretted it.

  2. +++ b/core/includes/update.inc
    @@ -224,14 +224,10 @@ function update_do_one($module, $number, $dependency_map, &$context) {
    - * @param $module
    - *   The module whose update will be run.
    - * @param $number
    - *   The update number to run.
    ...
    -function update_entity_definitions($module, $number, &$context) {
    +function update_entity_definitions(&$context) {
    
    +++ b/core/modules/system/src/Controller/DbUpdateController.php
    @@ -578,7 +571,7 @@ protected function triggerBatch(Request $request) {
    -    foreach ($updates as $update) {
    +    foreach ($updates as $function => $update) {
    
    @@ -587,11 +580,18 @@ protected function triggerBatch(Request $request) {
    -        $function = $update['module'] . '_update_' . $update['number'];
    

    I looked at this awhile and talked to @alexpott to convince myself that change didn't mean that entity update tasks items would overwrite each other, but @alexpott pointed out that there aren't ever multiple entity update batch items--this one batch item runs all of them, no matter how many there are. In the one use of the callback in DbUpdateController::triggerBatch() where it sets the batch, these parameters are actually hardcoded for that one use ever. And, as far as I can see, there is no way ever to alter it. Unless, like, someone altered the route to override the batch, but that is cloud cuckoo land. (Pronounced "coo-coo".)

    I also asked myself whether it was possible any other code anywhere was calling the batch operation directly such that we would need to document this BC break, but it seems extremely unlikely, so I don't think we should create a CR. Which is probably obvious but I just wanted to make sure we'd thought about it.

  3. +++ b/core/modules/system/src/Controller/DbUpdateController.php
    @@ -587,11 +580,18 @@ protected function triggerBatch(Request $request) {
    +    // Lastly, perform entity definition updates, which will update storage
    +    // schema if needed. If module update functions need to work with specific
    +    // entity schema they should call the entity update service for the specific
    +    // update themselves.
    

    "Call the entity update service" is vague -- an @see to the fully namespaced relevant methods would be helpful.

  4. +++ b/core/modules/system/src/Tests/Entity/EntityDefinitionUpdateTest.php
    @@ -604,4 +605,25 @@ public function testEntityTypeSchemaUpdateAndRevisionableBaseFieldCreateWithoutD
    +    $this->assertFalse($this->entityDefinitionUpdateManager->applyEntityUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_CREATED, 'foo'), 'Calling applyEntityUpdate() with an non-existent entity returns FALSE.');
    

    Minor: All five of these assertions say "an non-existent".

  5. +++ b/core/modules/system/src/Tests/Entity/EntityDefinitionUpdateTest.php
    @@ -604,4 +605,25 @@ public function testEntityTypeSchemaUpdateAndRevisionableBaseFieldCreateWithoutD
    +    $this->assertFalse($this->entityDefinitionUpdateManager->applyEntityUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_CREATED, 'entity_test_update'), 'Calling applyEntityUpdate() with an nonsense $op returns FALSE.');
    +    $this->assertFalse($this->entityDefinitionUpdateManager->applyFieldUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_DELETED, 'entity_test_update', 'new_base_field'), 'Calling applyFieldUpdate() with an nonsense $op returns FALSE.');
    

    It's unclear just by reading the test why these are "nonsense ops". I am guessing because the entity was not created in the pending update and the field was not deleted?

  6. +++ b/core/modules/system/src/Tests/Entity/Update/SqlContentEntityStorageSchemaIndexTest.php
    @@ -30,13 +35,30 @@ public function setUp() {
         $this->assertTrue(db_index_exists('node_field_data', 'node__default_langcode'), 'Index node__default_langcode exists prior to running updates.');
    ...
    +    $this->assertTrue(\Drupal::state()->get('update_order_test_update_8001', FALSE), 'update_order_test_update_8001 runs before entity type updates.');
    
    +++ b/core/modules/system/tests/modules/update_order_test/update_order_test.install
    @@ -0,0 +1,40 @@
    +    \Drupal::state()->set('update_order_test_update_8001', db_index_exists('node_field_data', 'node__default_langcode'));
    

    This state flag and the assertion message with it seem kind of confusing and maybe could use an inline comment or something. I think part of the problem is that the state key name does not describe what its contents actually are. See the quoted context line for an assertion message that actually explains what's going on.

  7. +++ b/core/modules/system/src/Tests/Entity/Update/SqlContentEntityStorageSchemaIndexTest.php
    @@ -30,13 +35,30 @@ public function setUp() {
    +    // Ensure that hook_update_N implementations are in the expected order.
    ...
    +    // Node update has been run during update_order_test_update_8002.
    ...
    +    // Ensure that the base field created by update_order_test_update_8002 is
    ...
    +    // User update has not run during update_order_test_update_8002.
    

    Minor: more missing parens.

  8. +++ b/core/modules/system/tests/modules/update_order_test/update_order_test.install
    @@ -0,0 +1,40 @@
    +/**
    + * Only declare the update hooks once the test is running.
    + */
    +if (\Drupal::state()->get('update_order_test', FALSE)) {
    
    +++ b/core/modules/system/tests/modules/update_order_test/update_order_test.module
    @@ -0,0 +1,27 @@
    + * Only declare the new entity base field once the test is running.
    + */
    +if (\Drupal::state()->get('update_order_test', FALSE)) {
    

    This is some magical s@#$ with conditionally declaring the hook implementations, but it seems okay to make the test work.

  9. +++ b/core/modules/system/tests/modules/update_order_test/update_order_test.install
    @@ -0,0 +1,40 @@
    +  function update_order_test_update_8002() {
    +    \Drupal::state()->set('update_order_test_update_8002_update_order_test_before', db_field_exists('node_field_data', 'update_order_test'));
    +    if (\Drupal::service('entity.definition_update_manager')->applyFieldUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_CREATED, 'node', 'update_order_test')) {
    +      \Drupal::state()->set('update_order_test_update_8002_update_order_test_after', db_field_exists('node_field_data', 'update_order_test'));
    +    }
    +
    +    if (\Drupal::service('entity.definition_update_manager')->applyEntityUpdate(EntityDefinitionUpdateManagerInterface::DEFINITION_UPDATED, 'node')) {
    +      // Node update has run.
    +      \Drupal::state()->set('update_order_test_update_8002_node__default_langcode', db_index_exists('node_field_data', 'node__default_langcode'));
    +      \Drupal::state()->set('update_order_test_update_8002_node__id__default_langcode__langcode', db_index_exists('node_field_data', 'node__id__default_langcode__langcode'));
    +      // User update has not run.
    +      \Drupal::state()->set('update_order_test_update_8002_user__id__default_langcode__langcode', db_index_exists('users_field_data', 'user__id__default_langcode__langcode'));
    

    This one could actually use some inline comments clarifying as well; it took me awhile to puzzle out.

  10. +++ b/core/profiles/minimal/src/Tests/MinimalTest.php
    @@ -34,5 +34,10 @@ function testMinimal() {
    +
    +    // Ensure that there are no pending updates after installation.
    +    $this->drupalLogin($this->rootUser);
    +    $this->drupalGet('update.php/selection');
    +    $this->assertText('No pending updates.');
    
    +++ b/core/profiles/standard/src/Tests/StandardTest.php
    @@ -152,6 +152,11 @@ function testStandard() {
    +    // Ensure that there are no pending updates after installation.
    +    $this->drupalLogin($this->rootUser);
    +    $this->drupalGet('update.php/selection');
    +    $this->assertText('No pending updates.');
    

    +1 for adding these assertions.

I'll try to improve some of the above.

Also, I was initially confused by the CR: https://www.drupal.org/node/2535396 since the title doesn't really seem to describe the change introduced by this patch. It's amazing that we never had a CR at all for the original beta blocker, but I searched for awhile and couldn't find one. The only one is the original CR for the addition of automatic schemata for entities: https://www.drupal.org/node/2259243 I'll see if I can add anything to the CR for this issue to spare others the confusion of "huh, that changed a long time ago". (Edit: I didn't do that last bit.)

xjm’s picture

Assigned: xjm » Unassigned
FileSize
28.49 KB
15.24 KB

Okay, that was unexpectedly brain-bending. I've added some detailed documentation to the test that should hopefully aid future generations.

SqlContentEntityStorageSchemaIndexTest is probably not named correctly anymore; it should be something like SqlContentEntityStorageSchemaUpdateTest. But it didn't seem worth adding noise to the patch for that.

Leaving at RTBC because it's just docs and assertion text updates.

xjm’s picture

Title: Allow hook_update_N implementations to run before the automated entity updates » Allow hook_update_N() implementations to run before the automated entity updates

cuz

xjm’s picture

Shouldn't the CR https://www.drupal.org/node/2535396 include examples of the API added by this patch as well? I.e. the "during" case (I think).

effulgentsia’s picture

Re #53, yeah, the CR was written in #14, but the patch completely changed approach in #33, so let's get that updated. Keeping at RTBC though for maximum exposure of the patch prior to it being committed.

alexpott’s picture

Nice updates @xjm. I've updated the CR

Re #50.2 emma disagrees

plach’s picture

Assigned: Unassigned » catch

This looks ready to me, RTBC +1. Assigning to @catch since this was discussed with him at length.

Can be fixed on commit:

+++ b/core/modules/system/src/Tests/Entity/Update/SqlContentEntityStorageSchemaIndexTest.php
@@ -32,32 +32,50 @@ public function setUp() {
+    // 4. update_order_test_update_8002() explicitly applies the updates for
+    //    the update_order_test_field storage. See update_order_test.module.
+    // 5. update_order_test_update_8002() explicitly applies the updates for
+    //    the node entity type indices listed above.

+++ b/core/modules/system/tests/modules/update_order_test/update_order_test.install
@@ -23,16 +27,25 @@ function update_order_test_update_8001() {
+
+      // User updates have not yet run. Check and store whether the updated
+      // user indices now exist.

These do not wrap at column 80 correctly.

catch’s picture

Status: Reviewed & tested by the community » Fixed

I looked at plach's comment length to fix on commit, but I think it might have been missing a space because I got to 81 chars, so left those as is.

I think we have more to do here, but apart from #2464427: Replace CacheablePluginInterface with CacheableDependencyInterface we're missing concrete use-cases, so this feels as far as we can go for now.

Committed/pushed to 8.0.x, thanks!

  • catch committed 4f60dd5 on 8.0.x
    Issue #2535082 by alexpott, jhedstrom, xjm, plach, Fabianx, effulgentsia...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

divined’s picture

The same error on 8.2:

  $ids = [
    'contact_message.field_1',
    'contact_message.field_2',
  ];

  if (!$field_storage_configs = \Drupal::entityManager()->getStorage('field_storage_config')->loadMultiple($ids)) {
    return;
  }

  foreach ($field_storage_configs as $field_storage) {
    $new_field_storage = $field_storage->toArray();
    $new_field_storage['settings']['max_length'] = 50;
    $new_field_storage = FieldStorageConfig::create($new_field_storage);
    $new_field_storage->original = $new_field_storage;
    $new_field_storage->enforceIsNew(FALSE);

    $new_field_storage->save();
  }

Failed: The SQL storage cannot change the schema for an existing field (field_1 in contact_message entity) with data.