Problem/Motivation

When updating the storage definition of an identifier field (e.g. 'id' or 'revision_id'), the primary keys are not updated properly in the SQL schema.

This was the original scope of #2841291: Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field but that has expanded and changed scope significantly so that this now splits it into a separate issue (again).

Proposed resolution

Add primary keys when adding a field that is part of the primary key.

CommentFileSizeAuthor
#71 2938951-71.patch12.39 KBtstoeckler
#71 2938951-65-71-interdiff.txt7.05 KBtstoeckler
#65 2938951-65.patch12.17 KBtstoeckler
#65 2938951-61-65-interdiff.txt1.81 KBtstoeckler
#61 2938951-61.patch12.1 KBtstoeckler
#61 2938951-61--tests-only.patch7.99 KBtstoeckler
#61 2938951-57-61-interdiff.txt864 byteststoeckler
#57 2938951-57--for-testbot.patch31.23 KBtstoeckler
#57 2938951-57--tests-only.patch28.04 KBtstoeckler
#57 2938951-57--for-review.patch11.94 KBtstoeckler
#57 2938951-56-57--interdiff.txt2.21 KBtstoeckler
#56 2938951-56--for-humans.patch11.8 KBtstoeckler
#56 2938951-54-56--interdiff.txt7.54 KBtstoeckler
#54 2938951-54.patch17.79 KBtstoeckler
#54 2938951-53-54-interdiff.txt1.06 KBtstoeckler
#53 2938951-53.patch17.78 KBtstoeckler
#53 2938951-52-53-interdiff.txt1.32 KBtstoeckler
#52 2938951-52.patch16.45 KBtstoeckler
#52 2938951-45-52-interdiff.txt2.95 KBtstoeckler
#45 interdiff-45.txt2.57 KBamateescu
#45 2938951-45.patch16.24 KBamateescu
#43 interdiff-43.txt1.79 KBamateescu
#43 2938951-43.patch16.42 KBamateescu
#40 interdiff-40.txt10.18 KBamateescu
#40 2938951-40.patch14.94 KBamateescu
#40 2938951-40-test-only.patch12.13 KBamateescu
#37 2938951-37.patch20.15 KBtstoeckler
#33 2938951-33.patch20.14 KBtstoeckler
#33 2938951-31-33-interdiff.txt1.76 KBtstoeckler
#30 2938951-31.patch19.13 KBtstoeckler
#30 2938951-24-30-interdiff.txt4.76 KBtstoeckler
#24 2938951-20.patch20.73 KBtstoeckler
#23 2938951-17-20-interdiff-next-try.patch1.35 KBtstoeckler
#20 2938951-20.patch20.73 KBtstoeckler
#20 2938951-17-20-interdiff.txt978 byteststoeckler
#17 2938951-17.patch20.79 KBtstoeckler
#17 2938951-17-update-fixes-interdiff.txt5.25 KBtstoeckler
#17 2938951-17-more-tests-interdiff.txt3.07 KBtstoeckler
#17 2938951-17-fixes-7-interdiff.txt2.02 KBtstoeckler
#13 interdiff-13.txt4.15 KBamateescu
#13 2938951-13.patch17.19 KBamateescu
#12 2938951-11.patch15.53 KBtstoeckler
#12 2938951-8-11-interdiff.txt1.88 KBtstoeckler
#8 2938951-7.patch13.97 KBtstoeckler
#8 2938951-7--contact-fail.patch12.79 KBtstoeckler
#8 2938951-0-7-interdiff.txt6.49 KBtstoeckler
entity-primary-key-3.patch9.65 KBtstoeckler
entity-primary-key-2--tests-only.patch6.55 KBtstoeckler
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tstoeckler created an issue. See original summary.

The last submitted patch, entity-primary-key-2--tests-only.patch, failed testing. View results

amateescu’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Field/FieldStorageDefinitionListener.php
    @@ -90,6 +90,14 @@ public function onFieldStorageDefinitionCreate(FieldStorageDefinitionInterface $
    +      // If the field being created is the ID field make sure the ID key is
    +      // registered correctly.
    +      $id_key = $this->entityTypeManager->getDefinition($entity_type_id)->getKey('id');
    +      if (!$last_installed_entity_type->hasKey('id') && ($storage_definition->getName() === $id_key)) {
    +        $keys = $last_installed_entity_type->getKeys();
    +        $keys['id'] = $id_key;
    +        $last_installed_entity_type->set('entity_keys', $keys);
    +      }
    

    Why is this needed? I ran the new test coverage manually and it passes without this change.

  2. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
    @@ -109,6 +113,32 @@ public function testEntitySchemaUpdate() {
    +  public function testPrimaryKeyUpdate() {
    

    We should also test all the other shared tables of a translatable and revisionable entity type, so we have to cover the id and revision entity keys on the:

    - base_table
    - data_table
    - revision_table
    - revision_data_table

  3. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
    @@ -109,6 +113,32 @@ public function testEntitySchemaUpdate() {
    +    $field = BaseFieldDefinition::create('string')
    

    Is there any reason for creating this ID field as string instead of integer? I think it is important to create it as integer because that maps to a serial (auto-incremented) field in the database.

tstoeckler’s picture

Yay, thanks for taking a look at this. I assume this means you are not opposed to tackling this in a separate issue, which is great news!

1.

See #2488258-89: [ignore] Patch testing issue where I checked whether splitting this is feasible. What happens is that various ContactStorageTests install an ID field to the contact message entity which generally doesn't have an ID and, thus, also do not have an id entity key set. And since we explicitly use the last installed entity type during the update calling $entity_type->getKey('id') when building the schema fails.

Note that the patch I linked to there still has the "drop-primary-key" part which I thought was necessary but turns out not to be necessary after all. I don't think that's related to this, though. I can try out another patch with that removed and the hunk you mentioned removed, as well, if you want to be 100% sure.

2.

Sure, that makes total sense. Will tackle that in my next round here (but could be a while, maybe on the weekend...).

3.

Good point! My original intention for the text was to not to uninstall and re-install the field storage definition explicitly but to update it (which then ends up being the same thing, if there's no data, as you know) and then I would have needed some way to assert that the update actually happened, which I wanted to do by checking if the column type was changed from serial to varchar. That turned to be both rather complex and (I think ?!) unnecessary, which is why I then wrote the test as it is now.

So unless I'm missing something else, you're absolutely right, and we should switch (back) to the standard integer ID field.

Thanks for the awesome review.

amateescu’s picture

Oh, yes, I'm definitely in favor of splitting that monster from #2841291: Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field into smaller chunks that actually have a chance at proper reviews :)

For #4.1: yup, let's try a patch without that hunk because I don't think the code which creates a new field storage definition should have anything to do with setting an entity key. At best, it's those contact tests which should be fixed to also add the entity key if they're adding an ID field.

tstoeckler’s picture

Re #5: Awesome! ;-)

Regarding the contact stuff: The tests themselves arent't actually doing anything, it's actually contact_storage_test_entity_base_field_info() that's adding the field. And surely enough contact_storage_test_entity_type_alter() does add the ID key (I guess that should be done in *_build(), not *_alter() - strictly speaking - but ignoring that for now). The problem is that we use the last installed installed entity type at this point and that doesn't have that key - and rightly so. In any case, will upload a patch without that to see who's right ;-)

hchonov’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Database/AssertSchemaTrait.php
@@ -0,0 +1,54 @@
+    $db_type = Database::getConnection()->databaseType();
...
+        $result = Database::getConnection()->query("SHOW KEYS FROM {" . $table_name . "} WHERE Key_name = 'PRIMARY'")->fetchAllAssoc('Column_name');
...
+        $result = Database::getConnection()->query("SELECT a.attname, format_type(a.atttypid, a.atttypmod) AS data_type
...
+        $schema_object = Database::getConnection()->schema();

Not really important, but the connection object could be stored in a variable, as it is retrieved 4 times.

tstoeckler’s picture

OK, here we go. This fixes 2. and 3. and I am providing a patch to (hopefully) prove that the hunk mentioned in 1. is actually necessary.

The last submitted patch, 8: 2938951-7--contact-fail.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 8: 2938951-7.patch, failed testing. View results

amateescu’s picture

Ok, so the contact stuff does fail without that change, but I hope you agree that we need to fix that in contact_storage_test_entity_type_alter() rather than \Drupal\Core\Field\FieldStorageDefinitionListener::onFieldStorageDefinitionCreate() :)

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
1.88 KB
15.53 KB

Hah, thanks for asking for that test coverage. The test was green for me locally, but before creating the patch I noticed I had the "drop-primary-key" part still applied (which I then removed before creating the patch). So now I know why I introduced that in the first place! We now have dedicated test coverage proving that it - or at least something like it - is necessary. Yay! That said, feel free to come up with something less ugly to make the tests pass. I have to admit that I don't fully grok that part of the problem, including why the hunk fixes it, off the top of my head. I know that I spent quite some time debugging that, but I don't remember the exact details.

Also, the additional contact storage failures showed up for the first patch in #8 like I expected proving that hunk to be necessary.

So here's a patch with the test coverage from #8 and the FieldStorageDefinitionListener hunk + the "drop-primary-key" hunk that I cooked up somewhere in the depth of #2841291: Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field.

Oops:
X-post with #7 and #11.

Re #7: Thanks. This is only copied, but I do think it makes sense to fix these things, since we are effectively introducing new code here. Will fix in the next iteration

Re #11: Can you elaborate on what you think the fix should be? I tried to explain in #6 why I think the fix is correct, so as of now, I in fact do not agree ;-) There's a good chance you're right (in fact in this area of the code I'd bet on you rather than me, if I had to put money on it ;-)), but I don't really understand what you're getting at specifically.

amateescu’s picture

What I was getting at is that FieldStorageDefinitionListener::onFieldStorageDefinitionCreate() does not (and should not) have anything to do with entity keys, it's the responsibility of the module which alters an entity type definition to inform the definition update manager of its changes.

Here's what I think the correct fix should be :)

tstoeckler’s picture

Wow, didn't know that you could implement hook_module_preinstall() for your own module, very nice.

I'm actually not sure I agree with you conceptually. For reasons that are unknown to me we have chosen to handle creation of module-provided fields automatically in ModuleInstaller. So if we have made that "just work" for normal fields, I think it should also "just work" for ID fields. So while introducing knowledge about entity keys to FieldStorageDefinitionListener is not super nice, it's "fairer" IMO than making Contact Storage Test account for this on its own. On the other hand, I find it rather bizarre that we support entity types without an ID key in the first place, so I'm not too excited about the whole Contact Storage Test thing in the first place. At the end of the day, I don't really feel strongly either way. So I'm fine with this, assuming it's green.

tstoeckler’s picture

Awesome. So @amateescu do you have any thoughts on the "drop-primary-key" part? Other than that I think this should be ready to fly.

amateescu’s picture

Sorry, took me a while to re-wrap my head around this code, but now that I managed to visualize it again, here's my thoughts on the "drop-primary-key" part:

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1439,7 +1452,24 @@ protected function deleteSharedTableSchema(FieldStorageDefinitionInterface $stor
    -          // Drop indexes and unique keys first.
    +          // Drop the primary key, indexes and unique keys first.
    

    I don't think this is the right place to do this. It works in the current patch because in the new test you are testing an entity type without any data in its base tables, so you are basically testing the first if branch from \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::updateSharedTableSchema().

    We need the same tests with a "filled" state, which would also go through the top-level else branch of \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::updateSharedTableSchema(), which, in turn, leads to the fun scenario of having to deal with getSharedTableFieldSchema() :)

    I guess this basically means we can not split this patch from #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes, unless we only want to deal with empty data tables in this issue.. which is not a very realistic scenario since most (all?) updates on these fields will be done on non-empty tables..

  2. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
    @@ -110,6 +115,145 @@ public function testEntitySchemaUpdate() {
    +    // Then uninstall the field and make sure all primary keys that the field
    +    // is part of have been removed.
    

    At first read, this comment sounded really wrong, but after giving it more than a minute's thought it actually makes sense, since a composite primary key does not make sense anymore once you remove one of its fields.

tstoeckler’s picture

Thanks for the input!!!

Will explain a bit more later, but right now just pressing Save before my battery dies...

tstoeckler’s picture

OK, here we go.

I agree totally with your assessment that we should test and fix both cases. Thanks for pointing that out. In all the mess of these different issues, I had completely lost track of that not-so-minor detail.

In the test I explicitly used the explicit ->delete*() and ->install*() calls to be able to test that the primary key is gone in between. When calling just ->update*() we can only assert that the primary keys are there after the update and we have to assume that they are there again after having been deleted. At least I couldn't think of any way to assert that.

But, you are 100% correct, that we should test that, so I amended the test to also test the ->update* code path now. The first interdiff then only introduces that without adding any data, because that actually revealed that the test code was using an incorrect field name for the revision ID field. Therefore I also split that out into a separate interdiff (as I didn't know at the time whether the "combined" patch would be green).

Then I added test code that saves an entity and then performs the update again. Surely enough, I hit #2928906: The field schema incorrectly stores serial fields as int so I had to manully add entity IDs and revision IDs. I'd really like to not re-merge those, though. And like you anticipated I hit #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes again. I worked around the dependency, though, similarly to what I did here and in #2841291: Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field. Once this is RTBC I will tackle #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes next, though, so hopefully those workarounds won't be that long-lived. By the way, can I take your mention of that issue as a sign that you generally agree with the direction of that issue? That would be rather sweet.

So yeah. I'm pretty happy that the patch is green, so for me personally this would RTBC material, as it does fix a nasty bug (proven by the tests), but I won't fight you if you would like me to merge one of the other issues back in.

amateescu’s picture

I just read through the patch from #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes and it's pretty clear for me now why you prefer to tackle them separately. If we would merge them, the result is going to end up with a very hard to review patch, much like how #2841291: Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field ended up back in the day :)

So, yes, I agree with doing this as a first step. Just a couple of minor nits before RTBC:

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1439,7 +1452,8 @@ protected function deleteSharedTableSchema(FieldStorageDefinitionInterface $stor
    +          $this->dropPrimaryKey($table_name, $column_names);
               if (!empty($schema['indexes'])) {
    

    Needs an empty line here :)

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1625,7 +1639,10 @@ protected function updateSharedTableSchema(FieldStorageDefinitionInterface $stor
    +            // Drop the primary key, indexes and unique keys first.
    +            $this->dropPrimaryKey($table_name, $column_names);
    +
    +            // Drop original primary keys, indexes and unique keys.
    

    The comment is duplicated at this point, so we need to get rid of the second one, and transfer the "original" part to the first comment.

tstoeckler’s picture

Title: The primary key is not correctly re-created when updating the storage definition of an identifier field » The primary key is not correctly re-created when updating the storage definition of an identifier field
FileSize
978 bytes
20.73 KB

So, yes, I agree with doing this as a first step.

That is great news! I do still feel that with all these issues I am treading in less shallow waters than I am generally comfortable with, so your validation really of the issues' direction and how they play together does mean a lot. Yay!

Fixed the comments in #19.

tstoeckler’s picture

Title: The primary key is not correctly re-created when updating the storage definition of an identifier field » The primary key is not correctly re-created when updating the storage definition of an identifier field

Status: Needs review » Needs work

The last submitted patch, 20: 2938951-20.patch, failed testing. View results

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
1.35 KB

Interdiffs are hard.

tstoeckler’s picture

Ahhhhh. No, don't test it. OMG, this is embarassing...

Wanted to retest it anyway...

The last submitted patch, 23: 2938951-17-20-interdiff-next-try.patch, failed testing. View results

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

Let's do this! :)

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/modules/contact/tests/modules/contact_storage_test/contact_storage_test.install
@@ -6,18 +6,24 @@
+function contact_storage_test_module_preinstall($module) {
+  // This test module alters the definition of the 'contact_message' entity type
+  // so we need to inform the system about our changes before additional field
+  // storage definitions are added by the module installer.
   // @see contact_storage_test_entity_type_alter()
-  $original = clone $entity_type;
-  $original->setStorageClass('Drupal\Core\Entity\ContentEntityNullStorage');
+  // @see contact_storage_test_entity_base_field_info
+  if ($module === 'contact_storage_test') {

I did know know that module's could react to their own pre-installation. Funny and kinda cool. Is it not possible that this change will require similar changes in contrib modules that do similar things to contact_storage_test? (For example, https://www.drupal.org/project/contact_storage). Will they break? The contact_storage module's test fail after this change on core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php:1268 with Undefined offset: 0

tstoeckler’s picture

Re #27: Can you take a look at #13 and #14? I think that @amateescu's point would be that contact_storage currently alters the entity type incorrectly which is exposed by this issue and it should be updated in the same way that contact_storage_test is in this patch.

Alternatively, like I argue in #14 we could consider it a bug that it's currently impossible to provide an ID field for a different entity type without the workaround employed in #13. Reverse-applying the interdiff in #13 and then also #2476637: Remove obsolete setting of provider on the message ID field the Contact Storage test passes.

Like I said before, I don't feel strongly either way, because all of this is pretty weird to me. If we didn't have to account for BC, I would be arguing that having an entity type without an ID field is a bug in the first place, but that's not possible at this point anymore...

alexpott’s picture

Yeah but we do have to account for BC so we shouldn't be breaking modules like contact_storage.

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
4.76 KB
19.13 KB

OK, here's a proposal: This is fully backwards-compatible (as far as I know, at least Contact Storage's (contrib) and Contact Storage Test's (core) tests pass with this) and it also doesn't introduce knowledge about ID keys to FieldStorageDefinitionListener. It's not necessarily pretty; it adds a workaround for the symptom of this problem in SqlContentEntityStorageSchema - along with a lengthy comment. Please feel free to shorten/improve the comment.

Thoughts?

tstoeckler’s picture

Ha, since this workaround is only needed in getEntitySchema() (and not when building the field schema), I think we should actually be able to remove this workaround in #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes as we then shouldn't be needing to build the entity schema during a field update/install/delete. If that is the case, I think that's a pretty strong case for #30. I will verify that - assuming #30 is green - the assumption is correct by working on a patch for #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes that builds on #30 here.

Status: Needs review » Needs work

The last submitted patch, 30: 2938951-31.patch, failed testing. View results

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
1.76 KB
20.14 KB

OK, that's encouraging. Fixed the unit test and added a @todo about removing the workaround per #31. Will try out if that's actually true.

tstoeckler’s picture

OK, so #21 over there proves my theory: Because processIdentifierSchema() is only called to build the entity schema, not the field schema and that issue alleviates the need for the entity schema in various places, we no longer need that workaround. That's encouraging...

...except I just realized it's actually not encouraging at all because the fact that processIdentifierSchema() is only called to build the entity schema is a bug in its own right, which is fixed in #2928906: The field schema incorrectly stores serial fields as int. So that will force us to keep that workaround around.

So yeah, would love some more input by @amateescu and @alexpott at this point. I'm fine to go with whatever approach you two can agree on ;-)

Berdir’s picture

Yeah, I would also prefer to fix this in a way that does not require changes in contact_storage. Agree that it would be a more correct and cleaner fix and it's yet again in the grey zone between incorrectly implementing API's, bugfixes and internal changes. contact_storage has 15k installs and even though new installations slowed down since webform became what it is now, it's still a pretty common module.

We're already introducing too many BC breaks that we don't know about and getting plenty of complaints and backlash about that, lets do our best to avoid them when we can :)

I'm not sure what that exactly means, my understanding is that the approach in the last two patches would just work until we tackle the next issue in the long chain/pool of storage issues, so I guess that would mean to do what @tstoeckler wrote in #28?

tstoeckler’s picture

my understanding is that the approach in the last two patches would just work until we tackle the next issue in the long chain/pool of storage issues

No, actually it will work fine. I just thought that we can remove the ugly workaround in ::processIdentifierSchema() when we tackle the next set of issues, but then I realized that that's not true, so presumably we have to keep this around forever.

So maybe @amateescu can comment on the latest patch. As far as I can tell both @alexpott and @Berdir agree with the approach in the latest patch because it is more backwards-compatible.

tstoeckler’s picture

Status: Needs review » Needs work

The last submitted patch, 37: 2938951-37.patch, failed testing. View results

tstoeckler’s picture

Version: 8.5.x-dev » 8.6.x-dev
Status: Needs work » Needs review
amateescu’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
@@ -109,6 +114,163 @@ public function testEntitySchemaUpdate() {
+    // Then uninstall the field and make sure all primary keys that the field
+    // is part of have been removed.

This part of the patch has been bugging me since my initial review from #16, point 2, so today I had another look at my latest patch from #2841291: Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field, which didn't need the "drop primary key" part added here, and I finally figured it out:

The test scenarios you wrote which assert that the primary key is dropped if one if its fields is uninstalled can not happen in practice! If you are uninstalling any of the id / revision / langcode fields without the intention of adding them back in the same update function, this can only mean three things:

- removing the 'id' field: your entity type is not stored in the database anymore, so it won't have any tables
- removing the 'revision_id' field: your entity type is not revisionable anymore, so it won't have any revision or revision_data tables (which were exactly the ones with an "empty" primary key in your test)
- removing the 'langcode' field: your entity type is not translatable anymore, so it won't have any data or revision_data tables, same as above

On the other hand, if you are updating one of those three fields, that means you are un-installing and re-installing them, which should work properly. It's just the in-between state that doesn't make sense, therefore it can not be tested.

Since I had to prove my theory above by updating the test and then the code, here's an updated patch as well as a test-only one.

The last submitted patch, 40: 2938951-40-test-only.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 40: 2938951-40.patch, failed testing. View results

amateescu’s picture

Status: Needs work » Needs review
FileSize
16.42 KB
1.79 KB

We need to bring back the check in SqlContentEntityStorageSchema::processBaseTable() in order to fix the contact storage test.

I also included the lengthy comment, but I'm not really sure it's accurate. The change is pretty straightforward: if an entity type doesn't have an id key, there's nothing to process, but I'd rather wait for @tstoeckler's approval before removing it :)

tstoeckler’s picture

Thanks for working on this, it's great to not only have some eyes but also some helping hands as this issue is still a fair bit above my paygrade... ;-) I have to admit that I don't remember all the details that brought me to write the patch in the way that I did, but ...

On the other hand, if you are updating one of those three fields, that means you are un-installing and re-installing them, which should work properly.

... I'm fairly sure that the reason I introduced those primary key shenannigans was that there was a test failure in exactly that code path: There was an SQL exception when updating a field about an illegal primary key. It's absolutely possible, however, that "that means we have to delete and recreate the primary key" was an incorrect conclusion from that.

In fact, the tests seem to be on your side as there is no such SQL exception among the fails.

The only fail is the contact issue which seems to be exactly what was discussed above and seems to be distinct from the primary key issue. So hopefully we can get that issue resolved in some way as well and then this will finally be ready.

Thanks for your persistence on removing that code!!!

Crosspost with #43: Awesome that you agree with that. Yeah sure, let's remove that comment. I think I still had "No ID key is a bug" in my mind when I wrote that, and while that's still my theoretical opinion, it's not actually reality unfortunately ;-)

amateescu’s picture

Yeah.. I also had to dig pretty deep down the memory lane to remember the problems I encountered when writing the original patch :)

Ok then, let's replace that comment with something less wordy. I also made the same change to processRevisionTable() to save some poor soul banging their head on the desk over the same discussion that we had here. And fixed the documentation of those two methods while I was there.. :/

tstoeckler’s picture

Yay, looks perfect! Thanks - again - a lot!! This certainly has my blessings but I guess I can't RTBC - even though most of the code that I contributed on top of your original code just got ripped out again ;-)

tstoeckler’s picture

Hmm... with #45 and the patch from #2871036: [PP-1] Fix Content Moderation and Workspace tests and mark the 'workflow', 'moderation_state' and 'workspace' base fields as required again I now get the error again that I was talking about in #44:

SQLSTATE[42000]: Syntax error or access violation: 1072 Key column 'id' doesn't exist in table: ALTER TABLE {block_content_field_data} DROP `id`; Array ( )

This is in the update tests, when the ID field of custom blocks is being updated by being deleted and then re-added again. So I guess there is still some weirdness that at the very least I haven't grokked.

However, since the patch in its current form allows 2 of 3 issues it blocks to proceed and comes with some nice test coverage I would still endorse getting this in as first step to unblock further progress in this area.

tstoeckler’s picture

Status: Needs review » Reviewed & tested by the community

OK, finally pinpointed the issue in #47 to #2974722: Fix dropping of columns in a composite primary key on MariaDB >=10.2.8 and make it consistent across database drivers. So we're good here.

Actually going ahead and marking this RTBC now. Pretty much the only code I actually wrote here is the test coverage and that has been approved by @amateescu above.

tstoeckler’s picture

Status: Reviewed & tested by the community » Needs work

#2974722: Fix dropping of columns in a composite primary key on MariaDB >=10.2.8 and make it consistent across database drivers revealed that we should add more assertions about the primary keys after fields have been removed. Will work on that.

amateescu’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
@@ -214,33 +222,28 @@ protected function assertPrimaryKeys(EntityTypeInterface $entity_type, $removed_
-        $this->assertPrimaryKeyColumns($table, []);

@tstoeckler, as explained in #40, the only assertion that was removed was actually invalid.

tstoeckler’s picture

Well it was valid before #40 because we were explicitly dropping the primary key. But instead of dropping the assertion completely we should add an assertion that actually asserts the correct primary keys after removing a field.

tstoeckler’s picture

OK, this brings back the assertions. This is green locally together with the patch I'm about to upload in #2974722: Fix dropping of columns in a composite primary key on MariaDB >=10.2.8 and make it consistent across database drivers. So I can only assume this will be green on Drupal.org.

This also adds some sorting to AssertSchemaTrait::assertPrimaryKeyColumns() to (hopefully) make the tests pass on PostgreSQL and SQLite.

I now think, though, that we may want to postpone this on #2881522: Add a Schema::findPrimaryKeyColumns method to remove database specific logic from test as it will alleviate the need for introducing AssertSchemaTrait in the first place.

tstoeckler’s picture

So apparently PostgreSQL handles the primary keys differently: it just drops them. Here's a patch which puts a compatbility layer in place to recreate them.

Notes:

  1. I'm pretty sure we don't want to fix issues in the database layer here. Not sure if we want a separate issue or if I should roll that into #2974722: Fix dropping of columns in a composite primary key on MariaDB >=10.2.8 and make it consistent across database drivers. In any case we will probably want dedicated test coverage and we also probably want to wait for #2881522: Add a Schema::findPrimaryKeyColumns method to remove database specific logic from test with that one, if only for the test coverage.
  2. I think the MySQL behavior is actually the odd one here. It's not at all guaranteed that changing the primary key constraint to apply to a subset of columns leaves the table in a valid state. We currently don't support that, but if I have two nodes with primary keys (1, en) and (2, en) and I want to recreate the ID field then the primary key will just be on langcode in the meantime and will be broken. So I think dropping the key (like PostgreSQL does automatically and MariaDB requires you to do manually) is actually more sensible. So maybe we should adapt the MySQL driver to drop the key (and not recreate it) instead?
tstoeckler’s picture

Title: The primary key is not correctly re-created when updating the storage definition of an identifier field » [PP-2] The primary key is not correctly re-created when updating the storage definition of an identifier field
Status: Needs review » Needs work

Will re-roll this, after I am done with #2974722: Fix dropping of columns in a composite primary key on MariaDB >=10.2.8 and make it consistent across database drivers and officially marking postponed on that.

I really tried hard to keep this unpostponed, but my luck failed me this time... ;-)

tstoeckler’s picture

tstoeckler’s picture

Alright, it took a couple more rounds over there than I had hoped, but I think we're pretty close to an RTBC over there so doing a full patch-set here now. Because we changed the approach with regards to primary key in accordance with #53.2, I had to update EntitySchemaTest::testPrimaryKeyUpdate() that is being introduced here.

Will send for a run on all three database drivers just to make sure, but the driver-specific stuff should no longer be relevant here, so we should be seeing the same three fails on all three drivers in the test-only patch. The test also passes locally on MariaDB >10.2.8.

The last submitted patch, 57: 2938951-57--tests-only.patch, failed testing. View results

amateescu’s picture

Status: Needs review » Needs work

Like I mentioned before, I don't really agree with the interdiff from #52 because it's bringing back the situations described in #40, which are impossible to replicate in real-world scenarios.

For example, if you remove the id key/field from an entity type, that means the entity type is not stored anymore so it doesn't have any tables anymore, let alone primary keys on those tables.

However, if you remove it with the intention of adding back in the same update function, e.g. when you are updating the field storage definition, it makes perfect sense to test that the primary keys are correct at the end of the process.

amateescu’s picture

Title: [PP-2] The primary key is not correctly re-created when updating the storage definition of an identifier field » The primary key is not correctly re-created when updating the storage definition of an identifier field
tstoeckler’s picture

Here's a re-roll for the change in the test to account for findPrimaryKeyColumns() being protected now. Thanks everyone for the awesome recent progress!

(Only sending the tests-only patch for a test on MySQL, as there really isn't anything driver-specific here anymore. The full patch I'm sending for all three runs just to be extra-extra sure not to break anything even though we've consistently verified above that it really shouldn't be necessary, strictly speaking.)

amateescu’s picture

Sorry if you're already typing a reply for it, but.. don't forget about #59 :)

The last submitted patch, 61: 2938951-61--tests-only.patch, failed testing. View results

tstoeckler’s picture

No, actually missed that, thanks for pointing it out.

Well, I feel pretty strongly, that we should add those assertions, as they were in fact necessary to trigger all the bugs/inconsistencies over in #2974722: Fix dropping of columns in a composite primary key on MariaDB >=10.2.8 and make it consistent across database drivers. Those are now explicitly tested with the tests over there, but it still feels unnecessarily incomplete. While you are correct that there are no practical use-cases that will result in e.g. a data table without a primary key and without an ID field, we are still running that code so I see no reason why we shouldn't test that it has the expected side effects - especially when it's quite trivial to do so.

So, escalating this to the Framework Manager level. The concrete question here is:

The current patch includes test coverage for the primary keys at the "in-between" state of updating a field storage definition after it has been deleted and before it has been recreated. This state will never be an actual "end" state after an update, as e.g. there will never be a data table if there is no ID field. @amateescu finds testing for this behavior incorrect precisely because it does not reflect reality and @tstoeckler finds it correct to test this for completeness.

(@amateescu feel free to clarify, of course, in case the statement above does not represent your views accurately)

tstoeckler’s picture

Alright, I still disagree with this, but I would really love to get this in at some point. So here's an interdiff realtive to #61 to remove the test assertions that @amateescu finds problematic plus the resulting patch. That way the patch that @amateescu would like to see go in is the last one, so that it can be RTBCed. If the committers instead want to side with me I will re-upload #61.

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

Thanks, @tstoeckler! I was also wondering about the status of this issue a few days ago and I was sad when I saw that it was still blocked on our disagreement :/

Anyway, if you're fine with proceeding with #65, let's see what core committers think about it.

FWIW, the second to last paragraph from #64 is an accurate description of my concerns.

alexpott’s picture

In reviewing this I tried to work out what

+++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
@@ -1333,11 +1333,23 @@ protected function createSharedTableSchema(FieldStorageDefinitionInterface $stor
               // Check if the field exists because it might already have been
               // created as part of the earlier entity type update event.

might mean for this change. If this happened do we still need to check the primary key?

Also is \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::updateSharedTableSchema() dead code?
Nope it's called by $this->{$operation . 'SharedTableSchema'}($storage_definition, $original);

tstoeckler’s picture

Re #67: Very good question indeed! git blame shows that this was introduced in #2398689: Follow-up: Applying entity schema updates still fails when both field and entity type definitions changes. It can only happen when using EntityDefinitionUpdateManager::applyUpdates() to e.g. make an entity type revisionable because in that case when SqlContentEntityStorageSchema::onEntityTypeUpdate() re-creates the entity type it already uses the "live" field storage definitions so that it will already include e.g. the revision ID field when it creates the shared table schema. Then when SqlContentEntityStorageSchema::onFieldStorageDefinitionCreate() wants to create the revision ID field, the schema already exists in the database. In this case, the entity schema will already include the primary key, though, so in terms of this issue we are good, I guess.

If we do get around to removing EntityDefinitionUpdateManager::applyUpdates() we might be able to remove this again, although over in #2976035: Entity type CRUD operations must use the last installed entity type and field storage definitions we found out that we in fact will need to add the capability to add a field storage definition and update the entity type "in one go", namely when adding a field that is an entity key (or revision metadata key). @amateescu: if you're latest patches in this area are any indication I would bet that you probably will find a way to remove this ugliness in the end ;-)

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
@@ -109,6 +111,171 @@ public function testEntitySchemaUpdate() {
+    $update_manager->uninstallFieldStorageDefinition($field);
...
+    // Now test updating a field with data.
...
+   *   (optional) The name of a field that has been removed. If passed, any
+   *   primary key that would otherwise have contained this field will be
+   *   checked to have been dropped.

I think this comment is now incorrect. There's no checking going on.

I'm unsure how I feel about the dropped assertions. I think asserting that the primary key no longer existing after dropping the field is useful to prove what we're testing. Ie. the recreation of the primary key. But OTOH there's no database that'll let you have a primary key on a field that doesn't exist so /shrug. I guess we have to hope the uninstallFieldStorageDefinition definitely works - but I guess there must be other tests for this.

I think what might be a better test is something like this:
1. Get all the primary keys before the uninstall.
2. Uninstall the field
3. Check that the primary keys no do not equal the original primary keys.
4. Re-install the field.
5. Check the primary keys now equals the original value.

One thing that's good about this approach is that it gets rid of the conditionality in assertPrimaryKeys() as it would become something that gets the primary keys for all the entity tables.

@amateescu what do you think about that?

amateescu’s picture

@alexpott, I think that's a good idea! Asserting the primary keys for all tables as a whole instead of asserting them per-table is something I can get behind :)

tstoeckler’s picture

Alright, so if I understood the proposal correctly, then this is what we are talking about? If so, then this is fine by me, sounds like a fair compromise.

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

That looks great to me, a nice compromise indeed :)

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 71: 2938951-71.patch, failed testing. View results

jibran’s picture

Status: Needs work » Reviewed & tested by the community
alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed 383c6ee and pushed to 8.7.x. Thanks!

  • alexpott committed 383c6ee on 8.7.x
    Issue #2938951 by tstoeckler, amateescu, alexpott: The primary key is...
jibran’s picture

What is the reason we are not backporting this bug fix?

tstoeckler’s picture

Yay, thanks so much!! This is great, and unblocks lots of progress elsewhere.

alexpott’s picture

@jibran I've asked @catch and @larowlan for a second opinion cause deep underlying entity storage changes need as many eyes as possible. I'm +1 on backporting.

catch’s picture

This looks backportable to me.

  • alexpott committed ed5f441 on 8.6.x
    Issue #2938951 by tstoeckler, amateescu, alexpott: The primary key is...
alexpott’s picture

Version: 8.7.x-dev » 8.6.x-dev

Backported to 8.6.x

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.