Problem/Motivation

When updating the storage definition of an identifier field (e.g. 'id' or 'revision_id'), the primary keys are not updated properly in the SQL schema.

This was the original scope of #2841291: [PP-1] Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field but that has expanded and changed scope significantly so that this now splits it into a separate issue (again).

Proposed resolution

Add primary keys when adding a field that is part of the primary key.

Members fund testing for the Drupal project. Drupal Association Learn more

Comments

tstoeckler created an issue. See original summary.

The last submitted patch, entity-primary-key-2--tests-only.patch, failed testing. View results

amateescu’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Field/FieldStorageDefinitionListener.php
    @@ -90,6 +90,14 @@ public function onFieldStorageDefinitionCreate(FieldStorageDefinitionInterface $
    +      // If the field being created is the ID field make sure the ID key is
    +      // registered correctly.
    +      $id_key = $this->entityTypeManager->getDefinition($entity_type_id)->getKey('id');
    +      if (!$last_installed_entity_type->hasKey('id') && ($storage_definition->getName() === $id_key)) {
    +        $keys = $last_installed_entity_type->getKeys();
    +        $keys['id'] = $id_key;
    +        $last_installed_entity_type->set('entity_keys', $keys);
    +      }
    

    Why is this needed? I ran the new test coverage manually and it passes without this change.

  2. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
    @@ -109,6 +113,32 @@ public function testEntitySchemaUpdate() {
    +  public function testPrimaryKeyUpdate() {
    

    We should also test all the other shared tables of a translatable and revisionable entity type, so we have to cover the id and revision entity keys on the:

    - base_table
    - data_table
    - revision_table
    - revision_data_table

  3. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
    @@ -109,6 +113,32 @@ public function testEntitySchemaUpdate() {
    +    $field = BaseFieldDefinition::create('string')
    

    Is there any reason for creating this ID field as string instead of integer? I think it is important to create it as integer because that maps to a serial (auto-incremented) field in the database.

tstoeckler’s picture

Yay, thanks for taking a look at this. I assume this means you are not opposed to tackling this in a separate issue, which is great news!

1.

See #2488258-89: Ignore: Patch testing issue where I checked whether splitting this is feasible. What happens is that various ContactStorageTests install an ID field to the contact message entity which generally doesn't have an ID and, thus, also do not have an id entity key set. And since we explicitly use the last installed entity type during the update calling $entity_type->getKey('id') when building the schema fails.

Note that the patch I linked to there still has the "drop-primary-key" part which I thought was necessary but turns out not to be necessary after all. I don't think that's related to this, though. I can try out another patch with that removed and the hunk you mentioned removed, as well, if you want to be 100% sure.

2.

Sure, that makes total sense. Will tackle that in my next round here (but could be a while, maybe on the weekend...).

3.

Good point! My original intention for the text was to not to uninstall and re-install the field storage definition explicitly but to update it (which then ends up being the same thing, if there's no data, as you know) and then I would have needed some way to assert that the update actually happened, which I wanted to do by checking if the column type was changed from serial to varchar. That turned to be both rather complex and (I think ?!) unnecessary, which is why I then wrote the test as it is now.

So unless I'm missing something else, you're absolutely right, and we should switch (back) to the standard integer ID field.

Thanks for the awesome review.

amateescu’s picture

Oh, yes, I'm definitely in favor of splitting that monster from #2841291: [PP-1] Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field into smaller chunks that actually have a chance at proper reviews :)

For #4.1: yup, let's try a patch without that hunk because I don't think the code which creates a new field storage definition should have anything to do with setting an entity key. At best, it's those contact tests which should be fixed to also add the entity key if they're adding an ID field.

tstoeckler’s picture

Re #5: Awesome! ;-)

Regarding the contact stuff: The tests themselves arent't actually doing anything, it's actually contact_storage_test_entity_base_field_info() that's adding the field. And surely enough contact_storage_test_entity_type_alter() does add the ID key (I guess that should be done in *_build(), not *_alter() - strictly speaking - but ignoring that for now). The problem is that we use the last installed installed entity type at this point and that doesn't have that key - and rightly so. In any case, will upload a patch without that to see who's right ;-)

hchonov’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Database/AssertSchemaTrait.php
@@ -0,0 +1,54 @@
+    $db_type = Database::getConnection()->databaseType();
...
+        $result = Database::getConnection()->query("SHOW KEYS FROM {" . $table_name . "} WHERE Key_name = 'PRIMARY'")->fetchAllAssoc('Column_name');
...
+        $result = Database::getConnection()->query("SELECT a.attname, format_type(a.atttypid, a.atttypmod) AS data_type
...
+        $schema_object = Database::getConnection()->schema();

Not really important, but the connection object could be stored in a variable, as it is retrieved 4 times.

tstoeckler’s picture

OK, here we go. This fixes 2. and 3. and I am providing a patch to (hopefully) prove that the hunk mentioned in 1. is actually necessary.

The last submitted patch, 8: 2938951-7--contact-fail.patch, failed testing. View results

Status: Needs review » Needs work

The last submitted patch, 8: 2938951-7.patch, failed testing. View results

amateescu’s picture

Ok, so the contact stuff does fail without that change, but I hope you agree that we need to fix that in contact_storage_test_entity_type_alter() rather than \Drupal\Core\Field\FieldStorageDefinitionListener::onFieldStorageDefinitionCreate() :)

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
1.88 KB
15.53 KB

Hah, thanks for asking for that test coverage. The test was green for me locally, but before creating the patch I noticed I had the "drop-primary-key" part still applied (which I then removed before creating the patch). So now I know why I introduced that in the first place! We now have dedicated test coverage proving that it - or at least something like it - is necessary. Yay! That said, feel free to come up with something less ugly to make the tests pass. I have to admit that I don't fully grok that part of the problem, including why the hunk fixes it, off the top of my head. I know that I spent quite some time debugging that, but I don't remember the exact details.

Also, the additional contact storage failures showed up for the first patch in #8 like I expected proving that hunk to be necessary.

So here's a patch with the test coverage from #8 and the FieldStorageDefinitionListener hunk + the "drop-primary-key" hunk that I cooked up somewhere in the depth of #2841291: [PP-1] Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field.

Oops:
X-post with #7 and #11.

Re #7: Thanks. This is only copied, but I do think it makes sense to fix these things, since we are effectively introducing new code here. Will fix in the next iteration

Re #11: Can you elaborate on what you think the fix should be? I tried to explain in #6 why I think the fix is correct, so as of now, I in fact do not agree ;-) There's a good chance you're right (in fact in this area of the code I'd bet on you rather than me, if I had to put money on it ;-)), but I don't really understand what you're getting at specifically.

amateescu’s picture

What I was getting at is that FieldStorageDefinitionListener::onFieldStorageDefinitionCreate() does not (and should not) have anything to do with entity keys, it's the responsibility of the module which alters an entity type definition to inform the definition update manager of its changes.

Here's what I think the correct fix should be :)

tstoeckler’s picture

Wow, didn't know that you could implement hook_module_preinstall() for your own module, very nice.

I'm actually not sure I agree with you conceptually. For reasons that are unknown to me we have chosen to handle creation of module-provided fields automatically in ModuleInstaller. So if we have made that "just work" for normal fields, I think it should also "just work" for ID fields. So while introducing knowledge about entity keys to FieldStorageDefinitionListener is not super nice, it's "fairer" IMO than making Contact Storage Test account for this on its own. On the other hand, I find it rather bizarre that we support entity types without an ID key in the first place, so I'm not too excited about the whole Contact Storage Test thing in the first place. At the end of the day, I don't really feel strongly either way. So I'm fine with this, assuming it's green.

tstoeckler’s picture

Awesome. So @amateescu do you have any thoughts on the "drop-primary-key" part? Other than that I think this should be ready to fly.

amateescu’s picture

Sorry, took me a while to re-wrap my head around this code, but now that I managed to visualize it again, here's my thoughts on the "drop-primary-key" part:

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1439,7 +1452,24 @@ protected function deleteSharedTableSchema(FieldStorageDefinitionInterface $stor
    -          // Drop indexes and unique keys first.
    +          // Drop the primary key, indexes and unique keys first.
    

    I don't think this is the right place to do this. It works in the current patch because in the new test you are testing an entity type without any data in its base tables, so you are basically testing the first if branch from \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::updateSharedTableSchema().

    We need the same tests with a "filled" state, which would also go through the top-level else branch of \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::updateSharedTableSchema(), which, in turn, leads to the fun scenario of having to deal with getSharedTableFieldSchema() :)

    I guess this basically means we can not split this patch from #2929120: [PP-1] The field schema is missing field-level primary keys, unique keys and indexes, unless we only want to deal with empty data tables in this issue.. which is not a very realistic scenario since most (all?) updates on these fields will be done on non-empty tables..

  2. +++ b/core/tests/Drupal/KernelTests/Core/Entity/EntitySchemaTest.php
    @@ -110,6 +115,145 @@ public function testEntitySchemaUpdate() {
    +    // Then uninstall the field and make sure all primary keys that the field
    +    // is part of have been removed.
    

    At first read, this comment sounded really wrong, but after giving it more than a minute's thought it actually makes sense, since a composite primary key does not make sense anymore once you remove one of its fields.

tstoeckler’s picture

Thanks for the input!!!

Will explain a bit more later, but right now just pressing Save before my battery dies...

tstoeckler’s picture

OK, here we go.

I agree totally with your assessment that we should test and fix both cases. Thanks for pointing that out. In all the mess of these different issues, I had completely lost track of that not-so-minor detail.

In the test I explicitly used the explicit ->delete*() and ->install*() calls to be able to test that the primary key is gone in between. When calling just ->update*() we can only assert that the primary keys are there after the update and we have to assume that they are there again after having been deleted. At least I couldn't think of any way to assert that.

But, you are 100% correct, that we should test that, so I amended the test to also test the ->update* code path now. The first interdiff then only introduces that without adding any data, because that actually revealed that the test code was using an incorrect field name for the revision ID field. Therefore I also split that out into a separate interdiff (as I didn't know at the time whether the "combined" patch would be green).

Then I added test code that saves an entity and then performs the update again. Surely enough, I hit #2928906: [PP-1] The field schema incorrectly stores serial fields as int so I had to manully add entity IDs and revision IDs. I'd really like to not re-merge those, though. And like you anticipated I hit #2929120: [PP-1] The field schema is missing field-level primary keys, unique keys and indexes again. I worked around the dependency, though, similarly to what I did here and in #2841291: [PP-1] Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field. Once this is RTBC I will tackle #2929120: [PP-1] The field schema is missing field-level primary keys, unique keys and indexes next, though, so hopefully those workarounds won't be that long-lived. By the way, can I take your mention of that issue as a sign that you generally agree with the direction of that issue? That would be rather sweet.

So yeah. I'm pretty happy that the patch is green, so for me personally this would RTBC material, as it does fix a nasty bug (proven by the tests), but I won't fight you if you would like me to merge one of the other issues back in.

amateescu’s picture

I just read through the patch from #2929120: [PP-1] The field schema is missing field-level primary keys, unique keys and indexes and it's pretty clear for me now why you prefer to tackle them separately. If we would merge them, the result is going to end up with a very hard to review patch, much like how #2841291: [PP-1] Fix NOT NULL handling in the entity storage and 'primary key' changes when updating the storage definition of an identifier field ended up back in the day :)

So, yes, I agree with doing this as a first step. Just a couple of minor nits before RTBC:

  1. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1439,7 +1452,8 @@ protected function deleteSharedTableSchema(FieldStorageDefinitionInterface $stor
    +          $this->dropPrimaryKey($table_name, $column_names);
               if (!empty($schema['indexes'])) {
    

    Needs an empty line here :)

  2. +++ b/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php
    @@ -1625,7 +1639,10 @@ protected function updateSharedTableSchema(FieldStorageDefinitionInterface $stor
    +            // Drop the primary key, indexes and unique keys first.
    +            $this->dropPrimaryKey($table_name, $column_names);
    +
    +            // Drop original primary keys, indexes and unique keys.
    

    The comment is duplicated at this point, so we need to get rid of the second one, and transfer the "original" part to the first comment.

tstoeckler’s picture

Title: The primary key is not correctly re-created when updating the storage definition of an identifier field » The primary key is not correctly re-created when updating the storage definition of an identifier field
FileSize
978 bytes
20.73 KB

So, yes, I agree with doing this as a first step.

That is great news! I do still feel that with all these issues I am treading in less shallow waters than I am generally comfortable with, so your validation really of the issues' direction and how they play together does mean a lot. Yay!

Fixed the comments in #19.

tstoeckler’s picture

Title: The primary key is not correctly re-created when updating the storage definition of an identifier field » The primary key is not correctly re-created when updating the storage definition of an identifier field

Status: Needs review » Needs work

The last submitted patch, 20: 2938951-20.patch, failed testing. View results

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
1.35 KB

Interdiffs are hard.

tstoeckler’s picture

Ahhhhh. No, don't test it. OMG, this is embarassing...

Wanted to retest it anyway...

The last submitted patch, 23: 2938951-17-20-interdiff-next-try.patch, failed testing. View results

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

Let's do this! :)

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/modules/contact/tests/modules/contact_storage_test/contact_storage_test.install
@@ -6,18 +6,24 @@
+function contact_storage_test_module_preinstall($module) {
+  // This test module alters the definition of the 'contact_message' entity type
+  // so we need to inform the system about our changes before additional field
+  // storage definitions are added by the module installer.
   // @see contact_storage_test_entity_type_alter()
-  $original = clone $entity_type;
-  $original->setStorageClass('Drupal\Core\Entity\ContentEntityNullStorage');
+  // @see contact_storage_test_entity_base_field_info
+  if ($module === 'contact_storage_test') {

I did know know that module's could react to their own pre-installation. Funny and kinda cool. Is it not possible that this change will require similar changes in contrib modules that do similar things to contact_storage_test? (For example, https://www.drupal.org/project/contact_storage). Will they break? The contact_storage module's test fail after this change on core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorageSchema.php:1268 with Undefined offset: 0

tstoeckler’s picture

Re #27: Can you take a look at #13 and #14? I think that @amateescu's point would be that contact_storage currently alters the entity type incorrectly which is exposed by this issue and it should be updated in the same way that contact_storage_test is in this patch.

Alternatively, like I argue in #14 we could consider it a bug that it's currently impossible to provide an ID field for a different entity type without the workaround employed in #13. Reverse-applying the interdiff in #13 and then also #2476637: Remove obsolete setting of provider on the message ID field the Contact Storage test passes.

Like I said before, I don't feel strongly either way, because all of this is pretty weird to me. If we didn't have to account for BC, I would be arguing that having an entity type without an ID field is a bug in the first place, but that's not possible at this point anymore...

alexpott’s picture

Yeah but we do have to account for BC so we shouldn't be breaking modules like contact_storage.

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
4.76 KB
19.13 KB

OK, here's a proposal: This is fully backwards-compatible (as far as I know, at least Contact Storage's (contrib) and Contact Storage Test's (core) tests pass with this) and it also doesn't introduce knowledge about ID keys to FieldStorageDefinitionListener. It's not necessarily pretty; it adds a workaround for the symptom of this problem in SqlContentEntityStorageSchema - along with a lengthy comment. Please feel free to shorten/improve the comment.

Thoughts?

tstoeckler’s picture

Ha, since this workaround is only needed in getEntitySchema() (and not when building the field schema), I think we should actually be able to remove this workaround in #2929120: [PP-1] The field schema is missing field-level primary keys, unique keys and indexes as we then shouldn't be needing to build the entity schema during a field update/install/delete. If that is the case, I think that's a pretty strong case for #30. I will verify that - assuming #30 is green - the assumption is correct by working on a patch for #2929120: [PP-1] The field schema is missing field-level primary keys, unique keys and indexes that builds on #30 here.

Status: Needs review » Needs work

The last submitted patch, 30: 2938951-31.patch, failed testing. View results

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
1.76 KB
20.14 KB

OK, that's encouraging. Fixed the unit test and added a @todo about removing the workaround per #31. Will try out if that's actually true.

tstoeckler’s picture

OK, so #21 over there proves my theory: Because processIdentifierSchema() is only called to build the entity schema, not the field schema and that issue alleviates the need for the entity schema in various places, we no longer need that workaround. That's encouraging...

...except I just realized it's actually not encouraging at all because the fact that processIdentifierSchema() is only called to build the entity schema is a bug in its own right, which is fixed in #2928906: [PP-1] The field schema incorrectly stores serial fields as int. So that will force us to keep that workaround around.

So yeah, would love some more input by @amateescu and @alexpott at this point. I'm fine to go with whatever approach you two can agree on ;-)