Problem/Motivation
Currently, only a few entity keys are automatically marked as NOT NULL, which is a problem because it gives the impression that entity keys are somehow special and leads to bugs like #2824851: EntityResource::patch() makes an incorrect assumption about entity keys, hence results in incorrect behavior.
Also, developers have to know about this limitation of entity API and override \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::getSharedTableFieldSchema()
in order to mark fields as NOT NULL.
An additional (major) problem found is that when we are updating the storage definition of an identifier field (e.g. 'id' or 'revision_id'), the primary keys are not updated properly in the SQL schema.
This patch also uncovered a another bug:
- The entity storage schema stores incorrect field schema data for the identifier fields: both the 'id' and 'revision_id' fields are of type int
in their base tables, when they should be serial
instead
Proposed resolution
Since we can now mark field storages as required (#2390495: Support marking field storage definitions as required) we can also add the NOT NULL database constraint automatically. This will also improve the performance of indexes that are using those required fields.
Pass the new 'primary key' definition when re-adding the identifier field in \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::createSharedTableSchema()
.
Fix the field storage schema data for the identifier fields for all entity types.
Remaining tasks
Review.
User interface changes
Nope.
API changes
The 'not_null' parameter of SqlContentEntityStorageSchema::addSharedTableFieldIndex() will become deprecated.
Instead, you should set \Drupal\Core\Field\BaseFieldDefinition::setStorageRequired() on the respective field.
Data model changes
Nope.
Comment | File | Size | Author |
---|---|---|---|
#238 | 2841291-238-8.9.x.patch | 52.38 KB | dhirendra.mishra |
#238 | interdiff_234-238.txt | 1.52 KB | dhirendra.mishra |
#231 | interdiff-229-231-review.txt | 3.56 KB | gease |
#231 | 2841291-231-review.txt | 52.44 KB | gease |
#231 | 2841291-231-combined.patch | 62.36 KB | gease |
Issue fork drupal-2841291
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFingers crossed :)
Comment #5
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAnd this is where things get ugly: we need to update the last installed schema definition of the 'id' field for all entity types, but
\Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::hasColumnChanges()
stands in our way.Here's one possible solution but I'm open to better ideas :)
Comment #7
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe-rolled for 8.2.x.
Comment #8
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedMissed a newline in system.install.
Comment #11
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOf course, 'revision_id' has the same problem in the revision table.
Comment #12
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSide note: this patch doesn't apply cleanly to the 8.3.x branch at the moment because of #2842480: system_update_8203() is named incorrectly.
Comment #14
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed the last one!
Comment #16
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedComment #17
BerdirBut some additional test coverage for the Schema.php change and also the update function would be great.
Didn't review too closely yet, need to actually apply the patch to better understand those changed/removed method calls.
You mentioned that this is fixing multiple related problems, I'm OK with keeping it together but we should mention all the problems and fixes in the issue summary.
Comment #18
tstoecklerCan you elaborate why this is necessary?
If it is, it seems the process*Table() methods are pointless and should be removed? Or should we just move them further down in getSchema()?
Comment #19
BerdirComment #20
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAdded test coverage for all the bugs that were found in this issue and also added them to the issue summary.
@tstoeckler, that change is needed because the
process*Table()
methods are only called when processing the full entity schema (i.e. what's stored in\Drupal::keyValue('entity.storage_schema.sql')->get('node.entity_schema_data')
), but we also need to process the identifiers when processing the per-field storage schema (i.e. what's stored in\Drupal::keyValue('entity.storage_schema.sql')->get('node.field_schema_data.nid')
).Edit: The test-only patch is also the interdiff ;)
Comment #21
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAnd here's the patch for 8.3.x which just solves a context change in
system.install
.Comment #24
tstoecklerRe #20: Ahh thanks. That makes sense, but that also means that
process*Table()
are fairly pointless and arguably problematic, right? Not your fault though, of course.Comment #25
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedHmm, I wouldn't say pointless, there might be use-cases for processing the shared tables as a whole. And also I don't think we can remove them now, even if they are protected, because some subclasses of
SqlContentEntityStorageSchema
might rely on them..The test fail for #21 was a random one, so back to NR.
Comment #26
tstoecklerYeah, true.
Looked through the changes in
SqlContentEntityStorageSchema
and those look good to me, but didn't review the entire patch and I can't RTBC the DB-layer changes anyway.Comment #27
dawehnerI'm still wondering whether this is the actually the case. It is some kind of limitation which is not strictly needed in SQL itself. Maybe its some limitation we accept for entity API?
Let's use strict equal:
===
. Note: We coulddo (!(... conditionn)
I guess this would be a 8.3.x only change now ...
Looking at the test I'm wondering, don't we need some test that actually saving/loading those entities continues to work? Changing schema is tricky conceptually, isn't it?
In case you are bored, change it to
atLeastOnce()
, as you change this line already. Exacting on 11 calls is just ridicilous :)Comment #28
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks for the review!
#27.1 and .4: These were *very* helpful in uncovering the fact that we were indeed changing the schema incorrectly, because DBTNG did not support updating composite primary keys. Now fixed and tested in the new patch.
#27.2, 3 and 5: Fixed :)
Comment #29
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed PostgreSQL and SQLite. This should be ready for another review now :)
Comment #30
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI think this needs to be tested on a filled database as well. And since it will probably not be committed to 8.2.x, changed the update function to be 8.3.x only.
Comment #31
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedFixed and updated some docs and also explained the third (!) DBTNG bug that's being fixed here.
Comment #32
timmillwoodThis should be 8302
Apart from this nit pick it looks good.
Comment #33
BerdirIn case we care about keeping this < 80 characters, we can easily replace "for every entity type's ID and.." with a simple "for all ID and .."
Other than that and the nitpick above, this looks good to me as well. Great work on all those fixes and the test coverage.
Comment #34
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedDone!
Comment #35
BerdirLets do this! :)
Comment #36
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI just found that this issue can not be separated from #2850686: Handle NOT NULL correctly at the storage level for required fields, because the update function introduced here will blow up once the code changes from that issue are applied. It's weird that the tests didn't fail over there for the combined patch, so I queued a few more to see what's up.
In the meantime, let's wait a bit with this one.
Comment #37
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThe test fail for php 7 from #2850686-2: Handle NOT NULL correctly at the storage level for required fields confirmed my suspicion above, so we have to bring that patch in the scope of this issue.
The reason is that the update function from the current patch here relies on the code from HEAD which makes entity keys NOT NULL in the storage, but the other patch changes that part of the code and only marks required fields as NOT NULL.
Here's the combined patch from #2850686-5: Handle NOT NULL correctly at the storage level for required fields, and the interdiff to #34. I also merged the other issue's summary into this one.
Comment #38
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedForgot to bring the tag as well.
Comment #39
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedComment #40
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSmall self-review, apparently I forget to apply the interdiff from #34 to the combined patch.
Comment #41
hchonov2841291-40.patch system_update_8302 is failing when running on our system for a custom entity type with the Exception:
This is a required text_long base field and in the field_data table there is no column text, but the columns text__format and text__value. I guess there is some logic error trying to retrieve the column text not considering that there is no such column but split into two columns.
Comment #42
cspitzlayThe exception reported in #41 could be solved by reordering our update-hooks so the status field is made part of the entity keys first (like it is done for the comment module via system_update_dependencies()).
Comment #43
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@hchonov, @cspitzlay, do you think it's worth trying to make
system_update_8302()
run last at all times? In that case, maybe it should be a post_update hook instead.Comment #44
catchCan this be removed once the update has run? If so should it have a @todo/followup to remove the logic in 8.5.x?
Should we set a default of 900 somewhere instead of requiring this to be passed in? Seems like it could break any programmatic creation of feeds that's not setting refresh. Or is this something that should never happen outside tests?
Why editing the existing update? This should never run again.
Comment should ensure it runs before I think? That's also an easier example for contrib to copy if it needs to. Also updates need renumbering to 84**
Comment #45
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks for taking a look!
#44.1: That depends whether we want to allow people to upgrade directly from 2 or more prior minor versions, e.g. from 8.3.0 straight to 8.5.0, skipping 8.4.0.
#44.2. Technically this shouldn't happen if the user validates the entity before saving, as that will trigger the 'required' validation error for the
refresh
base field, but not everyone does that :/ We could set a default value of '900' for that field if we want to babysit code that doesn't do any validation before saving.#44.3: Similar to 1), if we want to allow people to update from 8.0.0 to 8.4.0 directly, we need to change the existing update function. Since I haven't seen any real obstacles so far that would prevent us from supporting updates over multiple minor versions, I think we should try to keep it that way. But maybe this needs a bigger policy discussion.
#44.4: Good point, fixed :)
Comment #47
cspitzlayRegarding #43: I will talk to tstoeckler about this next week and
we will get back to you on that.
Comment #48
catch#44.1 - hmm OK but should we add a note to remove in 9.x, possibly trigger_error('...', E_USER_DEPRECATED) in that code path?
So #44.3 is purely a bugfix to the existing update, yeah that's fine to keep. We don't explicitly support going from 8.0.x to 8.3.x, but we don't prevent people trying to either.
#44.2 my guess is that would be the default case for custom code that's creating feeds (install profiles and similar things), so we should probably babysit - but we could also deprecate the babysitting and trigger_error() with E_USER_DEPRECATED maybe?
Comment #49
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAdded a note to remove that code in 9.0.x. I think trigger_error() should only be used when developers actually have a way to fix/update their code in order to not trigger the error anymore, but that's not the case here.
Also added a default value for the 'refresh' field of aggregator feeds, I agree that it makes sense :)
Comment #50
tstoecklerRe #43: Since the update doesn't actually modify any database schema or such I think it would be safe to turn it into a post-update instead and, thus, would also make the update path safer for people with custom entity types. Marking needs work for that.
Comment #51
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe #50: I started to do that locally but then I realized that
hook_post_update_NAME()
does not have a dependency resolution system likehook_update_N()
, and it's very likely that converting the current update function to a post_update one will conflict with post update functions that are converting entity types to be revisionable, because this one needs to run first.So, I'm sorry to say but #43/#50 is not really possible, this needs to remain a hook_update_N() implementation :/
In the meantime, here's a reroll.
Comment #52
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOops, the patch from #51 contains the work started for the post_update change, this should be better.
Comment #54
timmillwoodI can't see any issues with this.
Comment #55
jibranNW for the renaming of node_update_8301.
8401
Do we need a separate test for this?
Do we need a test for this update hook?
Shouldn't it be 8401 and in 8.4.x updates group?
8401 now.
s/8.3.x/8.4.x
Comment #56
jibranIt is a major bug. Is it going to be backported to 8.3.x? If yes then update hooks name should 83xx.
Comment #57
tameeshb CreditAttribution: tameeshb at Google Summer of Code commentedMade changes as recommended in #55
Please review! :)
Comment #58
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI don't understand this interdiff at all.. it changes the same file (node.install) twice?
Anyway, I made the changes for #55 myself:
#55.1, 4, 5, 6: Fixed.
#55.2: I don't think so, it's a very simple change.
#55.3: Nope, not doing this update would make all our update tests fail so it's already tested implicitly many times over.
The interdiff attached is for #52.
--
This issue is similar to #2248983: Define the revision metadata base fields in the entity annotation in order for the storage to create them only in the revision table in terms of "big changes" to the entity storage API, so I don't think it should be committed to 8.3.x at this point.
Comment #59
jibranThanks!
Comment #60
tstoecklerI'm not quite content with #51 TBH. I need to play around with this and our live site that this broke a bit more, though, before I can write a proper response. So setting back for now. If I don't post anything by Monday feel free to re-RTBC.
Comment #61
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRerolled for the short array syntax conversion.
Comment #62
timmillwoodRe-RTBC'ing as @tstoeckler suggested in #60.
Comment #64
jofitz CreditAttribution: jofitz at ComputerMinds commentedRe-rolled.
Comment #65
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@Jo Fitzgerald, why do you think this patch needed a re-roll? The one from #61 applies without any error to 8.4.x.
Also, the patch in #64 has a different file size than #61. What else was changed in there?
Comment #66
jofitz CreditAttribution: jofitz at ComputerMinds commentedSorry, I think that was my mistake. I've been switching between 8.3.x and 8.4.x too much and suspect I rolled that patch against the wrong version. Please ignore it.
Comment #68
jofitz CreditAttribution: jofitz at ComputerMinds commentedRe-setting to RTBC - #63 appears to have been a testbot glitch (and then I further confused things with an incorrect re-roll in #64).
Comment #69
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe-uploading the correct patch so it's the last visible one...
Comment #70
catchJust to say I don't think we'd ever convert entities to revisionable in a post-update? Am I missing something?
Comment #71
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@catch, we do actually in #2721313: Upgrade path between revisionable / non-revisionable entities :) The "schema converter" class introduced there is meant to be run in a post-update hook so it can update the definitions and copy the data in a single place.
Comment #72
tstoecklerSo our local errors were actually caused by a separate bug, so I don't want to hold this up anymore.
I haven't reviewed #2721313: Upgrade path between revisionable / non-revisionable entities yet, but my intuition is the same as @catch in #70 that making entities revisionable in a post-update is a bad idea. So I still don't really buy #51 TBH, but I don't feel that strongly about it.
However, I don't see a change notice for this. A lot of (custom and contrib) modules will want to add status fields to their entities just like comment and they will all potentially hit the same update ordering issue as Comment module, so we should add a change notice recommending
hook_update_dependencies()
.Comment #73
tstoecklerSee #2858184: Adding NOT NULL to base fields with multiple columns is broken for the related error.
Comment #74
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@tstoeckler, that's fair so I wrote a change record, hope it's detailed enough :) https://www.drupal.org/node/2858195
Comment #75
tstoecklerWow, that's awesome, thanks a lot!
Comment #76
alexpottDiscussed with @catch, @cilefen, @cottser, @laurii, and @xjm. We agreed that this is a major issue as the entity API is complex enough without requiring developers to know that they need override specific parts to support NOT NULL fields.
This overlaps with other issues for example #2615496: A serial/primary key field can not be added to an existing table for some databases. The issue summary reads like a list of issues that can and should be addressed separately. Is there any reason why they are not?
Comment #77
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedMostly because I'm currently working with a dependency chain of 6 issues (#2820848: Make BlockContent entities publishable) and it's getting really really hard to keep all of them clean and updated :) But one more won't kill me so I moved the DBTNG parts over to #2615496: A serial/primary key field can not be added to an existing table for some databases.
I also had an attempt to further split out this patch in two but it was unsuccessful, see #36/#37.
No interdiff because I only removed the parts that were posted in #2615496-23: A serial/primary key field can not be added to an existing table for some databases.
Comment #78
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedObviously, this will fail without the DBTNG fixes :)
Comment #79
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRerolled for #2860096: Remove api doc groups for updates eg. updates-8.2.x-to-8.3.x and also posting a combined patch with #2615496: A serial/primary key field can not be added to an existing table for some databases.
Comment #80
lokapujyaIn mysql, the field type is still INT, not SERIAL. Right?
Comment #81
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedYes, 'serial' is just a DBTNG specification for an auto incremented INT.
Comment #82
daffie CreditAttribution: daffie commented#2615496: A serial/primary key field can not be added to an existing table for some databases has landed.
Comment #83
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedNice! Re-uploading the patch from #79 which was RTBC before.
Note that I still think this is not backportable to 8.3.x due to the amount of changes in the entity storage. Contrib and custom code will need a few months to update their tests for required fields.
Comment #84
jibranThis was RTBC in #76 after that @amateescu removed #2615496: A serial/primary key field can not be added to an existing table for some databases code and just the rerolled the patch.
Comment #85
hchonovWhy do we mark the revision field in CEB::baseFieldDefinitions as storage required and then pretend it is not storage required in SqlContentEntityStorageSchema ::getSharedTableFieldSchema?
I am not really sure that I understand this correctly but what will happen if the field storage is required and we have two properties - one required and one not? Then in the first loop we set NOT NULL to TRUE and then in the second to FALSE? I guess we should first check if not already set to TRUE, right?
Comment #86
hchonovProbably out of scope of this issue, but why is it even possible that some update hook 84** will be executed before an update hook 83**? Is this a desired/expected behavior?
Comment #87
Berdir83xx and 84xx is just a convention, the only thing that matters is that updates for a specific module are run in order. And the update system doesn't ensure a specific order across multiple modules, that's when you need that hook.
Comment #88
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedRe #85:
1. The comment from
SqlContentEntityStorageSchema ::getSharedTableFieldSchema()
just above the code you quoted describes in detail why we need to make the revision field nullable on the base table :)2. Each required column of a field will get NOT NULL => TRUE if the field storage is required, the foreach loop doesn't influence the previous or the next column.
Comment #89
alexpottHow come we have to do this in this issue?
Scary. How much contrib / custom going to run into this?
The current rtbc patch is not going to be able to do this. Why do we have to make this change here?
This looks weird. Why did we have to make this change here?
Must be an unnecessary change. Looks out-of-scope.
What about sites that have already had this update run on them? Don't we need to ensure langcode is required on them? Yep this is covered by system_update_8401(). How come we're changing this update and not using hook_update_dependencies()?
Not entirely sure that we should be replacing this method here. An entity key is not quite the same as a required field. Also this is a trait - contrib / custom tests could be using it and the breakage seems unnecessary.
So here is where the node langcode will get updated - still strange to go and "fix" the old update. I'm really concerned about contrib and custom entity updates needing changes or hook_update_dependencies() as a result of this change. Do we think it is worth not making the fix so generic. Ie. having entities opt in to the new behaviour and able to upgrade at there own pace. I.e. an update per entity type rather than this catch all? OTOH it is great that the change record tries to tell people what to do.
Comment #90
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThanks for getting to this one, Alex :)
3. This was necessary in a previous iteration of the patch but I have the feeling that it might not be needed anymore. Let's try it out.
4. Will reply at the end :)
5. Because *a lot* of CM test will have to be updated, which would make this patch huge. I will open a followup to investigate and fix them.
6. That's because the logic there is simply broken and not tested. With this patch, we're providing some indirect test coverage. Do you want a followup issue for adding some dedicated test coverage as well?
7. Yeah, it was just an unused variable that I removed. Anyway, reverted for now.
8. It's true that this could be fixed by using
hook_update_dependencies()
but in this particular case, sites that already ran that update function won't be impacted in any way. The change only affects sites upgrading from 8.0.0 to 8.4.0 directly, and I personally don't think that warrants the use ofhook_update_dependencies()
.9. That method name change should be looked at in the context of its (single) usage in core, which has the following comment:
The purpose of the old method was to trigger the addition of a NOT NULL constraint, and, since we're changing entity keys to not be marked as NOT NULL automatically, contrib or custom code tests that might be using that method won't be testing what they thought they would be testing anymore. That's why I think they need to be explicitly aware of this change and update their code/tests.
10. and 4.: If we make the fix "optional" and let each entity type update on their own, I think some contrib and custom entity types will never do it, which will have a big impact on their future support for basically everything worked on by the Workflow Initiative:
- they will not be able to update their 'id' or 'revision_id' base fields (i.e. what this patch fixes generically)
- they will not be able to use the new "initial values" addition for base fields (#2346019: Handle initial values when creating a new field storage definition)
- they will not be able to be converted to be revisionable and publishable
- they will not be able to participate in any current or future developments in the WI "area": Content Moderation, Trash, Workspaces
Given all that, I personally think we should go ahead with the generic fix approach and help with any problems encountered by contrib and custom code on a case by case basis. This is also why the current issue, while being a major bug, should not go into 8.3.x but straight to 8.4.x instead, so people have at least a few months in advance to spot any potential problems.
Comment #91
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedPutting back to RTBC to get committer feedback.
Comment #92
alexpottRe #90
5. yep a followup would be great.
6. yep a followup would be great too.
8. I'm not sure I agree changing an update should be thing of last resort - hook_update_dependencies() is for this exact case
9. Well the problem here is that writing a test that works against 8.3.x and 8.4.x for contrib will be hard - I think we should prioritise easy test coverage. Why not deprecate the method and make it call the new method?
10. Good points - I'd like catch's thoughts on this one too - will ping him.
Comment #93
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@alexpott #92:
5. here it is: #2871036: [PP-1] Fix Content Moderation and Workspace tests and mark the 'workflow', 'moderation_state' and 'workspace' base fields as required again
6. and another one :) #2871026: Add test coverage for \Drupal\migrate\Plugin\migrate\destination\EntityContentBase::processStubRow()
8. actually, this particular problem with the 'langcode' field was fixed by the extra test coverage you requested in #2615496-26: A serial/primary key field can not be added to an existing table for some databases, which led to fixing even more subtle bugs in DBTNG! So changing the existing update function is no longer necessary.
9. sure thing, we can keep the old method and mark it as deprecated :)
Marking as NR for @catch.
Comment #94
catchI'd prefer per-entity updates here. It means when contrib is incompatible with workflow it'll have to be patched/updated separately, but it makes the upgrade path much more predictable.
We've had several issues with the 8.2.x to 8.3.x upgrade path (and especially 8.1.x to 8.3.x) - only one of these was a core bug, but several interactions with contrib updates and a drush bug that took hours/days to track down. So the less variables in the upgrade path the better.
Comment #95
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThe problem with per-entity updates is that it's not only about splitting the update function in X (the number of content entity types in core) but it requires much more work for both core and contrib/custom code and it also comes with additional complexity to Entity API in general:
- we will need a flag in the entity definition which will be used to determine if an entity type should have its entity keys marked as NOT NULL (the current behavior in HEAD) or all of its required fields (the behavior proposed by this patch)
- the entity storage will have to maintain both branches of the code that deals with adding NOT NULL to a column, and switch between the two behaviors based on the flag mentioned above
- each potential contrib/custom entity storage class will have to take care of this separation as well
- many other places will need to know about this flag and allow or disallow some things based on it. A few of them I already mentioned in #90, for example: disallow setting an initial value for a base field, which in turn means disallow adding a 'published' base field, which then means that an entity type can not be converted to publishable, etc. Note that these are just a few examples that I have fresh in my mind because of the immediate work planned for WI.
And all this added complexity for an update function that simply updates the 'id', 'revision_id', 'uuid', 'langcode' and the 'status' base fields. I'm sorry but I don't think it's worth it...
Comment #96
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOTOH, it is true that contrib and custom modules will need at least some test updates to take care of fields that are marked as required but not populated during tests, but, to answer #89.10, it's not their existing update functions that will need to be changed.
I'm really on the fence with this one. Can we get some more opinions from other entity maintainers?
And just to be clear.. I'm not opposed to implementing all the things mentioned in #95, I just want us to be sure that the additional complexity is worth it :)
Comment #97
dixon_This issue was discussed during the Hard Problems meeting at DrupalCon in Baltimore. See notes and decisions here: https://docs.google.com/document/d/1-5B-Gndhklns20NZpGQul10525rFWuYB2NbT...
Comment #98
tstoecklerI just hit this locally on our project as well.
I think this should be part of the change notice, that if you have an entity type that declares a
uid
entity key, you need to decide whether you want to make it storage required and write an update path (either way).The other entity keys (
id
,uuid
, etc.) are handled automatically for everyone callingContentEntityBase::baseFieldDefinitions()
so I can see how we don't need a change notice for that (although might still be nice to mention it) and bysystem_update_8401()
, but for this one, every contrib/custom module with an entity type with auid
key needs to explicitly handle this with an update hook.Thoughts?
Comment #99
timmillwoodSwitching to "needs work" based on discussions in #97 and points from #98.
Comment #100
tstoecklerJust as another data point, I hit a problem simliar to #98 with the contributed Media Entity module, since we're already running with this patch. That particular module won't be a problem obviously, because it's now in core, but I can see other modules, that have been in development for a long time during the D8 cycle that have not gotten around to do
parent::baseFieldDefinitions()
in their ownbaseFieldDefinitions()
method. So I now feel a bit more strongly that a section should be added to the change notice for entity types that don't use the generic base field defintions (for whichever reasons).Comment #101
tstoecklerJust re-read the change notice and found
I was not aware of
isStorageRequired()
falling back toisRequired()
forBaseFieldDefinition
s. That lead me to think this whole thing through again. And since I was already pretty deep down the rabbit hole I went ahead and restructured and expanded the change notice a bit so that - at least to me as a potential contrib/custom module developer - it is more actionable. @amateescu: It would be awesome if you could have a look at it and revert anything that is not correct, or further improve it in any way.However, I do have some more thoughts on the actual patch after reading through it now. At least 2. is a "needs work" to me, unless of course I'm missing something.
Any specific reason for marking them "just" storage required vs. required?
Since I was not clear about this the change notice is also a bit vague about this, at the moment, by the way.
I am not seeing the bundle field being marked as (storage) required, that should definitely happen, though, right?Ahh,the bundle field is being marked as required already in HEAD. Missed that, sorry.
If the others are storage required (but not required), it seems this should too?
Not used.
Sorry if this was asked before already, but is this in scope of the issue? I don't see any explanation nor test coverage, so it's hard to tell.
Edit: 2 is in fact bogus, updated for that. So would be nice to get a response to the above points but seems pretty close to RTBC then to me.
Comment #102
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented@tstoeckler, the updates to the CR look very good and useful, thanks for taking the time to do that!
Unfortunately, the outcome of the discussion that @dixon_ mentioned in #97 did not make it back into the issue, so here it is: this change was deemed to risky at this point in the 8.x cycle so, if we want to go through with it, we will need to provide a BC layer by making the new NOT NULL behavior opt-in (through an entity type definition key or something).
That does not invalidate the current patch/CR, it's just that it will need a lot more work to get it done :/
Re #98:
It's not only about having the 'uid' field in the entity keys, every field that is specified as an entity key and not marked as required (or storage required) yet will need to be handled in an update function.
Re #101:
1. The difference between storage required and required is that a field that is marked as required also makes its widget required in an entity form, while storage required has no influence over the UI at all.
3. That's true, it can be set to storage required but, since this field is not editable through the UI, I simply chose to mark it as required.
5. That was brought up before, see #90.6 :)
Comment #103
tstoecklerRe #102:
Hmm... that is kind of unfortunate. But fairly understandable given the extent of the change notice and the fact that missing something can lead to fatals that are possibly unrecoverable. Are there more specific plans, though? A flag on the entity type seems sensible to me and I could try to cook something up in that direction but not if anything concrete was already discussed.
Yes, exactly. I tried to make that clear in the change notice. I do think there is value in explicitly mentioning all examples from core, though, as a lot of people both in contrib and custom copy-paste.
So these two together don't really make sense to me. None of the other entity keys except for language are editable in the UI either, but we set those to storage required. Thinking about this some more I actually think the latter makes sense because we do care explicitly about the storage, not the UI. So I think it would make sense to make the default langcode storage required, as well.
Comment #104
tstoecklerHere's a re-roll first. Still waiting for more info before going any further with this, but just want to point out that I could make some time for this if we can agree on a direction.
Comment #105
tstoecklerOK, so this currently conflicts with #2346019: Handle initial values when creating a new field storage definition so here's a patch on top of the latest one (#76) from there, since I needed that for our project and it seems that one is going to land a bit sooner than this.
Comment #107
Wim Leers#2346019: Handle initial values when creating a new field storage definition just landed! Rebased #105.
Comment #108
Wim LeersThis blocks #2871036: [PP-1] Fix Content Moderation and Workspace tests and mark the 'workflow', 'moderation_state' and 'workspace' base fields as required again.
Comment #109
Wim LeersDo we want ===?
Unused.
Why are changes like these necessary? How come these tests used to work fine? Are these examples of areas where code used to be not strict enough?
If so, does that mean this increased strictness may cause BC problems for sites updating from 8.3 to 8.4?
Oh, interesting! I explicitly marked that issue as postponed on this issue, and added it to the related issues.
"only required fields", this says. Shouldn't it say "only storage-required fields"?
This is an interesting update path test. If it's clear to Entity API maintainers, then ignore me. But it looks kinda mysterious/confusing to me :)
I don't think clearing static caches explicitly here is necessary?
NOT NULL
behavior.Unassigning catch, AFAICT it's no longer blocked on him.
Comment #110
Wim LeersBecause of its relationship to #2824851: EntityResource::patch() makes an incorrect assumption about entity keys, hence results in incorrect behavior, also tagging .
Comment #111
tstoecklerWorking on the test failures.
Comment #112
tstoecklerI tried to fix the tests, and did find out problem with the current patch (see
*--interdiff-1.txt
).However, actually resolving the failures would have required me to adapt the field definitions in
EntityTestUpdate
. Currently neitherid
norlangcode
are marked as required, and so SQL complains when creating the data table which has(id, langcode)
as the primary key, but neither of those areNOT NULL
. But updating those field definitions is wrong because they are explicitly copied from the state of 8.0.0-rc1. So this fortifies what we kind of already new (and documented in the change notice): entity types written before this change are broken after this change. I guess this is where the discussion in Baltimore already arrived at regardless of this specific test and I think the only solution to this is what #102 already alludes to: we need to make the new behavior opt-in and provide a BC-layer. Thus, not proceeding with any more test "fixes" and instead waiting for a confirmation by ideally @amateescu that proceeding with an opt-in entity type annotation entry is the way to go.I did find one strange issue where
SqlContentEntityStorageSchemaConverterTest::testMakeRevisionableErrorHandling
(which tests that the schema converter doesn't do anything if the update fails) did slightly alter the stored schema of the entity type even though it should not have. Didn't dive into that too deeply, though, because of the aforementioned issue.Re #109: Thanks for the review, some commends (and see
*--interdiff-2.txt
):I think there is not much we can do here about this that would be in scope.) But no, this should not cause BC problems.
setRequired()
orsetStorageRequired()
are sufficient to getNOT NULL
. I adapted the text to say "only required (or storage-required) fields", I hope that works for you.EntityUpdateTest
and, thus, re-runs its test. It just uses a different database dump as a starting point. That is how many update tests work.uid
key from nodes, but it doesn't really hurt to have entity keys lying around even if they are not used, so I would prefer not to discuss that hereComment #113
Wim LeersThanks for addressing #109, your answers to my points make sense.
And more importantly: hopefully we can continue here again soon, this is all tricky stuff clearly, where we need to be very careful to not break BC. Kudos to all of you for fixing these foundational issues — that's very hard!
Comment #114
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedSorry for replying so late here, I'll take a look later today at what would be best to do for this issue :)
Comment #115
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI looked into the test failure and @tstoeckler is spot on with his investigation in #112. My testing also revealed that we should depend on #2860654: Field storage CRUD operations must use the last installed entity type and field storage definitions if we want to be able to update field storage definitions at the same when changing an entity type from non-revisionable to revisionable.
So, to answer the question about what needs to happen here to move forward, I think the only good solution is to have an opt-in entity type definition key which defaults to keeping the current behavior, then have individual update functions for each core entity type that we want to upgrade to the new behavior. Something that should be discussed is what to do when we install a new entity type (e.g. something that's not yet stored in the last installed entity type definitions), do we default to the new behavior for them?
I will be away for about a week so I won't be able to review patches, but I'll try to comment whenever my input is needed :)
Comment #117
kfritscheJust a re-roll, comments from #115 still open.
Comment #119
tstoecklerTrying to get back to this, first off a self-review. I will provide an updated patch with the fixes. I think, even though we have now agreed that we can't do the update wholesale, I think it makes sense to get the patch green first (including the update!), so that we now it works for all core entity types. We can then split that off into separate update functions and optionally tackle those in dedicated follow-up issues.
Above @amateescu mentioned that it's fairly arbitrary whether to mark this "just" required or storage required. Since IMO this is more similar to ID/revision ID/etc. than e.g. 'published', I would prefer storage required here, for consistency.
Ahem, this has since been committed, so I'm not sure this is still going to fly.
This doesn't read well. Borked re-roll?
Strict equality would be nice here, especially for the second one. I don't even want to think about what happens if someone names their field '0' and doesn't provide a revision key...
All these need to be bumped for 8.5 now
Comment #120
tstoecklerHere we go.
Comment #122
tstoecklerOpened #2923915: Stubbing content entities with required fields with a default value is broken for the migration problem that came up a couple times here. I don't think it would fly to just include this here without any test coverage. So I guess this is now postponed on that one, but that is fairly irrelevant at this point as we first need a green patch and then do the refactoring to make the behavior change opt-in as discussed above. Hopefully that one will have landed, by the time this is finished.
Comment #123
tstoecklerOK, this adds an update path dependency for block_content (and re-introduces the one for comment which I mistakenly deleted before). This should fix at least some of the fails.
Comment #125
tstoecklerFixed a couple of the tests but wrangled quite a bit with
SqlContentEntityStorageSchemaConverterTest
, as that revealed a few issues with the actual (i.e. non-test) code. It's green now but in the course of getting there I made some changes that are more invasive that I have liked. I'm going to keep them for now, until this is green, and then consider whether we can revert some of them, split them off into a separate issue or whether we actually have to keep them as is. In any case, my objective here is a green patch first, to validate that the code actually works correctly in all cases.Since I did change a couple thing uploading this to see what else I now broke.
Comment #127
Wim LeersConveying #122 in the issue title. (Doesn't mean we can't continue the NW/NR patch updating cycle.)
More importantly: impressive progress! 😵👏
Comment #128
tstoecklerThis fixes all the remaining tests. Since I did change some runtime code I may have broken a few new ones, though, so not opening the champaign quite yet ;-)
Also, even if this is green, this is not "final-review" ready, see the earlier comments for the broader plan. But as mentioned there I want this to be green to make sure we don't break anything before refactoring the patch.
Re #127, yes, thanks for that. Since you bring it up both of the following issues have patches including tests so could use a review while we sort things out here:
The first one is not a hard blocker for this one (thus, PP-1 is correct) because we don't have the use-case that fixes in core, but it will trip up a lot of contrib modules and people with custom entities once they start working
NOT NULL
constraints per this issue. So would be nice to get that out of the way.Comment #129
tstoecklerOh, and this is now related to #2619328: A required boolean field behaves differently depending on the widget because making the status field properly required breaks the ability to use a checkbox widget for it per that bug, but that is what we do now in core.
Comment #131
tstoecklerOK, this time I'm legitimately crossing my fingers...
Comment #132
tstoecklerOK, so I was able to revert most of the interdiff in #125 while keeping
SqlContentEntityStorageSchemaConverterTest
green. Let's see what else breaks...I still think the field schema changes in #125 make sense (i.e. having all indexes and keys related to a field part of the field schema) but it would be awesome to be able to this in a separate issue.
In the meantime I extracted two issues from this:
I even tagged the first on as "API-first Initiative" so hoping we can quickly make some progress there. Note that that issue includes test coverage and upgrade path test coverage, so should be pretty ready. The same goes for the issues linked in #128 by the way, still hoping for a review there...
Since the latter issue is not in itself a bug and solely exists to bring down the patch size here, only increasing the "PP-" count by one, not two. If this gets to RTBC without the second one going in before, I will just mark that as duplicate.
Comment #133
tstoecklerSelf-review of the patch in the meantime.
This should be 8501 by now.
Oops, meant to remove this. Should be obsolete.
Comment #134
tstoecklerI was able to revert the "serial -> int" as well, with only a small workaround, so will move that to #2928906: The field schema incorrectly stores serial fields as int which I have opened for that and the keys/indexes issue discussed above.
As far as I can tell, all the changes in the patch are now strictly required and related to the NOT NULL behavior of entity fields (except for the issues that this is postponed on, whose patches are included here. Sorry for not providing a for-review patch, by the way...) So assuming this is green the next step is to actually make this behavior opt-in per #115 and related. Note that I probably won't have time to work on this in the next few days, unfortunately...
Comment #136
tstoecklerContactStorageTest
exposed a failure when creating the ID field, as at that time the ID key is not set in the last installed entity type (as previously there was no ID key). I never liked the idea that contact messages do not have an ID field out of the box, but for now I implemented a workaround that allows to handle this case "properly".EntityUpdate(|Filled)Test
were related to the field schema changes, which I reverted above, so they now rightly fail. Removing them here and will include them in the next roll of #2928906: The field schema incorrectly stores serial fields as int.Comment #138
tstoecklerComment #139
Wim LeersGreat progress, @tstoeckler! 👏
I think this primarily needs reviews from Entity/Field/Typed Data experts.
Comment #140
tstoecklerActually, this still needs work per #115 as we decided against making this behavior change automatically for all entity types but instead have some sort of opt-in mechanism. So - while it doesn't hurt for someone to look at this - it's not yet RTBC-able. Marking accordingly.
Comment #141
tstoecklerComment #142
tstoecklerComment #143
larowlan#2927569: Various tests do not set values for required field when creating entities is in
Comment #144
larowlan#2927563: Aggregator feed "refresh" field should have a default value is in
Comment #145
tstoecklerWow, thanks for the quick reviews and commits!!! That's awesome. Note that this is still postponed on #2923915: Stubbing content entities with required fields with a default value is broken (that is the weird-looking hunk in the
EntityContentBase
migration destination) so hopefully we can get that in soon, as well. Also #2858184: Adding NOT NULL to base fields with multiple columns is broken - while not a blocker - is closely related and still looking for a review.In the meantime, here's a reroll of #136.
Note that in addition to not having super much time for this in the next days I won't pick this one up again before I get at least #2928906: The field schema incorrectly stores serial fields as int and maybe also #2929120: The stored field schema does not contain field-level primary keys, unique keys and indexes in better shape since I extracted @amateescu's fixes to that issue but borked up the tests somehow, which is a bit unfair.
Comment #147
tstoecklerNow also extracted #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field into its own issue, as that is both required here and in #2928906: The field schema incorrectly stores serial fields as int. I only recently discovered that that was in fact the original scope of this issue. It seems that at the time it was necessary to expand the scope here, but - unless I'm missing something - that is no longer the case, as is proven by the passing test over there. Therefore I think it makes sense to get that in separately with dedicated test coverage. As noted multiple times already, this one will still need to be reworked anyway to make entity types opt-in to the new NOT NULL behavior.
Interestingly I don't need the very ugly "drop primary keys" hunk that I included here to make the patch over there pass. So once that patch is in I will revisit here whether that is in fact still needed or not.
Marking PP-2 now.
To recap:
This is now blocked on:
And while we're at it here's another shameless plug for the closely-related (yet not blocking) #2858184: Adding NOT NULL to base fields with multiple columns is broken which is looking for reviews.
Comment #148
tstoecklerAlso marking needs work for now, as it still needs to be reworked. It's important to now that this passes tests, however.
Comment #149
Wim Leers@tstoeckler: I looked at all those patches, but … they're in areas that I don't feel qualified to review. I did want to leave a comment to:
Comment #150
tstoecklerYay! #2923915: Stubbing content entities with required fields with a default value is broken landed. So this is now only postponed on #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field (+ the actual reworking of this patch)
Comment #151
tstoecklerOK, next round. This needed a re-roll. And since #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field is green, I've now properly split the patch into a review patch and a full one which contains the one from over there.
Since I had to redo a bunch of things by hand anyway I also changed a couple things. Most notably I removed everywhere were we were manually setting 'not null' in the entity schema to properly set the fields to storage required. I ran all tests that are touched here locally - including a couple update tests - and they are green, so let's see...
BTW, opened #2973299: Add the label of the entity type to the error message when there are outstanding entity changes in UpdatePathTestBase when debugging the update tests.
Comment #153
tstoecklerSo ContentModerationState and Workspace are doing the same trick that Node is doing: They add their uid field as an entity key to get a NOT NULL constraint. So we need to update that accordingly for them as well. For content_moderation I added an update path for that, as far as I know that's not required for workspace yet.
Workspace brings in another interesting case, though. The target field is a required string field, but the default live workspace is created with '' as the value. Because an empty string is considered an empty value, however, and the entity system wants to avoid saving empty values, this ends up as NULL in the database. When this patch now adds a NOT NULL constraint to the field since it is required things explode. In theory, I think it would work to make the field required and explicitly not storage required, but I'm not sure if that's semantically sensible. In the UI a required string field also prevents you from entering an empty string, but of course in the UI there is no way to distinguish between an empty string and NULL. Not yet sure what to do about this, for now I just commented out the setRequired() call like the rest of the fields. In any case, this strengthens the case that we cannot "just do this" for all entity types, as here we would actually have a NULL value in a column that we would try to add a NOT NULL constraint to.
I based this on top of the latest patch in #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field, but while the tests over there pass this makes the spurious primary key problem that I "fixed" above re-appear. So for now just skipping all update tests because they will all fail and will make the test output very hard to parse. So this patch should hopefully be "green", but in this particular case that doesn't mean it passes all tests ;-) I will have to figure out why this patch is messing up the nice primary key re-creation that we fixed over in that issue. Not today, though...
Comment #154
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedDon't worry about the 'target' field on the workspace entity type, we'll remove that weirdness in #2958752: Refactor workspace content replication plugin system to three services and won't try to store an empty value there :)
Comment #155
tstoecklerOK, so in trying to figure this out today I could - as I think I had before at some point - pinpoint the issue to MySQL not being able to drop a column that is part of a composite primary key. This is the case for the ID and langcode fields in the data table and the revision ID and langcode fields in the revision data table. I remember finding something on StackOverflow or similar about this before, but I couldn't find anything this time around.
However, I was wondering why I was seeing this issue since we now have test coverage of all those code paths over in #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field which is green. So I ran those tests locally and saw the same failure. Even without the patch here, with just the one from over there, I was seeing failures in exactly the cases described above:
So since those evidently pass on Drupal.org, I suspect that this might be something on my environment. I'm pretty sure I saw this on Drupal.org as well at some point - as I don't just go on running update path tests on my machine for fun ;-) - but we'll see.
So just re-enabling the update path tests for now to see what happens.
Comment #158
tstoecklerHere's a work-in-progress patch, not green, yet, though.
The fixes entail:
EntityDefinitionTestTrait::updateEntityTypeToRevisionable()
to account for entity types that are not translatable but have langcode entity keyEntityDefinitionTestTrait::makeBaseFieldRequired()
to not unnecessarily use\Drupal
SqlContentEntityStorageSchemaTest::testOnFieldStorageDefinitionUpdateShared()
from #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field to account for the different NOT NULL logic from here. This also required to mark both properties ofShapeItem
required which I hope doesn't break any other test.The first one in the list above was causing a lot of the update path test failures but fixing that reveals another update path problem which I have not found a solution for:
system_update_8004()
callsEntityDefinitionUpdateManager::updateEntityType()
with the dynamic entity type fromEntityTypeManager::getDefinition()
. This means that at this point every revisionable entity type will have arevision_default
field installed becauseEntityFieldManager::buildBaseFieldDefinitions()
will add it. However, at this point therevision_default
field is not (yet) a revision metadata key becausesystem_update_8501()
has not run yet. ThereforeDefaultTableMapping::create()
will add it to the list of fields to add to the entity data table, which then happily succeeds. Regardless of any other updates it is never actually removed from that table subsequently, but it no longer gets added to the data table in the table mapping aftersystem_update_8501()
has run. When actually saving an entity after having run all the updates,SqlContentEntityStorage::mapToStorageRecord()
will not add an entry for therevision_default
field in the record to save to the data table.What this issue changes is that the field gets added with a
NOT NULL
constraint in the database including when it's initially added to the data table because it is marked storage required. Because of this, saving an entity now throws an SQL exception because there's a required field for which we are not providing a value.I see a number of problems here:
system_update_8004()
using the runtime entity type definition. This is a clear bug, but it's not clear whether and if so how we can fix this, as the update will already have run on most systemsEntityFieldManager::buildBaseFieldDefinitions()
adding the field even if there isn't an accompanying revision metadata key. More than adding a simple check this would requireEntityFieldManager
relying on the passed in entity type to build the field definitions, although the fact that that's not the case when::onEntityTypeUpdate()
is called seems like a clear bug to meIn other words: Where I thought I was beginning to make out some clarity in all the fog, the craziness has really just begun.
Comment #161
tstoecklerAlright, opened #2976034: system_update_8004() accidentally installs the 'revision_default' field if run on a 8.5 codebase for #158.1, #2976035: Entity type CRUD operations must use the last installed entity type and field storage definitions for #158.2 and #2976040: default_revision can be left around in data table due to broken entity type CRUD for #158.3 and .4.
Not sure on which of those this will actually be postponed, we'll see.
But it's certainly PP-3 now because #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field is now PP-2.
Comment #162
BoobaaBumping this to critical per the related [#2951279/41]. IDK which one to mark as a dupe, tho.
Comment #163
tstoecklerActually @Boobaa I think that one is a separate issue. Here I had a
revision_default
column too many, i.e. one leftover in the wrong table, not a missing one.Comment #164
tacituseu CreditAttribution: tacituseu commented@Boobaa: they're separate issues, all I meant is that the mechanism that was described in #158 might also be responsible for #2951279: After 8.5.0 upgrade: Unknown column revision.revision_default .
Comment #165
tstoecklerHere's an up-to-date testbot patch with the latest code from all the dependent issues. No change in the "for-review" portion of the patch. So should still have the same amount of fails.
Comment #167
tstoecklerSo #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field is no longer postponed on anything (yay!), but as far as I can tell this is now postponed on #2976040: default_revision can be left around in data table due to broken entity type CRUD as that causes the test failures.
Comment #169
jibranBlocker is in #2938951: The primary key is not correctly re-created when updating the storage definition of an identifier field.
Comment #170
tstoecklerHere's a rebase, and one including the latest patch (#10) from {#2976040]). The
EntityOwnerTrait
was a pretty big conflict, I hope I did everything correctly, but I wouldn't be surprised if there's some fallout from that. Let's see.Comment #172
tstoeckler🙈 both this and #2976040: default_revision can be left around in data table due to broken entity type CRUD add a system update... ...not uploading a new "for-review" for that...
Comment #174
tstoecklerWell that was the interdiff, not the patch...
Comment #176
mpp CreditAttribution: mpp as a volunteer and at AmeXio for District09 commentedUpdated issue description with the API change ahead.
Comment #177
tstoecklerThanks @mpp! The
setStorageRequired()
method is already pre-existing, however. It just doesn't enforce the required-ness on the DB level yet.Also assigning to me for now. I have started to look at some of the test failures, so it doesn't make sense for anyone else to pick this up at the moment.
Comment #178
tstoecklerLet's see if this is green.
If so, I will extract a bunch of the "optional" changes, i.e. the marking required of various entity fields that are not entity keys to a follow-up issue, to re-reduce this issue's scope. Then, as a second step, we can (finally!) change this patch to not update all entity types automatically but make it opt-in in some way. That will get rid of all the update-hook-dependency madness currently in the patch.
Comment #180
tstoecklerRebased for #3001800: Fix "The revision_user revision metadata key is not set." and friends.
Comment #182
tstoecklerBah, the update issue really is gnarly, hopefully this one will be green.
Comment #183
tstoecklerSorry, here's the correct interdiff.
Comment #185
tstoecklerAlright, making the Content Moderation State fields required again so they don't fail the update path tests. I ran all Content Moderation tests locally, so not sure if anything else will break due to that, if we're lucky, this may be fine. Also fixing the
convertToPublishableTest()
Comment #186
tstoecklerExtracting some stuff to #3003586: [PP-4] Use setStorageRequired() instead of overriding the storage schema to mark fields as NOT NULL in the database to reduce scope / patch size here. Hopefully this is still green in which case we can finally tackle the update problem.
Comment #188
tstoecklerAhh apparently I had some update dependencies mixed up which was hidden by the added update functions that are now gone. Also took the opportunity to clean up some minor stuff.
Comment #189
tstoecklerComment #191
tstoecklerSigh...
Comment #192
tstoecklerOK, here's a first stab at making the implementation opt-in. This allows to remove all the update stuff. I ran all of the tests that are/were touched here and pass with this. That also includes some update tests which makes me reasonably confident that this isn't too terrible. We'll see.
Comment #194
tstoecklerRebased for #3023799: Add @trigger_error() to deprecated EntityManager::clearCachedDefinitions() method.
Comment #196
tstoecklerOK, so I spent a lot of time with this during the Global Contribution Weekend and, unfortunately, I am now convinced that the approach in #192 / #194 will not fly. So I am now taking a step back to give some in-depth reasoning on what that approach was, why it doesn't work and what I think we should do instead.
Issue summary
To re-cap, here are the basic facts of this issue:
Previous approach
So while we want to allow entity types to somehow upgrade their schema and opt-in to the new schema generation mechanism, it is clear that we also need to keep supporting the old way. That is why #192 implemented a flag inside of
SqlContentEntityStorageSchema
depending on which the generated schema would follow either the old mechanism or the new one. My thought was that we could avoid having module maintainers be aware of this subtlety so I attached the following logic to the flag:This ensures backwards-compatibility for any pre-existing entity types for existing sites. Each entity type would then have needed to add an update function where it sets that flag on the entity type and subsequently calls
EntityDefinitionUpdateManager::updateEntityType()
which would then upgrade the schema. From then on the flag would be set and the new schema would be used. I had to fight with #2554235: Make the content entity storage and entity query use the last installed definitions instead of the ones living in code, because with this we need to make sure that when callingEntityTypeManager::getDefinition()
the flag will be set. That was annoying, but solvable.This ensures that newly installed entity types benefit from the new schema generation mechanism without having anything to worry about.
Because with this solution the flag would have been completely internal to the storage schema API, I decided in the patches above to add a new
storage_schema_settings
array, which contained this flag so as to support further possible settings in the future to supportfurther improvements to the schema.
The problem
I did not realize, though, that 3. from the first list is in conflict with 2. in the second list. If a newly installed Drupal site with the patch above would install the Custom Blocks module, the
info
column would get aNOT NULL
constraint. Before this patch, that constraint was not there, though, so there might be code in custom modules, or, for example, default content that relies on the ability to insert NULL into that column. This is just one example, but it's already enough to prove that the logic above is not feasible.New approach
So the only way to move forward here, I think, is to force module authors to explicitly opt-in to the new schema generation as part of the entity type annotation. That way not only previously installed entity types, but also newly installed (but previously coded) entity types will still use the legacy schema generation until the module author explicitly changes it in the annotation and provides an update path accordingly.
We then need to document that module authors creating new entity types are encouraged to opt-in to the new-style schema generation from the get-go. This is not ideal, but, as far as I can tell, unavoidable.
To avoid module authors having to remember or copy-paste verbose and somewhat obscure strings as are used in the last two patches above, I think it would make sense to add an opaque
"storage_schema_version"
key to the entity type annotation. The current (implied/implicit) storage schema version would be 1, and this patch would raise the "preferred" storage schema version to 2. That would still provide a mechanism for further improvements in the future but be less tedious to maintain for module authors than a bunch of separate flags/settings.I will attempt to adapt the patch for this now, but wanted to provide some background information to give others a chance to provide some input here.
Comment #197
tstoecklerOK, here's a first stab at the solution proposed in #196. I ran a bunch of tests locally and it looked alright. This is, however, inherently much less invasive, so passing tests are not necessarily great validation of the patch. I set the schema version on all of the test entity types as those by definition do not have any BC-constraints.
Comment #198
tstoecklerSo nice, this doesn't break anything. I am currently working on updating #3003586: [PP-4] Use setStorageRequired() instead of overriding the storage schema to mark fields as NOT NULL in the database to actually upgrade all core entity types to storage schema version 2 using this patch. That one will be the real validation that this is feasible. Also we probably want some more tests here before this is ready.
Just leaving a note here for now that I forgot to update content translation's
entity_test_translatable_UI_skip
andentity_test_translatable_no_skip
in the last patch, just so I don't forget.Comment #199
tstoecklerAnd also the following:
Comment #201
GaëlGHere's a reroll against 8.8.x. It also applies on 8.7.x and 8.7.1.
I add some conflicts. Most of them where merged automagically, but on one commit, some expectations in tests changed both in patch #197 and meanwhile in core. It was count expectations, like initially the code was
$this->storage->expects($this->exactly(2))
, the patch changed it toexactly(1)
and a commit in core changed it toatLeastOnce()
. In these cases, I kept the less strict expectation that matched both cases (atLeastOnce()
).Comment #202
GaëlGWhoops.
Comment #204
geaseSeems that #2554235: Make the content entity storage and entity query use the last installed definitions instead of the ones living in code and #3025427: Add @trigger_error() to deprecated EntityManager->EntityTypeBundleInfo methods introduced changes that require some refactoring of the patch.
Comment #205
catchComment #206
geaseComment #207
geaseMinor update of comments, compared with the previous patch.
Comment #208
geaseThe patch from #207 fixes two issues with tests, observed in #201.
First one was due to deprecation of entity manager in favour of entity type manager and entity field manager and introduction of entity type bundle info service.
In the second one, because of the cached entity type the entity type manager used to return the old definition containing the langcode entity key and based on that the field for the langcode was still returned. This was the case without the patch and with the previous version. However without the patch the field was defined as "DEFAULT NULL" in the DB schema, because it wasn't an entity key anymore. However with the patch we are setting the field as "storage required" and therefore when present the field is always "NOT NULL". When we clear the cached entity type from the entity type manager, then the langcode field will not be returned at all, which is actually the desired behavior.
Comment #209
geaseComment #210
tstoecklerThanks @GaëlG and @gease, nice work! Still needs work for #199 and also needs a change record, but most importantly, the new approach needs a review from another entity API maintainer before we can proceed. Tagging this accordingly. After looking at this again after quite some time I still think the approach is sound, but I also authored it so just me saying that doesn't really count.
Comment #211
hchonovIn
ContentEntityBase::baseFieldDefinitions()
the patch sets storage required on the entity key fields without checking if the entity storage schema is at least version 2. Wouldn't that create an inconsistency between the last installed field storage schema and the live definitions, which will result into the entity definition update manager listing unapplied updates?Comment #212
hchonovI've just checked and
drush updbst
does not return any pending schema updates.However the following returns FALSE and shows the inconsistency I have in mind:
The question is whether we need an update for this? I would assume that we either need an update for that or we have to check the entity storage schema version in
ContentEntityBase::baseFieldDefinitions()
before defining the entity keys as storage required.The updates in #3003586: [PP-4] Use setStorageRequired() instead of overriding the storage schema to mark fields as NOT NULL in the database already take care of updating the last installed storage definition when updating the entity storage schema for core entities to level 2.
----
Based on that I would say that we do not need an update here, but instead should set the entity keys with storage required in
ContentEntityBase::baseFieldDefinitions()
only if the entity storage schema is>= 2
.Setting needs work for this.
Comment #213
hchonovI finally managed to do a FULL review :). Well, what should I say ... amazing work! I really like the new approach and how the patch looks like.
I have some notes, but I consider them minor stuff, which however should be addressed and when this is done I will RTBC it.
I apologise if some of my questions have been answered already in previous comments, but it is not easy to go through all the 200+ comments.
For all other entity keys we don't set them as required. Why are we then adding the todo for the published entity key here?
Similar to that - why is there the note on the owner entity key that it cannot be required? A side note - the anonymous user is
0
and not NULL and therefore I think that required will be fine, however according to the points above I think this is out of scope.Why is this with the current patch needed?
Let's not add the notion that the version is not there and remove the ternary operator.
NULL
will be converted to 0, which is fine for the check.So we'll not be testing any test entities with the legacy storage schema even if knowing that most installations might keep using it until D9?
I would suggest that we include a legacy entity type with
storage_schema_version=1
.Why are making the properties of the test shape item required?
I don't find any results when I search for the string "required_field_storage_definitions" whether in the patch nor in Drupal 8.8.x. Could you explain why this hook implementation is needed?
I think that we should add the same test and expecting it to return FALSE to the upper test
testOnEntityTypeCreateVersion1()
See my previous comment about this, where I suggest to set storage required only if the storage schema is >=2 - #212.Comment #214
geaseAccording to suggestion from #212 added checks for storage schema version in calls to ::baseFieldDefinitions().
Comment #215
geaseFixed tests with some missing EntityType::get('storage_schema_version') mocks.
Comment #216
geaseEntityFieldManagerTest::testGetBaseFieldDefinitionsTranslatableEntityTypeLangcode() for 8.7 seems to be shorter than for 8.8 so one hunk of the above patch didn't apply automatically. Re-rolled for this purpose for 8.7.x branch.
Comment #217
geaseComment #218
geaseComment #219
hchonovIt looks like that @amateescu was also in favour of a solution where the new NOT NULL behaviour should be OPT IN for BC reasons. From #102:
-----
From #101:
I was not aware of that as well. But that means, that an update to the storage schema v2 would mean, that all required base fields will now get the new
NOT NULL
constraint. This is a problem, because base field field overrides for different bundles might be affected by updating the storage schema to v2, as then if a base field is required for one bundle, but not for the other it will still be storage required and therefore it will be impossible to save entities for the second bundle, where the field isn't required, because the storage schema will already be defined with theNOT NULL
constraint.In order to solve that problem we cannot allow for
BaseFieldDefinitions::isStorageRequired()
to fallback toisRequired()
for entity storage schema >= 2 anymore. This is a behaviour change, but not BC break, because the new behaviour is OPT IN and will remain like this until Drupal 9 even for new installations and entities. This change is needed because even if we call both ->setRequired(FALSE) and->setStorageRequired(FALSE)
on the overridden base field, it still will already be created in the database with the NOT NULL constraint and the only possible solution to prevent that would be to implementhook_entity_base_field_info_alter()
and call->setStorageRequired(FALSE)
for that field - however this not a solution for existing entity types, because the hook implementation should be present before the entity type has been installed, which would prevent custom and contrib modules from overriding required base fields for specific bundles.We need a test for this. I suggest adding a new required base field on EntityTestWithBundle, then adding two bundles and implementing
hook_entity_bundle_field_info()
in theentity_test
module, cloning the required base field there and returning the cloned object with->setRequired(FALSE)
for the second bundle only. Then create two entities with a missing value for the field - one for each bundle, and save them. This test should pass on HEAD and fail with the current version of the patch.Comment #220
geaseImplemented a test as suggested in the last paragraph of #219. The assumption was correct, it passes without the patch and fails with the patch, except that calling
->setRequired(FALSE)
inhook_entity_bundle_field_info()
doesn't seem to have any effect.Comment #224
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis issue was mistakenly bumped to critical in #162 (see the next few comments), but hasn't been brought back to major after that.
Comment #225
geaseA reroll from #215. Also removed one erroneous 'schema_version' from annotation of config entity.
Comment #226
hchonovComment #227
joseph.olstadthis patch also will apply to 8.7.10 (with fuzz)
Comment #228
hchonovWe need the check only for new additions, not for existing ones. The change here is wrong as it will affect existing installations with the old schema, but it shouldn't as they have been already installed with the storage required flag set to TRUE.
non-related changes should not be part of the patch.
Comment #229
geasePer comment #228 removed condition where setStorageRequired was introduced before this patch, and wrapped for convertToRevisionable(), where, instead, setStorageRequired is introduced with this patch.
Comment #230
hchonovThe bundle field needs to be updated with
::setStorageRequired(TRUE)
as well.This comment is unnecessary here as we are dealing with the storage only and not with the form level requirement. Instead it should be taken care of in the referenced issue.
This has been resolved in #2928906: The field schema incorrectly stores serial fields as int which is RTBC. Let's remove it from here and provide two files - one "--combined.patch" for testing and one "--review.txt" for the changes of the current issue only. The "--combined.patch" should be build on top of the referenced issue.
$field_definitions
=>$field_definition
Comment #231
gease#23 addressed.
Comment #232
geaseComment #233
daffie CreditAttribution: daffie commented#2928906: The field schema incorrectly stores serial fields as int has landed.
Comment #234
gnunes CreditAttribution: gnunes at bio.logis Genetic Information Management GmbH commentedI've re-rolled the review patch from #231 (2841291-231-review.txt), because it didn't contain the merges from the merged patch as requested in #230, for both 9.0 and 8.9. I added the interdiffs too.
It needs review please.
Comment #235
tstoecklerJust noticed that this is still assigned to me. Will try to get to review this in the next couple of days but will but for now rectifying the assignment status to match the status quo.
Comment #236
hchonovThis comment does not have anything to do with the current issue and therefore is unnecessary and out of scope. I think it was based on a previous assumption, that setRequired(TRUE) leads to
::isStorageRequired()=TRUE
, which however turned out to be wrong and the behaviour for that will be removed in #3083131: BaseFieldDefinition::isStorageRequired wrongly falls back to ::isRequired.Let us be more explicit in the naming of the tests methods and name them e.g. "testOnEntityTypeCreateWithStorageSchemaVersionN".
I've found only this nitpicks. Beside them I consider this RTBC already. Feel free to set to RTBC directly with uploading the next patch for those nitpicks in order to save on comments count :).
Comment #237
dhirendra.mishra CreditAttribution: dhirendra.mishra at Valuebound for Valuebound commentedComment #238
dhirendra.mishra CreditAttribution: dhirendra.mishra at Valuebound for Valuebound commentedComment #239
alexpottI think this issue needs quite a bit of metadata work.
Needs more documentation about what this is.
Comment #245
MurzI re-rolled the patch from #238 to 9.3.x branch as GitLab MR with some removals: