Problem/Motivation
This is the final beta-blocking child of #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions. The parent issue in general, and #2333113: Add an EntityDefinitionUpdateManager so that entity handlers can respond (e.g., by updating db schema) to code updates in a controlled way (e.g., from update.php) specifically, made it so that when you deploy new code that changes an entity type's definition or its field definitions, update.php can make entity handlers react to that. Which means, it's possible for onFieldStorageDefinitionCreate()
, onFieldStorageDefinitionUpdate()
, and onFieldStorageDefinitionDelete()
to be called on base fields, but the SqlContentEntityStorage
implementation of those methods were initially written to only handle fields whose storage is in their own dedicated table, because that's all that was supported in Drupal 7 (via Field UI).
Proposed resolution
- Improve
DefaultTableMapping
to also support fields that can be stored in the entity's base tables. - Fix
SqlContentEntityStorage
to work with the improved DefaultTableMapping. - Move all field schema generation code from SqlContentEntityStorage to
SqlContentEntityStorageSchema
, where it belongs, and make it work with the improved DefaultTableMapping.
Remaining tasks
Review. Commit.
User interface changes
None.
API changes
- Removes
getReservedColumns()
fromBaseFieldDefinition
andFieldStorageConfig
. That is now the exclusive province ofTableMappingInterface
. - Adds a
isBaseField()
method toFieldStorageDefinitionInterface
. Whether a field is required to exist for every bundle of an entity type can and does impact storage decisions, so it makes sense for FSDI to provide that info. - Adds
allowsSharedTableStorage()
,requiresDedicatedTableStorage()
, andgetDedicatedTableNames()
methods toDefaultTableMapping
. Although public, in the patch they are only called bySqlContentEntityStorage
andSqlContentEntityStorageSchema
. #2326719-25: Move pseudo-private table mapping functions from ContentEntityDatabaseStorage to public API of DefaultTableMapping contains a proposal for a future followup issue to expand TableMappingInterface to provide some of this information in a more generic way, but for now, these are methods specific to the DefaultTableMapping implementation. - Implements
DefaultTableMapping::allowsSharedTableStorage()
as returning true for single-valued base-fields only. Multivalued base fields get a dedicated table, which is a database schema change compared to HEAD, but as a result, now work. The only two core examples of a multi-valued base field areuser.roles
andtaxonomy_term.parent
, and because SqlContentEntityStorage hasn't supported this until now, they've had to implement their own custom storage. This patch doesn't change those two use cases, but #2248977: Complete support for multi-value base fields in ContentEntitySchemaHandler and use it for the user.roles field is an issue for doing so. - Adds a
finalizePurge()
method toFieldableEntityStorageSchemaInterface
that's identical to HEAD'sFieldableEntityStorageInterface::finalizePurge()
. With this patch, both the storage handler and the storage schema handler need to do something when field data is done being purged. - Adds an optional
$storage_definitions
parameter toSqlEntityStorageInterface::getTableMapping()
that defaults to the current ones defined for the entity type. But, during a definition update or delete, this allows a table mapping to be retrieved for the old definitions. - Moves all schema affecting code from
SqlContentEntityStorage::onFieldStorageDefinition(Create|Update|Delete)()
andSqlContentEntityStorage::finalizePurge()
to the same methods onSqlContentEntityStorageSchema
. In the process, makes those methods work for all fields (shared-table and dedicated-table). - Makes
SqlContentEntityStorage::countFieldData()
work for shared-table fields. - Moves pseudo-private
SqlContentEntityStorage::_fieldSqlSchema()
method to protectedSqlContentEntityStorageSchema::getDedicatedTableSchema()
and also adds a protectedSqlContentEntityStorageSchema::getSharedTableFieldSchema()
to handle shared-table fields. - Moves pseudo-private
SqlContentEntityStorage::_fieldIndexName()
method to protectedSqlContentEntityStorageSchema::getFieldIndexName()
. - Requires entity-type-specific subclasses of
SqlContentEntityStorageSchema
(core has several, see patch) to move their field-level overrides fromgetEntitySchema()
togetSharedTableFieldSchema()
. This is because when an updated field definition gets deployed, but there are no entity type definition changes being deployed, only the latter runs.
Comment | File | Size | Author |
---|---|---|---|
#80 | interdiff.txt | 829 bytes | effulgentsia |
#80 | entity-fix-sql-storage-2337927-80.patch | 225.16 KB | effulgentsia |
#78 | interdiff.txt | 28.22 KB | effulgentsia |
#78 | entity-fix-sql-storage-2337927-78.patch | 225.13 KB | effulgentsia |
#74 | interdiff.txt | 8 KB | effulgentsia |
Comments
Comment #1
effulgentsia CreditAttribution: effulgentsia commentedComment #2
effulgentsia CreditAttribution: effulgentsia commentedInterdiff relative to #2298525-97: Test issue for Make ContentEntityDatabaseStorage handle changes in the entity schema definition. What's being removed in this interdiff is going into #2338873: Modules providing non-configurable field storage definitions can be uninstalled, leaving orphaned unpurged data.
Comment #3
effulgentsia CreditAttribution: effulgentsia commentedWe need to resolve this @todo or link it to an issue.
Comment #4
plachI think a separate issue would be better, given the patch size.
Comment #5
effulgentsia CreditAttribution: effulgentsia commented+1 to #4.
Should we remove the getKeys() comparison (as #2 is doing) or not (as #2298525-99: Test issue for Make ContentEntityDatabaseStorage handle changes in the entity schema definition)? Given the other comparisons, including the added getEntitySchemaData(), I don't know if there are any keys that could be different that we're concerned with.
Comment #6
effulgentsia CreditAttribution: effulgentsia commentedUseless assertions. What should they be changed to?
Comment #7
plachWe have to remember to block entity schema changes before the last iteration as some storage classes do not support them yet, AAMOF they hard-code quite some assumptions about table layouts.
Comment #8
plachComment #9
plach+1 on #5
Mmh, it seems I forgot to complete that refactoring: we just need to convert those assertions to expectations for the
::createTable()
method of the db schema service (see::testOnEntityTypeCreate()
). I can fix those tomorrow, bed time now :)Comment #10
effulgentsia CreditAttribution: effulgentsia commentedPer #2333113-105: Add an EntityDefinitionUpdateManager so that entity handlers can respond (e.g., by updating db schema) to code updates in a controlled way (e.g., from update.php), we should change this from state to its own KV collection and remove that key prefix. We should also constructor inject it.
Comment #12
effulgentsia CreditAttribution: effulgentsia commentedThis looks like potentially good code, in terms of allowing some class of schema changes on a non-empty table, but I can't tell under what circumstances it's meant to run. Seems like there needs to be some kind of check for whether indeed only index changes are involved. Without that, we get the failure in #2. However, I wonder if this is an enhancement that can be a post-beta followup. So, this patch reverts it.
Comment #13
effulgentsia CreditAttribution: effulgentsia commentedWhy is this needed? Shouldn't the above delete/create themselves already handle dropping and creating revision tables? Reverting this to see if tests still pass.
Comment #15
effulgentsia CreditAttribution: effulgentsia commentedI see. Ok. Reuploading #12.
Comment #16
effulgentsia CreditAttribution: effulgentsia commentedComment #17
effulgentsia CreditAttribution: effulgentsia commentedThis is the last beta blocking child of the parent issue, so moving the beta blocker tag to it.
Comment #18
effulgentsia CreditAttribution: effulgentsia commentedComment #19
effulgentsia CreditAttribution: effulgentsia commentedComment #20
BerdirCore actually has at least two use cases for this, we just couldn't use it yet. That is a) user.roles and b) taxonomy parents (which are right now lazy-loaded and lazy-updated, but with all the other writes that are now going on and the entity cache, I think we can drop that stuff and save a lot of custom code.
I've started working on this a while ago in #2248977: Complete support for multi-value base fields in ContentEntitySchemaHandler and use it for the user.roles field, will pick it up when this is in.
Will try to review this today.
Comment #21
alexpottAdding deprecated functions to get the beta in kind of sucks.
The $this->tableMapping and $table_mapping logic is really difficult to follow. Basically it seems as though we statically cache when we are not using custom storage definitions. A comment would help somewhere. We have one usage of this functionality - in \Drupal\Core\Entity\Sql\SqlContentEntityStorageSchema::deleteSharedTableSchema() - i wonder if this would be simpler with an additional function for this and then refactor the common code into a protected function.
Fieled -> Field
Comment #22
plachGiven 7, 12 is fine by me. I wrote most of the patch code so I am obviously generally +1 on it :)
Comment #23
plachI agree that's not nice, but I think solving it would require to fix #2274017: Make SqlContentEntityStorage table mapping agnostic , which still needs some thoughts/exploration and that can be addressed mostly as internal refactoring, thus not requiring to be addressed before beta. Since that method is not part of any public API, although being public, I think the problems implied by going this way are more theoretical than practical. @eff suggested a similar approach to address #2326719-25: Move pseudo-private table mapping functions from ContentEntityDatabaseStorage to public API of DefaultTableMapping, that is making some methods public but not part of an interface, which works fine. Addressing this the same way makes sense IMHO.
Comment #24
plachWorking on #21.2 and #21.3, while we wait for Alex's reply.
Comment #25
fagoI reviewed the patch and I think it's ok to proceed as is. The most unfortunate part about it is defnitely the already mentioned setEntityType() part and table mapping instantiation, but cleaning this up is no simple task and would postpone this issue unnecessarily on a non-API breaking clean-up. I do think it's better done in a separate issue where we can focus on it anyway.
@plach: setEntityType() seems to be already a public non-interface method. We discussed moving more logic in the table mapping classes, do we have a separate issue for that or is that supposed to be part of #2274017: Make SqlContentEntityStorage table mapping agnostic ?
Deprecated naming, it's not called schema handler anymore. It's just internal though, so no show-stopper.
I think it would make sense to add an method to the table mapping that can give you the right table for a given fields given some parameters (old-revisions vs new, default language or not).
Looks like this got deleted without replacement?
Update: Covered by testDedicatedTableSchema() of SqlContentEntityStorageSchemaTest now, so that's fine.
Comment #26
fagocross-post
Comment #27
plachThis should fix #21, I'll answer to @fago's review in a separate post as soon as I get a chance. I just boarded a plane :)
Comment #28
plach[double post]
Comment #29
plach+1 on renaming that in a separate issue, it's a totally internal change.
@efff opened an issue for all of this, see https://www.drupal.org/comment/9095763#comment-9095763
Comment #30
yched CreditAttribution: yched commented$this->schemaHandler : is that such a big deal to rename here ? Seems a bit artificial to push this to a followup, and those kind of internal cleanups tend to get easily forgotten. But no showstopper, agreed.
Comment #31
plachWhat we have for now is only that: I think it migth make sense to address moving instantiation logic in the table mapping and refactor the storage to use it together, as I imagine the required changes to be related. Otherwsie we need a separate issue and we need to update the patch @todo references.
Comment #32
effulgentsia CreditAttribution: effulgentsia commentedWhat should we rename
schemaHandler()
to? In docs, we refer to it as the "storage schema handler". Once #2332857: Decide how secondary entity handlers (like storage_schema) should be instantiated gets fixed, you'll be able to get it via$entity_manager->getHandler('storage_schema')
. It's an object that implements the interfaceEntityStorageSchemaInterface
(just as most other interfaces for objects you can get via$entity_manager->getHandler($handler_type)
don't have the*Handler
suffix. Given all that, since the currentschemaHandler()
method is in the storage class, and therefore, astorage
prefix would be unnecessary, should we just name the methodschema()
? That would be consistent withDrupal\Core\Database\Connection::schema()
.Comment #33
fagoad #32: I'd still think that it should be getStorageSchema() and $storageSchema property as this matches the interface. I'd keep the storage prefix also as the dependencies of the storage are not all storage related.
Comment #34
fagoMinor:
Missing space before "for".
Comment #35
effulgentsia CreditAttribution: effulgentsia commentedFixes #33 and #34.
Comment #36
fagoI gave this some manual testing with the following results:
1. Changed a base field without a storage change - all fine, no updates detected.
2. Added a new base field to user entity, all created fine.
3. Tried deleting an existing base field. This generally seems to think it deals with dedicated fields and fails:
Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'd8.comment__homepage' doesn't exist: UPDATE {comment__homepage} SET deleted=:db_update_placeholder_0; Array ( [:db_update_placeholder_0] => 1 ) in Drupal\Core\Entity\Sql\SqlContentEntityStorage->onFieldStorageDefinitionDelete() (line 1508 of /var/www/core/lib/Drupal/Core/Entity/Sql/SqlContentEntityStorage.php).
4. Tried updating field settings influencing storage, i.e. update comment id to be not unsigned. FieldStorageDefinitionUpdateForbiddenException has been correctly thrown.
5. Tried changing the schema provided by a field type. This seems to try to change all fields but fails to detect the data migration. I guess that is as the schema for the before and after comparison is calculated both with the updated way. However, that's something what is officially not supported in d7 either so I think it's fine to not have that case covered for now. That's something we should probably fix for release though -> I'd suggest creating a follow-up, critical issue for this. Thoughts?
Thus, setting needs work for 3.
Comment #37
effulgentsia CreditAttribution: effulgentsia commentedWork in progress for #36.3.
Comment #38
fagocomment exceeds 80chars.
Comment #40
fagoThanks eff, I took a look at fixing the fail / issue. Problem is that currently the table mapping creation relies on the entity manager to determine whether something is a base field, but when the field has been deleted this information is not available any more as we do only keep field storage definitions in state.
However, we should be able to determine the storage by looking at the field storage definitions only - as that's their purpose. Thus, passing in base field definitions into the DefaultTableMapping is problematic in general. As we need to know whether a field is a base field in there I suggest adding FSDI::isBaseField(). Imo that makes sense to have as it's required to determine the storage.
Attached patch contains those test fixes and just adds in the existing but not previously unused logic to drop the columns of a deleted shared field. This seems to work fine - but I did not check whether influences other tests, so we'll see whether it is green.
However, it drops the columns only where there is no data, because else we need to do field purging in order to call delete() on each of the purged field items. That's not implemented yet for base fields and a bigger topic left for #2282119: Make the Entity Field API handle field purging. So in the case of having data we can throw an update-not-possible exception for now. However, I do think that's something we should fix for release, thus bump #2282119: Make the Entity Field API handle field purging to critical.
Comment #41
fagoI've repeated my test for #36.3 and it works as it should now. However, imo it still needs some more test coverage to make sure this won't break people's data after beta:
- EntityDefinitionUpdateTest misses an testFieldCreateDeleteWithData() what makes sure we throw the respective exception when we try to delete a base field with data.
Given that improved coverage and green tests, I think this is ready to go!
Comment #43
effulgentsia CreditAttribution: effulgentsia commentedThanks fago. I'm looking into those failures.
Comment #44
BerdirSo, I'm trying to test this using my file_entity use case that adds a bundle field to the file entity. And unfortunately, I'm still stuck on my 'initial' problem. My attempt in the previous issue with putting it in a setting of the field definition wasn't accepted, which is OK with me :)
However, I've now tried the only other thing I can think of (short of providing a custom field type plugin for this) and that is to override the storage_schema handler and add the initial key there. That doesn't seem to work. Apparently, onFieldStorageDefinitionCreate() doesn't go through getEntitySchema(), so it doesn't pick up whatever changes are made there... That seems problematic? It would mean that any addition there on base fields will be skipped when you call onFieldStorageDefinitionCreate()/Update(), like those not null calls or the uid type override in UserStorageSchema.
Is that a known issue? Should we somehow route the schema definition for base fields through getEntitySchema()? If you think that we can solve this as a follow-up without outside API changes then I'm OK with that :) Just wanted to make sure we're aware of this limitation...
Comment #45
BerdirIf you want to try it out yourself, the code is in https://github.com/md-systems/file_entity/tree/schema_handling, run FileEntityFileTypeClassificationTest, which is testing exactly this scenario.
See https://github.com/md-systems/file_entity/compare/schema_handling for the relevant changes compared the current manual creation of that field.
Comment #46
BerdirOne more thing on this :)
I started re-rolling #2248977: Complete support for multi-value base fields in ContentEntitySchemaHandler and use it for the user.roles field on top of this, to test it.
This, as a generic statement, is not (yet) true. Yes, it has logic to build the schema for those tables, but it doesn't actually create the necessary tables yet when a module is installed.
I added the following to onEntityTypeCreate() to allow that in my issue (no patch posted yet, still working on it):
Feedback is very welcome if there is a better way to do it :) Again, mostly wanted to make sure that the current state is clear and what works and what doesn't.
Comment #47
effulgentsia CreditAttribution: effulgentsia commentedThis fixes the failures from #40 and incorporates #46 (slightly different code, but basically the same approach). I still need to add the test for #41 and respond to #44. @plach: do you have thoughts on #44?
Comment #48
effulgentsia CreditAttribution: effulgentsia commentedSigh. Forgot to set status for bot.
Comment #50
effulgentsia CreditAttribution: effulgentsia commentedAdded the test coverage requested in #41.
Comment #52
effulgentsia CreditAttribution: effulgentsia commentedThis is wrong. Until we solve #2282119: Make the Entity Field API handle field purging, purging doesn't work for any nonconfigurable field, regardless of table layout. This patch changes the assertion to expect an exception and fixes the code accordingly.
Comment #53
effulgentsia CreditAttribution: effulgentsia commentedI made two final changes before calling it a night:
I don't have the steam left to think about #44. I'm hoping you Europeans can finish whatever is left here and get it to RTBC while I sleep :)
Comment #54
effulgentsia CreditAttribution: effulgentsia commentedComment #55
fagoCreated #2339855: Handle changes to the schema provided by a field type for #36.5
Agreed. That means we always have to go through entity schema to derive the schema of a certain field, i.e. we have to make sure we can invoke getEntitySchema() with old field storage definitions also. However, if we add the possiblity to add indexes in a storage acgnostic way I'm not sure the getEntitySchema() override would be still needed? So maybe that would be alternative, cleaner way to address the problem - but that's clearly follow-up.
Imo, we have to fix bailing out to getEntitySchema() for getting the schema of a single field storage definition, i.e. add a possibility to pass in the field storage definitions for which to generate the schema similar getTableMapping() on the storage. I think we need plach's input on this one though :/
But moreover, the changes of #53 make the system to fail to detect changes to this customizations, or to detect changes like #2339855: Handle changes to the schema provided by a field type, thus I'm not sure this is a good idea?
Comment #56
effulgentsia CreditAttribution: effulgentsia commentedI don't think the changes in #53 did that. I think that's a problem both in HEAD and with earlier versions of this patch, since
getEntitySchemaData()
strips out all field schema info anyway. I opened #2340061: Overrides to field schema in *StorageSchema::getEntitySchema() aren't applied when base fields are created/changed later to deal with that problem. I don't think either that or #2339855: Handle changes to the schema provided by a field type is this issue's scope though: this issue is about reacting to changes in field storage definitions. If there are code changes other than definition changes, that's a separate scope, so let's deal with it in the follow ups.Comment #58
plachThis should fix #44 by moving field schema alteration to
::getSharedTableFieldSchema()
implementations. As discussed in Skype with @berdir, @eff anf @fago, this restores entity schema index/key difference tracking, which allows to detect changes also in field index/key definitions, as alterations are not prefixed with'field_'
and thus are stored together with entity index/keys.I opted for allowing
::getSharedTableFieldSchema()
to return a different definition depending on the shared table, since there are edge cases where the same field can have slightly different definitions depending on the table. For instance{node}.vid
is nullable while{node_revision}.vid
is not.Comment #60
effulgentsia CreditAttribution: effulgentsia commentedComment #61
effulgentsia CreditAttribution: effulgentsia commentedRefactored EntityDefinitionUpdateTest to make it easier to add new test methods to it to cover the fixes in #58. It now doesn't rely on installing a module to make a definition change, since that's not the primary purpose of EntityDefinitionUpdateManager. The purpose of EntityDefinitionUpdateManager is to allow for definition changes via updated code, so the test now simulates that via
state()
.Comment #62
effulgentsia CreditAttribution: effulgentsia commentedThat part looks good. Here's an added test.
Comment #63
effulgentsia CreditAttribution: effulgentsia commented#62's patch is correct, but the interdiff is wrong. Here's a better one.
Comment #64
effulgentsia CreditAttribution: effulgentsia commentedThat part is problematic. Attached is a failing test change.
Comment #66
fagothanks d.o. for eating my comment :(
Changes look good in general, although I must say that the DX of getSharedTableFieldSchema() is worse than the single getEntitySchema() before. Having everything in a single array makes it nicer, but moreover we have to make clear when to use which method now. I do think this is solvable with documentation for now, plus we should be able to do away with the per-field optimizations rather easily by adding storage-acnostix indexes support.
That valuable comment got lost.
It's great to have the improved test coverage now, as this covers now all important cases. I did test my manually tests once again and came to the same result: everything's good except the problematic case of adding a field having optimizations. In my test, it detected an entity type change + the field change, thus I'd assume the problem is that the entity type change already re-initialized the schema, so the field change afterwards fails.
Comment #67
fagook, I fixed the tests, re-added above comment and added some docs on when to override which method. Given that, this now takes care of the remaining problems as well and should be ready to go imo.
Comment #68
effulgentsia CreditAttribution: effulgentsia commented#67 looks good to me.
This patch tries to improve it. I think it's about on-par now.
Comment #70
plachThis should fix installation.
Comment #71
effulgentsia CreditAttribution: effulgentsia commentedResolved the only @todo in the patch without an issue link. And changed the only one that was linking to the meta to link to its dedicated followup.
I think the only thing that's left here is fixing #6/#9.
Comment #73
effulgentsia CreditAttribution: effulgentsia commentedFixed the failure in #71.
Comment #74
effulgentsia CreditAttribution: effulgentsia commentedThis fixes #6 by moving the test from SqlContentEntityStorageTest to SqlContentEntityStorageSchemaTest and copying the implementation from testDedicatedTableSchema() to testDedicatedTableSchemaForEntityWithStringIdentifier(), and just changing the mock and expectation from integer to string.
I think all feedback on this issue has now been addressed.
Comment #75
plachRTBC +1 if this is green :)
Comment #76
kylebrowning CreditAttribution: kylebrowning commentedomg. +1 if RTBC
Comment #77
jibranUnable to find any nitpicks so RTBC as per #75.
Comment #78
effulgentsia CreditAttribution: effulgentsia commentedYay!!! Thanks for the RTBC!
Good news: I added some more test coverage: to cover field schema updates, not just creation and deletion. For the update test, I used updating the field type from
string
totext
, which also has the effect of testing a multicolumn type, and updating from single column to multicolumn. And I also added a test for the update to involve a change to a customization in getSharedTableFieldSchema(), which ends up being a similar problem space as #2339855: Handle changes to the schema provided by a field type (in fact, the solution solves that issue).Bad news: Those tests required code fixes to make work. So, with much sadness, I need to un-RTBC this and await a re-RTBC from someone.
Good news: I'm about to close #2339855: Handle changes to the schema provided by a field type as a duplicate. Yay for one less followup.
Comment #80
effulgentsia CreditAttribution: effulgentsia commentedComment #81
effulgentsia CreditAttribution: effulgentsia commentedComment #82
swentel CreditAttribution: swentel commentednitpick - 80 chars, but shouldn't hold up the patch.
same here
Marking RTBC so core committers can have a first look up at it.
Comment #83
alexpottI gave the patch thorough review in #21 and the discussion since has addressed further issues with interdiffs all looking good. Let's do this :)
Committed f27fd1f and pushed to 8.0.x. Thanks!
Comment #85
webchickYEEEEAAAAAHHHHHH!!
Comment #86
yannisc CreditAttribution: yannisc commentedIncredible!
Congratulations and many thanks to all core devs for working on this and other issues making Drupal 8 a step closer for all the rest of us!!!
Comment #87
plachWHO-HOO!
Comment #88
yched CreditAttribution: yched commentedYay ! Congrats, folks ! Sorry I couldn't help much :-/
Comment #89
int CreditAttribution: int commentedCongrats.. There are 0 beta blockers now. Let's wait and see :-D
Comment #90
xjmHooray!
For everyone who noticed that this was the last known beta blocker and is wondering when the beta will come out, see: Today there are zero Drupal 8 beta blockers! Here's what's next.
Comment #92
ekes CreditAttribution: ekes as a volunteer commentedIt seems that in the course of this the code for supporting 'unique keys' got lost. It's still documented, there is code to remove them, but not create them even if they are in the schema array.
I've opened a new issue at #2742103: SqlContentEntityStorageSchema::getDedicatedTableSchema() ignores 'unique keys' from the schema definition