Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Updated: Comment #203
This is being coded in the D8MI - Entity Translation sandbox.
Problem/Motivation
- An entity type's base field definitions are to a large part duplicated in the providing module's
hook_schema()
implementation which is needed to provide the database schema for the entity type. Especially since field items provide their own schema, this has become completely unnecessary - Because the field definitions and the schema are completely separate systems changes to field definitions are not reflected in the schema and thus are only supported if the storage is cared for separately
- As this has storage implications it is not easily possible to change entity types from translatable to non-translatable or vice-versa and from revisionable to non-revisionable or vice-verca
- Even in Drupal 7 it is very cumbersome to properly declare the schema for revisionable entities which is why few contrib entity types are properly revisionable. With Drupal 8 and the notion of entity translation in core, the burden for declaring the schema for revisionable and translatable entities is even higher.
Proposed resolution
- Introduce a
SqlStorageInterface
for SQL-based entity storages. It contains agetTableMapping()
method which returns aTableMappingInterface
value object which contains a list of database fields per table for the entity type. It also stores the relationship of the database fields to the entity field definition. Both entity field definitions which provide multiple database fields (i.e. multi-column fields) and database fields that do not correspond to entity field definitions (i.e. denormalized database fields such as 'default_langcode') are supported. Because of this complexity a value object with utility methods is chosen rather than a simple array. As the table mapping is needed during querying and saving the respective code is put directly on the storage class. The table mapping has a switch statement that returns a different table structure, depending on whether the revisionable or not, translatable or not, or any combination of that. This will allow for easier refactoring once we splitContentEntityDatabaseStorage
into different classes for the different types. - Introduce a
ContentEntitySchemaHandlerInterface
that uses the entity type, the entity type's field definitions (and their respective schema arrays) and the table mapping (see above) to generate the full schema array for all of the entity type's schema in agetSchema()
method. As this code is only used during module install (currently) this code lives in a separate class, theContentEntitySchemaHandler
. To hide this code organization detail from the outsideContentEntityDatabaseStorage
implementsContentEntitySchemaHandlerInterface
. ThegetSchema()
method ofContentEntityDatabaseStorage
simply forwards the call toContentEntitySchemaHandler
, which it lazy-instantiates directly (i.e. using the class name directly). While using direct instantiation in a class is generally discouraged neither of the 3 methods we have for swappability in core are applicable here:- We cannot make it a service as the schema handler needs the entity type injected. Therefore we would need per-(content)-entity-type services, but this is not possible as you cannot call
EntityManagerInterface::getDefinitions()
during container building - We could make it a top-level entity controller/handler but the whole concept of schema is bound to the specific implementation of the storage. I.e. there might be storages which do not need a schema at all and, thus, it would be semantically incorrect to expose this as part of the API
- The entity query factory factory pattern was invented to allow for this sort of easy swapping per entity type. Following that pattern we could introduce a
getSchemaHandlerServiceName()
method onSqlStorageInterface
. Then we could use the storage to figure out which schema handler to use. But since the storage needs the schema handler (constructor) injected there is a chicken and egg problem: Between the storage and the schema handler each one needs the other to be instantiated. This could be solved by using setter injection but we don't really have a way to use setter injection for entity controllers/handlers.
The fact that we have this separation is quite neat as entity types do not actually have to override the schema handler but can just provide additional schema tables that are not (in part not yet) auto-generated in the storage class.
- We cannot make it a service as the schema handler needs the entity type injected. Therefore we would need per-(content)-entity-type services, but this is not possible as you cannot call
- Entity queries are updated to use the table mapping instead of
drupal_get_schema()
- Entity saves and loads are updated to use the table mapping instead of
drupal_get_schema()
anddrupal_write_record()
- Because entity type tables are no longer registered in
hook_schema()
DrupalUnitTestBase::installSchema()
fails for entity type tables. Therefore aDrupalUnitTestBase::installEntitySchema()
is introduced.
Remaining tasks
- More reviews?
User interface changes
None.
API changes
- Modules should/can no longer declare their entity type schema in
hook_schema()
. Instead the schema is generated automatically for them. In case they want to specify additional schema information (denormalization tables, indexes, ...) they need to provide a custom storage class and overridegetSchema()
. If they do want to avoid this behavior for some reason they can provide a storage controller which does not implementContentEntitySchemaHandlerInterface
DrupalUnitTestBase::installSchema()
no longer works for entity tables.
API additions
- Introduce a
TableMappingInterface
- Introduce a
SqlStorageInterface
- Introduce a
ContentEntitySchemaHandlerInterface
- Introduce a
DrupalUnitTestBase::installEntitySchema()
Follow-ups
Follow-ups needed if/when this is committed:
- #2232465: Deprecate table names from entity definitions
- #2248977: Complete support for multi-value base fields in ContentEntitySchemaHandler and use it for the user.roles field
- #2258347: Consider adding hook_entity_schema_alter()
- #2274017: Make SqlContentEntityStorage table mapping agnostic
- #2277979: Menu link storage does not implement SqlEntityStorageInterface
Blockers of this issue
-
Spin off to separate issues
Good ideas of separate parts discussed or found during this issue, but not blocking this, and could be done without committing this (not follow-ups)
- #2256521: [META] New plan, Phase 2: Implement menu links as plugins, including static admin links and views, and custom links with menu_link_content entity, all managed via menu_ui module: This would allow to remove several hacks.
- #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities: Would allow us to remove one hacky line of code, where we explicitly disable
not null
- #2232427: Allow field types to control how properties are mapped to and from storage: Would allow to clean up some lines of code, that still use the D7-style
serialize
schema key. - #2209049: Allow field schemas to contain unique keys: Would allow to remove a couple lines of code where we hardcode a unique key for the UUID field.
- #2248795: Entity displays fetch the bundle field definitions for deleted bundles: Would allow to remove a very hugly hack.
- #2232471: Discuss and standardize field definition descriptions == schema descriptions: Multi-column fields currently have duplicated schema descriptions.
Comment | File | Size | Author |
---|---|---|---|
#415 | d8_entity_schema.patch | 301.55 KB | fago |
#411 | d8_entity_schema.interdiff.txt | 4.25 KB | fago |
#411 | d8_entity_schema.patch | 301.56 KB | fago |
#409 | d8_entity_schema.interdiff.txt | 1.12 KB | fago |
#409 | d8_entity_schema.patch | 299.11 KB | fago |
Comments
Comment #1
tstoecklerSo I started some work on this recently.
I used the following approach: I added a getSchema() method to FieldableDatabaseStorageController. I started by simply looping over baseFieldDefinitions() of the entity type and putting that together. I then diffed the generated schema with the current schema in hook_schema(). That made me notice a number of missing pieces in the entity field API which is why I opened the following issues:
The current status is a WIP and more of an initial stab at this. @plach's comment in #1498720-34: [meta] Make the entity storage system handle changes in the entity and field schema definitions lead to post this now, though, rather than later in order not to duplicate any efforts.
I'm providing the following files:
*-gets-schema.patch
is the actual getSchema() method in FieldableDatabaseStorageController*-diff-script.php
is the script I used to diff the two schemas*-schema.diff
is the output of the above script*-schema-changes.patch
is a bunch of minor changes to the existinghook_schema()
implementations to avoid inconsistencies and reduce the size of the diff generated by the scriptNote that I'm not at all bent on following this approach any further it just proved useful to me in finding out how our schemas are currently structured.
Comment #3
tstoecklerHere's a new patch. With this patch I am able to properly generate the entire schema of the various entity_test_* entity types. I chose entity_test specifically so we can proceed quicker on this, as there are some minor issues that remain: So far I have ignored a bunch of non-critical schema keys completely: 'description', 'default', 'not null', 'size', 'unsigned'. The first ones are actually not very hard to fix but the first ones are very inconsistent between our current hook_schema() and our field definitions. So we should figure that out in separate issues, but hopefully proceed here nonetheless.
Again, attaching the (updated) diff script that I used to diff the old and the new schema and the actual diff output.
A couple of notes for that diff output:
1. While a bunch of things are being added in the diff, they all make sense and show incompleteness and deficiencies in the current entity_test schema.
2. In order to get that diff output I had to cheat a bit and apply the following patches:
* #2209071: StringItem should cast the 'length' schema value to (int)
* #2209049: Allow field schemas to contain unique keys
3. Do not turn on entity_reference module when testing this, as that will mess with the schema of entity reference fields. Will open an issue for that as well.
The patch (mostly) contains the FieldableDatabaseStorageController::getSchema() method.
The next step would be actually using that to install the schema. That plays into #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions.
But this could now seriously use some reviews!
Comment #4
tstoecklerOpened #2209981: ConfigurableEntityReferenceItem should not mess with the field schema for the problem mentioned above.
New patch, which actually *uses* the automated schema for the entity_test entity types. It contains a couple hacks that we'll eventually want to remove, notably:
* Manually installing the schema in hook_install()
* Manually uninstalling the schema in hook_uninstall()
* Allow to pass in the schema into drupal_write_record().
But this gets us going.
This also reveals a problem, that I've noticed before but ignored so far: Schema defaults. Those are currently almost consistently missing from the field item schemas. Will open an issue to discuss that.
Let's see how this goes.
Because I intended the whole getSchema() function due to the static caching, I used -w for the interdiff, so that won't apply.
Comment #6
tstoecklerSo I apparently my logic which field gets saved into which table was wrong: Only translatable fields end up in the data table(s). I was also trying to save the values of computed fields so far, which wasn't working out very well. The former issue led to rewrite most of the getSchema() method, but that makes it much more readable now. It still needs inline docs (lacking almost entirely now), but it looks OK, to my mind. I can install with this at least.
Comment #8
tstoecklerCrossing my fingers that that was the only problem...
Comment #10
tstoecklerHmm... must have been a context change, no merge conflicts.
Comment #12
xjmBeta-blocking as a part of #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions.
Comment #13
tstoecklerOK, discussed this with @plach, @fago and @Berdir today at Drupal Dev Days. Here's a new patch that implements the EntitySchemaBuilder that @plach proposed in #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions. As discussed I added this as a separate service.
We also discussed making it a top-level controller but decided against this because this would leak the concept of 'schema' into the top-level API, while it is really a very storage-related concepts that might or might not apply depending on the storage controller. The fact that we have to pass around $entity_type, however, into every single method of that class, really feels like we should have per-entity-type instances for this. That would allow to pass in the entity type in the constructor and then just reference it everywhere instead of passing it around. Not sure.
This also includes a temporary shiv to fall back to hook_schema() implementations if they still exist. That's because e.g. Node's schema does not at all comply with its field definitions. That depends at least on #2111887: Regression: Only title (of base fields) on nodes is marked as translatable but there might be similar things for other entity types. Anyway, getting this patch green would allow us to progress quicker here, and then figuring out the rest of the entity types in parallel.
So this patch (just like the previous ones) just removes entity_test_schema(). Let's see. I can save nodes successfully at least...
Providing two (consecutive) interdiffs, but it's pretty much a full rewrite, so it might make more sense to review the patch in its entirety.
Comment #15
tstoecklerThe reason this fails install is that by ditching drupal_write_record() we lost the auto-serializing support for fields that have 'serialize' => TRUE in their schema. Shortcut entities have the 'route_parameters' field which is an array, so when saving that things blow up. So we somehow need to bring that back.
Also I forgot the actual installation of the dynamic schema.... *slapsforehead* thanks @plach for pointing that out.
Comment #16
plachThis is already looking very good, +1 for per-entity-type services. Aside from that:
I think we should remove tables from the entity type definition: they don't belong there, they are a detail that matters only for the storage controller. The table name logic should be hard-coded (aside from allowing temporary table names). We may want to introduce a different key to specify the entity is not stored, but I guess it wouldn't specify a storage controller then.
Ok, but see above.
Please no, it took me 30 seconds to parse this :)
As I was saying above I think we should hard-code these.
PhpStuff :)
Since this is a generic API would it make sense to avoid the "table" word? I guess the returned array shouldn't be required to be a valid Schema API array. This would be the case only for controllers implementing the
SqlStorageInterface
. Maybe just implementing on getSchema() and just returning the full array?Recreating the schema would help making entity tests pass :)
Why is this needed?
Comment #17
tstoecklerFollowing changes:
- Fixed the serialization problem:
- This forced me to remove one very strange line from Shortcut::getRouteParams(), which I do not understand how it currently works anyway, but...
- Added automatic installation and uninstallation to ModuleHandler::instal()/uninstall()
- In order to fix some of the tests (or attempt that, at least) added a DrupalUnitTestBase::installEntitySchema() as DUTB::installSchema() doesn't work with auto-generated entity tables anymore.
- Fix some weirdnesses in the entity API tests. Most notably they are working with a field that's completely undefined, i.e. no base field definition?! I must be missing something...
- Renamed SchemaStorageControllerInterface to SqlStorageControllerInterface per #2079019: Make Views use SqlEntityStorageInterface. We should postpone and repurpose that issue, I think.
Regarding the review above:
- Fixed 3. I changed it to a in_array() + array_search() combination which is one function call more, but should be more readable I think.
- Fixed 5.
- 7. should be fixed per the above.
- 8. is "needed" to make the auto-generated schema match the current one. We can of course remove that, if wanted.
Still @todo.
- Fix the 1000s of tests that will probably fail. (But hopefully this will install now...)
- Investigate using per entity type services.
- Investigate hardcoding the entity tables in the storage controller. (1., 2., 4., of the review above). I agree with this point but I'm not yet sure how big the scope of that change will be.
Comment #19
plachIf it turns out that removing table keys from entity type annotation is too big of a change, probably it make sense to defer that to a follow-up.
Comment #20
plachJust a reminder: we should investigate whether we need an actual schema array or a lighter data-structure is enough. Maybe we can add a @todo?
I guess per-entity-type services is the way: we should pass this information just once. @Crell was pointing out we should not require storage controller to have a schem builder definied, though.
Should we consider a follow-up to add a serialize() method (actual name TBD) to the field item class? Regular items would just return their value while valuess needing to be serialized would get an automatic serialize() call.
I don't think we should bake this logic into the module handler, it's a new dependency I don't think we *need* to introduce. What about implementing
entity_modules_installed()
?This is really a cool change :)
Comment #21
tstoecklerOhh, I will (not) miss you drupal_write_record(). You had some nifty features, however...
Let's see if we can get some useful test output this time.
This does not address #21 yet. Thanks for the review, though! Very much appreciated.
Comment #23
tstoecklerThis fixes 1.,2. and 4. from above. This now does not actually use per-entity-type services, as you cannot call EntityManager::getDefinitions() during container building as that will blow up. I then explored implementing a factory-factory logic similar to entity query, so that each storage controller can specify the service name of its schema builder. That however did not work because it introduced a circular dependency issue:
- In order to instantiate the schema builder you need to call the storage to call the getSchemaBuilderServiceName() method.
- In order to instantiate the storage controller, you need the schema builder, as that gets injected.
So this now just introduces a schemaBuilder() method which manually instantiates the schema builder on demand. It's not as nice architecturally, but it's overridable and its lazy-load and it works :-)
Injecting the entity type greatly simplifies the logic in EntitySchemaBuilder. @plach you were right from the beginning that it's better this way, I just needed to see this for myself :-)
I like 3. But let's do that in a follow-up.
Sorry, no interdiff as I fucked my branch up locally. It's pretty much a rewrite anyway, though.
Comment #25
tstoecklerAhh, the revision issue conflicted with this. HEAD moves fast this week.... :-)
Also removed some bogus field definitions that I had added. @plach said (correctly) that those don't make sense.
And I changed EntitySchemaBuilder to account for $field_definition->hasCustomStorage().
Let's see.
Comment #27
tstoecklerSo in my many refactorings I seemingly removed a necesseray check which was causing the fatal.
Also removes a bunch of leftover stuff that I saw in self-reviewing the patch above.
*crossingfingers*
Comment #29
jessebeach CreditAttribution: jessebeach commentedSome boilerplate generation spillover.
Params need to be included in the function signature and given default values. Objects should be identified by Interface. There's a use statement missing for
\Drupal\Core\Field\FieldDefinitionInterface
The @file info is wrong here.
Going to do some manual testing now.
Comment #30
plachCan we rename those
ContentEntitySchemaHandler(Interface)
? I just realized that schema builder is the name I wanted to give to the service actually triggering (re)builds (see #1498720-48: [meta] Make the entity storage system handle changes in the entity and field schema definitions). The prefix Content is to communicate the fact we are dealing with content entities.Can we move these close to the lines where we are actually using it?
80 chars
bogus blank line
"a SchemaBuilder"
I think :)
Please let's add a comment here: every time I see
drupal_get_schema()
I think "WTF?!" :)80 chars
$other_
is a really poor variable name prefix :)I had to read the code multiple times to understand what was going on, what about
$data_
?I don't get what this is doing: shouldn't we use
array_combine
here?Actually the base table does not hold the language column.
bogus blank line
bogus parameter
We need at very least indexes/unique keys before this can be committed. We probably need a spearat method like
addTableIndexes()
to make it overridable per-entity-type. Or maybe those can go into theprocessTable*
methods?This looks very neat now :)
This change really does not look right to me: you are using the test entity storage controller to create field/instance entities...
This does not look right either: the definition will be provided by the Field module once we create the related field_config/field_config_instance entities.
Comment #31
plachYep, it's totally follow-up material :)
Comment #32
plachWorking a bit on this.
Comment #33
tstoecklerComment #34
plachLet's see how many failures we have now...
Comment #36
mauzeh CreditAttribution: mauzeh commentedFixed some more tests
Comment #37
mauzeh CreditAttribution: mauzeh commentedposting interdiff
Comment #39
plachMore fixes from my and @mauzeh's branches.
Comment #40
plach@mauzeh:
Reviewing your code, a couple of notes:
Do we need still need entity/entity_test modules here?
I think we should call the module "simpletest_test" or something. It has nothing to do with entities so the current name is a bit misleading. By calling this way we allow it to be used also for other purposes.
Why do we need these permissions?
Comment #42
plachThis might be green finally.
(crossing-fingers)
Comment #44
plachThat was unfair, marvin :(
Comment #45
plachThis refactors the whole stuff a bit: as discussed yesterday with Tobias, we are moving the
getTableMapping()
method on the storage, so we can properly lazy load the schema builder (which will grow bigger in the follow-ups). Interfaces have been refactored accordingy.Comment #46
plachSorry, wrong patch. The interdiff in #45 is good.
Comment #48
BerdirQuick review...
I don't even...
There's special cases like uid 0, wondering if that could be a problem. You don't really override that, though.
How are multi-field indexes going to work?
Are you sure this is not too late? modules might already want to create entities in hook_install()?
So we add those by default now?
No longer needed, .module files are optional.
FYI: #1709960: declare a maximum length for entity and bundle machine names adds a constant for 32.
Ah, I see, this is manually added now, I guess to test how this will work for nodes?
Those two fields only exist on the revision tables now, do we keep that? We don't want to see those values if we don't load a revision I think, or we risk duplicating it on save/showing it by default.
Also, log shouldn't be Translatable, not in EntityTestRev.
Comment #49
plachSorry, just saw #48, I will address it in the next iteration.
Comment #50
plachReverted some probably unneeded changes after addressing #48.
Comment #52
plachThe content translation test failures are expected because we are currently disabling translatability for entity types that do have translation support, as their schema has not been converted yet. We probably need some kind of BC layer until we are able to switch table layout, which probably needs to wait for #2068325: [META] Convert entity SQL queries to the Entity Query API.
Comment #54
plachThis might be green again.
Comment #59
plachThis should fix installation.
Comment #61
plachLet's see whether this is better:
Comment #63
plachMore fixes
Comment #65
plachThis includes the previous fixes plus support for index and keys generation. We should be ready to remove the BC layer now.
Comment #67
plachFixed unique keys generation.
Comment #68
plachJust a quick and dirty attempt to remove all entity schemas. I was able to complete astandard installation, let's how this performs.
Comment #70
plachComment #72
plachThis one should be cleaner:
Still missing:
buildSchema()
methods)drupal_get_schema()
BC layerWe should be ready for some serious review though.
Comment #75
tstoecklerFixing some of the test fails now. Starting from the bottom of the list in case anyone is already working from the top.
Comment #76
tstoecklerFound two issues:
- As entity query now uses getTableMapping() it fails to find the 'default_langcode' field on data tables, as we currently don't expose that as it's an implementation detail of the storage controller.
- Shortcut currently doesn't have stuff like shortcut_set on the data table denormalized (like node does) which means you need to reinstall after applying this patch, because after the patch it does. Or you can just use minimal profile.
Comment #77
tstoecklerAlso:
- Forum module is currently broken due to the issue mentioned in #48.4: Forum module tries to save a taxonomy term in forum_install(). This no longer works. :-/
Comment #78
tstoecklerAlso:
- Node module should have a separate storage controller which adds {node_access} to the schema.
(Sorry for the noise, just going through the tests one by one and don't want to lose this information)
Comment #79
tstoecklerOne last problem:
- tracker_install() queries {node}
This one fixes all those fails that are related to legacy usage of installSchema() in DUTB tests. Let's see.
Comment #80
tstoecklerSome of the aggregator fails are due to #2227367: StringLong's schema is broken, so marking that as critical. Applying that patch doesn't actually fix aggregator because of the entity key defaults stuff. Aggregator items don't have a UUID (see also #2149851: Remove todo about GUID field on the 'aggregator_item' entity and add UUID field), but we're trying to add a unique key for the 'uuid' column nevertheless. I would suggest to simply roll the defaults stuff back. Also #2209049: Allow field schemas to contain unique keys would allow us to remove that hardcoded unique key from the schema handler in the first place.
Comment #82
plach@tstockler:
Thanks for keeping the work going!
I think this is not correct, we should definitely have
default_langcode
in the table mapping: it's true it's a table to field representation, but I think all actual table columns should be included.We can move the term installation to
hook_entity_schema_installed()
(seeuser.install
) to fix that.There is the node access storage that should be the perfect candidate to do that, but I think this is follow-up material.
Can we do a quick entity query conversion or it's too much work?
I think it's fine: now that stuff is properly refactored to leverage those properties the right way, removing entity key defaults and add some @todo should not be problematic.
I don't think that's a big deal: if that gets in before this we can adjust our code, otherwise it's a just few lines that will need to be refactored.
Comment #83
YesCT CreditAttribution: YesCT commented#20 3. (follow-up needed still?)
Also, tagging for issue summary update as a lot seems to have been sorted out since this issue was created.
Let's get a section for follow-ups and blockers.
Comment #84
tstoecklerRe #82: Thanks for the feedback. Agreed on all accounts.
Re #83: Yes, that follow-up still needs to be created. I will hopefully get to that tonight.
...Along with another re-roll which should then seriously bring down the failure count
If I don't get to that, or life happens..., here's my interdiff for what I have so far. I will assign tonight if I get to it, so as long as this is unassigned, feel free to work on it, but then don't forget to assign yourself.
Comment #85
tstoecklerOops, I did forget to assign in the end. Oh well.
Here we go: This should have a lot less fails. At least all of the problems mentioned above.
The fact that we now have non-entity table fields in getTableMapping() forced me to introduce a weirdness: There really was no sensible array key to use (as the key is the field definition name, but we don't have a field definition here), so I used '' as a key. To clean that up I would like to introduce a small value(ish) object for the table mapping á la Url which has some methods for building and returning the various parts of the current multi-dimensional array. Let's first get this to green, though.
I also introduced a hack in the form of db_table_exists() in tracker_install(). Will have to think more about how to avoid that, couldn't come up with anything so far.
I also removed the drupal_get_schema() backwards-compatibility layer as that should no longer be needed.
I'm pretty tired so won't update the issue summary and open follow-ups just yet, but hopefully tomorrow.
Comment #87
tstoecklerHere's some more trivial test fixes. The rest are actual failures. I really thought there would be less of those :-( Some of those might be trivial to fix as well, but we'll have to look into them one by one to figure them out. Before working on those I'm going to a) update the issue summary and open a bunch of follow-ups and b) complete the unit test that I have locally, at least with some basic coverage. Maybe that will turn out something.
Maybe I can summon @mauzeh to help with some of the test fails?! (Or anyone else of course!)
Comment #89
tstoeckler'PDOException' with message 'SQLSTATE[08004] [1040] Too many connections'
Assuming that's a testbot problem. Re-uploading patch.
Comment #92
tstoecklerRewrote the issue summary and opened a bunch of separate issues. There's only one that's truely a follow-up, though. And none of them are critical or major, so don't panic! :-)
Comment #94
BerdirThis should fix a lot of tests...
Comment #96
BerdirMore test fixes, done for the moment.
Comment #99
plachJust a reroll before starting work
Comment #101
plachNow for reals
Comment #103
plachA couple of fixes, I hoped to be able to work on this a bit more...
Comment #105
BerdirThis seems like the wrong fix? It's the same problem as feed.queued had, which had no default value. Instead, we should set the default_value on the status field...
Same here I guess?
Comment #107
plachNot many fixes again...
Comment #109
tstoecklerOK, I'll take a stab at this.
This one already updates the patch to use the new EntityTypeInterface::BUNDLE_MAX_LENGTH and ContentEntityTypeInterface. Additionally it fixes ViewEntityDependencyTest.
It seems we need to open a follow-up to auto-generate views data from the auto-schema. I.e. that that would no longer be a nice-to-have but at least a major bug. See the @todo in the interdiff. Of course there might be a way to just temporarily fix the problem in some way that I haven't thought of, which would mitigate this.
Comment #111
plach@tstoeckler:
We have #1740492: Implement a default entity views data handler postponed on this work. I already linked it in a @todo or two in a previous patch, I think we can do the same with the @todo you added in the last one.
I disagree with this change, sorry: those constants (and the methods below) are all details of our core storage implementation. Other SQL-based implementations might want to store data in a completely different way. Those constants/methods would make no sense in that context. I originally thought about
SqlStorageInterface
as a marker interface, then I realized that we would need the::getTableMapping()
method to let View generate its data automatically, but I think there's not much more we can abstract. I think it's fair to putContentEntityDatabaseStorage
in our type-hints as in those cases we are explicitly referring to our well-known core implementation.Comment #112
tstoecklerOk here we go. Will try to tackle some more stuff later today, but unassigning for now.
Re #111: As I really like type-hinting interfaces I did the following as a compromise: I introduced a new
ContentEntityDatabaseStorageInterface extends SqlStorageInterface
where I moved the constants and the get*Table() methods. I don't feel *super* strongly about this, so if this doesn't work for you @plach, then I'll give up :-) and type-hint ContentEntityDatabaseStorage again.- This also reverts some of the changes to the node defaults that were problematic:
- The $node->status field default, for instance, depends on the node type configuration. So I added that default in bundleFieldDefinitions(). I also found that the field definition cache is not cleared when a bundle entity is updated, so I fixes that. Should probably be moved to a separate issue.
- The uid and revision_uid should not default to 0, IMO, but uid should be set to the current user by default and revision_uid should be set to the same value as uid if it is not provided. This is not yet implemented, so a couple tests that passed above will probably fail. I got a little confused however, which tests pass and which fail, so uploading this to get a status quo from the bot.
This also includes a baseline unit test. For now only the LAYOUT_BASE schema is tested without a UUID key, so there's a lot of more coverage to provide but it's a start.
Comment #114
tstoecklerHmm... damn. Will look into this later today.
Also forgot to mention: The interdiff in #112 includes the one from #107, at least those parts which I don't change further. I had forgot to apply that to my branch and noticed in the middle of fixing stuff.
Comment #115
Sylvain Lecoy CreditAttribution: Sylvain Lecoy commentedYou might be interested in looking at the reverse procedure: e.g. building an entity object from Schema API: http://drupalcode.org/project/doctrine.git/blob/48b52566638b3e31e3e0ebff...
If you have any questions i'll be glad to help :)
Comment #116
Sylvain Lecoy CreditAttribution: Sylvain Lecoy commentedThis can also help: http://drupalcode.org/project/doctrine.git/blob/48b52566638b3e31e3e0ebff...
Comment #117
andypostThere was a trouble introduced in #2225399-65: Apply formatters and widgets to Feed base fields
Comment #118
tstoecklerHere we go.
This fixes all failures from below, except for ConfigImportAllTest, which I did not run locally (yet). The EntityReferenceAdminUiTest depends on #2225739: Remove usage of field_info_instances(), use EntityManager::getFieldDefinitions() instead in order to pass, so this is now blocked on that. Therefore the patch also includes that for now. (I left that out of the interdiff(s), though.)
I also fixed a few things here and there that I noticed reading through the patch, so even though I tried to be very careful I might have broken this or that. I really hope this gets us into the single digits, though, in terms of fails.
Other than that I added the hook documentation for hook_entity_schema_installed() (and for the corresponding hook_entity_schema_uninstalled() that I introduced for consistency). It's not the coolest hook in the world and we might be able to remove it at some point but for now this gets us closer to commit.
Also added test coverage for an entity with a UUID field. Adding test coverage for the other layout types (i.e. the actual nitty gritty) is up next.
Unassigning for now, though. I'll try to work a little bit more on test coverage but that shouldn't block anyone else taking a stab.
Comment #120
tstoecklerOne last try for today. This should fix all but one of the tests that previously failed. Given how things have gone in this issue I expect a whole bunch of new tests to fail now... Let's see.
Comment #122
martin107 CreditAttribution: martin107 commentedIt looks like there has been a change in policy
from Node.php
So anonymous user changes will be marked as owner changes
and from UserCancelTest.php
The test marked with a star (line 294) is expecting attribution by anonymous user.
There is more than this going on as the compiler error is
Comment #123
tstoecklerRe #122: Good catch! That check should do a strict check for === FALSE.
Regarding the UserCancelTest failure that's a different problem. I already have that fixed locally, will upload patch tomorrow.
Comment #124
tstoecklerActually, reading #122 again, the conclusion is incorrect. $node->getRevisionAuthor() will actually return the user entity, not the ID, so even for anonymous users the current check is sufficient.
Also, for anyone wondering: In core there is no UI for setting the revision_uid column. The current behavior of always assigning the ID of the current user is not mitigated by the referenced code. I verified that.
Here is the patch with the fix to UserCancelTest. That test actually fails due to a pre-existing bug: When a user account gets cancelled and the nodes get reassigned node.module fails to re-assign the node_revision.revision_uid database field. There is actually test coverage for this (which failed with the above patch), the reason this is not failing in 8.x must be some weirdness in EntityReferenceItem. I will look into that.
Anyway, this one should ***finally*** be green. Let's see.
Comment #125
tstoecklerSo in terms of bringing this along additional test coverage would be awesome, if anyone wants to work on that.
I also still want to work on converting the table mapping to a value object before this lands, as the current empty string hack is really not so nice. My thoughts on that were something like the following:
Of course naming, etc. should be improved. I think those methods should cover the current usage of the table mapping, though. I might be wrong, however.
Comment #126
plachAwesome work! I will try to review the patch ASAP.
About #125 it might be a good idea but we should try to get feedback from at least @fago and @berdir, before going on with coding.
Comment #127
tstoecklerThis needed a re-roll after #2225739: Remove usage of field_info_instances(), use EntityManager::getFieldDefinitions() instead.
I removed the Entity Reference hacks, as those should no longer be necessary. Let's see.
I also fixed a small oversight that SqlStorageInterface wasn't extending EntityStorageInterface.
Comment #128
tstoecklerNeeded a reroll after #2232037: taxonomy_index should have a primary key.
Comment #129
tstoecklerGit doesn't like me today....
Let's try this again. Also with an interdiff, as I removed a check that should not be necessary.
Comment #132
tstoeckler#2226197: Introduce FieldStorageDefinitionInterface in the Entity Field API broke this one (yay!!!).
Comment #133
plachJust a reroll to merge/update my dev branch. Tomorrow I will post my review.
Comment #135
fagoAmazing to see this one being green, great work!
ad #125, TableMappingInterface: Yeah, I really like this idea. It's way cleaner and easier to use than an array with some documented keys. Even further, in future it might be possible to do a DatabaseStorage class that works independently from the table mapping? E.g., it just has save() and saveRevision() methods that work through all respective tables according the mapping. We could prepare for that by making sure TableMappingInterface already provides all the information needed for doing that.
Regarding uid default values, there is #1979260: Automatically populate the author default value with the current user which is working on that already.
Is it a builder or handler? :-)
This and the fuddling with constants does not seem too nice. Also, what does it add compared to checking whether an entity type is revisionable or translatable, or both?
The check for the data table seems unnecessary for me. If it is translatable it *has* to be there (given this storage class is used), i.e. we can throw an exception e.g. during construct if it's not defined.
Once that is done, couldn't the schema handler go and do just $entity_type->isTranslatable() / $entity_type->isRevisionable() / $entity_type->isTranslatable() && $entity_type->isRevisionable() to do the checks?
getColumnMapping() then maybe?
Imo that's deprecated. We've default values in field definitions which have been applied to $entity - those should be saved, nothing else.
what about ContentEntityDefaultDatabaseStorageInterface ?
-> this would reflect that any new tandem of storage + schema classes should add their own interface variants.
Also, documentation should clarify this is the interface of the default database storage, and contrib is free to add/use their own.
Is there already an issue for that?
So as it doesn't belong in here, shouldn't it better stay in node_schema() for now?
Again, should stay in user_schema() then?
Comment #136
plachFixed 1 and 3 from #135 + some other minor clean-up.
@fago:
2. Actually those constants aren't as useful as I had imagined, however we might keep them for now, as for BC reasons we need to keep those checks around and we cannot simply rely on the entity type keys.
These don't look very clean, I am wondering whether we can come up with something better.
We can probably remove this check, as we no longer apply defaults at SQL level. This might imply some tricky test failure though, if that's the case we can defer this to a follow-up.
I am really not feeling the need for this interface, I cannot imagine why one would need to provide an alternative implementation of the very same concept without subclassing
ContentEntityDatabaseStorage
. Anyway, I am not feeling strongly about this either, if we go this way I agree with @fago that we need a more specific name.If we don't drop this, I guess
getTableLayout()
would be a better name, now that I think about it :)This should move onto the storage class: we are once again instantiating the schema handler at every request atm.
This should be referring to the interface (if we don't drop constants).
Nice catch :)
Not sure this is the best example we can provide: we want to encourage people to use fields for this kind of additions :)
These should be
buildSchema()
otherwise we would bypass the alter hook, but...... I agree with @fago that these shuld be left in the various
hook_schema()
implementations :)We can get rid of this change then.
Ouch, luckily we are addressing this in #2068333: Convert node SQL queries to the Entity Query API.
We could try and just do a couple of entity saves.
I guess we could call this
$dbStorage
or just$storage
. The current name might be ambigous wrtSqlStorageInterface
.Comment #138
plachBetter one
Comment #139
plachFixed aggregator, block and node indexes. I will finish them tomorrow.
@tstoeckler:
Do you want to take care of the reviews above or do you want me to go ahead with them?
Comment #140
plachMmmh, I will get rid of this in the next reroll :)
Comment #141
plachAbout #125: what about
getExtraColumns()
instead ofgetOtherColumns()
?Comment #143
sunHm. I noticed a level of excitement about this issue/patch slowly getting ready
I want to encourage you to keep on working on this, but I hope that you're aware that there are some critical problems in the currently proposed code...? Sorry for being snarky, but it's my honest first time impression after glancing over this patch :)
LOLWUT @ http://ibin.co/1JAhv20hGyHa
There are so many
array_
functions involved, I... am not even remotely able to decipher what this code tries to do, and thus cannot suggest a more appropriate solution...However, in case I get the gist right, then that crazy code cries for a dedicated domain data object with proper methods...?
Comment #144
plachI AM excited, honestly :)
Well, I guess it depends on the definition of "critical": we are aware there's something we didn't address properly yet. Hopefully we are not missing anything that would require to rewrite the whole thing :)
Did you express all your main concerns in #143 or there's something more you did not have to the time to write down?
I know I know :)
(see #136.1)
Basically it's a way to ensure well-known columns come before arbitrary field columns. I guess there are ways to clean-up that code, but I really didn't have the time/motivation to do that yet.
If I get you right, we were discussing about that in #125 and following.
Comment #146
plachRerolled
Comment #147
plachThis fixes a problem in the node indexes and entity schema uninstallation.
Comment #149
plachJust an attempt to fix schema uninstallation (not really final, sun, all that code should be OOPified ;) + some test fixes. Partial interdiff, sorry I messed it up.
Comment #151
plachThis should get rid of notices.
Comment #152
BerdirWent through the module/test changes bottom to top.
Same as file_usage.
This needs the same @todo as similar tables, or we'd simply keep it in hook_schema(). Like we for example do for shortcut.module. I don't really understand why we have to do this? And in some cases don't...
Are we sure this hook is called when forum and taxonomy_term is not enabled at the same time? Again, another problem that we could avoid if we install the schema earlier I think.
typo.
Can we maybe use the existing test module that we use for database tests instead of adding a new table for this?
Can't we install the entity schema before calling hook_install(), like we do normal schema? Yes, it has to be hardcoded in ModuleHandler, but so what? Or we could maybe use hook_preinstall?
#2227381: Apply formatters and widgets to User base fields 'name' and 'email' is the issue for this.
Why? (@todo should say). Temporary todo or do we need a follow-up for this?
I'm removing $field_info and injecting $entity_manager in #2116363: Unified repository of field definitions (cache + API), if we can get that in first then there's a bunch of stuff that won't be needed anymore.
Comment #153
plachJust a reroll. New patch following...
Comment #157
plachThis completes the entity-type-specifc index definitions. I will start addressing the reviews above soon.
Comment #159
plach@Berdir:
We have a tricky issue with the
UriItem
whose schema defines atext
column having 2048 as maxlength, while the current{file}.uri
column is avarchar(255)
. Can you have a look to the interdiff?Comment #161
plachRerolled
Comment #163
plachAnother try...
Comment #165
plachThis reverts the changes to
hook_schema()
implementations not related to entity storage classes plus some other schema fix:Some DUBT test will probably fail due to these changes, but posting anyway to identify them all.
After fixing test failures I will start addressing the reviews above if @tstoeckler does not beat me to it.
Comment #168
jessebeach CreditAttribution: jessebeach commentedMassive patch! Awesome work! I'm 50% through reviewing. I think I got through the meat of it and the rest seems to be conversions, but I'll finish up tomorrow to be sure.
This @todo needs an isuse logged for it.
What should get removed from what? The recommendation is ambiguous.
This could do with a comment to explain what this bitwise operation is doing.
Again, a comment would be most helpful to explain what this bitwise operation is doing.
Secret hooks :)
I added this to the issue summary.
This @todo needs an issue.
This @todo needs an issue.
Is there no default? Perhaps an error?
Is
drupal_write_record
removed or deprecated in this patch?Is $schema just an array at this point?
Perhaps a specific Exception here and not the general one.
Comment #170
plachThanks Jesse! I am going to fix those failure and then I will address your review and the ones above :)
Comment #171
tstoecklerSorry @everyone, the last week was a bit crazy for me. I really wanted to push this along, but never got around to it. Thanks for keeping this going, especially @plach.
This should fix tests. Will work on addressing the reviews now.
Also noticed this is still assigned to me, sorry for that.
Edit: Crosspost. Damn, hopefully I wasn't to late.
Comment #172
plachOuch, I was about to post more or less the same patch :(
I think this can be fixed in the parent class.
Some answers:
@Berdir (#152):
Honestly I didn't think about harcoding it in the module handler: one reason we might want to avoid that and leave everything under the Entity module's control is that, when switching to dynamic schema handling, we will have a class (probably a service) handling schema installation/uninstallation/reinstallation/manipulation, well, you get me :)
The reason why
hook_preinstall()
does not work is that it's too early. Originally we were doing it there, but we found a nasty bug: when installing the Locale module, entity type definitions were parsed before Locale schema was installed, thus causing the string translator service, which was already aware of the Locale backend, to break as it tried to query (yet) unexisting tables. We might try to work around this, but actually this is just an example of a situation that theoretically any contrib module might trigger, so I thought it would be better to just define a dedicated hook, also because more of them will be added or renamed when unifying field storage and implementing dynamic schema.@jessebeach:
I didn't qualify the @todos yet as I wanted to have a final patch, with some +1 before :)
1. The base|data|revision|revision_data table entity type keys
2. Ok, anyway @fago suggested to remove that code so we might go that way instead
7. Not sure what you mean :)
8. Nope, we are just no longer using it in the storage classes
9. Yep, it should always be a Schema API array
Comment #174
tstoecklerLet's try that again.
Comment #176
tstoecklerOops, accidentally included some local, non-comitted changes and also forgot to merge 8.x for the 1000th time...
This should be better. Interdiff was correct.
I agree with putting the schema creation back into ModuleHandler. Once #2206347: Use event system in ModuleHandler is in, we can have the entity system register an event subscriber to clean that up.
Comment #177
plachI am not clear why the event system should let us fix something hooks are not a good fit for: we'd still have to find the proper point in the execution flow: a new module-system hook or a new event, what would be the actual difference?
Comment #178
tstoecklerRe @plach the difference is that the entity *system* - i.e. \Drupal\Core\Entity - cannot implement hooks. Hooks can only implemented by modules. That's why we had to put the stuff in entity.module for now. With the event system we can register arbitrary subscribers in core.services.yml. So the entity system can totally provide an event listener in \Drupal\Core\Entity\EventSubscriber to react to the module handler to install/uninstall the schema.
Comment #179
tstoecklerAhh, sorry I misunderstood your comment. Yes, you are of course correct and I am wrong.
Unlike I tried to allude to in #178 the problem is *not* that the code isn't being called, but the problem is *when* it gets called. Thanks for pointing that out! I will think about this more, maybe we need a new event over there or a new hook without that issue.
Comment #180
BerdirThe new event will have to be called before hook_install(), so the tables will be available there. The event is not a replacement for hook_install(), it's a replacement for the 10 one-off calls that are currently in ModuleHandler::install() to clear various caches, rebuild stuff and so on.
That's also what we did with the plugin cache clearer service, added a hardcoded call there and will then move to an event in that issue.
Comment #182
plachBefore hook_install() sounds much like hook_module_preinstall(), which would re-introduce the bug described above. From a theoretical POV #180 works for me, but we will have to be very careful with the actual implementation :)
Comment #183
jessebeach CreditAttribution: jessebeach commentedWhere do these db fields get added back? I'm looking in
\Drupal\comment\CommentStorage::buildSchema()
, but I don't see them declared.Comment #184
BerdirThey are not re-added. They're generated based on Comment::baseFieldDefinitions() and the schema() method of the corresponding field type/field item classes. that's the main thing this issue is doing, generating the schema for the entity tables. buildSchema() overrides are only necessary for things that can't be generated automatically like additional indexes and tables like user roles or taxonomy term hierarchy stuff.
Comment #185
jessebeach CreditAttribution: jessebeach commentedUML: http://www.lucidchart.com/invitations/accept/5356c34a-9470-474a-8732-696...
The UML is open for editing.
\Drupal\Core\Entity\Schema\ContentEntitySchemaHandlerInterface
is not doing much for us yet.Also, both
\Drupal\Core\Entity\ContentEntityDatabaseStorage
and\Drupal\Core\Entity\Schema\ContentEntitySchemaHandler
implement this interface, even thoughContentEntityDatabaseStorage
contains an instance ofContentEntitySchemaHandler
. We end up with a circuitious route to getting a schema:which calls
which calls
It seems improper to share an interface like this. Does
ContentEntitySchemaHandler
really need to implementContentEntitySchemaHandlerInterface
?All but one method of
ContentEntitySchemaHandler
is defined outside an interface as well.This is the current HEAD class/interface UML for reference.
Comment #186
jessebeach CreditAttribution: jessebeach commentedAdding the images with different names. Grumble, process, grumble.
Comment #187
plachI don't see any problem in having an object referencing another object and proxying some calls to it. It's a technique used in several common patterns. For a detailed explanation of why we went this way see the issue summary (bullet 2 of the proposed resolution).
Comment #188
Berdirfile.uri is special because it's quite a different thing than a usual (http) URI, it's something like public://file_path, and that coudn't be much longer than 255 anyway, so I'd suggest to simply set the max_length of that to 255 instead of doing it on the index only.
We do have an issue for this I think.
I had the same problem with the plugin cache clearer, the answer was simply that the plugin cache clear needs to happen after the module schema is installed, I think it's the same here, first install tables from hook_schema(), then entity tables, then hook_install(). The order of calls *is* very fragile there, but I think it should be possible to find something that works.
I don't see another way, default stuff needs to work in hook_install(), I'm pretty sure that forum only works in the tests because it's always enabled at the same time as taxonomy.
Oh, I missed this in the issue that limited entity_type to 32. Because this is broken, if I would really store a longer entity_type in there, then things could break if the only difference is the longer entity_type. Not related to this issue, but I did notice that you also have a limit on the entity_type index for comment to 32, that should also no longer be necessary because the field length should now already be limited to 32 (if it is not, then we need to adjust it accordingly). We should open a separate issue for this, though.
Comment #190
jessebeach CreditAttribution: jessebeach commentedFair point. I concede it's not the pattern that concerns me, just some of the details. So if we can work through the details, I'll be assuaged :)
Firstly, if the intent is to make it possible to swap out the schema handling (this is what I understand the intent to be), then we have a method in
ContentEntitySchemaHandler
that isn't declared inContentEntitySchemaHandlerInterface
, but thatContentEntityDatabaseStorage
assumes exists.public getFieldColumnName()
as in
Any schema handler would need to implement this function outside of the
ContentEntitySchemaHandlerInterface
, no?Also, in bullet 2 of the issue summary, the following statement is made:
I hadn't recalled this relationship so I added
SqlStorageInterface
to my map and discovered that the implementation has deviated from this description.SqlStorageInterface
extendsEntityStorageInterface
. Maybe we just need to update the description in the issue summary?And the last observation. The docblock of
ContentEntitySchemaHandlerInterface::getSchema()
indicates that the method "Gets the full schema array for a given entity type.". This is true forContentEntitySchemaHandler
. ForContentEntityDatabaseStorage
, thegetSchema
method implementation also invokes theentity_schema
alter hook. So in this case, the returned array is the full schema plus anything else modules alter into it. Perhaps that should be noted in the docblock forContentEntityDataStorage::getSchema()
implementation in addition to the@inheritdoc
statement.Comment #191
jessebeach CreditAttribution: jessebeach commentedSo, in the spirit of "putting it all together", I've mapped out the landscape and a little context for the
getSchema
method in its various incarnations.In the graph, the blue-background boxes are "field" class or interfaces. The pink boxes are notes indicating what exactly
getSchema
is doing on this class.For fields,
getSchema
is defined in\Drupal\Core\Field\FieldStorageDefinitionInterface
. BothFieldInstanceConfig
andFieldConfig
implement this interface.FieldInstanceConfig
delegates itsgetSchema
method to its stored instance ofFieldConfig
.For entity storage,
getSchema
is defined in\Drupal\Core\Entity\Schema\ContentEntitySchemaHandler
. BothContentEntityDatabaseStorage
andContentEntitySchemaHandler
implement this interface.ContentEntityDatabaseStorage
delegates itsgetSchema
method to its stored instance ofContentEntitySchemaHandler
.The
getSchema
methodContentEntitySchemaHandler
loops through its list ofFieldItemInterface[]
instances and calls theirgetSchema
methods, aggregating the returned arrays by field name.Comment #192
plachHere is reroll after #2225955: Improve the DX of writing entity storage classes, I hope I didn't break anything.
I am going to work a bit on making the schema match (almost) exactly the current HEAD one (I'm focusing on FKs now), hopefully my changes won't conflict with Tobias' ones.
@jessebeach:
Yep, that's the main goal :)
Good catch, actually I was planning to move it on the storage class for two reasons:
ContentEntitySchemaHandlerInterface
is storage-agnostic: it does not know (nor it needs to know) anything about tables and columns, in fact aMongoDatabaseStorage
class could use it to return a schema definition array that makes sense for its implementation, possibly not even a Schema API array. By the way this is the reason why we decoupled it from theSqlStorageInterface
:)Yep, we definitely do. Moreover your nice diagram made me notice that
ContentEntityDatabaseStorageInterface
(which does not make too much sense to me, btw) should probably extend alsoFieldableStorageInterface
.I think the description is still valid as what's returned is always the entity schema: it's up to modules altering it in a sensible way. Adding a note the method implementation might make sense, but I think the hook documentation could be enough.
Comment #196
plachThis should fix foreign keys.
Comment #202
tstoecklerOK, sorry it took me so long (again...) :-(
This fixes all reviews above, except:
The code there is just being moved, I'd hate to introduce failures or anything due to an unrelated change. Opened #2249113: Use an entity save instead of db_insert() in user_install() for that.
No we're not. In fact I'm pretty sure that it won't be. See below for my thoughts on the schema installation
I simply forgot those. Although see below, for my thoughts on the layout type.
See below for my thoughts on hook_entity_schema_alter()
OK, so there is a couple of things I would like to do:
Comment #203
tstoecklerOh, and I opened a bunch of issues :-)
(Well Berdir opened the MapItem one...)
Comment #204
plachThanks! Some answers to #202:
hook_schema_alter()
too, but when talking about this stuff with @Berdir on our way back from Szeged, he pointed out that it's not our job to prevent people from doing stupid things and that an alter hook could be useful. At least this is what I remember of our conversation :)I am more than open to reconsidering it, if @Berdir does not feel strongly about it. We could even remove it for now and add it back later if needed, it would just be an API addition. If we go this way we can get rid of the
::buildSchema()
methods and just go with the::getSchema()
ones.Going to review the code now.
Comment #206
plachThis should fix many tests.
Comment #208
tstoecklerThis should be green again.
Some explanations:
- DrupalUnitTestBaseTest:
- Some assertions are removed, but no actual test coverage is removed: The test previously did the following:
- First test that installing a table from an enabled module works
- Second test that installing a table from a non-enabled module does not work
- Third enable the module from the previous step and test that installing the table now works.
- As you can see, the third step and the first provide identical coverage. That is way some assertions are removed and a little code is moved around.
- MigrateTaxonomyTermTest:
- Taxonomy terms are incredibly weird right now, as:
- They have a 'parent' field item which is defined as a single-value integer field in the field definitions, with a default value of 0.
- It is in fact a multi-value field that is stored in taxonomy_term_hierarchy table. This is fixed in the attached patch. Previously we were generating a schema field for the parent due to this, which this also fixes.
- When it doesn't have an actual parent it gets a single value of 0.
- The test was asserting that the field value is actually NULL, where it is 0 due to the above. I changed it to assert the correct (?) behavior.
- I have no idea why this patch changes that behavior.
Also looked into the taxonomy_term.vid max_length override and - instead of hardcoding the max length in the schema - I added 'max_length' support to EntityReferenceItem. The field definition already declares a max_length, so we just have to use it. Checked that the schema is generated correctly.
Comment #210
tstoecklerHere we go. Should be green for realz.
Also unassigning. I do plan to work on this and finish off the few remaining todos but don't want to hold anyone back in case I don't get to it.
Comment #211
tstoecklerHere we go. This drops
hook_entity_schema_alter()
. Discussed this briefly in IRC with @Berdir and he didn't have a concrete use-case in mind either. Surely people will find some crazy way of using but I'd rather have them alter the field definitions or swap out the schema handler if they really need it. Since we want to support schema updates when field definitions change having altered schemas is really, really weird. I personally see this similar to an alter hook for config data, which we also don't have.I also tried to employ the try-catch mechanism for creating schemas that the cache system uses. I literally copied a lot of code from there. Because the storage controllers already have a save->doSave separation I couldn't use that. I am currently using anonymous functions for that but would appreciate better suggestions.
This allowed me to remove hook_entity_schema_(un)installed(). Which should make this patch at least a tad smaller. This also makes #2249113: Use an entity save instead of db_insert() in user_install() not only a random nice-to-have issue but actually a (soft) blocker of this. I put a workaround in user_install() for now.
I suppose we could also remove DrupalUnitTestBase::installEntitySchema() but I didn't do that for now, it should work as is.
This will make #2068325: [META] Convert entity SQL queries to the Entity Query API a hard blocker of this, unfortunately. I hacked around one query in comment_get_thread() so I could at least save and display a node.
I also had to introduce 2 hacks into Views module, which I don't really know what to do with. We're trying very hard to discourage people from querying the DB directly but Views is built on the opposite. At the very least we should have a
EntityQuery extends Query
(well not *that* EntityQuery, but...) so that we don't have to put the hack in the generic Views Query class, but have a dedicated Entity one. Similarly, we should have a EntityPager class.Let's see what else breaks :-/
Feel free to suggest to revert the try-catch changes. I too really wanted to avoid the entity query issue as a blocker. I *think* hacking the schema installation into the module handler - for now - would sufficiently solve the problem discussed above as well, but @plach was fairly adamently against it. So not really sure how to proceed.
Comment #212
tstoecklerDoh, forgot to merge 8.x. Sorry for the noise. The interdiffs are correct, though.
Comment #213
tstoecklerSo this hunk is of course not intentional, will remove on next re-roll:
Comment #216
plachI am definitely not against hard-coding the entity schema installation in the module-handler (see #204.1): please (please!), let's revert the try/catch change, it feels very fragile and awkward :(
I reviewed all the lastest interdiffs, the only stuff that is bothering me (aside from the try/catch change) is from #202:
This shows exactly what bothers me of this interface: interfaces are used to hide implementation details, this one mandates or at least strongly suggests an exact implementation. Any storage class implementing it and not respecting some strict assumptions will probably end-up breaking stuff. This interface is a lie IMHO :)
This is overly verbose, would just
$custom
work?I am not sure about this change: this means that the storage class will no longer take care of configurable fields. Even if it's just temporary, this may negatively affect the parent issue, so I'd like to find another solution if possible.
Why we removed this? Language is revisionable and storage classes need to now it.
Comment #218
tstoecklerThanks for the review!
1. I removed ContentEntityDefaultDatabaseStorageInterface now. I did like it, but I'm fine with removing it.
2. We have a few other $include_foo parameters in EntityManagerInterface so I went with $include_custom as a compromise.
3. I reverted this and added a hack with a lengthy comment in ContentEntityDatabaseStorage. Note that currently using hasCustomStorage() would not be a problem but with #2144263: Decouple entity field storage from configurable fields or a follow-up thereof it would eventually become problematic.
4. I personally think isRevisionable() and isTranslatable() does not make sense at all for entity key fields as it's not at all defined what that actually means, but I reverted that change for now.
Comment #220
plachThe idea is that we use "base entity schema" only for base fields, configurable fields should be bundle fields if I am not mistaken.
typo
Comment #221
tstoecklerRe #220.1: Well field instances (FieldInstanceConfig) are bundle field definitions, but fields themselves (FieldConfig) are field storage definitions. Which makes sense, that is what they are conceptually. They are added in hook_entity_field_storage_info(). Bundle fields themselves cannot have any storage, as we don't have per bundle storage.
Comment #222
tstoecklerSorry, should have tested this. This one installs at least and I can add a node. Let's see if I broken anything inadvertantly.
Comment #223
tstoecklerComment #224
plach@221: We are having troubles because we are retrieving field storage definitions, then we need to filter out field config storage definitions obviously. If we just get base field definitions (which ARE field storage definitions), through
EntityManagerInterface::getBaseFieldDefinitions()
, we don't need such a check. This was one of the main points we discussed in Szeged: we decide whether a field is stored in the base tables or gets CCK schema based on it being "optional", i.e. belonging only to certain entity type bundles.Comment #226
plachLet's try this
Comment #228
tstoecklerLet's see if this is already enough. It fixes VocabularyUnitTest at least and ContentEntitySchemaHandlerTest.
Comment #230
tstoecklerAhh, so the fails from above were actually fixed, the new fails are due to a bug that was introduced by #2201051: Convert path.module form alters to a field widget. PathItem does not declare any 'columns' in its schema (because the values are stored externally) But getSchema() calls
$schema['columns']
unconditionally to see if the field item declares columns with reserved words. It's just that PathItem::getSchema() is never called in 8.xComment #231
tstoecklerOops forgot the patch+interdiff.
Comment #232
tstoecklerThis introduces
TableMappingInterface
.I think I've found an API that is reasonably nice for both using the mapping as well as building it. I'd love your thoughts on it, though!
Let's see what this breaks.
Comment #234
tstoecklerYeah, late night patch rolling is not always a good idea.
Comment #236
plachOverall, changes look good to me, thanks! I still think in the stroage class we should be using base field definitions instead of storage definitions (see #224). Storage field deifnitions make sense in the table mapping stuff though, as we should probably return also cck tables in the
getTables()
method (to be done in #2144263: Decouple entity field storage from configurable fields, probably). We can differentate between the base vs bundle fields by checking whether the storage definition implementsFieldDefinitionInterface
.Missing double underscore.
I think 'title' should be replaced with 'value' if I am understanding this correctly.
I am wondering whether this description is too implementation-specifc. Would it make sense to write something more generic?
Comment #237
plachComment #238
tstoecklerRe 2.: No in fact 'title' is correct there. The point of this paragraph is to try to explain which parts are implementation specific and which parts are generic. In this case naming the database column after the field name (== 'title') is the implementation detail. We could just as well choose to always use the property name as well even for single-column fields - i.e. 'title__value'. Please do provide suggestions on how to document this better.
I tried to expand the documentation a little bit in this patch.
Re 3.: The problem is that for the mapping to be truely generically usable as is, i.e. without further looking into field definitions, etc. We need all three pieces of the following pieces information and we need the relationship between them:
- the field name
- the column name as defined in getSchema()
- the actual db column name
Again, I am more than open to improvement suggestions, but I didn't really see what structure would be better for this.
Hmm... thinking about this I have the following idea: Instead of
getFieldColumns()
returning the nested structure as above. We have agetFieldNames($table_name)
that simply returns a list of field names that are stored per entity type, and then we have agetFieldColumnMapping($field_name)
which returns thearray('value' => 'title')
orarray('value' => 'description__value', 'format' => 'description__format')
structure, respectively. Thoughts?According to the previous test result this should be green. Let's see.
Edit: Forgot the most important part: Thanks a lot (!) for the review. This feels close..
Comment #239
tstoecklerThis implements the idea from #238. It forced me to add a line of code in a few places but it allowed to remove the lengthy documentation on the array structure, which IMO is a sign of a more sane API.
This also changes back to using baseFieldDefinitions() instead of fieldStorageDefinitions(). I still think the latter is more correct, but we can re-hash that debate when we add support for multi-value fields at which point there will actually be a difference. For now it's just a semantic debate.
I will work on adding test coverage next.
Btw: I have one question, that I forgot to ask the whole time. Why are we using 'field__' for field-level unique keys/indexes/foreign keys and not the entity type ID? I mean they are still within the realm of an entity type?!
Comment #241
jessebeach CreditAttribution: jessebeach commentedThe test fail in #239 is due to
$entity_manager->getBaseFieldDefinitions($entity_type->id())
, where$entity_type->id()
equalsshortcut
, is returning NULL. This is happening in the__construct
method ofContentEntityDatabaseStorage
.Comment #242
jessebeach CreditAttribution: jessebeach commentedDerp, my bad. base fields are returning just fine for shortcut. Still looking :)
Comment #243
tstoecklerSorry, I should really know to run unit tests locally by now, before submitting patches.
Here we go.
Working in better test coverage now.
Comment #244
tstoecklerHere we go. This brings 100% coverage for DefaultTableMapping and greatly expands the test coverage of ContentEntitySchemaHandler. Still no revisionable / translatable layouts, but full test coverage of index / foreign key generation.
Speaking of which, I think our naming standard for foreign keys is a little bit strange: for single-column fields it is "field__$field_name" and for multi-column fields it is "field__$field_name_$specifier" where $specifier is the original name of the foreign key. I don't really see a need for this differentiation but I'll leave that up to @plach to decide.
Writing the test coverage made be refactor a few internal things minorly. I.e. before we were calling getSchema() and getDescription() multiple times, so I optimized that. I also noticed that for multi-column fields we duplicate the schema description currently, i.e. all columns for a field simply get the field description as schema description. Let's sort that out in #2232471: Discuss and standardize field definition descriptions == schema descriptions, though. I think I might have an idea for that...
Comment #245
yched CreditAttribution: yched commentedFrom #239 :
Yeah, that sounds wrong. Probably an overlook from when we moved storage from per-field to "per [entity type, field name]" in #1497374: Switch from Field-based storage to Entity-based storage ?
Agreed that the difference is not ideal. Then again, doesn't it (kind of) reflect the current column names for base fields in base tables ? (just $field_name for single-column field, $field_name__$column for multi column). I guess it's better to keep the two in sync.
Comment #246
plachI quickly skimmed through the changes and they (again) look good to me thanks! I will do a deeper review later today. I think we still owe @fago the removal of the layout type constants, but I can try to work on that myself if you wish to work on improving test-coverage further.
The original idea was reducing the risk of clashes between stuff automatically generated from field definitions and things manually added by the storage class. If not prefixing field stuff with the entity type id feels wrong I think we can safely add such a prefix:
I don't remember whether this was intentional, but I don't see the need for such a difference either. Probably it's just a failed attempt to implement what @yched says in #245.2 :)
Oops :)
Comment #248
plachSince we are getting close to being ready here I performed a diff between the entity schema tables before/after the patch. These are the relevant differences:
int(11)
vsint(10) unsigned
.These are the table-specifc differences:
Some of the above can be fixed by changing field definitions but I think most of them is ok. I will post a detailed plan tomorrow.
Comment #249
plachMinor adjustment
Comment #250
tstoecklerHere we go. This fixes #245/#246 and completes the test coverage for ContentEntitySchemaHandler. The only thing left to test is ContentEntityDatabaseStorage::getTableMapping().
Discussed the index and foreign key naming with @plach and we agreed on the following pattern: ENTITY_TYPE_ID_field__FIELD_NAME__KEY where KEY is the key used for the index/unique key/foreign key in the field schema declaration.
This also removes the table layout constants per the above.
I had to produce several interdiffs to due a bunch of merges.
Marking "needs review" for the bot, but this is still needs work per #248.
Edit: Crosspost with #249, but included that as well (including interdiff :-P).
Comment #252
tstoecklerA) I accidentally removed more code that I wanted to.
B) I fixed the NOT NULL => TRUE for entity keys above, but that apparently causes other problems. I will have to investigate that.
Comment #254
tstoecklertstoeckler--
Comment #255
tstoecklerThis should work. Added a lengthy comment on why revision_id cannot be NOT NULL.
Comment #257
tstoecklerAhh, that 'label' is also an entity key always throws me off. That should not be required either.
Also updating the issue summary a bit.
Comment #260
BerdirCrazy stuff, as usual with rest.module ;)
Shortcut is also tricky, the existing code there was broken, WimLeers recently noticed that too in #2241235: Shortcut/ShortcutSet entity types should use cache tags, this is a slightly different fix without the weird isOriginalId() condition. Also added a not-syncing check there.
Based on a discussion with @alexpott, this code should probably live in the UI and not in postSave(), it depends on the user creating the shorcut to check which one is the default, so that's pretty weird to live there.. but that goes beyond what we have to fix here.
Comment #261
fagoGreat to see the TableMapping class + interface, that's much better than the arrays. Even more awesome you've been able to do away with the table layout constants already!
I think this should be called setFieldNames as it overwrites existing ones. Maybe a better name would be setTableFields() as it defines which fields go into that table.
I think this can be called just getColumns($table_name) - as there are no different ways between getting all or not-all columns.
I must say this is a bit confusing as it's a mapping part of another mapping (table mapping). Could we just call this getTableColumnNames() maybe, i.e. always differentiate between a "field column" and a "table column" ?
protected function processIdentifierSchema(&$schema, $key) {
hm, could we solve that by adding a separate "identifier" field type maybe? That said, we should probably mark the "revision id" as revisionable as it values change by revision also.
Awesome to see the table mappings being applied in mapToStorageRecord(), #2232427: Allow field types to control how properties are mapped to and from storage is a great follow-up here.
@ContentEntityDataBaseStorage
As I'd see it, all that table mapping customizations do not fit in the storage class as the one controls the table mapping should be solely the schema handler. The schema handler tells us how the table mapping looks like, so it should be created/configured there. So could we move that code over there and just forward getTableMapping() to the schema handler?
Ideally, I think the storage class itself would not know about the table mapping at all and just handle all tables in a generic fashion, similar as attachPropertyData() already does. That's something we could look at in a follow-up though.
Found a typo:
Comment #262
plachThis would imply loading the schema handler at each request, which we are trying to avoid as it will grow bigger. I think it's ok for the storage class to know how fields are mapped to tables. Note that this information does not prevent us from cleaning-up all the current assumptions and just relying on the mapping in the load/save code.
Comment #263
plachI am wondering whther it would make sense to open a follow-up to allow the langcode to be actually NULL. I am not sure why we should special case language, since the other entity keys can be NULL.
Comment #264
BerdirThe problem with langcode is that we re-initialize it automatically in onChange(), so it's set to an empty value instead of NULL.
Wasn't sure if ContentEntityBase would be able to deal with that being NULL but maybe we can add a check to only update it if it's not NULL? somehow... +1 to try and improve that in a separate issue, but I still think that the crazy stuff that rest.module there is doing is well... crazy. There should be a better way than having to differentiate between setting a list to NULL and array().
Comment #266
tstoecklerHere we go.
This adds some test coverage for the methods in ContentEntityDatabaseStorage (except for getTableMapping(), didn't get to that one yet) and fixes the above review, except for:
Well, there's still
getExtraColumns()
, so I think having justgetColumns()
in addition would be confusing.I don't really care at this point, but as @plach above disagrees, I'll let you guys fight this out before changing anything. :-)
Let's please handle those in a follow-up, as both them are non-trivial/controversial IMO.
Comment #268
plachIf the end goal is having both a storage class and a schema handler class that are decoupled from the actual table layout, what about having a separate factory encapsulating the table-layout logic to instantiate the table mapping? We could exploit per-entity-type services to make the factory swappable.
Comment #269
tstoecklerPhpStorm-- I don't know why it didn't rename the actual usages of the method. I should have checked, though, so tstoeckler-- also.
This should fix that.
Also added test coverage for getTableMapping() for simple and revisionable entity types. Translatable and "both" are up next.
I like #268. If noone beats me to it, I will look into that after completing test coverage. However, if I'm not mistaken
will not work for the same reason it didn't work for the schema handler.
Comment #271
tstoecklerFix new test introduced in #2257421: Convert CommentDefaultFormatterCacheTagsTest from web test to entity unit test.
Also completes test coverage. We now have complete test coverage for the introduced code.
Comment #273
tstoecklerThe drop is moving fast today... This time it was #2224549: Simplify checking whether an entity type is revisionable. That's a good thing, though. Because we can use the method that that introduced here.
Comment #275
jessebeach CreditAttribution: jessebeach commentedA fine goal. Two questions:
Comment #276
jessebeach CreditAttribution: jessebeach commentedThank you for the test coverage for
DefaultTableMappingTest
.This patch has well passed that size where one can comfortably review it in a day :) So, we need to satisfy the following steps, I believe, to feel confident that it's ready.
1. Testbost is green ('natch).
2. New test coverage is sufficient
3. Generated schemas match the schemas in 8.x HEAD
tstoeckler added test coverage for
DefaultTableMapping
. As far as I can tell the rest of the test coverage looks sufficient.So, let's establish that the auto-generated schemas are the same as the declared schemas in HEAD and then move on this patch. I'm going to attempt to do this.
Comment #277
jessebeach CreditAttribution: jessebeach commentedAlight, I've created SQL dumps for an 8.x HEAD install and a site install with this patch. I installed every module and applied config for each core field. If you'd like to install this testing site, I've pushed it up to a sandbox. The dump files are attached.
https://drupal.org/project/bloat/git-instructions
I diffed them and noticed the following inconsistencies:
comment.install
, we had the following entry for the Comment entity'scid
field.which ends up having length 11.
In the patch, we declare this field as an
int
, which ends up having length 10. Many incrementing fields were 11 and now are 10 in length.'file_managed'
schema has numerous inconsistencies in field types:filesize
was abigint
, now it's anint
.shortcut
table is missing the following properties:weight
,route_name
,route_parameters
Before
After
The missing properties are now found in the
shortcut_field_data
table. I'm not sure if this done intentionally or not.Before
After
Comment #278
tstoecklerHmm... actually the ID field of an entity is declared as
serial
. SeeContentEntitySchemaHandler::processBaseTable()
. I don't know where the different lengths come from.Yes, this is intentional. Those properties were probably not put on the base table because they are not translatable. However, we put all fields on the data table - translatable or not - so that we don't have to join on the base table on normal load queries. We should check the actual queries, however, whether this actually works currently. But, yes, that is how it should be. This issue has just revealed an inconsistency where Shortcut module is not following best practices.
Comment #279
plach@269:
I probably did not explain myself well: when I say per-entity-type services I just mean that every storage class can implement its own
::createInstance()
method and define its own dependencies and services. As long as we return an implementation ofTableMappingFactoryInterface
(I guess), there is really no point in requiring the service string identifier to be fixed. For instance:@275:
We could do this as a follow-up, but I think it would imply a small (probably irrelevant) API change: we would probably end-up changing the services passed to
ContentEntityDatabaseStorage::__construct()
. I think it would be perfectly fine, though, this is getting huge.Comment #280
plachWrt the schema differences outlined in #248 and #277, here are my thoughts:
int(11)
andint(10) unsigned
would display 10 digits (as they both have a 4 byte storage capacity -> 2^32), but the former would optionally display also the minus sign. From my POV the patch is actually fixing things by declaring ids with the unsigned attribute, but we can remove it otherwise.{aggregator_feed}.image
column smaller is a problem (IMHO the current limit of 65535 bytes is ok), we should change the field definition tostring_long
.homepage
column, in fact by changing it fromvarchar
totext
we might make certain type of queries slower:(from the MySQL reference manual)
To address this we should probably change the
UriItem
schema definition tovarchar(2048)
. This size is supported since MySQL 5.0.3 so we should be fine as our minimum requirement is MySQL 5.0.15.log
fromlongtext
tovarchar(255)
, which is going to severely reduce the revision log maximum size. We probably need to change the log definition type tostring_long
.uri
column would be fixed by theUriItem
change suggested in #3; thefilesize
column probably requires a newbig_int
field item, à lastring_long
; thestatus
column needs to become a boolean item.type
column needs to be limited to a max of 32 as the other bundle names;status
/promote
/sticky
definitions should become boolean; thelog
column should become astring_long
.shortcut_set
column needs to be limited to a max of 32 as the other bundle names.varchar
lengths should be fixed by specifying the original lenghts asmax_length
in the related field definitions, the schema handler should use that information to set the max size ofvarchar
columns.I think we can fix most differences here but the ones requiring bigger changes (if any) could be deferred to a follow-up (discussed this with @catch in Szeged).
Comment #281
jessebeach CreditAttribution: jessebeach commentedI doubt anyone will be this deep into entities and storage during the beta phase. I'd venture we'll be well within the limits of acceptable churn to do this as a followup rather than further complicate an already massive patch.
I logged the issue #2265779: Allow storage classes to define their own dependencies and services.
Comment #282
BerdirNobody is quite sure on how to deal with the dependency injection for base/subclasses and if it's an API change or not I think. We'd like say it's not one but the problem is that as soon as you have a storage that needs something else, you need to duplicate the parent create() and __construct() methods, so changing it *will* break those subclasses.
We also do that for content entity storage, see comment and user implementations. Those would have to change too.
But agreed that we can still do that later.
Comment #283
jessebeach CreditAttribution: jessebeach commented#280:2 "For aggregator stuff: if making the {aggregator_feed}.image column smaller is a problem (IMHO the current limit of 65535 bytes is ok), we should change the field definition to string_long."
This gives us, conservatively, about 16K characters. I think this length will be plenty.
#280:3, updated UriItem to varchar(2048)
#280:4, the log field is now string_long
#280:5.1, fixed with the change in #280:3
#280:5.2, introduced
integer_big
type. For the life of me, I can't find tests for the data types, so I don't know how to add coverage for this.#280:5.3, changed
integer
toboolean
.#280:6-8, Haven't gotten to them yet.
Comment #285
tstoecklerI think we should rather introduce a new setting on the existing
Integer
class. Even though we have String and StringLong, etc. we did the same thing for the 'unsigned' setting instead of introducing an UnsignedInteger class.Yay, this is an oversight. Nice catch!
It seems we should update the description also?
Comment #286
tstoecklerBig merge conflict due to #2263273: config_snapshot table can be created by the storage class itself and #2068333: Convert node SQL queries to the Entity Query API. Let's see if this still passes.
Comment #288
tstoecklerReverting the UriItem schema change from #283 for now, apparently SQL doesn't like that for some reason.
Comment #290
tstoecklerThat was my fault. Incorrect merge above...
Comment #291
plachThis was addressing #280, we should fix the test instead.
Comment #292
jessebeach CreditAttribution: jessebeach commentedThanks for the reroll on those 8.x changes. That was a nasty merge indeed.
The Feed entity sets the max_length setting to NULL, causing the length of our varchar to be
(int) NULL
...BOOM. I've removed the settings override in the Feed Entity so it uses the 2048 default.I'm working on the remaining comments in #280 now.
Comment #294
jessebeach CreditAttribution: jessebeach commentedBUNDLE_MAX_LENGTH
They are already. Maybe tstoeckler updated these?
Updated.
Set the
max_length
field setting toBUNDLE_MAX_LENGTH
.varchar
definitions look fine. The title of the shortcut_set went from 32 to 255. That seems fine.I just had the File entity set the size of the filesize field to
big
.Indeed, updated to: "'The status of the file, temporary (FALSE) and permanent (TRUE).'".
The interdiff includes the UriItem change from #292. I neglected to push that commit to my diffing repo for piling on a few more commits, so...it's there.
Comment #296
tstoecklerWithout having diffed the schema myself (in a while, at least) the changes look good to me.
Except for one little thing:
Here and elsewhere:
Minor, but please use setSetting() multiple times instead of setSettings(). See #2236903: setSettings() on field definitions can unintentionally clear default values for the reasoning. (Only fix this when re-rolling anyway: again, it's minor! :-))
Comment #297
tstoecklerOops, forgot this part: This is now EntityTypeInterface::BUNDLE_MAX_LENGTH
That's why the install is failing.
Comment #298
jessebeach CreditAttribution: jessebeach commentedNamespace the constants!
Comment #300
jessebeach CreditAttribution: jessebeach commentedsetSettings()
tosetSetting()
.Comment #301
yched CreditAttribution: yched commentedI don't fully get the introduction of the 'max_length' setting for EntityRef fields.
- It's a bit awkward as it's meaningless for content entities.
- Why would it be the responsibility of the code defining the field to know the length of IDs used by the referenced (config) entity type ?
- Also, unless I'm mistaken, ER fields on config entities store the full config ID, right ? So why would some fields use a value different than 255 ?
Also, FWIW, I'm not too fond of switching to individual setSetting() calls - left a note in #2236903: setSettings() on field definitions can unintentionally clear default values.
Comment #302
fagooh no, more than 300 comments! :D
Agreed. I'm wondering whether those could be constraints only even, but we have already max_length as setting so let's continue like that for now.
Comment #303
jessebeach CreditAttribution: jessebeach commentedAn excellent question. It took me a while to tease this apart.
Firstly, the
EntityReference
field makes a distinction between content and config entities in terms of the field type:For content entities, the field type is
int(10)
. For config entities, we usevarchar(255)
.To test this, I created a node called: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.
I created a field called: field_bbbbbbbbbbbbbbbbbbbbbbbbbb
I then referenced this field instance from an entity reference field on an article. The
target_id
saved to the DB is: node.aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.field_bbbbbbbbbbbbbbbbbbbbbbbbbbThis is a
varchar(255)
field.The Node entity defines a base field
type
and the Shortcut entity defines a base fieldshortcut_set
that reference aNodeType
config entity. Without overriding themax_length
property, this would be avarchar(255)
, like any other config entitytarget_id
. But the length of the ID of aNodeType
(bundle) can never be more than 32, so setting themax_length
to 32 on these entities' entity reference base fields -- creating avarchar(32)
db field -- is an optimization from the default db field length.Comment #304
jessebeach CreditAttribution: jessebeach commented#302,
I did this is the latest patch.
IntegerBigItem
is removed. Thefilesize
base field is now just anint
with an overridden size property:Comment #305
jessebeach CreditAttribution: jessebeach commentedOn comparing the generated table schema again, I noticed numerous inconsistencies where a db field had been
NOT NULL
and after this patch, becameNULL
.I think I've tracked down why in
Drupal\Core\Entity\Schema\ContentEntitySchemaHandler::addFieldSchema()
. There's a loop in this method that sets thenot null
property of a fields schema to FALSE if the field name isn't in the keys defined for the table. See the last line of the code below.So any non-key field that invoked
setRequired(TRUE)
on the field definition would have this required property reversed here.Which means fields that had been marked
'not null' => TRUE
become'not null' => FALSE
.I updated the code as follows, only setting the
'not null'
to true if the field is one of the keys:Which brought up some errors on install with required fields that don't have default values.
I'm fixing these now.
Comment #306
yched CreditAttribution: yched commentedYes, I'm not sure it's worth the extra setting, and the added onus on the "field definer" to kow which max length makes sense depending on which config entity type is referenced. I'd tend to think that just using 'int' for content entities and varchar(255) for content entities would be good enough - but that's just my 2 cts :-)
Comment #307
BerdirThe problem with that is that those fields are often used in indexes for multiple fields, sometimes even as part of a primary key, and if the field is set to 255, then you need to manually limit the size of the field in the index, that's what we cleaned up in the issue that enforced that bundles must not be longer than 32 characters.
Comment #308
jessebeach CreditAttribution: jessebeach commentedCommitting #2249113: Use an entity save instead of db_insert() in user_install() fixed the user status field empty problems noted in #305.
This patch update fixes the problem that all non-key db fields were set to
'not null' => FALSE
(meaning NULL is allowed) regarding of their DataType definition settings.Now, having resolved that, I surfaced another issue. The method
\Drupal\Core\TypedData\DataDefinition::setRequired()
, which we see on the Node title field definition here:Does not end up changing the value of the
not null
db field property; the default for theDataType
, in this casestring
with a default'not null' => FALSE
, is always used. We can override a setting like'length'
with thesetSetting()
method because the static methodStringItem::schema()
calls up the field definition setting, which is directly manipulated in thetitle
field declaration above.So, this brings up the issue that settings in
schema()
are essentially hard-coded unless the value is pulled from the field definition settings with a default supplied.I'd love some help with the
setRequired()
problem. I'm not sure where this should be handled. I fixed this forStringItem
, but this approach isn't great since we'll need to change each of the otherDataType
items to allow'not null'
to be manipulated withsetRequired()
.Comment #310
yched CreditAttribution: yched commented@Berdir : hm, indexes... yeah, makes sense I guess. I still find the "max_length setting" approach less than ideal, but I'll have to live with it.
Any chance config entity type could publicize somewhow the corect "length" to use when referencing them ? That way the 'length' could be silently set to the correct value by ERItem::schema() ?
Comment #311
yched CreditAttribution: yched commentedRegarding NULL / NOT NULL / isRequired() :
Yeah, historically in CCK, NOT NULL in the schema has never been tied to whether the field is required.
IIRC, handling of NULL was a nasty pain back in CCK with the "mixed / dynamic table layout" ("per-bundle" tables & "per-field" tables, and the fact that a field could move from one to the other).
It's possible that those reasons are mostly gone with the current "always store configurable fields in per-field tables" layout for configurable fields". Since we don't write any record in the per-field table when a field is empty, NULL or NOT NULL is not too relevant.
But then again we're now back to generating schemas for tables that hold several fields in base tables...
And the fact that a field is required doesn't necessarily mean you have valid, non NULL values to put in the tables :
- imagine a field that's "not required", and is stored in a table along with other fields : then, in the row for a given entity (revision) ID, you have columns for every field stored in the table, and the columns for that non-required field will have NULL values when the field is empty.
- now switch the field to "required" : this change only applies to the subsequent entity submits, existing entities still have the field empty, and those NULL values are still there in the table, so you can't suddenly switch the schema to NOT NULL = TRUE
Also, depending on the field type, a field value can store several "columns", and it's not obvious to determine for which columns the "required" property should translate to a NOT NULL. A "secondary" column can totally be NULL while the field itself is not empty.
Long story short : back then, we gave up on trying to closely assign NOT NULL in field schemas, and went with 'not null' => FALSE for all field column schemas...
Comment #312
plachI didn't read the last comments with the attention they deserve, but I am not sure we need all this NOT NULL futzing: before coding we agreed that all schema columns will be nullable except for entity keys as defaults should be enforced at data structure level.
Comment #313
tstoecklerAlso, I already opened #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities a while back to discuss that and that is still waiting for review. (And yes, it's in the issue summary... :-)) So can we move that discussion there?!
Comment #314
jessebeach CreditAttribution: jessebeach commented#311,
That makes sense. I was erroneously equating the two.
#312,
My only concern here is that many fields are explicitly marked as "NOT NULL" in the HEAD
hook_install
schemas. After the conversion in this patch, they've lost that value and are now simply NULL. The title field of nodes is canonical example. By not exposing the 'non null' property of column items for manipulation throughsetSettings()
, all fields based on the core data types will be locked into the default value of 'not null' for that type.Comment #315
jessebeach CreditAttribution: jessebeach commentedFolks can also extend the basic data types in order to alter the column item values.
Comment #316
sunRe: To NULL or to NOT NULL:
As long as the columns are not part of an index, the change doesn't make a difference. However, for columns that are part of an index, allowing NULL values should be used wisely, because affected indexes require an additional internal flag to track emptiness. That makes the index larger, and a larger index negatively affects filter/group/sort performance.
This is part of the one-pager about most basic database performance optimization rules, which everyone should know inside-out and follow by heart:
http://dev.mysql.com/doc/refman/5.6/en/data-size.html
Comment #317
jessebeach CreditAttribution: jessebeach commentedThe latest schema dumps.
Comment #318
jessebeach CreditAttribution: jessebeach commentedWe can address the NULL/NOT NULL differences in #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities, a much smaller and focused patch. At 300KB, this issue's patch is already unwieldy. Consensus is that the schema differences introduced by this patch with NULL/NOT NULL values are not regressing current behavior.
Comment #319
jessebeach CreditAttribution: jessebeach commentedThe 'bigint' for file size isn't being plumbed through:
Comment #320
jessebeach CreditAttribution: jessebeach commentedWith the change to
ContentEntitySchemaHandler:: addFieldSchema()
in #308,non null
is no longer explicitly set on all db fields. The existing schema verification tests assumed thatnot null
is always present, so I removed the values from the items that no longer explicitly set it.The
isRequired()
experiment code in #308 is NOT included in this patch.I also fixed the
bigint
problem noted in #319.Those are the last of my concerns. I'm ready to move on this patch.
Comment #322
jessebeach CreditAttribution: jessebeach commentedThese merge conflicts are blowing my mind. I can't tell where to take out the field info and where to keep it. I'm going to have to start this again tomorrow morning.
Anyone else is more than welcome to tackle this mess before then :)
Comment #323
BerdirYeah, sorry, that was to be expected, plus side is, you don't need to worry about the field.info service anymore :)
Re-roll is based on a rebase, so no useful conflict diff or something, as that worked best..
Comment #325
plach@sun:
Thanks for pointing that out, in IRC we agreed to address indexes + NULLable stuff in #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities. Basically the idea is the we make NOT NULL every field for which we add an index automatically, fields adding indexes in their definition are in charge to evaluate whether it's the case to make their columns NULLable or not.
Comment #326
tstoecklerChecking the schema generation code (\Drupal\Core\Database\Driver\mysql\Schema::getFIeldTypeMap()), possible values are 'tiny', 'small', 'medium', 'big', and 'normal', where 'normal' is the 'default'. So I think we should use 'normal' as the default value here. Also, since the defaultInstanceSettings() is the one and only source for information on the possible settings right now it would be extra awesome to add a comment listing the allowed values.
You don't seriously think you could geth this into core unnoticed?! :-P Leave the ponies out of my otherwise beautiful unit test!
Comment #327
jessebeach CreditAttribution: jessebeach commentedDrat! You and your attention to detail!
Honestly though, I realized I'd let that slip in. I was going to pull it out in a reroll. Honestly. I swear.
Comment #328
jessebeach CreditAttribution: jessebeach commentedGoodbye my ponies.
The two test fails arose because the
IntegerItem
now has a default size property with the valuenormal
. The expected return value of this field type in two migration tests just needed this property added to them.Comment #330
jessebeach CreditAttribution: jessebeach commentedOops, forgot the comment requested by tstoeckler in #326.
Comment #331
pwolanin CreditAttribution: pwolanin commentedthe MapItem field is currently getting saved incorrectly. There is a patch to fix it, but berdir thought the change might have some impact on or overlap with this patch.
see: #2267753: ContentEntityDatabaseStorage::mapToStorageRecord hard-codes $entity->$name->value
Comment #332
tstoecklerThanks @pwolanin for the pointer. We're aware of the problem space and that issue is one step in the right direction. We still need #2232427: Allow field types to control how properties are mapped to and from storage however, to properly "fix" this and remove the 'serialize' hack here.
Comment #333
plachI just completed a full review of the code and I think it looks quite good. We still have lots of @todos, but I think we should be ready to go. The amount of test coverage we are introducing is impressive, btw :)
Final reviews would be welcome now, as we should not be too distant from RTBC.
Do we still need this?
Comment #334
tstoecklerThanks @jessebeach for keeping the ball rolling. Yay!
1. Simplified the 'not null' adding for entity keys.
2. Explicitly added 'not null' => TRUE for fields used in indexes to avoid the problem mentioned by @sun in #316
3. Removed the 'max_length' "feature" from EntityReferenceItem in favor of an explicit check for $entity_type->getBundleOf() to resolve @yched's concern from #310.
4. Removed unnecessary change per #333 that was still leftover from the lazy-schema creation.
I also think this is ready now. Everything from above has been resolved and we discussed this with @plach, @fago, @Berdir, and @jessebeach in IRC last week and also nothing came up there that hasn't been resolved now. So...
...who will dare to RTBC?! :-)
Comment #336
jessebeach CreditAttribution: jessebeach commentedThe
mail
base field of theUser
entity cannot be set to'not null' == TRUE
to optimize performance; the field might be empty (an empty string for example) because a user does not have an email address. This is the case for the anonymous user. Install failed on this SQL exception.I propose we set a 24 hour auto-RTBC on this issue since any qualified to RTBC has touched this issue in a significant way.
Comment #337
tstoecklerYay, @jessebeach, you rock! Makes a lot of sense. Thanks for always cleaning up after me... :-) !
Comment #339
plachWhy are we reintroducing this SQL query? The entity query was fine, my only doubt was about doing it only after doing a SQL-specific check...
About RTBC, I think @fago didn't write code for this, so he could be a good candidate to RTBC it. It would also be good to get a +1 from @yched, if possible.
Comment #342
jessebeach CreditAttribution: jessebeach commentedplach, I don't see this code in the patch in #336. Maybe you were reviewing an early version?
Comment #343
jessebeach CreditAttribution: jessebeach commentedThe patch in #336 no longer applies. Rerolled
Comment #345
yched CreditAttribution: yched commentedDidn't delve in the meaty parts yet (ContentEntityDatabaseStorage, Core\Entity\Schema, Core\Entity\Sql),
posting what I have on "the rest" - mostly minor stuff.
Unintended empty line removal
I can't seem to find where this behavior lived previously, and thus don't fully get how moving it is related to that patch. No biggie, just wondering.
Does this only affect nodes / nodeTypes ?
Isn't that potentially true for any (config) entity that is a bundle_of a (content) entity ?
@todo needs an issue link ?
(+ minor : the code would be slightly easier on quick visual parse if the $field_name == 'language' condition came first)
@todo needs an issue link ?
Not relevant anymore, according to @tstoeckler's #334 ?
Also
- one in Node.php
- MigrateFieldInstanceTest::testFieldInstanceSettings() adds a 'max_length' to the expected settings of a file field
- CustomBlock::baseFieldDefinitions() still has a 'max_length' for the 'type' field
- Term::baseFieldDefinitions(), same for 'vid' field.
(What I don't get is that those last two are already in current HEAD - got added in #1709960: declare a maximum length for entity and bundle machine names. Yet AFAICT EntityRefItem defines no 'max_length' setting ?)
minor - needless hunk, we could avoid touching that file completely (well, not that patch size matters at this point :-p)
Comment #347
jessebeach CreditAttribution: jessebeach commentedFixing failures.
Comment #351
jessebeach CreditAttribution: jessebeach commentedHuh, so
$entity->getRouteParams()
returnsarray()
in 8.x andNULL
with the patch. I can find no obvious reason for this so far. PassingNULL
as the$parameters
value toUrlGenerator::doGenerate()
results in a PHP warning because thatNULL
gets passed toarray_replace
.Comment #352
jessebeach CreditAttribution: jessebeach commentedThe fields on the Shortcut entity are causing the problem and it seems to be because we switch the
shortcut
andshortcut_field_data
tables. I'm getting closer to the issue...I'm currently looking at how the entities are constructed that get passed toContentEntityDatabaseStorage::attachPropertyData()
.In HEAD and the patch, the following query is run:
In HEAD, this simply returns four fields in the table
shortcut_field_data
:id
,langcode
,default_langcode
andtitle
.In the patch, it returns the base fields, which includes the field
route_parameters
, which allowsNULL
and has no default, so it gets the valueNULL
.route_parameters
is aLONGBLOB
and amap
field item, so mayberoute_parameters
just needs to set a default ofarray()
?Comment #353
jessebeach CreditAttribution: jessebeach commentedAlright, these Shortcut field values are simply being stored differently.
Comment #354
jessebeach CreditAttribution: jessebeach commentedSo, the problem is the use of the
MapItem
. berdir notes that #2235457: Use link field for shortcut entity will change theroute_parameters
field from aMapItem
toLink
type field.Comment #355
jessebeach CreditAttribution: jessebeach commentedI'm trying to get around the shortcut
route_parameters
issue just to keep us moving here with a simple workaround by somehow setting a default value ofarray()
on the field, but nothing seems to stick. I've tried setting the value withFieldDefinition::setSetting('default_value', array());
; I've tried setting a default value inShortcut::preCreate()
andShortcut::preSave()
. Nothing wants to work to get an empty array in place.Comment #356
jessebeach CreditAttribution: jessebeach commentedActually, I'm unpostponing this. I think we can find a simple solution to move this issue along. It might not be the most elegant or perfect solution, but it just needs to enable us to make progress. Postponing this on an issue that is itself postponed is just going to grind progress into the dirt.
Comment #357
BerdirThis was not properly merged with #2267753: ContentEntityDatabaseStorage::mapToStorageRecord hard-codes $entity->$name->value. That issue adds a check for the main property in mapToStorageRecord() and if that is NULL, it uses all values of the field and stores those. That check got lost.
I'm not sure if that logic should happen inside the table mapping somehow or outside, probably inside (make column name NULL or something to indicate that?), but I don't know this stuff well enough to provide a fix.
Might also be worth to keep some of the documentation/logic of the changes that that method from the other issue, check how it looks right now with always working on the first field item explicitly.
Comment #358
fagoI think the table mapping is fine as it is, i.e. value => $field_name. So attached patch re-implements the logic of #2267753: ContentEntityDatabaseStorage::mapToStorageRecord hard-codes $entity->$name->value and re-adds + updates the comment.
Comment #360
yched CreditAttribution: yched commentedPutting the case for "count($columns) == 1" within the loop on $columns looks a bit weird :-)
Also, can we preserve the "start by unconditionally doing
$items = $entity->get($name)->first()
, and then handle the various cases" shape that was added in #2267753: ContentEntityDatabaseStorage::mapToStorageRecord hard-codes $entity->$name->value ?The mapToStorageRecord() method only knows how to store single-values fields, let's make this explicit upfront. The current patch has one "if" branch that does an explicit $field_name->first(), and the other that does an implicit $field_name->$column_name.
Comment #361
fagoMakes sense, ok.
yep, that's a little bit confusing but I was not able to model the logic simply in another way. Howsoever this code is only temporarily until we get to #2232427: Allow field types to control how properties are mapped to and from storage anyway.
Comment #362
jessebeach CreditAttribution: jessebeach commentedThis should resolve the test failures. There are two exceptions in the
KernelTestBaseTest
that I'm seeing on HEAD as well.Comment #365
yched CreditAttribution: yched commentedMaking my way in the EntityStorage / tableMapping / SchemaHandler stack.
A couple high-level remarks for now :
--> rename to SqlEntityStorageInterface ?
But Mongo is a database too.
--> rename to ContentEntitySqlStorage ?
makes it clear that core stores in SQL by default, not just "in a database".
The only thing such an object does is generate a schema
--> we could be more specific than a vague "handler" (basically = "doer of some stuff") and rename to ContentEntitySchemaGeneratorInterface ?
OK, ContentEntityDatabaseStorage has getSchema(), which proxies to $this->schemaHandler->getSchema()
Not sure that should mean ContentEntityDatabaseStorage implements ContentEntitySchemaHandlerInterface.
Not clear what this brings, and muddies the waters IMO : a ContentEntityDatabaseStorage is not a SchemaHandler / Generator, and I don't think we have a use case for a ContentEntityDatabaseStorage facading as one ?
--> The getSchema() method could simply added to SqlStorageInterface ? (which would make sense anyway IMO - an SqlStorage needs to have a schema)
--> Or at least do "SqlStorageInterface extends ContentEntitySchemaHandlerInterface", and then ContentEntityDatabaseStorage only implements SqlStorageInterface ?
(I'd still vote for the former though)
nitpick : "entity fields for a table" isn't too clear. List of fields stored in the table ?
+ : why return just names, and not definitions directly ?
- DefaultTableMapping can do it since it stores $storageDefinitions it receives from the EntityStorage.
- It would simplify ContentEntitySchemaHandler, which then doesn't need to store storageDefinitions itself
--> it seems we could at least remove it from ContentEntitySchemaHandler with the previous point
--> we could also remove it from DefaultTableMapping if its construct received the ContentEntityDatabaseStorage ?
Basically : you are a TableMapping or a SchemaHandler *for a given EntityStorage*. If you need the list of field definitions, ask it to your EntityStorage.
ContentEntityDatabaseStorage::$storageDefinitions = not clear which "storage definitions" we're talking about :-)
--> we should go with the exact name IMO and rename to $fieldStorageDefinitions
Comment #366
yched CreditAttribution: yched commentedAlso :
- DefaultTableMapping.php
Why "Default" ? There's no real logic/behavior in there, it's basically a glorified array - the mapping is designed from the outside by EntityStorage, in a series of if(revisionable/translatable) in getTableMapping(). So, not sure why we have a "Default" TableMapping class that you might want to override - direclty naming it TableMapping would fly ?
Alternatively, we could move "design the mapping for the various cases" logic in the Mapping class itself, and have different classes for "revisionable", "translatable", "revisionable + translatable".
EntityStorage::getTableMapping() would just instanciate the right class in its if(revisionable/translatable) checks ?
No need for public setFieldNames() / setExtraColumns() methods anymore, this is done by each class internally according to the case it handles. They can probably all inherit from a TableMappingBase that is basically the current DefaultTableMapping, with the set*() methods moving to protected ?
Not sure which is preferable, just putting the idea out there. I must say delegating some complexity out of ContentEntityDatabaseStorage kind of appeals to me.
Comment #367
fagoAdded my thoughts regarding #365 and #366:
True. But what about SqlEntityStorageInterface and SqlContentEntityStorage ? We should be consistent where to put the SQL.
Seems reasonable. Also the interface is totally unrelated to content entities - so I was wondering whether it should just be EntitySchemaHandlerInterface/EntitySchemaGeneratorInterface
Good question. However, once you can get the table mapping from the storage, it would make sense to be able to get the schema also. Moreover it should be required during installation right now. Instead we could just move getSchema() to the SqlEntityStorageInterfae - if you implement that you should provide a schema as well right? Then, EntitySchemaHandlerInterface could be used only internally and is not required to be known outside the entity system.
Yes that! ;)
Not sure how it would get the field definitions from the storage?
Seems reasonable.
Interesting idea, I could see this help to better organize and separate things - thus I like it. What do others think?
Imo, this would be something we could take care of together with the discussed follow-up of improving the separation of the main storage and the table mapping. I was not able to find an issue for that though, so I created one: #2274017: Make SqlContentEntityStorage table mapping agnostic .
Comment #368
plach@365:
1: Makes sense (with @fago's correction :)
2: Makes sense too, but I am not sure this is the right place to make it happen. The patch here is already huge and this change would increase it unnecessarily.
3: My plan is adding more methods and functionalities to this class in #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions to handle changes in entity/field definitions, so it will become a full-fledged schema handler. Thus I think the name is fine as-is.
4: Given 3, I think your proposal makes sense: we would be forced to add more methods to the storage class that would make no sense for it, so just adding an "unrelated"
getSchema()
method to the storage would make sense.5: No strong opinion about this. I guess we should try and be consistent with the rest of core code: if usually we return a list of definitions, let's do it here too.
6: Not sure about this: it might complicate the inter-dependencies between the three classes, which are already a bit unclear. What about addressing this in #2274017: Make SqlContentEntityStorage table mapping agnostic ?
7: Agreed
I like 366 too: I spent quite some time trying to figure out how to isolate that logic and the table mapping seems to be a good choice. I agree we could discuss this in #2274017: Make SqlContentEntityStorage table mapping agnostic where more clean-up should happen.
Working on those failures now.
Comment #369
plachThis should be green again.
Comment #370
plachHere is the interdiff
Comment #371
plachAddressed 365, except 5 (it was too late):
The most relevant change is the introduction of
EntitySchemaProviderInterface
, which exposes only thegetSchema()
method and is implemented by the storage class.EntitySchemaHandlerInterface
extends it and will expose additional methods after #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions.@Jessebeach:
If you are going to post other patches for this, please use the sandbox and pick a topic branch named after you, so managing interdiffs will be easier, thanks!
Comment #375
xjmThis seems like good DX. However, the documentation here doesn't explain how I'd want to use this method (e.g., overriding it to add the
{taxonomy_index}
table).Comment #376
xjmI also added this information to the change record:
https://drupal.org/node/2259243/revisions
Comment #377
fagoI reviewed the whole patch again and fixed small things while going through it. Attached is a patch which does various small documentation fixes/improvements. Besides that,
SqlEnttiyStorageInterface
should extend fromEntitySchemaProviderInterface
, thus I've done so also.Interdiff & patch attached, I've pushed it to 8.x-et-entity_schema_auto-2183231-fago also.
I do not think this matters to someone *using* the interface, but it matters for someone defining the storage of an entity type. Thus, I added respective documentation to ContentEntityDatabaseStorage - please see the interdiff attached.
Further issue I noted while going through the patch:
Having "string_long" + integer and a size option is a DX wtf. Thus, I think string should have a respective size option as well.
Update: That's pre-existing though, i.e. integer and strings do it already differently. Thus, it should be a separate issue.
Then I figured we have quite some "Not null" fields which are not marked as required, e.g. comment pid. That's not related to this issue though, so I created #2274597: Not all required fields are marked as required
Comment #378
fagoAttached patch/interdiff addresses that, other remarks of #377 are left for follow-ups as said.
Comment #379
plachChanges look good to me, I hope they won't prevent @fago from RTBC'ing this ;)
That was intentional as the schema handler knows it deals with the
ContentEntityDatabaseStorage
and I didn't want to pollute the public API with those methods that are implementation-specific. Anyway, I agree those are probably going given the latest discussions, so no need to worry about them.Tomorrow I will hopefully finish to address @yched's reviews (I think also #345 is still pending), and then I guess we should be done, at least with this first step.
Comment #380
aspilicious CreditAttribution: aspilicious commentedIn the change record I'm missing some info about "changing the schema". Everyone knows how to alter a database schema in an update function for the regular hook_schema. How are schema changes handled now?
Or are the schema's of content entities fixed by default.
Comment #381
fagoad #380: That's not implemented yet, but stuff left for [#2259243].
I've worked over the change record draft, to explain that + improve it in general. I've removed the class diagram from there as it was hard to understand and I don't think internals need to be in the change record, but feel free to add it back if you think it should be there. Changes: https://drupal.org/node/2259243/revisions/view/7280867/7282547
Comment #382
plach@aspilicious:
One of the goals of this issue is reducing the need of manually altering the entity schema, as this is (mainly) derived from entity/field definitions. In most cases you would change the related definitions instead of actually touching the schema definition (this will be handled in #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions). Whether allowing direct manipulation of the generated schema is going to be discussed in #2258347: Consider adding hook_entity_schema_alter(), but personally I'd hope for a won't fix, although concrete use cases might be presented that would make me change my mind :)
Comment #383
xjmAgreed, the class diagram might be relevant handbook documentation on the Entity API's architecture (or even linked from the Entity API @defgroup) but it doesn't really belong in a change record.
Comment #384
xjmRerolled for #2167167: Remove field_info_*() -- merge conflict was the removal of core/modules/field/lib/Drupal/field/Tests/FieldInfoTest.php. And a small doc fix. (I don't have access to the sandbox.)
Comment #386
jessebeach CreditAttribution: jessebeach commentedAttached are the two latest SQL dumps for diffing. I don't see any major inconsistencies.
Let's get this in!
Comment #387
jessebeach CreditAttribution: jessebeach commentedI'm pre-emptively rerolling this on #2247991-57: [May 27] Move all module code from …/lib/Drupal/… to …/src/… for PSR-4, which is scheduled to land today.
Comment #388
jessebeach CreditAttribution: jessebeach commentedTo psr4-ify this patch, grep for the following pattern in the patch:
\/lib\/Drupal\/(?!Core)[a-z_]*\/
and replace the pattern with this:
/src/
I ran this in Sublime Text 2. In the past, my regexes have not reliably run in other REGEX flavors, so buyer beware :)
Comment #389
jessebeach CreditAttribution: jessebeach commentedI'll set back to Needs Review when #2247991: [May 27] Move all module code from …/lib/Drupal/… to …/src/… for PSR-4 is committed, which should be shortly.
Comment #390
plach@jessebeach:
I had to ignore your latest patch to be able to work in the sandbox (please push your next changes to it in topic branch named after you). Would you mind to do another conversion to PSR-4 to this one? I'd like the bot to have a pass to it.
@yched:
I tried to implement this, but it made lots of parts of code more complicated as we actually need field names in many places, so I don't think it's worth.
This should address also #345:
2: I think previously we were just allowing for NULL values to be stored but now we don't.
6: My understanding was that we would fix just ER items, not that we would be reverting #1709960: declare a maximum length for entity and bundle machine names.
Comment #391
plachRe-uploading just in case...
Comment #392
plachRerolled after #2247991: [May 27] Move all module code from …/lib/Drupal/… to …/src/… for PSR-4, git did all the work, let's see whether this is still green.
Comment #393
fagoChanges look good, except for:
Comment exceeds 80chars.
Anyway, there is nothing left which should this one up - we can do address improvements and field definition vs names questions in follow-ups. Let's unblock the remaining blockers and get this in! As I only did some small comment fixes I take the liberty to mark this RTBC anyway.
Comment #394
yched CreditAttribution: yched commentedI don't understand this. If it returns an array of Definitions keyed by field name, then both cases ("caller needs definitions" and "caller just needs names") are covered ?
Re: 'max_length' on ER field definitions : well, that setting currently has no effect whatsoever. Maybe we don't want to remove the existing ones from core in this patch here, but at any rate we shouldn't be adding new ones ?
Comment #396
catchRead through the whole patch except I ran out of steam at the test changes.
I couldn't find anything to complain about, this looks great. So many hook_schema() gone and the code that replaces hook_schema() is very light. Thanks for all the work comparing the generated schema, that was my main worry with this patch.
Do we need a follow-up for getFieldNames()?
I'm confused by this comment:
Isn't this still a varchar(255)?
Not committing just yet but I don't have any objections if someone else does.
Comment #397
jessebeach CreditAttribution: jessebeach commentedre:396, I think that's related to what's happening in #2275905: Clicking the enable language support link for a field settings form results in an SQL Insert error.
Comment #398
fagoOpened #2277169: Convert TableMappingInterface::getFieldNames() to return field definitions
Comment #402
xjmNot just the --stat. ;)
Comment #404
alexpottThe logic here is curious - shouldn't we be prioritising getting the table mapping from the storage over using
drupal_get_schema
Comment #407
fagoProblem is that menu links are not covered by the table mapping yet, that's content entity only and menu links are no content entities. However, even with custom-tailored schema it should be possible to implement the respective interfaces to provide a tablemapping. @plach, @tstoeckler, what do you think?
Comment #409
fagoThis should fix the test fail.
Comment #411
fagoBesides that the test now requires existing files, it was actually broken although green else. The test needs a custom changed date, but that's forced to REQUEST_TIME when doing an entity save. Turned out retrieveTemporaryFiles() is broken also, as it says to return array but returns a database result object. Here is a proper fix, fixing the implementation of retrieveTemporaryFiles() by making it return a list of file identifiers.
Looking at the menu-link issue now.
Update:
grml, I had the first two files deleted :/ The right ones are the last two, please look at those only.
Comment #412
fagoSo here is an updated patch + interdiff that moves the menu link schema to its storage controller.
Problem is that it uses drupal_write_record() which we cannot easily replace as it applies defaults and takes care of some serialization also. I introduced an optional $schema parameter which allows using drupal_write_record() with passed in schema also.
Comment #415
fago#412 will need some more review and changing db_write_record() is out of scope for this issue, so let's better move this to a follow-up -> created #2277979: Menu link storage does not implement SqlEntityStorageInterface.
Attached patch reverts #412 and adds the following interdiff for making the fallback to db schema nicer.
Comment #416
tstoecklerThanks @fago. I agree that it makes sense to explore the menu link issue in a follow-up. The interdiffs look good and address @alexpott's only concern. Back to RTBC.
Comment #417
alexpottCommitted 1ab2651 and pushed to 8.x. Thanks!
Comment #420
yched CreditAttribution: yched commentedCongrats to everyone involved !
Opened #2278051: Remove uses of non-existing 'max_length' setting on EntityRef fields.
Comment #421
tstoecklerYay, thanks so much everyone for keeping this going in the last couple of weeks. AWESOME!!!! :-)
See you in the various related / follow-up issues.
Comment #422
plachA W E S O M E work guys!
Comment #423
jaredsmith CreditAttribution: jaredsmith commentedThis commit breaks installation of Drupal 8 HEAD on PostgreSQL.
During the installation, it tries to create a database table with illegal syntax:
Please notice the Array in the second unique key.
Comment #424
tstoecklerFound the following issue: #2278493: Drop support for unique keys specified as an array of column name and length (breaks PG installation)
The attached patch avoids that error by not sepcifying a unique key as an array of column name and length in the first place.
This still does not get through the installer. The next fail is:
Haven't looked into that yet.
Comment #425
jessebeach CreditAttribution: jessebeach commented@tstoeckler, great catch. We should open this as its own bug given we're at comment #425 here and this issue might need further discussion.
Comment #426
Crell CreditAttribution: Crell commentedThe Postgres driver didn't work right even before this patch went in; See #2160433: [policy, no patch] Move PostgreSQL driver support into contrib and #2157455: [Meta] Make Drupal 8 work with PostgreSQL or remove support from core before release.
Please file a new issue for the postgres driver with that meta as a parent. Thanks.
Comment #427
XanoThis issue conflicts with my content entity type, whose fields do not match the table schema. This has the following consequences:
drupal_write_record()
, which means data is no longer automatically serialized based on the schema. Even though this is easily dealt with, it is something to take into account.I'm mostly interested in solutions to my query problems, as I want to disable the field > column matching completely.
Comment #428
tstoecklerRe #425: I agree that that should be a separate issue, but I never really understood why we did the file.uri unique key thing, so @plach would have to open that issue. I wouldn't know what to call the issue! :-)
Re #426: #2278493: Drop support for unique keys specified as an array of column name and length (breaks PG installation) is that issue for a concrete problem found already. Marked that as a child of #2157455: [Meta] Make Drupal 8 work with PostgreSQL or remove support from core before release. Will open an issue for the shortcut problem and any other ones, as soon as I figure out what the problem is.
Re #427:
So I'm not sure whether this is true, depending on what specifically you mean. How serialization works currently is that it checks the
serialize
key in the field item schema. SeeMapItem
for example.SqlStorageInterface::getTableMapping()
in your storage.SqlStorageInterface::getTableMapping()
and use$table_mapping->setExtraColumns()
.It would be pretty cool if we could support that optimization. But while I'm not yet 100% sure we can do that in a clean way let's discuss that in #2107249: Don't assume that content entities have numeric IDs in EntityReferenceItem.
Comment #429
Soul88It doesn't.
http://dev.mysql.com/doc/refman/5.7/en/char.html
Comment #430
Berdir3. Thanks for the clarification there, @Soul88. What the table schema supports automatically is limiting the size of the bundle config entity reference to 32, because it's often used in combined keys. Everything else can be done by altering the generated schema as pointed out by @tstoeckler for the other parts.
I'm also confused by your other tasks, content entities are essentially nothing but a container for a set of fields, they shouldn't consist of anything that's not a field and therefore, I'm also not sure why you need those requirements with differently named columns and additional columns. Entity query was already strictly based on the field definitions before this issue.
Comment #431
XanoSome of the 'problems' I am experiencing are because my content entities contain plugins. The entity types are quite a bit older than much code in Drupal 8, so maybe there is a solution for this that was added after creating the entity types and that I am not aware of yet.
The thing with plugins is that plugin IDs and plugin configurations must be stored, but when entities are loaded, this data must not be available as fields, but as plugin instances.
I also just published the change record for this issue. I did not see that last night, because it was still unpublished, so I may have missed some important information. I will check it today and report back later.
Comment #432
yched CreditAttribution: yched commentedWah, plugins in a content entity :-). Not sure what your use case is, but that sounds like it would require a custom "plugin id + config array" field type.
D8 is "If it needs to be in a content entity, it has to be a field".
Comment #433
jaredsmith CreditAttribution: jaredsmith commentedRe: 426, PostgreSQL may not have been working perfectly before this patch was committed, but at least we could complete the installation process and add basic content, etc. Once this patch was committed, we lost the ability to even complete a Drupal 8 installation on PostgreSQL. I'm fine with moving this to a new issue, but I think saying "PostgreSQL was already broken, so it's not a big deal" is a bit cavalier.
Comment #434
tstoecklerSo took the patch from #424 to its own issue: #2279363: file_managed.uri should not specify a prefixed unique key. @Berdir's comment in #2278493: Drop support for unique keys specified as an array of column name and length (breaks PG installation) convinced that it's an actual bug.
Will now look into the shortcut problem and link it from here, as soon as I find anything.
@Xano: Can we discuss your use-case in a separate issue? It seems that is a sufficiently complex problem space that that would deserve it. I'm happy to help/discuss there, though!
Comment #435
tstoecklerOpened #2279395: The PostgreSQL backend does not handle NULL values for serial fields gracefully and #2279405: ContentEntityDatabaseStorage inserts bogus NULL values for new entities. Applying the patch from the latter together with the one from #2279363: file_managed.uri should not specify a prefixed unique key lets me install without errors on PostgreSQL again. Phew! And sorry for the trouble everyone.
I think this is now finally finished.
Comment #437
plachRelated issue: #2322097: Enforce data tables for translatable entity types in the SQL entity storage.
Comment #438
david_garcia CreditAttribution: david_garcia commentedPossibly related issue:
#2342699 ContentEntityDatabaseStorage tries to update primary key values by default
As for now, statements against the database are issued with UPDATES to PRIMARY KEY fields. Some strict engines, such as SQL Server, complain because of this. A PRIMARY KEY cannot be updated.
Comment #439
yched CreditAttribution: yched commentedThe people in this issue will probably be interested in #2371709-22: Move the on-demand-table creation into the database API...
Comment #440
yched CreditAttribution: yched commentedThis added a 'size' setting to IntegerItem, as a *field* setting rather than a storage setting. Yet it's used in the schema.
#2376689: IntegerItem 'size' setting should be a storage setting
Comment #441
effulgentsia CreditAttribution: effulgentsia at Acquia commentedAnyone here remember why this was added? It's being targeted for removal in #2933408: Remove broken allowance of empty langcode in a PATCH request, so please comment there if you have any concerns about that.