Updated: Comment #60
Problem/Motivation
In Drupal 8 we introduced the concept of base fields which are "full blown" fields, but their schema is still hardcoded in hook_schema()
implementations of their entity types. If we want to be able to auto-generate the entity base tables for multilingual and performance purposes (don't create revision tables if the entity type is not revisionable), we need an API to retrieve the schema for base fields.
--> This task is a blocker for #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions.
More generally, we want to get rid of the separation between 'field types that can be used for base fields" & "fields types that can be used for configurable fields" - we have one unique notion of "field type", but we still have sub-interfaces for "field types that can be used for configurable fields". This is another step in that direction.
Proposed resolution
Make all field types provide a schema(), and clarify the relation between the schema and the "property definitions" and the respective ways to access them.
Remaining tasks
None.
User interface changes
None.
API changes
- The schema definition method
ConfigFieldItemInterface::schema()
moves up a level toFieldItemInterface::schema()
, making it a requirement for all field types (configurable + base fields) - Consequently,
Drupal\field\FieldInterface::getSchema()
moves toDrupal\Core\Field\FieldDefinitionInterface::getSchema()
, makingFieldDefinition
the authoritative way of getting the schema for a field type
Comment | File | Size | Author |
---|---|---|---|
#66 | interdiff.txt | 1.44 KB | effulgentsia |
#66 | move_schema_to_field_definition-2144327-66.patch | 48.05 KB | effulgentsia |
#63 | interdiff.txt | 703 bytes | amateescu |
#63 | move_schema_to_field_definition-2144327-63.patch | 47.61 KB | amateescu |
Comments
Comment #1
yched CreditAttribution: yched commentedComment #2
amateescu CreditAttribution: amateescu commentedHere's the patch that I started while working on #2142549: Support multi-column field types in base tables. Didn't continue in there because there's a weird coupling of
Drupal\field\FieldInterface
withDrupal\Core\Field\FieldDefinitionInterface
which doesn't make sense, so I thought we need a separate issue to sort this out.Also, I think this is part of #2144263: Decouple entity field storage from configurable fields so I'll assign that as a parent.
Comment #3
dasjothis would be helpful for #1740492: Implement a default entity views data handler.
the patch didn't apply anymore FieldDefinition.php, attached is a re-roll without any other chances
Comment #4
amateescu CreditAttribution: amateescu commentedThe patch was createed on top of #2047229: Make use of classes for entity field and data definitions, which is a prerequisite anyway, that's why it didn't apply :)
Comment #5
dasjoops, sorry for the noise then :)
Comment #6
fagoOne important thing we need to figure out here is the relationship of property definitions and the schema. Right now, we kind of assume schema column == property name everywhere if a property is stored. I think that's reasonable to assume, but still this should be documented to be required then.
Comment #7
amateescu CreditAttribution: amateescu commentedSo.. #2047229: Make use of classes for entity field and data definitions got in, let's get back to this patch. I don't see any alternative than duplicating the code for
getSchema()
andgetColumns()
betweenDrupal\field\Entity\Field
andDrupal\Core\Field\FieldDefinition
so maybe that needs to be cleaned-up in another issue, probably #2114707: Allow per-bundle overrides of field definitions?Any proposal on where to document the relationship between property names and schema columns?
Comment #8
yched CreditAttribution: yched commentedMaybe in the phpdoc for FieldItemInterafce::schema() ?
"The 'columns' need to be a subset of the properties defined in getPropertyDefinitions()" ?
Comment #10
amateescu CreditAttribution: amateescu commentedThere we go. Does it make sense to postpone this on #2112239: Convert base field and property definitions?
Comment #11
yched CreditAttribution: yched commentedWhy do you feel it could be affected by #2112239: Convert base field and property definitions ? (maybe it is, just not clear to me)
Just a couple remarks after a quick look, doesn't really qualify as a review :-)
The part about "This specifies what comprises a value... combination of 'value' and 'format'" is actually out of place now, we should remove that.
Defining "what comprises a value for a field type" is done in getPropertyDefinitions().
Hm - why is this needed ? sounds weird to have a base FAPI element refer to a constant defined by the "email field type" class ?
Won't work, should be $field_definition->getFieldType()
Comment #12
amateescu CreditAttribution: amateescu commentedBecause I thought we might want to remove hardcoded things like 'not null', 'default', 'length' from these schemas and populate them based on the field definition and it would be nice to be able to use the class-based one :/
Fixed all those points.
Comment #13
yched CreditAttribution: yched commentedNot fully sure what you have in mind, but at any rate the $field_definition received by the schema() method should already be FieldDefinitionInterface objects anyway, even if not all baseFieldDefinitions() methods have not been converted from the old array syntax ? There's a BC layer that converts them to objects just out of baseFieldDefinitions().
Comment #14
amateescu CreditAttribution: amateescu commentedThen I guess I'm just tired :P
Comment #17
amateescu CreditAttribution: amateescu commentedOk, I remembered what I was thinking initially and it's not ideal. I think we want to leave the population of those schema properties ('not null', 'default', 'length') to the code that creates the tables.
Comment #18
yched CreditAttribution: yched commentedI'm not sure I understand - how could some external code decide of 'not_null', 'default' values, if not the FieldItem::schema() ?
(although, not sure either why we'd need to provide a 'default' at all ? The current "configurable" field types that already provide a schema() in HEAD don't provide any ?)
Comment #19
amateescu CreditAttribution: amateescu commentedBecause it needs to decide at the time when a module that provides an entity type is installed. After that, if some property of a base field of that entity type changes, the schema generated for it won't match what's already in the database.
I'm considering all this from the perspective of "auto-generate entity base tables" using base field schemas, and it's also very possible that I'm just overthinking it :)
Comment #20
yched CreditAttribution: yched commentedRelated: I opened #2150511: [meta] Deduplicate the set of available field types
Comment #21
amateescu CreditAttribution: amateescu commentedRerolled.
Comment #22
fagoYeah, the schema is generated based on field definitions at install time. If you change your definitions, you have to take care of applying the schema changes. Sure it would be nice if we could generate updates.. - but that's out of scope for now. Consequently, I think we should include NOT NULL as needed.
But.. I'm not sure about default value. So far configurable fields did not have it, but base fields had. Still, default values are populated at API level anyway and are sometimes dynamic (time, language..). Any default values at db-level won't be picked up or used ever if you go with the API as you should - so what's the point in having and maintaining them?
Comment #23
fago$this->schema is an array, i.e. it getSchema() returns a copy each time it's used. So is this really necessary?
This could use a little more clarification, as - as of now - you have all objects you need when you look at the interfaces. Also, for base fields you actually could do so.
So maybe we should add that we cannot instantiate field items in every case as field instance settings might be missing.
We might want to document that computed field should return an empty array here, or even false?
Comment #26
effulgentsia CreditAttribution: effulgentsia commentedPer http://buytaert.net/the-next-step-for-drupal-8-is-a-beta, if this is going to happen in D8, it should happen by beta.
Comment #27
amateescu CreditAttribution: amateescu commentedI looked at it with xdebug just to be sure, and it seems that it is indeed necessary. getSchema() does not return a copy because memory addresses were the same on subsequent calls to getColumns().
Fixed the other two points.
Comment #28
yched CreditAttribution: yched commentedWIll try to review monday.
Regarding 'default_value':
- Yes, there's the unfortunate clash with the Entity/Field API notion of "default value", that is actually unrelated. Would add confusion, and we can't really rename either one.
- We always save "complete" db records, with data for all columns (be it NULL, ''...). So a 'default value' specified in the schema will never be about "what value gets written on partial inserts".
- The remaining case is "when adding a new column to an existing table, what to put in the column for the existing rows".
This is why D7/configurable fields never had to bother with that (per field tables, we don't add new fields to a table), and the only code that did was hook_updates that added a new "column/property" to an existing field type - but that was only in updates, and the hook_update contained an explicit schema without calling hook_field_schema().
It seems even now that we're putting several (base) fields in the same table, adding new ones after the table creation will be a job for hook_update ?
So yes, I don't think I see a use case for adding 'default value' in the field schemas ?
Comment #29
yched CreditAttribution: yched commentedHah - nothing in core currently uses the 'date' field type :-)
Well, I guess the target would be to store a timestamp, yes ?
The "configurable" DateTimeItem that was added in core from contrb date.module choses to store dates as ISO strings (2013-12-18T03:01:00), but we're probably not going to switch node.created and friends to that format - at any rate not in that issue ;-)
So yes, I'd tend to think we go with just 'type' => 'int', as node_schema() currently does ?
Then again, if this 'date' field type was intended for things like node.created, not sure why it currently uses an 'integer' field instead.
@fago, does it ring any bell on your side ?
For consistency with configurable fields (and for our current base table schemas) we should add a not_null TRUE in there - and in the other converted base fields as well. Only varchars should have not_null FALSE.
Why not parent::schema() ?
The class still uses this, no ?
Leftover :-)
Oops - this should never have been ->settings['foo'], but ->getFieldSetting('foo') :-)
Seems weird hat this gives two violations now.
So it looks like the "too long" check is already performed somewhere else:
getPropertyItems() defines the 'value' property as being of DataType Drupal\Core\TypedData\Plugin\DataType\Email, which defines a constraint using \Symfony\Component\Validator\Constraints\EmailValidator, which performs a filter_var($value, FILTER_VALIDATE_EMAIL);
So maybe we shouldn't be adding an additional constraint for max length at the field level then ?
Comment #30
Berdir1. Yes, date is a date string just like the configurable one, so should have the same schema definition for that column. We additionally have the timestamp data type, which is an integer. The advantage of that is that it also implements DateTimeInterface, so you can get a DateTime object from it. There's currently no TimestampItem. Something that we want to add there too is a special field type for changed fields, that knows that it has to update itself in preSave().
7. I had that that problem too in #2023563: Convert entity field types to the field_type plugin, we then reverted the merge of the e-mail classes there for that reason. Yes, the constraint in the configurable e-mail item class is unnecessary and incomplete (it's the only e-mail validation that is applied there), without it, we do have the UX problem of not knowing to which field the error relates to. But there's a separate issue to fix that.
Comment #31
yched CreditAttribution: yched commentedThanks for the explanations, Berdir :-)
re: date fields, added some considerations in #2150511-10: [meta] Deduplicate the set of available field types
Comment #32
amateescu CreditAttribution: amateescu commented'default' => 0
to go along with'not_null' => TRUE
...Comment #34
amateescu CreditAttribution: amateescu commentedComment #35
yched CreditAttribution: yched commented3. Ah true. Well then:
ConfigEntityReferenceItemBase::schema() does "if (old hook_field_schema exists) {return schema from the old hook}". We could add an "else {return parent::schema()}" after that ?
Then ConfigurableEntityReferenceItem::schema() can just do "parent::schema()", and will require no further change when the "old hook" BC layer is removed ?
7. I think @Berdir's comment means we should keep the duplicate constraints / violations for now ?
Comment #36
yched CreditAttribution: yched commentedAlthough, @Berdir re email constraint / violations:
it surprises (and worries :-)) me that we're able to assign field-level violations back to the right widget, but not property-level violations. You say there is an existing issue for that ?
Comment #37
BerdirThe problem is form API, which displayes validation messages at the top, *we* can map it correctly AFAIK. That's why we're messing with the validation error message and inject the field label. #2023465: Allow validation constraints to add in message replacements is the issue about that
I'd be fine (more than that, actually) with dropping that custom validation constraint, if we're ok with loosing that info until solve the issue above. As mentiones above, it only covers a tiny subset of validating an e-mail anyway.
Comment #39
amateescu CreditAttribution: amateescu commentedI'd prefer keeping the second validation, at least as a reminder that there's something we need to fix.
Fixed the failures and #35.3.
Comment #41
amateescu CreditAttribution: amateescu commentedAnd the correct patch this time.
Comment #42
fagoThis needs to go on the field definition also. Then, I'd suggest rephrasing to:
"Computed fields having no schema should return an empty array."
plus @return should tell us that as well, e.g. An empty array if there is no schema, or ...
Comment #43
amateescu CreditAttribution: amateescu commentedLike this?
Comment #44
yched CreditAttribution: yched commentedFrom the omment I just posted at #2116341-1: Apply defaults to definition objects:
What has "get", what doesn't have "get" ?
Comment #45
amateescu CreditAttribution: amateescu commentedSo.. we want to rename it to getSchema() here since we're already touching most of the code?
Comment #46
yched CreditAttribution: yched commentedI'd hate to derail that issue on naming, but... yes, that the question :-)
We have:
- static FieldItemInterface::schema($field_definition)
- FieldDefinitionInterface::getSchema()
The former is the source, the latter is the API accessor which fetches from the former. IIRC, this distinction was the reason why I went with slightly different names - even though, well, those actual names don't really convey what the difference is :-/
Don't know what I had in mind exactly - the idea that a static method cannot be a getter, or maybe that having a "get" prefix hints "call me, not the other one"...
Since this might start a pattern with [#8269327] (which also carries an issue of "which is which, which one should you call"), and we're touching all those lines here for schema(), we might as see if we want to reconsider that ?
Again, i don't think we want to derail that one, just pointing similarities and patterns. Unless we have a quick consensus, we can simply say "let's think about it in [#8269327]".
Comment #47
amateescu CreditAttribution: amateescu commentedI don't have any strong opinion on schema() vs. getSchema() so I'm fine with renaming it here in the name of progress and to unblock further work on entity storage :)
Comment #49
fagoI think static methods just defining something are ok without a "get" prefix, else I think we started doing the get prefix pretty much everywhere.
We also have Entity::baseFieldDefintions() - so that would be inline here.
FieldItem::getPropertyDefinitions() is a bit special as it is the definition + the public getter at the same time, but that's subject to with #2002134: Move TypedData metadata introspection from data objects to definition objects & co.
Thus, something along the lines of the following would make sense to me:
If it's used by devs to get something, it should have a get prefix - if it's used by devs to define something and not directly invoked, it's fine without.
ad #43: Looks good, thanks.
Comment #50
amateescu CreditAttribution: amateescu commentedMissed a few.
Comment #51
amateescu CreditAttribution: amateescu commentedIt seems that @fago prefers to not add 'get' for this method, so #43 is the patch to RTBC :)
Comment #52
yched CreditAttribution: yched commentedSorry @amateescu, I didn't intend to say "let's do it now" but rather "let's discuss a bit", I should have been clearer :-/
So yes, agreed with no get prefix for the static "initial definition" methods. and static FieldItem::getPropertyDefinitions() would be changed to comply in #2002134: Move TypedData metadata introspection from data objects to definition objects.
So yes, it looks like #43 is RTBC.
Reuploading it for clarity.
Comment #53
yched CreditAttribution: yched commentedMeant to do this.
Comment #54
catchThis no longer applies. Sorry it took me a while to get to it.
We should add $this->t() to the base class in a follow-up no? This is in the original code that was moved so shouldn't hold up commit.
Comment #55
Berdir@catch: That message will go away when we have a fix for #2023465: Allow validation constraints to add in message replacements. It's only there for the $name prefix so that we know which field the error message is for as form api displays them so badly.
The email property of the email item already contains complete e-mail validation, this just checks a single thing (length) and is therefore pretty stupid. That's why we had to change one of the user tests because there are now currently two validation messages in that case (the form will only display one)
Comment #56
amateescu CreditAttribution: amateescu commentedRerolled and added a schema for MapItem.
Comment #57
BerdirVerified that there are no other changes in there, back to RTBC.
Comment #58
Dries CreditAttribution: Dries commentedIt would help to have a better issue summary. For those of us not working on the Entity API every day, it's not clear why we are doing this, or why it is major. A better issue summary would help with the reviews.
Comment #59
moshe weitzman CreditAttribution: moshe weitzman commentedFeel free to set back to RTBC once issue summary is done.
Comment #60
amateescu CreditAttribution: amateescu commentedWrote an issue summary and rerolled the patch. This is major because it blocks #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions, which is a beta blocker/critical on its own.
Comment #63
amateescu CreditAttribution: amateescu commentedMessed up rebase :/
Comment #64
yched CreditAttribution: yched commentedThanks @amateescu.
Expanded the summary a bit.
Comment #65
effulgentsia CreditAttribution: effulgentsia commentedActually, that makes this a critical beta blocker then.
Comment #66
effulgentsia CreditAttribution: effulgentsia commentedIssue summary looks great.
I was about to RTBC this per #59, but since I haven't previously looked at this patch, I decided to review it. It all looks good, except I stumbled on #29.7. Reading the following comments about that, here's what I came up with to get that info into the patch itself.
If this is ok, please RTBC.
Comment #67
yched CreditAttribution: yched commentedYup, fine if green
Comment #68
chx CreditAttribution: chx commentedThe migration team fully supports this patch and would love to use it. Currently we write the identifier schemas into our migrations because there's no way the sql idmap plugin could figure out how to store an 'integer' or a 'string' and if this goes in, it could!
Comment #69
alexpottPatch looks great - one question before committing.
Hmmm... I understand that the current email field has a max length of 254 but can we really drop the length by 1 here. What are we going to do about those users with emails of 255 characters on d6 and d7? I know this is an edge case - but it is something we should consider. And yes this probably is an existing issue.
Comment #70
amateescu CreditAttribution: amateescu commentedThe 'email' column from
user_schema()
is defined with a length of 254 (at least in D7, didn't check D6), so we're good :)Comment #71
alexpottCommitted bbed679 and pushed to 8.x. Thanks!
The change notice what this allows us to do should be covered by the change notice for #1498720: [meta] Make the entity storage system handle changes in the entity and field schema definitions.
Comment #72
penyaskitousers table mail column in D6 is varchar(64), so it's fine.
Comment #73
chx CreditAttribution: chx commentedBTW. this is 100% broken #2174509: Drupal\Core\Field\FieldDefinition::getSchema() is broken