Early Bird Registration for DrupalCon Portland 2024 is open! Register by 23:59 UTC on 18 March 2024, to get $100 off your ticket.
PHP has interface. Let's rely on those for checking primitive types instead of putting it in the type definition (it seems really silly that we've been doing that).
Related "Make TypedData API sane" issues:
#1928938: Improve typed data definitions of lists
#1988492: Avoid having empty field items by default
#2002134: Move TypedData metadata introspection from data objects to definition objects
#2002138: Use an adapter for supporting typed data on ContentEntities
#1867880: Bring data type plugins closer to their PHP classes
Comment | File | Size | Author |
---|---|---|---|
#45 | primitive-interfaces-2002102-45.patch | 41.84 KB | Berdir |
#45 | primitive-interfaces-2002102-45-interdiff.txt | 2.11 KB | Berdir |
#42 | primitive-interfaces-2002102-42.patch | 41.57 KB | Berdir |
#42 | primitive-interfaces-2002102-42-interdiff.txt | 5.5 KB | Berdir |
#40 | primitive-interfaces-2002102-40.patch | 41.11 KB | Berdir |
Comments
Comment #0.0
fagoUpdated issue summary.
Comment #0.1
fagoUpdated issue summary.
Comment #1
dixon_Comment #2
fagotagging
Comment #3
dixon_Here's a first iteration which is sort of an intermediate step before starting work on #2002138: Use an adapter for supporting typed data on ContentEntities. In that issue
getValue()
andsetValue()
will probably be moved around a bit.I still need to change references to the primitive constants to use the interfaces for checking types.
Also, additional tests needs to be written. I'm only testing the new getters currently.
Although there's work still to be done, I'll let the bot have a taste of this.
Comment #4
BerdirBoolean and others extending string seems a bit weird?
I don't understand why we have to rename getValue() to get$Type(), understand getValue() is confusing if it's on every level and behaves differently, but why not have a shared getValue()/setValue() on PrimitiveInterface?
Otherwise that's going to make the code in e.g. FieldItemBase way more complicated, no?
Comment #5
dixon_Something that did struck when catching up on the code was that -- do we really have a need for making a distinction between primitive types and other data types? The initial use case was serialization, but that logic seems to be doing just fine without the need for primitive types during parsing.
Of course, there might be use cases outside of core for a limited set of primitive types that everything boils down to in the end. But right now I don't see it (I only see
PrimitiveTypeConstraint
which probably could be written differently if we didn't make a distiction between "normal" data types and primitive data types).Or have I missed something? Will something be different when we have converted everything to the new Entity Field API?
Comment #6
dixon_@Berdir I'm a bit torn here as well. The goal over at #2002138: Use an adapter for supporting typed data on ContentEntities is to remove the
TypedDataInterface
wheregetValue()
,setValue()
andgetString()
currently lives. The idea initially was that we should be able to cast most types into a string, for printing in the UI etc. And sinceTypedDataInterface
is going away, I thought it made sense for types that can be casted to extendStringInterface
instead.And wasn't the decision that
getValue()
andsetValue()
should go away eventually, because it's very generic and confusing? I don't necessarily agree the code will become more "complex", it will get more explicit :-) But yes, it will take more lines of code to set values... Hmm...Comment #7
msonnabaum CreditAttribution: msonnabaum commentedI also think it's a bit odd to see those interfaces extending StringInterface.
When I think of interfaces, I think of them as roles. If I have a float, I wouldn't say that it's playing the role of a string, but rather an object that can be cast to a string. I might go for something like StringableInterface or something like that, with only a toString() method.
Comment #8
msonnabaum CreditAttribution: msonnabaum commentedConcerning getValue, I think that's probably fine to stay as is. It makes more sense to see BooleanInterface::getValue than it does BooleanInterface::getBoolean. If the object's identity is already clear from the interface it implements, it doesn't need to reiterate it in that method.
That's not to say we shouldn't have more domain-specific methods on each interface where appropriate. For example, DurationInterface might have getStartTime() and getEndTime().
Comment #9
dixon_I thought extending
StringInterface
was a simple way to solve casting ;) ButStringableInterface
orStringCastableInterface
makes more sense indeed.Regarding the method names we came to the conclusion that we needed more explicit names, because when chaining on entity objects we'll have
getValue()
on each and every level.Comment #10
msonnabaum CreditAttribution: msonnabaum commentedIs there a specific example where that'd happen? It's difficult to consider that in the abstract.
Comment #11
Berdir@dixon_ Yes, but I think we're talking about the same level here. So TypedDataInterface::getValue() will go away, but a PrimitiveInterface::getValue() still makes sense because each primitive has a value.
Comment #12
dixon_Ok, so here's another iteration with
PrimitiveInterface
wheregetValue()
andsetValue()
is defined.There's still some failing tests, because we need to re-think the validation a tiny bit, since it's not enough to just pass around the result of
getValue()
anymore, since we now want to check on intefaces. We could avoid this if we explicitly map a validator to each primitive type, which might make sense.And that leads me to another thing I've been thinking of; do we really need to make a distinction between normal data types and primitive data types? We don't really use it in core anywhere since the serilization (the primary use case originally) is doing fine without primary types.
Comment #14
fagoAs discussed on IRC I still think there is value in differentiating primitives + others. Also, the interface makes sense - so why not have it? The question is whether we should allow other data types to be primitives as well? E.g., a mail data type is also a primitive, thus it should be able to inherit from StringInterface also.
Well, having a bunch of classes with just one simple check seems a bit of an overkill to me. I'm fine with having a PrimitiveTypeConstraint account for all core primitives, but I guess it should not fail if the primitive is not among the ones defined by us. Why bother?
Howsoever, one constraints or many - that's just an implementation detail so I don't care really.
At least it's confusing if you can do $entity->getValue(), I'm not sure it's so confusing on the other EntityNG levels, so we might keep it there.
Right. So one use-case we need to account for is the old out-standing issue #1867888: Improve TypedData date handling. Initially we started having $node->created->value as php DateTime objects, what we thought is nice API-wise. However, it resulted in some performance troubles and additionally complexity on the storage layer as it requires the storage layer to know on how to map it back to storage representation.
So, instead - right now $node->created->value just remains an integer - it's a timestamp, what we *have to* solve as it loses the meaning of being a date. So what we need here is a way keep that value as integer, but to tell the system that this can be read and written as a date also.
So the idea behind an interface like
DateInterface {
function getDate();
function setDate();
}
was that this allows that we implement the DateInterface alongside the IntegerInterface.
So, how should we approach that best?
For strings, I agree that having everything implement the StringInterface is a bit weird, and it's unnecessary. If you have an integer, you know you can cast it to a string - we don't have to specify that. But we have to declare that $node->created->value is an integer that works as date also.
So thinking about that, a date or a duration is actually always represented as string or integer, which then can be somehow interpreted as a date. So what about treating DateInterface and DurationInterface different in that, such that those have getDate(), getDuration() methods which we can implement in addition to the regular getValue() method of a string or integer interface?
Comment #15
msonnabaum CreditAttribution: msonnabaum commentedIf getValue returning an instance of DateTime is a problem, that's a tough one. My instinct would have been to do that as well.
The performance issue could maybe be solved with lazy instantiation and if that causes an issue with the storage controller, it would seem to me that there needs to be a method on the interface that always returns a value that's storable.
Comment #16
fagoWe did the lazy-instantiation. I think it was Drupal the variation of the class that added quite some of the weight, but still, we cannot simply do away with that.
>"returns a value that's storable."
Dates can be stored in various different ways, so just having a single method would not cut it. I actually think having $node->created->value be the thing that is stored is good, it makes $entity appear as it's stored and not something totally arbitrary different we make up out of the stored data.
Also that way, the storage is able to put a timestamp into it and gets a timestamp back - quite reasonable. Still, you could work on top of the DateInterface to treat is a date object?
So what about that?
Consequently, we could move Date and Duration out of primitives. In PHP those are not primitives anyway, and every primitive could use the getValue() approach. Then, it's up to the storage to decide whether a timestamp will be stored as integer (we do that) or uses some Date storage or index functionality.
Comment #17
BerdirSomething like this?
- Renamed DateInterface to DateTimeInterface as discussed, add getDateTime() and setDateTime() methods, no longer extends from PrimitiveInterface
- Renamed data type to datetime
- Renamed the Date class to DatetimeIso8601 and $this->value is an ISO 8601 date string.
- Adjusted validation and tests to show how this works now for Date, others will need to be adjusted in a similar way.
Comment #18
fagoExactly - this looks good to me!
If it doesn't implement PrimitiveInterface it should inherit from TypedData - getValue() doesn't get you the DateTime object.
However, I'm not sure about the internal implementation.
I think it should generate the DateTime object on the fly + just store the value given - so we can validate it later on. There might be validation that wants to validate the given string.(edited: that's the case)Then, setValue() right now accounts for timestamps also - what is a bit strange in a DateTimeIso8601 class. I'd rather move that to a separate Timestamp class and only support it there. Passing a timestamp to a DateTimeIso8601 is nothing we need to support I think, you still can create a DateTime object out of it + use the date setter if you need to.
Comment #20
BerdirWorked a bit more on this.
- Removed support for date time objects and timestamps in setValue(), cleanup
- Added a Timestamp type that implements IntegerInterface and DateTimeInterface but the value is a unix timestamp. Working nicely.
- Reverted validation to rely on types again, we can't pass through the primitive objects as $value, that kills all other validations. Had to trick a bit to make Date work as we want the getDateTime() method, so passed the typed data through the constraint. As datetime/timestamp are no longer primitive, it probably also shouldn't be validated in there. Instead, the default false should be removed I think and those objects should get their own constraints/validators. But keeping it seems to fix the typed data and entity validation tests for now.
Comment #21
fagoIt's weird when the primitive type constraint receives a non-primitive type as constraint option?
Also, I don't think passing through $typed_data like this is a clean solution. Afaik the options-list patch implements a $metadata->getTypedData() helper method for that, which allows constraints validators to get the typed data object. Maybe we can borrow it from there?
Left over debug statement.
Agreed. If the class does not implement a primitive interface - the constraint should not be added. If it does, it should validate that then, e.g. keep checking interfaces there + validate the matching interface. Then the question on whether we want to default to FALSE here, boils down on whether we want to allow contrib to add their own primitive type? If stuff like Date/Duration is no primitive, I cannot think of a use-case for that.
For doing clean date/duration validation I think we then should do
- allow data types to specify their interfaces in addition to classes, so we can make a "date" data type which only refers to the DateInterface + defines its constraints. (so class becomes optional, can we use the interface for annotations?)
- add a type matching helper, which uses the interface to check whether something is a date
- in the manager, get parent interface, get types for those interface, add validation of those types
I feel like we should do that in its own issue and go with all the validation hard-baked in the one constraint for now, so we could focus on that separately?
e-mail is not a primitive, so that's weird. As said, let's better keep checking for the interfaces here?
Comment #22
BerdirI reverted the change to pass the object in to the validators, so I had to go back to the type as I no longer had an object to check against an interface. But then noticed that I do need the object for a date validation so I added the constraint hack. If there will be standard way to get the typed data object great, then we can go back to the interfaces.
A mostly wanted it to get back to green so that we now if the fails are only related to broken validation or if there's anything else. Looks like there isn't, which is good.
Comment #23
fagook, worked a bit on this one. Patch is not done yet (has test fails), but should be good enough for some first reviews.
Comment #25
fubhy CreditAttribution: fubhy commentedOnce #1867856: Use annotation discovery for data type plugins is in I'd like to also suggest that we inverse typed data type discovery and base it on the interfaces of our objects, making those interfaces unique and bound to a single typed data type. I'd like to see the annotations for the typed data types on the interfaces instead of the actual objects in that case so that we can ensure that, for example, each entity type has to have a unique interface (Node => NodeInterface, specific to entity:node). This gives us the degree of type safety that would be required for making #1906810: Require type hints for automatic entity upcasting absolutely reliable and collision-proof. E.g. if I am trying to identify a typed data type based on reflection through the typehint of an argument on a controller it might potentially fail if the interface I am typehinting to is ambigious in typed data land. The only way to make that reliable would be to introduce a 'interface' key to the typed data definition and the logical consequence of that is that, instead of putting the @DataType annotation on the class, we would put it on the interface instead.
Comment #26
fagoYeah, I think enforcing an interface to represent the type is cool to do. Let's open a follow-up issue to this one?
This one needs-work for the test fails though.
Comment #27
fubhy CreditAttribution: fubhy commentedSorry, didn't mean to change the status. Had it open before the patch got uploaded :).
Comment #28
BerdirFixed most of those tests, except the context plugin stuff.
Comment #30
BerdirThe filter stuff was a HEAD fail, but no need to re-test, the Context fail is valid.
Comment #31
msonnabaum CreditAttribution: msonnabaum commentedThe date stuff still looks pretty awkward to me, but this is still an improvement, so perhaps that can be a followup.
Comment #32
fubhy CreditAttribution: fubhy commentedI opened #2028097: Map data types to interfaces to make typed data discoverable as per #25
Comment #33
fagook to me.
Attached patch fixes the context fail by re-using the validator configured by the typed data manager. Also, I had a look at the timespan issue: I think it's perfectly fine to keep the duration in seconds even when represented as duration object as we cannot safely convert the duration to something that involves days/months/years. Thus, I improved comments in that regard.
Comment #34
BerdirI think that comment is missing the word *no* (... generally valid...)
Comment #35
fagoops, added the "no".. Skipping tests as it's just a comment change.
Comment #36
fagoWe just ran into a use case where we would implement StringInterface for a Map. There having getValue() return the string is not so nice as getString(). Therefore coming back to this one:
BooleanInterface is not the object's identity, it's a role the object plays. There might be multiple. So for the boolean "role", a way to get me a boolean makes sense. However, the boolean class' identity is being a boolean, so its getValue() should be the same as getBoolean()?
Comment #37
msonnabaum CreditAttribution: msonnabaum commentedIf there are multiple "get" methods that are giving different representations, they should be "to" methods.
From my perspective, a StringInterface means the class implementing it is going to primarily be a string. I'd expect methods like length(), toUpper(), etc. If we have a data type that is not primarily a string but can be represented as one, I'd expect that to implement something like StringableInterface that has a single toString() method.
Comment #38
dixon_Can you describe the use case in more detail, or is there an issue for it? Would it maybe make sense to "keep it simple" in favour of code reuse and just duplicate a bit of code in Map?
Comment #39
fagoThe use case is #1842858-18: [PP-1] Convert menu links to the new Entity Field API.
Well, what I dislike about StringableInterface is that it sounds quite made-up and not intuitiv to me. Then, when we do it that way e.g. for options-field use case storage would have to check for a) stringInterface and b) StringableInterface + take extra care for calling the right setter when loading again.
Whereas else it could just say that if it's implementing StringInt. it's stored that way.
Comment #40
BerdirRe-roll. That was fun ;)
Comment #42
BerdirStupid annotations ;)
Comment #44
effulgentsia CreditAttribution: effulgentsia commentedThis looks awesome! Minor nits, which I'd be fine with punting to a follow up.
Since we know this to be wrong, should we instead throw an exception when someone uses this type for durations longer than a month?
s/constraint data/constraint for data/
Shame to lose info about which type is needed. Is there a way to bring that back?
Comment #45
BerdirFixed those tests, the id for duration was wrong, wrong namespace for timestamp and also noticed that the use for PluginException in the typed data manager was missing as the test exploded. And fixed the comment.
- Not sure about throwing an exception, I'd rather make it a validation, if anything.
- The validation message was useless anyway, what you got in there was the constant value, so it told you "should be of type 5". *very* helpful ;)
Comment #46
effulgentsia CreditAttribution: effulgentsia commentedThis looks ready to me.
Comment #47
fagoYep, I agree we should move on with this.
Comment now exceeds 80chars - should not block commit though.
Comment #48
alexpottCommitted 2e9c7fe and pushed to 8.x. Thanks!
Comment #49
BerdirUpdated the example in https://drupal.org/node/1806650.
Comment #51
fagoSmall follow-up #2047119: Remove deprecated documentation in DataType annotation.
Comment #51.0
fagoUpdated issue summary.
Comment #52
sunIsn't a URI a string?
Shouldn't the UriInterface at least extend the StringInterface? (similar to Email)
Comment #53
chriscohen CreditAttribution: chriscohen commented#44 makes a very valid point about error reporting that I don't believe has been addressed. The error reporting is woefully incomplete and I came across an issue where I had to spend a large amount of time trying to figure out what the error actually meant because it was way too generic.
If there's any way to go back to the previous method or even increase the usefulness of the error message, I think this should be seriously considered.
If this needs to be done in a separate issue because this one is closed, let me know and I can set it up.