Problem
As #2002138: Use an adapter for supporting typed data on ContentEntities revealed, the current validation of typed data references is problematic as entity constraints end up being called with two different arguments: The entity object or the typed entity object wrapping it.
This is critical, because the critical #2429037: Allow adding entity level constraints is postponed on it in order to avoid an API change for entity-level constraints.
Proposed resolution
Add support for intelligently unwrapping typed data objects, such that objects directly implementing the interfaces are not unwrapped while wrapping objects are removed.
Remaining tasks
RTBC and commit th patch.
User interface changes
none.
API changes
- TypeDataInterface::unwrap() is added. This method is implemented by the base class TypedData, so will not affect many implementations.
- Validators on the entity level now always receive the entity object itself and don't have to check if it is a typed data adapter or not.
Original issue by fago:
As #2002138: Use an adapter for supporting typed data on ContentEntities revealed, the current validation of typed data references is problematic. It applies the constraints specified on the data referenced target are applied on the data reference value. However, when the target implements ComplexDataInterface it leads to strange behaviour of passing unwrapped data to the constraints for the reference (as it does not implement ComplexDataInterface), while the data is passed wrapped to the constraint when validation is invoked from the typed data target object.
To fix, \Drupal\Core\TypedData\TypedDataManager::getDefaultConstraints should not add in the target reference constraints, but instead the validation metadata classes should make sure to trigger validation of the data reference targets based on its defined constraints.
Comment | File | Size | Author |
---|---|---|---|
#67 | interdiff.txt | 2.16 KB | rteijeiro |
#67 | d8_data_reference-2346373-67.patch | 14.73 KB | rteijeiro |
#64 | d8_data_reference.patch | 14.53 KB | fago |
#64 | d8_data_reference.interdiff.txt | 889 bytes | fago |
#62 | d8_data_reference.interdiff.txt | 7.35 KB | fago |
Comments
Comment #1
fagoComment #2
fagoBumped into a problem with data reference validation again with #2429037: Allow adding entity level constraints. After thinking about the whole problem space I think we should do the following to solve both issues:
- Allow data types (being objects) to annotate whether they are wrapped or not, while default to "wrapped". Only field items (and so their lists) would fall into the category of not being wrapped, but directly implementing the typed data interface.
- Leverage the 'wrapped' annotation to improve DX and always pass on unwrapped data, while skpipping unwrapping for not-wrapped data types (=fields).
- Given the improved data unwrapping, constraint validators will receive the same data in both situations.
- Do not add the constraints of referenced data in in TypedDataManager::getDefaultConstraints any more - not all of those constraints might be relevant for validating the referenced data is right, but might validate the data itself. Instead just add constraints to the the data reference property (entity, language) directly, what will now work nicely as the passed on data is always the same.
In case additional validation constraints should be added, it's fine to just add them to the data reference property (entity, language) directly. I do not see any reason they would have to be on the target definition as well - the target definition is only required for typed data to know what kind of data is referenced here, such that a typed data object for it can be created.
Comment #3
fagoPatch attached which implements points 1-3. Point 4 is an not required to fix this, but to solve the problem of constraints being applied wrongly to references as run into at #2429037: Allow adding entity level constraints. Thus it's best addressed there.
The patch introduces TypedDataManager::unwrap() and applies it accordingly in the systems using typed data - currently that's validation and contexts only. In both cases this ensures that entities and fields are always passed on in the best format. For contexts, this means that field items will now be returned as field item object from $context->getContextValue() also. DX++.
Comment #5
fagooh, this broke the ComplexData constraint as now complex data isn't necessarily passed as typed data any more. Attached patch fixes that.
Comment #6
klausiAre we only doing this because FieldItemBase is special? Shouldn't FieldItemBase::getValue() just return itself? Or should we implement the same adapter approach for fields?
I just think it is a bit ugly to pass the typed data manager around everywhere just because sometimes data objects implement TypedDataInterface themselves and sometimes not. I wonder if we can make it consistent somehow?
Comment #7
fagoYes, and FieldItemLists. But any other object could opt to do it the same way.
Nope, it needs to be idempotent with setValue().
This was considered a while ago when adapters for entity are planed, but there were no clear benefits for doing such a massive change back then. Right now, it's way too late for that nor are there good reasons to do it.
Not sure what you mean? Do you dislike injecting the manager into more objects?
I don't see why that would be a problem, moreover it makes sense for code dealing with typed data to have the manager at hand. We could have an unwrap() method on TypedDataInterface also, but it doesn't make so much sense there as not all typed data objects have to be unwrapped. Secondly, we try to keep typed data interface slim, such that there are not too many Typed Data API related methods on object that opt to implement typed data interface directly like fields.
Adding the 'wrapped' key was already considered back #2002138: Use an adapter for supporting typed data on ContentEntities but postponed there as it wasn't necessary to get the task done. But this issue and #2429037: Allow adding entity level constraints show that is critical for improving DX in conjunction with typed data, as clearly entity validation constraints would except to receive entity objects instead of typed data objects also. We'd face the same issue with other sub-systems: Consider implementing tokens based on typed data objects: Same thing, entities should stay entities, fields fields and primitives the primitives - adding the simple 'wrapped' flag allows us to do that.
Comment #8
yched CreditAttribution: yched commentedNo clear or strong opinion either way, but if we had an unwrap() method on TypedData, would it mean we don't have to put the info in the data definition ?
Comment #9
klausiYes, I think that way we could avoid it. If we add the unwrap() method to the interface with a default implementation at TypedData which is just
Another problem with having it on the typed data manager is that this is not extensible. The typed data object cannot decide how itself should unwrap; the logic is hardcoded in TypedDataManger::unwrap() which only supports ListDataDefinitionInterface and the "wrap" flag at the data type definition.
I don't know if there will be many additional uses cases to customize the unwrap() behavior, but we already have 3 possibilities implemented in the patch at TypedDataManger::unwrap(), so this looks suspicious. Maybe this justifies adding a new method to the interface?
Comment #10
fagoYeah, then we could do it without the info - right. I agree that the method would be bit more flexible, so if you are fine with having a method $field_item_list->unwrap() which just returs $field_item_list and the same for $field_item, I'd be happy to update the patch.
We can document it to remove any wrapping typed data object, and as there is no wrapping object in case of fields - it does nothing. Makes sense?
Comment #11
yched CreditAttribution: yched commentedI kind of like the "TD::unwrap()" over the 'TDM::unwrap(TD) + flag in the definition",
but it's not clear to me what the difference with getValue() is - getValue() is about unwrapping too, isn't it ?
So getValue() *always* unwraps, while unwrap() doesn't always unwrap ? It feels a bit confusing :-)
Comment #12
fagogetValue() gives you the internal value, whatever that is (free to define by the class), but it's the same value as setValue takes.
Right, depends on what you call "wrapper". I'd not consider the field class a wrapper, it's the main, primary API and not a wrapper object. Does there is nothing to unwrap. Not sure this point is simple to make in the docs though.
Comment #13
yched CreditAttribution: yched commentedYeah, independently of the approach ("TD::unwrap()" or "TDM::unwrap(TD) based on how the data type describes itself"), this notion of "the data object wraps something or not" feels a bit confusing atm.
- all $data objects have internal values, and those are distinct from the object itself ($data->getValue() is always !== $data),
- but some "wrap" their values ($data->unwrap() === $data->getValue()), and some don't ($data->unwrap() === $data) ? What's the difference ? A given data type choses to do describe itself as one or the other based on what ?
Plus, I think it's a common beginners's introduction to TypedData to explain that "it wraps native PHP data structures into a common interface" ? But it's not the same kind of "wrapping" we're talking about here ?
So, I'm not sure we have nailed it conceptually yet :-) Maybe 'wrap' is not the right terminology for what this tries to fix ?
Comment #14
yched CreditAttribution: yched commentedAlso, wondering if this is a question of "different "kinds" of TypedData need different base classes or interfaces" ? (like at some point we considered whether we needed something like LitteralTypedData)
Comment #15
fagoYes, but sometimes the "native PHP structure" directly implements the common interface - such as for fields. So there is no wrapper.
I'd explain it as
"it makes PHP data structures accessible via a common interface" and append "For that, objects may implement the interfaces directly or use wrapper objects implementing the interface."
But maybe let's better discuss this in IRC.
Comment #16
fagoThe critical #2429037: Allow adding entity level constraints is now postponed on this. This makes this critical.
Comment #17
fagoComment #18
larowlanI think this needs work based on comments?
Comment #19
yched CreditAttribution: yched commentedSlowly trying to get a better understanding of the problem space - not there yet, but making progress :-/
Meanwhile, stumbled on #2443733: TypedDataTrait should be TypedDataManagerAwareTrait
Comment #20
yched CreditAttribution: yched commentedAlso, reading up the thread, the original report says the source of the issue is that :
"Validation applies the constraints of the referenced target data to the reference data value" - i.e:
When $e->field_ref->entity references $e2, then validate($e->field_ref) runs the constraints that exist on $e2.
Then the problem is that it does so in a way where the validators receive the value in a format that is different from the one they receive when validate($e2) is called directly ($entity_adapter in one case, $entity in the other)
But I thought this behavior ("validate($e->field_ref) runs the constraints that exist on the referenced entity") was something we discussed we should stop doing in the first place ? If we do so, what is there to fix then ?
Comment #21
fago#2429037: Allow adding entity level constraints currently has the hunk which moves the constraints from the reference target definition to the reference property, but that does not fix this issue. Still, the reference property validation constraints would receive the unwrapped entity, while the constraints on entity level receive the wrapped entity. So this issue fixes that by improving how we unwrap data: it's not the validator integration applying a rule of thumb that complex data should be passed wrapped, it let's the data decide how it's main representation should be.
Comment #22
klausiI think the next step here is to move the unwrap() method from the TypedDataManager to the TypedData objects themselves.
We have not heard any alternatives to the name "unwrap" yet, so let's use that for now.
Comment #23
sidharrell CreditAttribution: sidharrell commentedwhile trying to grok this issue (complete fail, btw :)
I realised that #5, that passed testbot, is just the interdiff.
I think I'm starting to get the hang of this:
I rolled my git back to the commit that #3 was applied against in the qa.drupal.org log above.
made a branch for the issue, applied #3, applied #5, merged current head into my branch, created the patch.
See, this is what I meant by low hanging fruit. And I didn't even have to understand a bit of the code.
Comment #24
sidharrell CreditAttribution: sidharrell commentedgonna try to move the unwrap function, but don't wait on my account.
Comment #25
sidharrell CreditAttribution: sidharrell commentedk, I understood nothing of the surrounding context, but I understood, move method from here to there.
So don't laugh when this blows up. I'll just be happy if testbot survives.
Comment #26
sidharrell CreditAttribution: sidharrell commentedComment #28
sidharrell CreditAttribution: sidharrell commentedThis fails locally, too, but slightly less stupidly.
And with that I will bow out as gracefully as I can, and go trawling for some lower hanging fruit.
Comment #30
larowlanthank you @sidharrell
Comment #31
sidharrell CreditAttribution: sidharrell commentedspotted a couple things looking at the patch anew today.
Comment #33
sidharrell CreditAttribution: sidharrell commentedIn trying to figure out why that blew up, looks like in this call stack
Drupal\Core\TypedData\Validation\PropertyContainerMetadata->accept(Object, Object, 'Default', '')
Symfony\Component\Validator\ValidationVisitor->validate(Object, 'Default', '', , )
The second parameter to accept is not typedData.
Comment #34
sidharrell CreditAttribution: sidharrell commentedrevert to #23, which was fago's original patches.
Comment #35
sidharrell CreditAttribution: sidharrell commentedok, I really do apologize for being unnecessarily noisy on the issue, but I'm really trying to understand, so can someone maybe be so kind as to clarify:
PropertyContainerMetadata holds TypedData through composition ($this->TypedData), but also implements the MetadataInterface, which requires it to implement the accept method. PropertyContainerMetadata's implementation of the accept method's second parameter is $typed_data. Now is there any requirement that that $typed_data is the same as $this->TypedData?
Comment #36
sidharrell CreditAttribution: sidharrell commentedlooks like this patch may conflict with #2343035: Upgrade validator integration for Symfony versions 2.5+
should one or the other be postponed blocked on the other?
Comment #37
klausiThey can be developed independently and can be rerolled once the other issue gets committed.
Comment #38
fagoI think we can just update the patch which ever lands second.
Given, the way the metadata is initialized, it's the same as long the $typed_data being validated is not NULL or a primitive.
I picked up #31 and worked over it, so we have an updated implementation using a method only instead of the annotation. Patch is definitely simpler and more straight-forward compared to #5.
Comment #39
fagoComment #41
sidharrell CreditAttribution: sidharrell commentedadded an unwrap method to Map. It was using the TypedData unwrap method, which returned an array, which was passed into visit on line 33 of PropertyContainerMetadata.
Don't know if that's the right approach, but it fixed the test locally.
Comment #42
fagoWe cannot assume that complex data always use no wrapping typed data objects, e.g. entities do - thus restored the fix of #5 instead of #41.
Comment #43
klausiCool, IMO this is RTBC now; @yched: can you sign this off, too?
Comment #44
YesCT CreditAttribution: YesCT commentedDo we need a change record for this?
Or do we need to find other change records to update.
Comment #45
fagoAs there was no easy way to add entity-level constraints so far, I doubt that anyone has done so. I think it's enough to add a change notice as part of #2429037: Allow adding entity level constraints for that being added.
I don't think so.
Comment #46
jibranI think @larowlan has done something similar here in #2405467: Add EntityType and Bundle constraint on entity property of DynamicEntityReferenceItem
Comment #47
larowlanLooks good to me, one minor nit than can be fixed at commit time if possible
nitpick: s/objects directly implement/object directly implements
Comment #48
arlinsandbulte CreditAttribution: arlinsandbulte commentedJust edited the patch file to implement #47.
Keeping RTBC
Comment #49
arlinsandbulte CreditAttribution: arlinsandbulte commentedComment #50
yched CreditAttribution: yched commentedSorry for letting this slide, having a hard time carving out d.o time lately.
I still find this really confusing, so, trying to summerize what the patch does and why I feel this adds confusion - not sure I have better to propose though :-/
This patch adds unwrap() as a new way of "getting the content of a TypedData", in addition to the existing getValue().
It is "related" to getValue() (in the default implementation, $this->unwrap() === $this->getValue()), but different (in some other cases $this->unwrap() === $this).
- It is up to each TypedData subtype to decide what its unwrap() should do.
As a developper of a data type, it's unclear based on what I should chose one or the other behavior for my unwrap()
(side note : unwrap() either returns $this or $this->getValue(), right ? Is there any other valid behavior ? If not, shouldn't we instead put a flag in the @TypedData annotation, and have the base implementation in TypedData::unwrap() do one or the other based on the value of the annotation ?)
- It is up to each caller to figure out whether it wants to do $data->getValue() or $data->unwrap() - the difference still confuses me, personnally.
Current places that use unwrap() (instead of getValue() so far) :
Core\Plugin\Context\Context::getContextValue() - note the *Value* in the method name...
Core\TypedData\Validation\Metadata::accept()
Core\TypedData\Validation\Metadata::getProperty*Value*() - note the *Value* in the method name...
Core\TypedData\Validation\PropertyContainerMetadata::accept()
This is pretty meta stuff :-), which doesn't help making sense of the specific semantics of unwrap() as opposed to getValue().
(side note : not that it would make the difference clearer, but couldn't unwrap() be replaced by getValue(some flag) ?)
Comment #51
webchickSounds like this could use more discussion.
Comment #52
fagoSee #23 for a ready patch using annotations. However, once we use a method on the typed data object I do not think there is a point in using an annotation, why not just use the method to define the behaviour?
Right, but the difference is not new - it's already in the system in that we have objects being wrapped by typed data and objects implementing the interface directly. We just add a way to deal with it.
However, I agree that having both methods on the object might be confusing - this is way I avoided introducing yet another method on the TypedDataInterface in the initial patch (#23), such that the method can be hidden on the TDM for systems interacting with data based on the Typed Data API.
Yeah, but the value of the context - that does not mean it has to return the value of a field item for a field item instead of the object. But I guess where you are heading to is that it should be called a "value" also?
My initial reaction to that was that it muddies the waters, but it might be another of implementing things while avoiding the confusing choice:
- make getValue() of FieldItem and FieldItemList return $this.
- keep it mostly idempotent by supporting taking values over when an object is passed in setValue()
- there is already toArray() for getting the internal value
This would make getValue() the new unwrap() and remove the choice of how to get the value on the Typed Data layer. We could also add an in optional $internal flag to it, to make getValue() behave like toArray()? That means, object could opt to have an internally different value representation.
Comment #53
yched CreditAttribution: yched commentedMy suggestion was about making it clear that unwrap() can only return one of two things : either $this, or $this->getValue(). So, having the logic contained in the base Typeddata::unwrap() implementation, controlled by a boolean flag - as opposed to having to open each unwrap() implementation to see which one of the only two options it choses to do.
But yeah, annotations typically hold data to be used by an external class (e.g. the plugin manager), rather than used by the class itself to control its own implementation (it doesn't really need the annotation for that, it can use a static protected boolean property or method), so agreed it might be a bit uncommon. Not married to the idea.
Well, it would still leave the questions of "should I $data->getValue() or should I $tdm->unwrap($data) ?" / "what is unwrap() ?" :-) But yeah, one upside is that having unwrap out of the interface could mean more people will simply ignore the dilemma.
Yeah, I guess it's that distinction I'm having a hard time figuring out exactly :-)
- FieldItemList::unwrap() returns $this (a TypedDataInterface), but ItemList::unwrap() returns $this->getValue() (an array).
- Same distinction between FieldItem and Map.
Why ? What is the intrinsic conceptual difference between FieldItemList and ItemList, or FieldItem and Map, regarding how they relate to their "content" ? I have a vague feeling that the difference is not really intrinsic to those classes, but maybe rather intrinsic to how their respective *parents* typically chose to use them in a nested structure (ContentEntity -> field item list -> field item -> property) ?
My struggle with the current patch is that I have the impression that, because a couple very specific visitors (the places pointed at the end of #50, Context and Validation) will need a TypedData instead of an array for some operations on some classes, we're trying to cram an opportunistic conceptual difference where I have a hard time clearly seeing one. It would be just fine if the method name could be clearly scoped to a specific domain behavior (e.g. $data->howIShouldBehaveForValidation()). But I fail to see the more generic conceptual notion of "unwrap'.
Interesting. I think we have another issue about "$entity has toArray(), $entity->field has getValue(), $entity->field[0] has both, how do I do debug dumps ?".
TypedData-wise, toArray() is currently only for ComplexDataInterface, right ? Do you think of promoting it to TDI ?
Still haven't completely parsed how it would play with the issues above, but wouldn't that be a massive API change ?
Comment #54
sidharrell CreditAttribution: sidharrell commentedWhat if unwrap always returned a typedData? So getValue is for getting at the raw value, but unwrap is for getting at the lowest layer that is still typedData ( implements the typedDataInterface). So you still get the desired behaviour, entityAdapter unwraps to it's entity, and a Field returns itself, but the API definition is cleaner.
Comment #55
yched CreditAttribution: yched commented@sidharrell : "what if [...] unwrap is for getting at the lowest layer that is still typedData (implements the typedDataInterface) [...] so entityAdapter unwraps to it's entity"
Entity doesn't extend TypedData anymore, so that wouldn't work :-)
Comment #56
alexpottDiscussed with @effulgentsia, @webchick and @xjm. There was some discussion as to why this was critical if it not necessary to do #2429037: Allow adding entity level constraints. However as the issue summary points out this allows us to make that change without an API change - so leaving at critical. However, given the discussion in #50 and onwards maybe it is worth reconsidering whether or not we should change the API in #2429037: Allow adding entity level constraints?
Comment #57
alexpottAdding tag that I should have added in #56
Comment #58
jibranTo #44 I think we should create a change notice for sanity sake.
With all due respect this doesn't mean people(like me) are not trying to understand the validation api. Change notice would be a great help to understand this change. Thanks for the work on this patch.
Comment #59
fagoIt's intrinsic to what is the primary representation of the data - the internal value or the object implementing the typed data inferface. It's not only the parent, it's every other component dealing with the data which wants/should deal with it based on it's primary representation.
It's different to tell what the general "map" and "list" classes should do here as we do not really use them on its own. But thinking about e.g. a list of strings, it seems to be the saner default to unwrap as the contained items in the returned are array are just strings, while the contained values from an ItemList object are again TypedData objects.
As explained above, I think it's really a general behaviur of the object which is not special for a certain use-case. Field items should always be given to other APIs / system components as field item objects while entities always should be as entities. I don't see why that would differ by use-case - it only differs on whether the leveraging component wants to make use of the TypedDataAPI or not, but in the "not" case it's always the same unwrapping the behavior that we need. Thus, baking that in only special to our current two use-cases would a) introduce weird dependency cycles of TypedData depending on it's leveraging components and b) fail to provide a solution for any further leveraging components we do not know yet.
Imho, it should be in ListInterface as well - or better just in a common parent #2350597: Extract a common DataContainerInterface for lists and complex data. Non-containers are generically not representable as arrays so it does not make much sense on TDI.
To field items and item lists, yes - so I agree that it's probably not a good idea to do it at that point.
Yep, I think if we are not able to come to an agreement here soon, we'll have to move on with #2429037: Allow adding entity level constraints to unpostpone things and either do another API change later or live with this and the bad DX around entity-level constraints.
Comment #60
fagoI figured that we should be able to also discover whether something needs to be unwrapped via #2028097: Map data types to interfaces to make typed data discoverable also - as we can easily check whether the provided interface extends TypedDataInterface. So that helps us removing the way to specify the "unwrap" annotation, but it doesn't solve the problem of having to have a method for unwrap()/getData()/... in addition to getValue().
So still, we need to add a reasonable named helper to TDM, as the previously suggested TDM::unwrap().
Comment #61
fagodiscussed things through with yched per skype (thanks!). We worked out the semantics and came up with the following suggestion:
- Have TDM::getCanonicalRepresentation().
- Add to data type annotation: unwrapForCanonicalRepresentation().
Also, we figured that it's better to have the method on the TypedDataManager instead of having it on all the classes, as still the method does not make sense from the perspective of an object implementing the Typed Data API and using this object as canonical representation - as fields do.
Comment #62
fagoHere is a patch which implements discussed changes. Also, I add some more docs to explain things - please review.
Comment #63
jibranThanks for the improvements.
it took me while to understand the doc but after reading 3-4 times it makes a lot of sense. This is a huge improvement.
It would be nice if this is explained in a comment.
Comment #64
fagoThanks, jibran. Added that comment.
Comment #65
jibranI think it's ready but let's wait for @yched.
Comment #66
yched CreditAttribution: yched commentedNice ! Thanks @fago for a great discussion yesterday, and for being so quick with the patch :-)
Mulled a bit more on the doc explanations, and I'd like to propose the following for the phpdoc of TDM::getCanonicalRepresentation() - adjusted from @fago's proposal. If polemical, we can commit as is, and refine later :-)
Other than that :
That hunk confuses me a bit. That's a case where we need to do the opposite, get the TDI if a value was passed unwrapped ? How can we do that ?
Comment #67
rteijeiro CreditAttribution: rteijeiro commentedUpdated the comment with #66 suggestions.
Comment #68
fago@docu-improvments: Thanks, that works for me.
Generally, we cannot. It's up to the component API to provide you with that flexibility if you want to enable such generic validators / actions / or general implementations. Thus, in that case it's the validator metadata which makes it possible - in the context API case it's the context API which alternatively allows you to fetch the typed data object for such situations. It worked out already this way in Rules d7, so I think the approach is reasonable (and not new to this issue).
Comment #69
fagoAdded change record draft as requested: http://drupal.org/node/2462107
Comment #70
klausi"unwrap_for_canonical_representation" is not pretty ... naming this is really hard. I'm fine with the docs on TDM::getCanonicalRepresentation(), so it hopefully is enough to explain this.
CR also looks good, RTBC.
Comment #71
fagoyeah, our thinking was - better long and descriptive than short and unclear.
Comment #72
alexpott@fago and @yched nice work! This issue addresses a critical bug and is allowed per https://www.drupal.org/core/beta-changes. Committed 2575356 and pushed to 8.0.x. Thanks!
Removed these unused use statements on commit.
Comment #74
YesCT CreditAttribution: YesCT commentedthis helps unblock #2429037: Allow adding entity level constraints which is blocking #2395831: Entity forms skip validation of fields that are not in the EntityFormDisplay
Comment #75
klausiOops, the typed data manger is used wrong in the Context class here, filed follow-up: #2467411: Context class does not use typed data trait correctly, leading to fatal errors.
Remember: always use the trait methods (of TypedDataTrait for example), never the trait instance variable directly (unless you are absolutely certain that it has been populated already).