Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Discovered in #2930028-42: Comprehensive JSON API integration test coverage phase 1: for every entity type, individual resources only.
#2543726: Make $term->parent behave like any other entity reference field, to fix REST and Migrate support and de-customize its Views integration added support to core for entity reference fields with null
values. JSON API does not support this yet. And consequently, it is not able to properly work with e.g. the $term->parent
field yet if it has "root" as the parent.
Proposed resolution
Update JSON API's entity reference field normalizer similarly.
Remaining tasks
Roll a patch
User interface changes
None.
API changes
None.
Data model changes
None.
Comment | File | Size | Author |
---|---|---|---|
#45 | 2940339-45.patch | 12.02 KB | Wim Leers |
| |||
#45 | interdiff.txt | 3.04 KB | Wim Leers |
#43 | 2940339-43.patch | 10.58 KB | Wim Leers |
| |||
#43 | interdiff.txt | 2.07 KB | Wim Leers |
#40 | 2940339-41.patch | 11.09 KB | Wim Leers |
Comments
Comment #2
Wim LeersComment #3
Wim LeersComment #4
Wim LeersNow really starting this… because this is now also blocking #2945093, per #2945093-53: Comprehensive JSON API integration test coverage phase 3: test JSON API-specific use cases: related/relationship routes, includes and sparse field sets.
Fortunately, there already are some existing test assertions that we'll need to make pass in HEAD. So let's start with those.
Comment #5
Wim LeersThis one should not be a problem.
This one could be trickier. Hopefully not!
Comment #6
Wim LeersThis would be simpler if #2940342: Cacheability metadata on an entity fields' properties is lost had already landed, because that's changing quite a bit wrt normalizing fields, this will most certainly conflict.
Comment #7
Wim LeersThere are six cases to support:
'data': null
target_id
pointing to non-existent entity:'data': null
'data': {…}
'data': []
'data': [{…}, {…}, {…}]
target_id
pointing to non-existent entity:'data': [{…}, null, {…}]
What do we support, and what do we have test coverage for?
More tomorrow.
Comment #8
gabesulliceAren't 2 and 6 data integrity issues?
Comment #9
Wim LeersNo. A top-level
Term
entity'sparent
field is explicitly set to0
(zero) to indicate that its parent is the non-existent "root"Term
.Comment #10
gabesullice0 !== NULL
and it's not a valid entity ID. Perhaps we should make a special case here for an entity ID === 0 and treat it as if it were the same as case #1.What about 6?
Comment #11
Wim LeersQuoting
\Drupal\Core\Field\Plugin\Field\FieldType\EntityReferenceItem::isEmpty()
, which we should respect:i.e. any value being set on
target_id
means it's not empty. ForTerm
entities, it's0
. For others, it could be'foobar'
.Data integrity is up to the entity type's validation constraints to guarantee. JSON API should not try to interpret it in some way.
Comment #12
Wim LeersTwo and six are the same thing. In case of e.g.
Term
s: a term could have both<root>
(target_id = 0
) andblabla
(target_id = 34598
) as a its parents. That should then result inComment #13
Wim LeersActually
this is wrong!
It should be
because the JSON API spec says at http://jsonapi.org/format/#document-resource-object-linkage:
and because the
parent
always is a to-many relationship:Fixing that expectation. The test will still fail though: HEAD is still broken.
Comment #14
Wim LeersAnd this makes
TermTest::getIndividual()
pass!However, this sadly causes e.g. the
PATCH
test coverage to fail. Because that doesn't know how to deal withnull
values.That being said, I think that the expectations in #13 are technically violations of JSON API… because it breaks linkage.
But this is how
Term
entities are designed to work! I'm not sure how to continue here. Thoughts?Comment #16
Wim LeersClarifying that this needs feedback.
Comment #17
gabesullice<can_of_worms>
This brings me back to my comment in #10.
I think what we need to remember here is that *the purpose of the
relationships
member is not to serialize entity references.*The purpose of relationships are to represent links between resources. If the target resource does not exist, there simply cannot be a relationship between a thing and a thing that does not exist, by definition.
Now, it just so happens that the entity reference field is what is immediately available to us to access that information, but I don't believe entity references are a perfect 1:1 mapping to relationships (this difference of understanding is probably related to our conversation in #2949019: Untangle Relationship(Normalizer) and EntityReferenceFieldNormalizer).
Going back to something you said above:
I think what I'm getting at is that JSON API is interpreting entity reference values. It's interpreting them as relationships, not attribute values (which is a slightly different thing and imposes different constraints on our interpretation of a data reference).
This is a little mind-bending. How can a parent by both a child and not child at the same time? This is an impossible state in the current Drupal UI. The only way I can imagine this working is if there were some new concept of distinct hierarchies within a vocabulary (which doesn't exist yet?). Is this ground work for something else?
I think it would be within reason to say that an entity reference field cannot have
NULL
values. This would be in line with our statement that "we will choose JSON API spec compliance over Drupalisms".Long story short: I think we need special casing for targeted resources which do not exist. Regardless of whether or not entity reference field values allow it or not.
</can_of_worms>
Comment #18
Wim LeersThat just meant there can be multiple parents? E.g.
is a child of both and .That's also what I was leaning towards… thanks for the sanity check!
Comment #19
gabesulliceIn #7.6, you gave the example of
'data': [{…}, null, {…}]
, which would indicate that the term was both a root and a child simultaneously, which is what I was confused about. Also, keep that nationalist propaganda to yourself, everyone knows dutch chocolate is the best 😂Don't forget that this leads to a deep, dark rabbit hole where we have to worry about accidentally changing field deltas or not removing
null
/non-existing references. It's not entirely sane, but it is "the right way to think about it."Comment #20
Wim LeersIndeed, and that's entirely valid/possible!
And in fact, that's exactly my main concern here: if a term has both the root as its parent and a non-root term, then it'd still end up being serialized to
'data': [{…}]
. Now it seems the term does NOT have multiple parents, but only a single parent!So while I agreed with
… I don't yet know how to do that. The best idea I've had is coming up with a pseudo root term with an all-zero UUID.
Comment #21
gabesulliceIt's only valid/possible internally. And that's why I said, "this is an impossible state in the current Drupal UI." In CS terms it's also an invalid graph—one can't have an edge to a node that doesn't exist.
Heh, I feel like we're going in circles now :\
This makes me think that I didn't do a good job of explaining my thoughts in #17.
If you do that, you're still "serializing the entity reference field", not representing relationships to resources. There is no relationship to a non-existant term, even a pseudo-term. We shouldn't try to represent it for the sake of matching the storage system.
I think where I'm headed and why I said this was a can of worms is because I think we need to drop PATCH support for relationship routes and keep only POST/DELETE. On GET, we need to elide reference items from relationships when they point to non-existant resources.
When we get a post, we simply add a reference item, when we get a delete, we remove a matching reference item, if it exists. We then don't have to worry about delta or accidentally removing NULL values.
Comment #22
gabesulliceAdditionally, we could also add entity references back to
attributes
usingtarget_id
. Then we could support the full range of possibilities. Of course, then we would need to use a different member name for relationship fields so we didn't have duplicate field names in attributes and relationships (which is forbidden by the spec).Comment #23
Wim LeersThat's not true. It's totally possible to have the root node not exist in memory, and still be able to traverse the tree. It'd just mean special casing that one possible parent. And that's what we did here.
Dropping
PATCH
support altogether and eliding fromGET
responses still does not solve the problem: you still won't be able to know whether a term has the root term as a parent. And that's crucial to know.AFAICT that wouldn't work either, because the "related" and "relationships" responses still won't be able to make sense/be consistent with what's in
attributes
.Comment #24
gabesulliceHm, that seems a little too clever. I wasn't thinking about it this way. I'll roll with it though.
There's nothing in the spec that requires us to make up a UUID. An ID must just be unique. Perhaps that simplifies things?
Comment #25
Wim LeersI'm not sure where you're going with #24 :) Can you elaborate?
Comment #26
gabesulliceWhat I'm trying to say is that GETing from a
relationship
route is not meant to tell you about field values. It's telling you about linkage. It's about edges. When you DELETE an edge, you should think of it like cutting the link. When there's a NULL target_id, it just doesn't make sense in terms of edges. IOW, you shouldn't be able to add an edge to nothing. IOW, you shouldn't be able to POST NULL.(A) this is what you're describing I think. It's
target_id === [root]
(B) this is
target_id === [1, 2]
,you can't ever create thisif we're talking about theparent
field (this is a totally valid entity reference field otherwise).(C) this is
target_id === [1, NULL]
, which, as I hope I've shown, may be a valid value for the entity reference field, but not for relationships between nodes. This is what I meant by: "'serializing the entity reference field', not representing relationships to resources."(D) this is
target_id === [1, root]
. Again, a valid entity reference value, but if referencing "root" is meant to mean that a term is a root term, then you're saying that the term is both a child and a root. which is very strange. If you go all in on the "virtual" node, it's equivalent to B, which is still impossible to create.EDIT: just tried this in the UI, apparently both B and D are possible to create (but there might be a bug in the UI). It's still very strange to me, but I guess it's possible nonetheless.
Of course! I just meant that under
data
we could simply return:Notably, resource identifiers don't have a
links
member. Meaning they don't actually have to reference a valid URI, nor do they have to have valid UUID. They simply must "uniquely identify a resource", which in this case, they do, they identify the virtual root of thetaxonomy_term--foo
tree. It's okay if/jsonapi/taxonomy_term/foo/virtual
is404
.The only concern might be "resource linkage", but the spec says:
Which we may be able to narrowly get away with because the "MUST" is about orphaned resources not orphaned resource identifiers.
It's worth noting that we already have orphaned resource identifiers when they reference a resource to which the user does not have access.
I think the conclusion to take from this is that we can probably map all non-null values to resource identifiers. In the case of
<root>
, we would make theid
"<root>
" (the only validity constraint on IDs is that it must be a string, except for client-generated IDs, which SHOULD be valid UUIDs).This leaves only one remaining concern, can we differentiate between
NULL
and<root>
target IDs?Comment #27
Wim LeersAh, of course, +1!
Nice ASCII art, although I don't fully grasp it :P Because I don't see how B has
target_id === [1, 2]
. Doesn't matter though, I can reply to A/B/C/D.In this A/B/C/D, I think you or I or both of us are mixing up the possible values for
target_id
(Drupal/database land) with those in the JSON API representation. Only the JSON API representation can containnull
!<root>
in the UI), which is virtual because noTerm
entity actually exists in the DB that represents it. It's essentially an unmodifiable, unloadable, unviewableTerm
.target_id = [1, 0]
(Drupal/DB land) maps torelationships = [1, null]
(JSON API land)? But if so, that's the same as (D). In your ASCII art, you say there is no node in the graph (hence the NULL?) and are asking whether this then even is a valid graph. The implication being that this cannot occur. I agree, I don't think actually can occur.😍 I DID NOT REALIZE THIS WAS OKAY! I thought the JSON API spec mandated that every resource identifier's
id
contained a UUID, but apparently that's not the case! In fact, the JSON API spec only mentions UUIDs in the context of client-generated IDs: http://jsonapi.org/format/#document-resource-identifier-objects.Cool, this is a great way out!
First off: THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU THANK YOU for your deep dive on this, your analysis here was super helpful, super insightful, and obviously unblocks further progress! 🎉 👏 Without you, we wouldn't be at this place where that is the only remaining question :)
Now, to your sole remaining concern!
I'm not yet convinced that a
null
target ID is in fact possible. See the code I quoted in #11:\Drupal\Core\Field\Plugin\Field\FieldType\EntityReferenceItem::isEmpty()
:IOW: if it's
NULL
, then it's empty, and hence entity reference fields can only contain non-NULL values. Zero is an example, but generally it'll be non-zero, because it'll reference entities by non-zero numerical ID.Comment #28
Wim LeersGetting this going again.
The last iteration of this patch is >2.5 months old. Rebased, now retesting, to see how it fares.
Comment #29
gabesulliceis_null
?Nit: 80 line CS
How?
s/relation ship/relationship/
Nit: can this be on 3 lines?
Also: This is what is failing the tests. That's because entity collections can also receive an
EntityAccessDeniedHttpException
object.Do we have tests for NullRelationshipItems in a single-cardinality field?
Comment #30
Wim Leers#29.5: 🙏 — apparently that changed since #14 from 2.5 months ago!
Only fixing that in this reroll.
Comment #31
Wim LeersWOW, I did not expect that!
I was hoping we'd go back from #28's 92 fails to #14's 8 fails.
RelationshipItemNormalizerValue
+ updated.Still to do: implement the outcome of the discussion in #17–#27! That includes addressing #29.6.
Comment #32
gabesullice1. Just a style nit. Take it or leave it :)
3. I don't see the change. By "How?" I just meant, "can you expand this comment?"
4. 👍
6. 👍
Comment #33
Wim LeersI believe this implements #17–#27. This is the goal:
Because these "null relationships" or "virtual relationships" now need to generate a proper resource identifier object (
{'type':'something','id':'virtual'}
) rather thannull
— as this test expectation diff shows, it no longer makes sense to have a separate normalizer value and normalizer. We just need to updateRelationshipItem
a bit.It also makes the diffstat significantly smaller! From
to
Comment #34
gabesulliceThis is great. I'm so excited that this will finally be handled!
My one remaining concern is that this is going to be very confusing when it's encountered by a developer.
Let's make a change record/documentation for this and then make a follow-up to add a
meta
member to these resource identifier objects so we end up with:Comment #35
gabesulliceAfter we have a CR and docs on d.o, this is RTBC.
Comment #36
gabesulliceand followup
Comment #37
Wim LeersI can do that here.
Comment #38
gabesulliceI'm fine with that, but don't forget that entity reference field properties (like
alt
) get normalized as properties under "meta" too. We'll need to special case thelinks
member so that we don't have (de)normalization problems. I didn't want to hold this up if that turns out to be big deal.Comment #39
Wim LeersI thought I was going mad, I couldn't get the green patch at #33 to pass locally! Turns out it was only passing on 8.5, not on 8.6. The 8.6 test I queued confirmed that.
This will make it pass on both branches.
Comment #40
Wim LeersNow it should pass on 8.6. The reason that the above patches were only failing on 8.6 is because Drupal 8.6 fixed the
parent
field. Which is why only on 8.6 the "relationships" routes will work as expected forTerm
entities.Also fixed CS violations.
Comment #42
gabesulliceWhy
virtual
instead ofhelp
? I was thinking about https://www.w3.org/TR/html5/links.html#link-type-help, as referenced by https://www.iana.org/assignments/link-relations/link-relations.xhtmlComment #43
Wim LeersOops, that
-41
patch should not have been uploaded. It was not ready. Please ignore it. This patch will be relative to #40.The #40 interdiff fixed
Drupal\Tests\jsonapi\Functional\TermTest::testGetRelationships()
, but nowDrupal\Tests\jsonapi\Functional\TermTest::testRelated()
is failing.(That, and #40 has
::testGetRelationships()
still failing on 8.5, because #40 didn't think to only do it for 8.6. Even though I mentioned in the comment for #40 that it had 8.6-only behavior. I forgot to add a hunk.)This fixes both of those. Was a nightmare to solve!
#42: because
help
would be the link label/key, and picking something super generic likehelp
prevents us from using this for more generic purposes in the future.Comment #44
Wim LeersWHEW!
Green again at last! Now green on both 8.5 & 8.6!
Now to finally address #34 + #35.
Comment #45
Wim LeersI had accidentally uploaded a patch that implements #34. @gabesullice of course saw this,
and already reviewed it in #42. I gave my rationale in #43, but still looked at the links Gabe provided. One of them said:
… which of course is what this does. So, did that. Now it's exactly as #34 suggested :)
Also:
Comment #46
gabesullice😗👌
Comment #49
gabesulliceComment #52
gabesulliceMy first commits accidentally included a hunk from #2956084: Impossible to raise an error when an `include` is requested for an inaccessible relationship field..
Comment #53
Wim LeersThis caused a regression: #2977879: Regression in #2940339: when multiple vocabularies exist, normalization of Terms fails. But the root cause actually lies in Drupal core: #2977882: Term entity type's "parent" field does not specify "target_bundles" setting: a term from *any* vocabulary can be chosen.
Comment #56
rpayanmMoving to Drupal core's issue queue.
I'm working on https://www.drupal.org/project/drupal/issues/3122113