Problem/Motivation

Part of the docs for FieldResolver::resolveInternalIncludePath

Each resource type may define its own external names for its internal field names. As a result, a single external include path may target multiple internal paths.

I've stumbled upon it once I've started tuning my includes (as they are MANY) in #3039730: Include paths are resolved for every resource in a resource collection, instead of once per unique resource type.

The issue is that only the first match was added to the results.

Issue comes from the way data is aggregated and effectively skips anything but the first found item.

Proposed resolution

I will be posting a patch that was tested to work correctly for the scenario defined above.

Remaining tasks

Patch, review / discussion.
Maybe add tests to comply with the specs (as tests are passing, docks are not behaving correctly).

Note:
The currently proposed patch is adding a measurable overhead of 10-20% on top of the existing execution time of the method.
All the optimizations that I had reducing the includes count from this were lost due to the added overhead of managing the uniqueness.

User interface changes

None expected.

API changes

None expected.

Data model changes

None expected.

Release notes snippet

Bug fix...

Comments

ndobromirov created an issue. See original summary.

ndobromirov’s picture

Issue summary: View changes
ndobromirov’s picture

Status: Active » Needs review
StatusFileSize
new696 bytes

Here is the promised patch. Once that was applied the end includes started working as expected.

The issue was due to the fact that operator += in PHP is adding only missing indexes to the array. As we are concatenating always arrays with index zero, anything except the first was ignored :D...

Status: Needs review » Needs work

The last submitted patch, 3: jsonapi-3040280-3.patch, failed testing. View results

ndobromirov’s picture

Status: Needs work » Needs review
StatusFileSize
new1.48 KB

It ideally works OK when data goes to buildTree, but there are some duplicates that can crop up from just how data is made to relate.

Here is a revised patch that makes the tests pass, keeping the fix in.

wim leers’s picture

Status: Needs review » Needs work
Issue tags: -bug fix +API-First Initiative, +Needs tests

Thanks for digging into this and providing a patch!

Based on what I read in this issue, it should be quite simple to expand the existing FieldResolver test coverage to prove the bug in HEAD and prevent i from ever regressing again :)

wim leers’s picture

Status: Needs work » Postponed (maintainer needs more info)

Oh, and what do you mean by "local includes"?

ndobromirov’s picture

Status: Postponed (maintainer needs more info) » Needs review

I think internal... Wrong wording.

P.S. the other issue is resolving this as well in a different way without the big performance regression (10-20%).

ndobromirov’s picture

Title: FieldResolver::resolveInternalIncludePath does not handle multiple local includes corectly » FieldResolver::resolveInternalIncludePath does not handle multiple internal include paths for a single public one.
Issue summary: View changes

Back to needs work for the missing tests.
Updated IS and title.

wim leers’s picture

Title: FieldResolver::resolveInternalIncludePath does not handle multiple internal include paths for a single public one. » FieldResolver::resolveInternalIncludePath() does not handle resolving a single public include path to multiple internal include paths
Status: Needs review » Needs work
Related issues: +#2973681: Regression introduced by #2953207: Deep nested include on multi target entity type field fail

Back to needs work for the missing tests.

Now actually back to Needs work :)

Also clarified the title further! And linking to the issue that introduced this code.

wim leers’s picture

Part of the docs for FieldResolver::resolveInternalIncludePath

Each resource type may define its own external names for its internal field names. As a result, a single external include path may target multiple internal paths.

Just wanted to call out that @gabesullice added this comment in #2973681-36: Regression introduced by #2953207: Deep nested include on multi target entity type field fail 👍

wim leers’s picture

Status: Needs work » Needs review
Issue tags: -Needs tests
StatusFileSize
new4.04 KB

Here's the missing test coverage. This patch should fail.

wim leers’s picture

StatusFileSize
new5.61 KB

And this then is #12 (test coverage only) with #5 (@ndobromirov's solution) added to it. This should be green

wim leers’s picture

StatusFileSize
new643 bytes
new4.04 KB
new5.61 KB

Sorry about that, one extra \n caused the coding standards violations.

wim leers’s picture

StatusFileSize
new1.52 KB
new5.67 KB

And finally, a review of @ndobromirov's solution in #5:

  1. +++ b/src/Context/FieldResolver.php
    @@ -197,7 +197,7 @@ class FieldResolver {
             // Each resource type may resolve the path differently and may return
             // multiple possible resolutions.
    -        $resolved += static::resolveInternalIncludePath($relatable_resource_type, $remaining_parts, $depth + 1);
    +        $resolved = array_merge($resolved, static::resolveInternalIncludePath($relatable_resource_type, $remaining_parts, $depth + 1));
    

    👍 This change totally makes sense! This is just clearly wrong, even just based on the comment. We should've spotted this in #2973681: Regression introduced by #2953207: Deep nested include on multi target entity type field fail :(

    (I really wish PHP had stronger typing available for when you want to use it… that'd have prevented this for sure.)

  2. +++ b/src/Context/FieldResolver.php
    @@ -216,9 +216,17 @@ class FieldResolver {
    -    return array_map(function ($possibility) use ($internal_field_name) {
    -      return array_merge([$internal_field_name], $possibility);
    -    }, $resolved);
    +    $resolved_unique = array_unique(array_map(function ($possibility) use ($internal_field_name) {
    +      // Convert to strings so we can make uniqueness easier.
    +      return implode('.', array_merge([$internal_field_name], $possibility));
    +    }, $resolved));
    +
    +    // Have the unique list converted back to what it needs to be.
    +    $result = array_map(function ($string) {
    +      return explode('.', $string);
    +    }, $resolved_unique);
    +
    +    return $result;
    

    👎 This can be quite a bit simpler, and we can avoid some unnecessary operations by ensuring uniqueness before prepending the current field name.

The interdiff is quite confusing, but the end result is a patch that is simpler, that makes fewer changes.

The last submitted patch, 14: 3040280-14-test_only_FAIL.patch, failed testing. View results

e0ipso’s picture

Apologies, no dreditor in this laptop. I only have nits. And props for the included tests.


+ $resolved_as_strings = array_map(function ($possibility) use ($internal_field_name) {

I don't see the need of $internal_field_name, can we drop it?

Also, I see that we're hashing the array items using the implode function to assess uniqueness. Instead, I think that a more explicit approach would be better.

+    // Remove duplicates by converting to strings and then using array_unique().
+    $resolved_by_hash = array_reduce($resolved, function ($by_hash, $possibility) {
+      $hash = spl_object_hash((object) $possibility);
+      $by_hash[$hash] = $possibility;
+      return $by_hash;
+    }, []);
+    $resolved_unique = array_values($resolved_by_hash); // <-- Not necessary, but just to make the point.
wim leers’s picture

StatusFileSize
new685 bytes
new5.64 KB

I don't see the need of $internal_field_name, can we drop it?

Good catch! Done.

Also, I see that we're hashing the array items using the implode function to assess uniqueness. Instead, I think that a more explicit approach would be better.

We want to assert uniqueness in terms of include path, and the received include path from the URL (e.g. ?include=foo.bar.baz) has been mapped to an array (e.g. ['foo', 'bar', 'baz']). So I think using implode() is fine here: it makes it clear what uniqueness we're comparing.
Your proposal uses spl_object_hash() to achieve the same, but I've frankly seen that function be used so rarely that I think the current patch is much easier to understand?

wim leers’s picture

Status: Needs review » Reviewed & tested by the community

I tried implementing @e0ipso's suggestion, but that didn't work out: https://3v4l.org/uW39Z. @e0ipso said "wfm" in chat, so I'm gonna go ahead and commit this :)

wim leers’s picture

Status: Reviewed & tested by the community » Fixed
effulgentsia’s picture

Nice fix for resolveInternalIncludePath(). Does the same bug exist for resolveInternalEntityQueryPath()? If not, why not?

I guess what I'm asking is: do we have test coverage for 2 different fields, one on bundle1 and another on bundle2, both having the same public name. And an ER field that allows targeting either bundle. And then doing a filter query using the common public name? Will such a filter correctly query across both fields?

wim leers’s picture

I'm glad you like what you see here!

And great question.

Will such a filter correctly query across both fields?

Related: #2953207, and specifically in #2953207-21: Can't get the right target type when filtering on relationship with bundle-specific target entity type and later comments shows that today's Entity/Field Query API does not allow for that, and hence JSON:API does not allow for it either. This is why \Drupal\jsonapi\Context\FieldResolver::getDataReferencePropertyName() asks you to be more specific whenever doing filtering, and even that was only made possible relatively recently, in #2424791: Entity query hardcodes entity_reference and entity specifier. #2953207 opened #2971281: Test coverage: ambiguous filter expression paths for testing this more explicitly.

An important difference is of course that JSON:API resolving includes does not rely on the Entity/Field Query API. Hence it is not bound to the same limitations. It's far more common to request includes across a complex set of resources (with different resource types) than it is to filter across a complex set of resources, especially because #2956414: Support mixed-bundle collections (e.g. `/jsonapi/node`) is still open.

You may wonder how that can be common then. Well: lots of entity reference fields allow resources (entities) of different resource types to be referenced (same entity type, different bundles, e.g. "talk" and "article" nodes), and it's typical to need to fetch (and hence ?include) all those related resources (referenced entities). That's what this patch made work even in this fairly uncommon scenario.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.