Problem/Motivation

I'm doing a performance audit of a site using JSON:API which is trying to build responses with large numbers of complex entities (500 at a time).

When profiling, I noticed about 500ms on cold caches spent in writing the normalisation cache. This is designed to happen end of request after the response is sent, however I also noticed that ddev's default Nginx configuration appears to gzip JSON:API responses, removing Drupal's content length header, which will cause end of request tasks to be blocking. No idea how prevalent that would be in production sites but I did find various stack overflow threads asking about missing content length for non-HTML responses etc. so seems like something people run into.

Equally/more of a problem - retrieving the serialisation cache is more expensive than the normalisation, at least in this case. I'm uploading xhprof screenshots with the caching in place on a warm cache, and also with the cache get/set etc hacked out (not a proper MR yet), and without the caching is more function calls, but also faster because it's skipping the i/o (in this case to Redis, but it would be slower with the database backend), and much lower memory because less deserialisation happens.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Issue fork drupal-3572098

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

catch created an issue. See original summary.

catch’s picture

On this particular site there's a cache redirect for the variation cache items (the requests are made with authorisation so permissions/access etc. is involved), this means double the cache gets which probably doesn't help the time, but it's a real install and that will also happen on other installs.

catch’s picture

This was added in #2819335: Resource (entity) normalization should use partial caching which had various benchmarks and profiling, not sure why I'm seeing such dramatically different results but also that issue was a while ago so possibly other things have changed in the meantime.

catch’s picture

OK reading that issue maybe I have an idea. In the specific request I'm profiling, it's a custom JSON:API resource that constructs an array out of the values of 3-4 different entity types for each row, these then get serialized. Any field API overhead in retrieving values etc. is happening before serialization and the array only contains what's needed. This could explain why the caching is more expensive than the serialization.

#2819335: Resource (entity) normalization should use partial caching on the other hand is dealing with collections where the resource objects are the entities and they could have tens of fields each etc.

So rather than removing the caching (which also looks like it would be complex, it's very baked in), maybe we just need a way to suppress it.

catch’s picture

Status: Active » Needs review

Pushed a branch that might be close to enough - ResourceObjects are cacheable metadata, so if we set maxAge to 0, we can entirely skip the entire cache process, which it really ought to anyway. It's possible that variationcache already handles not getting the items (did not test this yet), but the custom end of request setting also needs to be bypassed since we don't want to add all those resource objects to memory for the duration of the request either. I can try this on site I'm looking at setting max age to 0.

catch’s picture

catch’s picture

Confirmed this is enough to bypass the caching if you set max-age 0 on the ResourceObject, which fully removes the overhead for me.

longwave’s picture

Should we add test coverage for this?

bbrala made their first commit to this issue’s fork.

bbrala’s picture

Status: Needs review » Reviewed & tested by the community

This is lovely and simple.

I was looking into testing this, thought it could be close to ResourceObjectNormalizerCacherTest::testMaxAgeCorrection but that code adjusts max-age this after normalization. Which is not really relevant here. I added a new test that just tests the max-age fix.

I kinda wondered, wouldn't this break in some way because we skip the max_age normalization, but that is not true since we only skip when 0 which is fine.

Added a change record.

Since the test is so minimal, I'm gonna be bold an RTBC.

  • longwave committed d8e08b13 on 11.x
    perf: #3572098 JSON:API normalisation caching can be more expensive than...

  • longwave committed 03d85879 on main
    perf: #3572098 JSON:API normalisation caching can be more expensive than...
longwave’s picture

Version: main » 11.x-dev
Status: Reviewed & tested by the community » Fixed

Committed and pushed 03d858794c4 to main and d8e08b13bc4 to 11.x. Thanks!

Also published the change record.

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.