Problem/Motivation

I experienced this when looking at 404s and noticed they weren't caching in the proxy. The 404 has a surrogate-control on every request, and never actually gets a cache hit in the page_cache module and never gets the headers to get cached at the proxy.

Digging further it seems if you have a request that generates a subrequest (such as a 404 which sub requests to /system/404) then the sessionless render strategy will kick in on the sub request, and in onRespond of the sub request get wrapped as a BigPipeResponse. Then, when the final outer response hits onRespond, it will wrap it again. Then when it unwraps it later to inject into page_cache using getOriginalHtmlResponse, it is not a HtmlResponse, but a BigPipeResponse, so the DenyPipePipeSessionlessResponses kicks it and refuses to store it.

Thus, cache is completely killed on all 404 responses.

It is also worth noting, when the sub request completes it is effectively a 200 response. The DefaultExceptionHtmlSubscriber changes this to a 404 before responding in the main request. Therefore, it's not as simple as only wrapping in the sub request (where rendering takes place...) because the status code is 200, and that's what will be pushed into the cache (via getOriginalHtmlResponse) - as opposed to the modified version.

Steps to reproduce

1. Install bigpipe_sessionless and page_cache and test pages are caching
2. Generate a 404 and repeat, notice Surrogate-Control always present and never get a X-Drupal-Cache HIT

Proposed resolution

(Removed old proposed resolution as it caused 404 to become 200)

Likely this needs some reconsideration of how the original vs streamed pages are kept and maintained, and what that looks like at a main request level if the response was generated in a sub request like in the 404 route.

Remaining tasks

-

User interface changes

None

API changes

None

Data model changes

None

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

Driskell created an issue. See original summary.

driskell’s picture

Issue summary: View changes
driskell’s picture

Updated with new findings.

GaëlG made their first commit to this issue’s fork.

gaëlg’s picture

Status: Active » Needs review

I stumbled upon the same problem. Thank you for your time spent on this and the info sharing.

Yes, we cannot wrap only the sub-request, as you explained. But we can wrap only the main request (see my commit in the issue fork). It seems to fix the Drupal page cache problem.
Of course, the problem of doing that is that the sub-response won't actually be streamed. It seems acceptable to me because I guess sub-requests only happen on 403, 404,... What we want to stream are "real" pages.

gaëlg’s picture

My previous change introduced a bug: some placeholders where not handled, resulting in missing html elements in the page (Drupal messages in my case).
This was because we cannot assume that we can bypass HtmlResponseBigPipeSubscriber::onRespond() just as soon as it's a subrequest. HtmlResponseBigPipeSubscriber::onRespond() transforms the response into a BigPipeResponse if there are Big Pipe placeholders. If so, they need to be handled.
So a better solution is to avoid the creation of big pipe placeholders in case we are on a subrequest, which I did (https://git.drupalcode.org/project/big_pipe_sessionless/-/merge_requests...). Then no need to bypass HtmlResponseBigPipeSubscriber::onRespond().

It seems to work well (pages with subrequests are still cacheable and now Drupal messages appear).

But Drupal placeholders is such a complex system that now I'm wondering if there could be placeholders into placeholders, which might result in less streamed parts than before? I don't think so and it wouldn't be such a problem, so I stop thinking about it!

Wim Leers made their first commit to this issue’s fork.

wim leers’s picture

Title: Pages generated via sub request never get cached » Pages generated via subrequest never get cached
Status: Needs review » Needs work
Issue tags: +Needs tests

Thanks to both of you for your work on this issue, @Driskell & @GaëlG! 🙏

Without test coverage, I cannot commit this unfortunately 😅

The good news is that it should be relatively straightforward to add test coverage to the existing \Drupal\Tests\big_pipe_sessionless\Functional\BigPipeSessionlessTest::testBigPipeNoSession() method 😊 That happened most recently in #2997281: Pages are cached while the request policy was set to deny. Also, thanks to GitLab CI testing, the error output is much better and we can test on all supported core versions simultaneously! 🎉

In fact, you already have a good chunk of what you need right in the issue summary:

Generate a 404 and repeat, notice Surrogate-Control always present and never get a X-Drupal-Cache HIT

I merged in upstream changes so that this MR will now also run the tests on GitLab CI 👍