Problem/Motivation

Once #606840: Enable internal page cache by default lands, we'll finally be shipping Drupal with the internal page cache enabled by default. Every anonymous page request by Drupal henceforth will be served by the built-in reverse proxy! (Unless opted out.)
In doing so, the existing integration tests uncovered dozens of bugs. Those bugs pointed out problems that would've broken Drupal 8 when used in combination with a reverse proxy.

Surely we didn't catch all such bugs yet, since we don't have 100% integration test coverage. This issue is about finding the remaining pages/scenarios that are broken when page caching is enabled. The majority of pages/scenarios are working, as is proven by the integration test coverage. But surely a minority is still broken.

Proposed resolution

Please test the (a)typical scenarios that you use in Drupal 7, and verify that they work in Drupal 8. If not, create a new child issue of this meta!

The goal: make sure that while you as a site builder/site administrator/content editor are changing settings, creating content, updating content…, that anonymous users immediately see the updated content.

Environment requirements:

  1. Drupal 8 HEAD
  2. 2 browser windows or 2 browsers: in one you're logged in as the site builder/admin/editor, in the other you're the anonymous user

Step-by-step plan:

  1. Test scenario. E.g. as the admin, modify node 1, save it. As the anonymous user, you should immediately see the updated node 1 on both the frontpage and at /node/1.
  2. Repeat step 1 for scenarios you care about/are interested in, until you find something that doesn't work.
  3. Once you find something, you'll want to figure out the root cause. 95% of the time, the problem will be missing cache tags. 3% of the time, the problem will be missing cache tag invalidations, 2% of the time will be more complex, e.g. #2465053: Drupal 8 only allows one user every 6 hours to register when page caching is enabled — caused by entity UUID in form state.
    So, to find the root cause:
    1. Modify \Drupal\Core\Cache\Cache::invalidateTags(): add debug($tags) to the function. Now repeat your scenario. You will see which cache tags were invalidated. If no cache tags were invalidated, or no cache tag associated with the thing you've modified was invalidated, then you've found something that should have a cache tag, but doesn't. Congrats :) Please open a new issue and talk to a mentor if this looks daunting. Go back to step 1.
    2. Now that you know the cache tag that is being invalidated (let's call it "X"), we need to make sure that the page cached for the anonymous user in the page cache also has this cache tag. The easiest way to look at that is by using the developer tools:

      If "X" is not in there, then the problem is that the rendered content that depends on the thing that has the "X" cache tag, didn't actually set the cacheability metadata of the thing on the render array. First, open an issue, and say at which specific URL the problem exists. Second, find the route at which this happens: find the route in one of the *.routing.yml files that corresponds to the URL. Add this to the issue. Third, find the code generating the render array: the _controller points to a class + method that generates the content for that route. Add this to the issue. Fourth, find in the controller's code where the thing that has the "X" cache tag is being used. Use RendererInterface::addDependency() (see the concrete example at https://www.drupal.org/developing/api/8/render/arrays/cacheability for more) to make the render array "depend" on the thing that has cache tag "X", this will automatically make the render array "absorb" all the cacheability metadata of the thing. Now test again: the bug should be fixed! Time to post a patch and go back to step 1!

    When opening an issue, always mark this issue as the parent issue, so we can find it from here!)

Remaining tasks

Novice todos, i.e. missing cache tags and/or missing test coverage (in no particular order):

  • None right now!

Done (in chronological order: first listed is first committed):

  1. #2471473: REST responses should have proper cache tags
  2. #2472281: 404/403 responses for non-existing nodes are cached in Page Cache/reverse proxy, are not invalidated when the node is created
  3. #2464657: Remove unnecessary cache clear in Views tests
  4. #2388023: File/Image field formatters don't add a cache tag for the file they display
  5. #2428837: Adding/updating interface translations should invalidate page & render caches Assigned to: swentel
  6. #2477157: rest_export Views display plugin does not set necessary cache metadata
  7. #2465053: Drupal 8 only allows one user every 6 hours to register when page caching is enabled — caused by entity UUID in form state
  8. #2464409: Search results should bubble rendered entity cache tags and set list cache tags
  9. #2579847: /node/add is lacking cacheability metadata, causes problems when cached by Dynamic Page Cache and "Use admin theme when editing or creating content" is turned off
Files: 

Comments

Wim Leers’s picture

Status: Active » Postponed
Wim Leers’s picture

Issue summary: View changes
cosmicdreams’s picture

So, I get that a more detailed plan is coming, but I have what I think is a simple question:

What are the kind of things we should be looking out for while testing?

Broken pages?
Broken parts of pages?
Stale pages?
Lots of errors in the error log?
Weird behavior / interaction with javascript?

If you set the expectations of what we should be hunting for in our tests we can provide the feedback you're looking for.

Wim Leers’s picture

Title: [PP-1] [meta] Find, fix & finish cache tag support » [meta] Find, fix & finish cache tag support
Issue summary: View changes
Status: Postponed » Active
FileSize
140.3 KB

#606840: Enable internal page cache by default landed, this is now unpostponed!

Searched the issue queues for all relevant issues that are already open, updated the IS with them.

#3: Thanks, that's awesome! IS updated to clarify all that :)

Wim Leers’s picture

Title: [meta] Find, fix & finish cache tag support » [meta] Find, fix & finish cache tag support: try to find broken scenarios as the anonymous user
dawehner’s picture

Simple scenario:

Create a view sorting by random.

Expected result:
The randomness works

Actual result:
The randomness doesn't work

Enable time based caching for that view

Expected result:
Page caching is valid for the specified time of that caching.

Wim Leers’s picture

Issue summary: View changes

#6: Thanks! Those are related to max-age. Because not everything correctly sets a max-age yet, we're still using D7's behavior of a page-level override. Once max-age bubbles all the way to the page (Response) level, that'll be fixed. See #2352009: Bubbling of elements' max-age to the page's headers and the page cache.

Added to the IS.

Berdir’s picture

Important and useful scenarios are everything with language and translations. Create content, configuration and interface translations, change them, then verify that the relevant caches have been invalidated. (interface translations will very likely not work yet, we'll have to decide that we want to do about that).

Wim Leers’s picture

Issue summary: View changes

#8: yes, for interface translation we have #2428837: Adding/updating interface translations should invalidate page & render caches. Added that to the IS and made this issue its parent. Basically, importing interface translations or changing them does not yet have the necessary cache tag support.

Content translations should work, but yes, please do test all scenarios you can think of!

marcvangend’s picture

I've been trying to find cache problems, but didn't find any. Short descriptions of the scenario's I tested:

  • create content (devel generate) and view on home page
  • post comment
  • approve comment
  • create a view of users (page becomes available)
  • change settings on a view of users
  • delete comment
  • register a new user
  • edit content
  • edit content using IPE
  • change image of node
  • replace image on node with another image with exact same filename and alt text
  • change field display settings
  • edit referenced content (field formatter rendered entity)
  • edit chained referenced content (field formatter rendered entity -> rendered entity)
  • edit username of author of chained referenced content (field formatter rendered entity -> rendered entity)
  • unpublish referenced content from the content management page
  • delete author of content, including all his content
  • change display author setting
  • change node's text format
  • change text format settings (disable auto-link url's)

In all cases, the anonymous user could immediately see the changes.

Wim Leers’s picture

Hurray, that's wonderful! Sounds like we should have a prize not for the most fixed scenarios, but for the most found scenarios, because that's apparently the real challenge :P

Wim Leers’s picture

Issue summary: View changes

#2464409: Search results should bubble rendered entity cache tags and set list cache tags is actually also too tricky for a novice to start with. Perhaps once I (or somebody else) has rolled an initial version, a novice can continue with it.

xjm’s picture

Issue tags: +D8 Accelerate Dev Days
Wim Leers’s picture

Issue summary: View changes

pwolanin and klausi found #2471473: REST responses should have proper cache tags but already fixed it today :)

Wim Leers’s picture

Wim Leers’s picture

Wim Leers’s picture

Wim Leers’s picture

jhodgdon’s picture

See #606840-133: Enable internal page cache by default ... I brought up the question of "Time ago" or "Time hence" formatting, such as what is used on Forum topic listing pages to say "Most recent comment was 3 months ago on this thread". And I was directed to this issue.

The problem is that the next time the page is requested, those "ago" time stamps could be all wrong (they're based on the interval from "now" to the most recent comment, so if "now" changes, they need to change). So any page displaying these directly is not cacheable, even for anonymous users.

On comment #136 of that issue, Wim Leers said that because of this, all "time ago" and "time hence" formatting must be done in JavaScript. [That seems reasonable to me.]

So...:

a) Is it currently in Core all being done in JavaScript? I don't think so -- there are many uses of the formatInterval() method in Core that are not wrapped in JavaScript as far as I can tell, and this is what is commonly used to create time ago/hence output. Basically, any call to formatInterval() needs to be checked to make sure it's not being put into a render array directly, and any views using time ago/hence formatting are suspect, as well as any time/date fields set up with time ago/hence formatting.

I'd also take a special look at Forum, User, and anything else displaying history, all of which display "ago" I think [forum lists, user profiles for "member for x days", and any list of "latest comment" in history etc.].

b) It seems like we should add a note to the formatInterval() method, and any related methods that do time ago/hence formatting, and make sure that developers know that using that in a render array directly will make the page uncacheable and suggesting they use JavaScript instead.

c) It also seems like we should add documentation in the UI for time ago/hence formatting (for time/date fields and for Views displays of time/dates) that page caching is not going to work if you choose this format. These formatters do not use JavaScript.

d) Do we have a Views field display that would use JavaScript to display date/time in "ago" format, as would be needed by any regular user who wanted to use "ago" in a cacheable page?

Wim Leers’s picture

This is one thing that was A) partially forgotten, B) not worked on initially because fixing this doesn't cause BC breaks. The current "time ago" field formatter for example should indeed specify max-age = 0 (to indicate that it is not cacheable). But it does not yet. However, the real solution, is to just output the absolute time, attach JS to turn that into a relative timestamp, and voila: all done. This is exactly how drupal.org comments' "time ago" info works :)

We should fix it, but because there are more important (potentially BC-breaking) things out there first, this will not yet be worked on, unless somebody steps up. But we must definitely fix that before release.

jhodgdon’s picture

Sounds like we need a sub-issue for it then? Or is there one already?

Wim Leers’s picture

#22: yes, starting a child issue for that would be splendid!

catch’s picture

We're missing cache tag invalidation for config.system.site:

The front page setting invalidates what gets shown as the front page.

The site name and etc. settings invalidate the page cache (but should probably invalidate the entire render cache).

Discovered via #2480811: Cache incoming path processing and route matching.

jhodgdon’s picture

Oh, I was kind of hoping you'd file the issue. ;) OK, I bit. Here it is: #2500525: Time ago/hence date/time formatting breaks caching; needs appropriate max age. Please review the summary and fix the mistakes I undoubtedly made.

Wim Leers’s picture

I said "must fix before release" in #21, but of course I meant "should".

Wim Leers’s picture

Wim Leers’s picture

Berdir’s picture

Something with comments is definitely not quite right yet, will open a separate issue when I have time to investigate a bit more:

To reproduce:

1. Create an article
2. Add a comment
3. Go to frontpage (still logged in), comment shows up
4. Separate browser window, anon, frontpage, so that it gets cached
5. Back to the logged in user. add another comment.
6. Back to frontpage with that user. Updated ( I have no idea how *that* works?)
7. Refresh page in anon browser window. Still shows the old comment count.

Additionally, node/1 with comments adds the comment_list that. That seems like a really bad idea, because any added or updated comment will invalidate all the render cached nodes.

Unless I'm missing something, I think we can fix both issues by removing the comment_list cache tag from the comment formatter and instead invalidate the cache tag of the commented entity. I've actually been doing this in a custom hook for months in my install profile, I just noticed another related bug there that is specific to my install profile when (finally) enabling page cache by default which reminded me of this.

Berdir’s picture

Ah, much stupid. The frontpage obviously works for authenticated because those links are placeholdered. Which I really, really want to remove.

Opened #2530846: Fix and improve comment cache tag usage

pfrenssen’s picture

I have also found a case where the front page view is not cleared correctly, I can replicate it with the following Behat test:

@api
Feature:
  @d8
  Scenario:
    Given "article" content:
      | title    |
      | Page one |


  @d8
  Scenario:
    Given I am logged in as a user with the "administrator" role

  @d8
  Scenario:
    Given "article" content:
      | title    | promoted |
      | Page one | 1        |
    When I am on the homepage
    Then I should see the link "Page one"

Note that in Behat after each scenario runs the created entities are removed again. So this is roughly the steps that are performed:

  1. An article is created using entity_create().
  2. The article is removed.
  3. The home page is visited with an anonymous user.
  4. An administrator user is created.
  5. The administrator user logs in and visits the home page.
  6. The administrator user is removed programmatically.
  7. A promoted article is created using entity_create().
  8. An anonymous user visits the home page but the promoted article is not visible.

If I remove either the first scenario (creating and removing the first article), or the second scenario (creating and removing the administrator user) then the test is green.

If I put the first two scenarios together then the test is also green.

Berdir’s picture

Are you sure that's not a behat problem?

The cache tag invalidation has a static cache, it is possible that that's not correctly reset when you create and delete content in the behat process. I had to add "\Drupal::service('cache_tags.invalidator')->resetChecksums();" in an @AfterScenario to fix similar errors in my tests.

Wim Leers’s picture

Berdir++

pfrenssen’s picture

@berdir, indeed that was the problem! Thanks!

I should've known, I had actually reported this problem with the static caches of the cache tags checksums half a year ago :-/

Wim Leers’s picture

Issue summary: View changes
Status: Active » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.