Steps to reproduce

I used a drupal 9.5 installation, installed the path_alias and redirect modules, added two nodes and two aliases foo and bar for the nodes.

In the cache_data bin I noticed that if the "Enforce clean and canonical URLs" is activated that there are no cache items for the aliased paths.

with Enforce clean and canonical URLs the cache_data bin looks like:
no preload-paths

w/o Enforce clean and canonical URLs it looks like (ignore the admin paths but the /node items definitely appear only in this scenario):
preload-paths

Question

Is this behavior intentional, and if yes why?

CommentFileSizeAuthor
preload-paths-2.png112.78 KBharlor
preload-paths-1.png78.05 KBharlor

Issue fork redirect-3340999

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

Harlor created an issue. See original summary.

catch’s picture

Priority: Normal » Major

This seems major to me.

berdir made their first commit to this issue’s fork.

berdir’s picture

At first I thought I couldn't reproduce, because I did see some preload-path entries, for example from the frontpage.

But it does (not) happen on other most regular pages. The problem seems to be the \Drupal\Core\Routing\UrlGeneratorInterface::generateFromRoute() call that happens in \Drupal\redirect\EventSubscriber\RouteNormalizerRequestSubscriber::onKernelRequestRedirect(), this calls into \Drupal\path_alias\AliasManager::getAliasByPath before setCacheKey() is being called. Which appens from PathAliasSubscriber, a Controller event, while we current run on request.

This results in no preload happening, and $this->cacheNeedsWriting is not set to TRUE. Unlike other things, we don't update caches here, so we never set it to TRUE then afterwards.

I'm unsure if PathAliasSubscriber could run earlier, but I guess the assumption is that it doesn't have to.

So either we set the cacheKey ourself and replicate that, or we move the request normalizer event subscriber to after PathAliasSubscriber.

Looking at the events, we are currently before dynamic page cache, so we'll need to make sure this doesn't cause any weird caching issues. We do things like removing index.php from the URL, so we do need to account for that. But if it works, it would have the additional benefit of being able to cache the checks that we are doing.

berdir’s picture

Title: Feature or Bug? - The path-alias preloading is not used if the "Enforce clean and canonical URLs" option is enabled » The path-alias preloading is not used if the "Enforce clean and canonical URLs" option is enabled
Status: Active » Needs review

Can't use controller event because that doesn't allow to set a response. Would need some weird trickery through a controller.

Went with explicitly setting the cache key myself. Also weird that setCacheKey() isn't actually on the interface but is documented as such.

One thing I'm wondering is how beneficial the preload path cache actually is these days, with things like render cache and dynamic page cache. I guess if you have many variations of dynamic page and render cache entries for the same URL? But if you for example have dynamic page cache variations and most things are a render cache hit, you likely load a ton of aliases that aren't actually needed? Or a different scenario, since the path alias cache is only set for 24h, you might run into a cache miss on such a partial cache hit and then only put a small amount of the paths into cache for another 24h?

Seems like something that might vary depending on the site and can probably only be answered with database query monitoring on a production environment before/after this change. To see how much it reduces path alias lookups vs. the cost of cache lookup and the preload query.

Pushed a MR, mostly untested and tests will need to be adjusted.

catch’s picture

One thing I'm wondering is how beneficial the preload path cache actually is these days, with things like render cache and dynamic page cache. I guess if you have many variations of dynamic page and render cache entries for the same URL? But if you for example have dynamic page cache variations and most things are a render cache hit, you likely load a ton of aliases that aren't actually needed? Or a different scenario, since the path alias cache is only set for 24h, you might run into a cache miss on such a partial cache hit and then only put a small amount of the paths into cache for another 24h?

The preload cache item is only loaded when a path is requested (in PathAliasManager::getAliasByPath()), so if you have a dynamic page cache hit and there are no paths to render, it won't be loaded at all. Obviously this ceases to be the case as soon as one alias gets requested.

It's definitely less useful with render caching but not sure we're there yet. The extreme case would be a similar case to d.o issues where you have a lot of updates to the same content like an issue - then the preload cache item will last across multiple render cache clears. On a site where it's going to be all dynamic page cache hits for 24 hours, then it's not adding anything extra.

berdir’s picture

Tested this a bit. In my case, with all render caching disabled, this replaces 15 alias lookup queries with the preload query (+ cache get). How often that happens like that is hard to say, but it's true that render cache can have quite a few variations, e.g. different permissions, languages and so on.

Updated unit tests, but since this was conditional, no existing tests failed on this, now the expected call is defined.

catch’s picture

Opened #3496369: Multiple load path aliases without the preload cache based on this discussion - might be away to keep the benefits of the preloading without the actual caching.

  • berdir committed 192d972d on 8.x-1.x
    Issue #3340999 by berdir, harlor, catch: The path-alias preloading is...
berdir’s picture

Status: Needs review » Fixed

Decided to merge this. The fix is a bit ugly, but costs us little and it should save a decent amount of queries in many cases. It's not up to redirect.module to decide whether or not this feature is beneficial. At worst it will need to be adjusted in the future if core decide to change how it works.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.