Problem/Motivation
After updating Drupal from version 10.2.7 to version 10.30, a 404 error occurred!
Bilingual website pl/en
the https//mydomain/en version works fine
the https//my_domain version (by default pl as the domain's primary language) shows a 404 error after clearing the cache.
Warning: Undefined array key "route:[language]=pl:[query_parameters]=:/ " in Drupal\Core\Cache\DatabaseBackend->getMultiple() (line 155 of core/lib/Drupal/Core/Cache/DatabaseBackend.php).
Drupal\Core\Cache\DatabaseBackend->getMultiple(Array, ) (Line: 126)
Drupal\Core\Cache\DatabaseBackend->get('route:[language]=pl:[query_parameters]=:/') (Line: 169)
Drupal\Core\Routing\RouteProvider->getRouteCollectionForRequest(Object) (Line: 252)
Drupal\Core\Routing\Router->getInitialRouteCollection(Object) (Line: 142)
Drupal\Core\Routing\Router->matchRequest(Object) (Line: 90)
Drupal\Core\Routing\AccessAwareRouter->matchRequest(Object) (Line: 105)
Symfony\Component\HttpKernel\EventListener\RouterListener->onKernelRequest(Object, 'kernel.request', Object)
call_user_func(Array, Object, 'kernel.request', Object) (Line: 111)
Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher->dispatch(Object, 'kernel.request') (Line: 157)
Symfony\Component\HttpKernel\HttpKernel->handleRaw(Object, 1) (Line: 76)
Symfony\Component\HttpKernel\HttpKernel->handle(Object, 1, 1) (Line: 53)
Drupal\Core\StackMiddleware\Session->handle(Object, 1, 1) (Line: 48)
Drupal\Core\StackMiddleware\KernelPreHandle->handle(Object, 1, 1) (Line: 28)
Drupal\Core\StackMiddleware\ContentLength->handle(Object, 1, 1) (Line: 32)
Drupal\big_pipe\StackMiddleware\ContentLength->handle(Object, 1, 1) (Line: 106)
Drupal\page_cache\StackMiddleware\PageCache->pass(Object, 1, 1) (Line: 85)
Drupal\page_cache\StackMiddleware\PageCache->handle(Object, 1, 1) (Line: 50)
Drupal\ban\BanMiddleware->handle(Object, 1, 1) (Line: 48)
Drupal\Core\StackMiddleware\ReverseProxyMiddleware->handle(Object, 1, 1) (Line: 51)
Drupal\Core\StackMiddleware\NegotiationMiddleware->handle(Object, 1, 1) (Line: 36)
Drupal\Core\StackMiddleware\AjaxPageState->handle(Object, 1, 1) (Line: 51)
Drupal\Core\StackMiddleware\StackedHttpKernel->handle(Object, 1, 1) (Line: 741)
Drupal\Core\DrupalKernel->handle(Object) (Line: 19)
P.S.
There is no cache problem with a single-language website.
I think the problem is that garbage collection corrupts the path to the home-page
does not allow null values in the main language.
After entering the language prefix it works fine.
But I guess that's not the point, because it destroys the links already provided.
Steps to reproduce
Install the standard profile.
TRUNCATE cache_data;
Visit https://drupal-dev.ddev.site/
Then https://drupal-dev.ddev.site/:%20
Without the MR, you will get a 404 and see the following three cache entries:
MariaDB [db]> SELECT cid FROM cache_data WHERE cid LIKE 'route:%';
+-----------------------------------------------------+
| cid |
+-----------------------------------------------------+
| route:[language]=en:[query_parameters]=:/ |
| route:[language]=en:[query_parameters]=:/%20 |
| route:[language]=en:[query_parameters]=:/system/404 |
+-----------------------------------------------------+
With the MR you will get a 200 on the path with %20 at the end, and there'll be a single cache entry:
MariaDB [db]> SELECT cid FROM cache_data WHERE cid LIKE 'route:%';
+-------------------------------------------+
| cid |
+-------------------------------------------+
| route:[language]=en:[query_parameters]=:/ |
+-------------------------------------------+
1 row in set (0.001 sec)
Proposed resolution
While there are not steps to reproduce yet, route lookup caching is inconsistent for urlencoded URLs, due to the way PathProcessorDecode happens during routing after the cache ID is created.
Instead of a path processor, we need to normalize the incoming path with urldecode directly in RouteProvider.
This means that https://example.com/%20 will consistently return a 200 (for the front page) instead of a 404. We might want a follow-up to redirect trailing urlencoded spaces in .htaccess
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
| Comment | File | Size | Author |
|---|---|---|---|
| #76 | Screenshot 2024-08-02 at 10.13.46 AM.png | 97.03 KB | banoodle |
| #22 | Screenshot 2024-07-13 at 9.12.25 AM.png | 196.22 KB | banoodle |
| #13 | Screenshot 2024-07-12 at 11.04.47 AM.png | 176.91 KB | banoodle |
| #3 | path after.jpg | 129.24 KB | pingwin_cracow |
| #3 | path before.jpg | 127.75 KB | pingwin_cracow |
Issue fork drupal-3456244
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
pingwin_cracow commentedComment #3
pingwin_cracow commentedComment #4
rbomhof commentedWe just updated to 10.3 and sporadically (so far once a day) the english version of our main homepage serves a 404 until we resave it, very frustrating. Did you find any better way to address this?
Comment #5
longwaveI assume this only happens on multilingual sites but are there concrete steps to reproduce? e.g. if you have a multilingual site and clear cache then visit the homepage, does it happen then? If not, can you work out the sequence of events that cause it?
Downgrading to "major" as this is not a site-down event and it appears from reports the site mostly works, it is only the homepage that is affected and even then not all the time.
Comment #6
rbomhof commentedUnfortunately we have yet to be able to reproduce this by taking any manual steps, and didn't notice anything in lower environments. So far once per day or so our multilingual site is marked down, and the default english version of the homepage serves a 404 until we resave that node, at which point it pops back in and loads as normal. The error we see in the logs is similar to what is mentioned above:
Warning: Undefined array key "route:[language]=en:[query_parameters]=:/ " in Drupal\Core\Cache\DatabaseBackend->getMultiple() (line 155 of /code/web/core/lib/Drupal/Core/Cache/DatabaseBackend.php)
Comment #7
fkildoo commentedI encountered this issue today as well. I also have a bilingual website with English as the primary language. The home page would go to a 404 page - "The requested page could not be found". It resolved after I cleared the cache. I am on Drupal version 10.3.1. I also received this error in my logs:
Warning: Undefined array key "route:[language]=en:[query_parameters]=:/ " in Drupal\Core\Cache\DatabaseBackend->getMultiple() (line 155 of /var/www/usccr.gov/html/core/lib/Drupal/Core/Cache/DatabaseBackend.php)
I'm not sure if it's related but with Drupal version 10.3.1, it is required to set the state cache flag $settings['state_cache'] to TRUE in settings.php. My 404 issue happened before I made that change. See https://www.drupal.org/node/3177901.
Comment #8
banoodle commentedWe are encountering this problem as well. We have a client with a very high-profile humanitarian aid site that is being affected by this. I think this is critical. For our client, the homepage being down is as bad as the entire site being down. And the homepage will only answer up again if a human intervenes (in the middle of the night).
This site is hosted on Pantheon - curious if the other reported incidents are also on Pantheon?
I plan to open a support ticket with them. I will report back anything I learn.
Comment #9
fkildoo commented@banoodle Our site is on AWS Linux, not Pantheon.
Comment #10
banoodle commentedThanks @fkildoo. Interesting.
We are now looking into adding the
$settings['state_cache'] = TRUE;to settings.php mentioned in this change record.Not sure this is our problem, but we are hoping it might help.
Comment #11
longwaveIt should not be required to enable state caching - that is an opt-in feature added in Drupal 10.3 and enabled by default in Drupal 11. However if this does make a noticeable difference here, that is an interesting data point that we should look into further.
Comment #12
fkildoo commentedEnabling state caching didn't make a difference for me. I just received the same error in the log I received earlier. However, my home page is still ok.
Comment #13
banoodle commented@fkildoo after running the suggested queries in the change record, we have concluded we might benefit from setting $settings['state_cache'] to FALSE, so we are giving that a try. Our incidents of this problem are infrequent (2 in 5 days), so it will be difficult to assess the success or failure of this setting.
One bandaid we may try for now is to set up a ping to hit https://oursite.com/en because @pingwin_cracow mentioned that working to resolve their issue once it occurs. At least if it does happen again, we're thinking the ping might resolve the situation quickly. We are considering pinging the site every 3 minutes.Edit: our bandaid idea above was based on my misreading of the description. I now realize the reporter meant he configured /en in the language settings.
In our case, saving the homepage node resolved the issue.
Comment #14
catchThe same undefined array key warning shows up in a similar, old, bug report for redirect module #3257559: Redirection issue in multilingual setup, suggests we might be hitting similar conditions.
If the state caching situation is not a red herring, then this might be related to the changes we made to Drupal\Core\Routing\RoutePreloader in 10.3.0 which removed the separate cache item in front of the state API.
For people experiencing this:
1. Do you have redirect module enabled
2.Please report back if state caching makes a difference, and also if it doesn't, once you've had a chance to try it out.
Also it sounds from reports like this might be an intermittent issue - e.g. clearing cache temporarily fixes it for a day or so, then it re-occurs?
Comment #15
fkildoo commented1. Yes, I have the redirect module enabled
2. I have state caching enabled but I don't think it makes any difference. I received the error and the 404 on my home page before I enabled it. After I enabled it, I received the error once again but not a 404 error on the home page.
Today was the first I became aware of the issue. I checked my logs back to 7/6 and there were no other errors.
Comment #16
banoodle commentedYes, we are using the Redirect module.
Today, we set state cache to FALSE (because we had a high number of states in the key_value table, and the change record suggested this).
So far, it is too early to tell if this will help, but we will let you know if the problem recurs.
We have only seen the failure 2 times in the 5 days since we upgraded to 10.3. The client has monitoring enabled on the homepage, so we are pretty sure it has only been the 2 times.
Comment #17
banoodle commentedOh and we aren't sure if clearing caches will fix the issue. So far, our team has just saved the english translation of the homepage node to correct the problem when it happened. We've asked everyone to try flushing caches first if it recurs, because we are curious about that.
Comment #18
banoodle commentedWe had another occurrence of the 404 error on the homepage about 2 hours ago.
So it looks like the state change had no affect on this issue.
We were able to correct things by flushing/rebuilding cache.
Comment #19
catch#1491748: Crawlers hitting /%20 can replace the home page with a 404 is an old IIS specific issue but shows a 404 request essentially replacing the front page cache.
Another question for people hitting this - do you have browser language detection enabled? In which case it would be #2430335: Browser language detection is not cache aware
Comment #20
banoodle commentedWe aren't using Browser detection.
That first issue doesn't seem related (d7). In any case, I tried the repo steps here and couldn't create the problem that way.
Comment #21
banoodle commentedOne thing all 3 of our incidents have had in common, is they occurred when the site was relatively lightly visited. The site gets a lot of traffic, but this occurred when it wasn't getting much traffic.
Could just be a coincidence, I suppose.
Comment #22
banoodle commentedComment #23
banoodle commentedI got a hold of a copy of the database taken shortly before the latest homepage 404 happened.
In the cache data table, I see the culprit...

Once caches are flushed, this bad actor is eliminated. We still can't figure out how this bad actor gets added to cache in the first place.
The :/%20 entry beneath the top :/ entry does seem to support @Catch's suggestion of this issue being related, but even if I enter myulr.com/%20 on my site, I can't make the problem happen.
In case it is helpful, here is the bad actor's blob data:
a:3:{s:4:"path";s:2:"/ ";s:5:"query";a:0:{}s:6:"routes";O:41:"Symfony\Component\Routing\RouteCollection":4:{s:49:"Symfony\Component\Routing\RouteCollectionroutes";a:4:{s:8:"<button>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:0:{}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:4:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:8:"_no_path";b:1;s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}s:7:"<front>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:1:{s:6:"_title";s:4:"Home";}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:3:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}s:8:"<nolink>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:0:{}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:4:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:8:"_no_path";b:1;s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}s:6:"<none>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:0:{}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:4:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:8:"_no_path";b:1;s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}}s:50:"Symfony\Component\Routing\RouteCollectionaliases";a:0:{}s:52:"Symfony\Component\Routing\RouteCollectionresources";a:0:{}s:53:"Symfony\Component\Routing\RouteCollectionpriorities";a:0:{}}}The following is the blob data from the :%20 entry:
a:3:{s:4:"path";s:2:"/ ";s:5:"query";a:0:{}s:6:"routes";O:41:"Symfony\Component\Routing\RouteCollection":4:{s:49:"Symfony\Component\Routing\RouteCollectionroutes";a:4:{s:8:"<button>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:0:{}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:4:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:8:"_no_path";b:1;s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}s:7:"<front>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:1:{s:6:"_title";s:4:"Home";}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:3:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}s:8:"<nolink>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:0:{}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:4:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:8:"_no_path";b:1;s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}s:6:"<none>";O:31:"Symfony\Component\Routing\Route":9:{s:4:"path";s:1:"/";s:4:"host";s:0:"";s:8:"defaults";a:0:{}s:12:"requirements";a:1:{s:7:"_access";s:4:"TRUE";}s:7:"options";a:4:{s:14:"compiler_class";s:33:"Drupal\Core\Routing\RouteCompiler";s:8:"_no_path";b:1;s:4:"utf8";b:1;s:14:"_access_checks";a:1:{i:0;s:20:"access_check.default";}}s:7:"schemes";a:0:{}s:7:"methods";a:2:{i:0;s:3:"GET";i:1;s:4:"POST";}s:9:"condition";s:0:"";s:8:"compiled";O:33:"Drupal\Core\Routing\CompiledRoute":11:{s:4:"vars";a:0:{}s:11:"path_prefix";s:0:"";s:10:"path_regex";s:8:"{^/$}sDu";s:11:"path_tokens";a:1:{i:0;a:2:{i:0;s:4:"text";i:1;s:1:"/";}}s:9:"path_vars";a:0:{}s:10:"host_regex";N;s:11:"host_tokens";a:0:{}s:9:"host_vars";a:0:{}s:3:"fit";i:1;s:14:"patternOutline";s:1:"/";s:8:"numParts";i:1;}}}s:50:"Symfony\Component\Routing\RouteCollectionaliases";a:0:{}s:52:"Symfony\Component\Routing\RouteCollectionresources";a:0:{}s:53:"Symfony\Component\Routing\RouteCollectionpriorities";a:0:{}}}Comment #24
fkildoo commentedOur issue happened once again at 6 PM on Sunday. The same error and 404 occurred on the home page. Flushing the cache brings the page back. I did notice in the logs the automated cron was run right before it occurred.
We do not have browser language detection enabled.
Edit: As a band-aid, I'm thinking about setting up a cron job to flush the cache run right after the twice-a-day automated Drupal cron.
Comment #25
catch@banoodle can you double check the state of the database when there's no 404 - i.e. that you definitely don't have the
%20cache entry, and that it's the only difference from the working state.It would help if other people on this issue could take the same steps banoodle has taken too.
Comment #26
fkildoo commentedI ran the same query @banoodle did (below) but it returned an empty result because I ran it after I flushed the cache. I know, not very helpful.
SELECT * FROM `cache_data` WHERE `cid` LIKE '%route"[language]=en:[query_parameters]=:/' ORDER BY `cache_data`.`cid` ASCIf I enter myulr.com/%20 on my site, I get the same error in my log, but no 404 on the home page.
I have a dev site with almost the same setup and the issue is not occurring there. Caching and CSS/JS aggregation are enabled on both. I am trying to identify all the differences. One difference (which I don't think is preventing the issue) is my dev site is using the Search 404 module which will send the user to my custom Search page (using Solr Search) when a 404 occurs.
This is hard to reproduce and I'm not great at debugging. I have read through the related issues to see if I can get any useful info.
For now, my non-ideal temporary fix (fingers crossed) is to set up a cron job to clear the Drupal cache every 6 hours, 5 minutes after the automated Drupal cron is scheduled.
cd /var/www/site/html && /var/www/site/html/vendor/bin/drush --uri=https://siteurl crOther temporary fixes I've considered:
-- Add a rewrite rule to strip out trailing spaces (and trailing %20)
-- Exclude caching for the front page using the Cache Exclude module or another way
Comment #27
catchI can't reproduce the bug as such, but had a look at some things.
First I had a quick look at what it would take to sanitize the incoming URL to remove the trailing
%20before we do any routing.Symfony doesn't let you replace the request URI in the
Requestobject, which would mean either replacing the entire object in a middleware, or in index.php before the request object is created. I had a look at doing the latter, and it's possible, but it means that instead of example.com/%20 being a 404 (which it should be), it just gets treated the same as the front page (which makes sense because we'd be normalizing it to the same thing), but that doesn't seem like what we want at all.I checked an 11.x Umami install, and route lookup does get cached for invalid paths, so the caching itself is not the problem, but
%20does return something for the route collection differently to%20abcde- even if that doesn't lead to a 404 on the front page most of the time, it's a bug either way.Then I had a look at the
cache_datatables (this is with Umami installed, so a multilingual site).e.g.
I then noticed we have the
PathProcessorDecodeclass in core, this will decode %20 to a space, which then gets rtrimmed by RouteProvider.That rtrim leads to a mismatch between $this->currentPath (which will have the decoded space) and the path passed to ::getRoutesByPath().
If we trim in PathProcessDecode, then routing finds the front page - this seems wrong, but would probably workaround the issue here. I've added an MR for this, first to see whether we have any test coverage it would break, secondly for people to try out if they want.
What I think we should probably do is not decode the path at all, putting up an MR for that to see what happens to test coverage. If tests pass with that, it would also be useful to see if people could try applying that change as well.
Full steps to reproduce from a clean install would be really useful still here, but I think we've got enough to go on to make some progress.
Comment #31
catchI didn't realise this was against 10.3.x so have put up two 10.3 MRs..
Neither were fully done anyway, just to get a test run with different approaches, so I've add an additional 11.x MR which includes a deprecation in 10.4.x: https://git.drupalcode.org/project/drupal/-/merge_requests/8789
If that really doesn't break anything, we could potentially backport to 10.3 (without the deprecation).
If we only want to handle the front page in 10.3 and save the behaviour change for a minor release, we might want to keep exploring both approaches.
Comment #33
catchOK not decoding fails a lot of tests, so https://git.drupalcode.org/project/drupal/-/merge_requests/8790 is the one to look at - just adding a trim.
Comment #37
catchComment #38
catchComment #39
quietone commentedMaybe a change of title to "Normalize the incoming path with urldecode directly in RouteProvider"
I left a question in the MR, so setting to NW.
Comment #40
catchNew title is good.
I also tried to address the MR feedback.
Comment #41
catchHas anyone been able to test the MR on a site experiencing this issue?
Comment #42
longwaveAdded some code comments, I think the code could be cleaner and there is an open question about the cache key.
Comment #43
catchAccepted the suggestion and I'm not sure about the open question because we need to backport this to a patch release.
Comment #44
catchBack to needs review because I'm not sure what to do about #43.
I think we could add the extra parameter in 11.1 in a separate MR, and then the existing approach (probably with forward slashes trimmed too) for the backport branches.
Or go ahead here and open a follow-up to add the extra parameter in 11.1
Comment #45
longwaveMaybe we just copy the code to both methods for now, and consider unifying it later?
Comment #46
catchCopied the code more properly, added a todo and a follow-up issue at #3462696: Add path as a parameter to getRouteCollectionCacheId().
Comment #47
longwaveLooks good to me, plus gives a tiny performance improvement in every page load given we are one event listener lighter.
It would be really helpful if one of the reporters in this issue can test the patch but in the meantime this is RTBC.
Comment #48
larowlan@fkildoo, @rbomhof or @banoodle - any chance you could try the RTBC MR for a few days and report back if it fixes your issue?
Updating issue credits
Comment #49
fkildoo commented@larowlan - Sorry, you all lost me a while ago! I don't have any experience with or knowledge of testing a merge request. I know how to implement a patch with Composer so I'm currently reading up on how to download and use a patch file from a merge request. If successful, I will test it and report back.
On another note, this error has occurred 18 times starting on 7/12. Below are the frequency and times it has occurred, if that is helpful.
2024-07-22 13:45:26.248
2024-07-22 11:24:41.247
2024-07-20 11:48:51.248
2024-07-15 13:15:26.248
2024-07-14 23:12:35.248
2024-07-14 23:12:30.253
2024-07-14 23:12:15.248
2024-07-14 23:10:22.248
2024-07-14 23:00:52.248
2024-07-14 22:00:51.248
2024-07-12 19:15:05.441
2024-07-12 11:32:16.441
2024-07-12 11:28:39.149
2024-07-12 11:28:34.934
2024-07-12 11:24:59.982
2024-07-12 11:20:46.981
2024-07-12 11:00:48.981
2024-07-12 10:00:46.982
Comment #50
banoodle commented@larowlan we can try it, but ever since we set state cache to true last Saturday morning (7/13/24), we haven't had a recurrence of the problem (so it has been over a week since the last incident).
Comment #51
larowlanThanks @banoodle
@fkildoo - you can turn an MR into a patch by adding `.diff` to the URL - for this issue the patch is at https://git.drupalcode.org/project/drupal/-/merge_requests/8790.diff - however you shouldn't reference that URL directly (as anyone can get access to the branch). Instead you should download the file into e.g. a PATCHES folder inside your codebase, and then with composer patches reference it using a file URL (e.g. "./PATCHES/name-of-the-saved-file.patch". Thanks again 💪
Comment #52
fkildoo commented@larowlan Thanks, I figured it out and tested it on my dev site. Will apply to production soon and report back if I get the error again.
Comment #53
fkildoo commentedI applied the patch to my production site. So far so good. I've visited all the URLs below (with the site URL in front) and they all redirect to my home page without a 404 or an error in my log. I'll keep monitoring my logs for the error.
/%20
/index.php/
/
Comment #54
john_b commentedI have seen the issue on a site which never was bilingual, and uses language undefined. Not sure what triggered it.
Comment #55
banoodle commented@larowlan looks like I spoke too soon: we had an incident last night. This was with state_cache set to true. We had gone 10 days since the last incident.
We have applied this issue's patch and will let you know if we have another incident.
Comment #56
fkildoo commentedJust an update... 3 days since applying the patch and there have been no errors. I assume the patch did the trick, but the last time there were 5 days between errors, so I can't be 100% sure yet. @banoodle went over 10 days between errors. If we both go over 11 days without seeing the error, I'd say it's probably good. I'll report back in a few days.
Comment #57
larowlanThanks
Comment #58
xjmPosted a comment on the MR. The deprecation for this should be targeting 11.1. If it gets backported to Drupal 10, that should happen without any added deprecations. No?
Comment #59
catchComment #60
allahnoor turab commentedThanks @larowlan,@banoodle. I applied the patch #51 on multilangual production site yesterday, so for gives positive result.
Comment #61
catchSince the deprecation change was minimal, moving back to RTBC.
Comment #62
john_b commentedShouild be "Critical". Since last D10 core update, our monolingual site is randomly returning 404s for key landing pages. We will be grateful for a fix in D10, where patch does not apply cleanly.
Comment #63
alexpottUnfortunately I'm not sure that this fix is going to resolve the underlying issue as reported initially here. I think the problem is due to case-insensitive queries in MySQL - see #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item - that issue also suggests a work around but we need to fix the underlying issue.
Comment #64
catchI think we should work on #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item separately to the bug discovered here. We don't know exactly what caused the 10.3.0 issue, and both predate 10.3.0.
For clarification, the bug I found here is that a request to https://example.com/%20 matches routes like this even though it ends up serving a 404 response:
Whereas a request to https://example.com/ matches the front page, and a request to https://example.com/%20abc is a proper 404. But it will probably be a week or so before we can confirm whether it fixes the bugs reported here or not due to the bug itself being so intermittent.
Comment #65
alexpottCan people who have the problem originally reported here confirm the following details:
select cid from cache_page limit 1;and then use the cid returned to a select where you have changed the case of the cidselect cid from cache_page where cid = 'cid_case_changed';show create table cache_page;(mysql) or the CLI commandpg_dump --table cache_page --schema-only(postgres)Thanks for helping! Eventually this will help the other issue.
Comment #66
catchMR up on #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item now, quick summary of the differences between two issues:
1. The bug I found here (which may or may not fix the reported issue) is that the routing system returns different results for
example.com,example.com%20andexample.com%20abcd, and the result forexample.com/%20is completely invalid - the client gets a 404, but the routes returned include 'fake' routes defined with an empty string for things like<button>. This is partly at least because the routing system inconsistently handles trailing whitespace in URLs.2. #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item is for when a literal trailing space is passed to the database cache backend as part of the cache ID, .e.g.
|foo |. MySQL conditions on varchars can (but not always, e.g. LIKE doesn't) treat|foo|and|foo |as the same, meaning that|foo |returns a cache item when it shouldn't, but then breaks the cache backend logic that reconciles the PHP cache IDs to what's stored in the database. The MR there ensures that cids with trailing spaces get hashed so that they're treated differently.These are somewhat related in that they both deal with trailing spaces, but separate bugs in between the routing and cache systems.
It seems like there is some kind of potential scenario where a URL like
|example.com%20|gets URL encoded to|example.com |which then gets passed to the database cache backend, but I can't yet see how that would happen in the routing system at least.Comment #67
alexpottVery happy to see all the new information and understanding. Feel like we're getting to a decent understanding. I guess the question we should ask here is whether or not we should be rtrim'ing paths. Is this expected behaviour? Also I'm pondering about path column in the router table and how lookup with trailing spaces should work on that. Ah interesting. It's sensitive to trailing spaces - this is because it is varchar... the cid column is varchar binary - it's the binary part that makes it case sensitive but not right padding sensitive. FUN.
I'm a bit conflicted with the fix here though. I think returning a 404 for http://example.com/%20 is better than normalising. But that's a gut feel rather than anything backed by hard evidence. This change would normalise it to homepage.
Comment #68
alexpottI've tested a few sites... wordpress seems to normalise %20 and rtrim... see https://www.nasa.gov/%20 and other wordpress sites... but other big websites like google, amazon all 404.
Comment #69
alexpottDrupal 7 404s... https://www.drupal.org/%20 :D
Comment #70
thejimbirch commentedUg.
I’ve made my team aware of the request in #65. banoodle or rbomhof will post an update.
Comment #71
catchI have a slight preference for this too, but technically doing it the opposite way around opens up a massive can of worms.
The URL decode in
urldecode()in PathProcessorDecode::processInbound() naturally strips the encoded space from the end of the URL.We would need to explicitly detect
%20at the end of a URL, and then either add it back after url decoding (which is not really url decoded then), or set the current path to the path with whitespace at the end, pass it into ::getRoutesByPath() etc. all of which is not a valid URL at all - i.e. you can't have an actual URL with a literal space at the end of it, that's invalid.That means the only real alternative is to explicitly add back the
%20after URL decoding:I guess I can make an extra MR that does that instead, and see what the pipeline looks like.
Comment #72
catchYeah so the problem if we do that, is that we then immediately run into #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item - i.e. if you hit example.com/%20 first, you get a 404, and the cache key that gets stored is for example.com/, so that gets a 404 too. But cane make it work if we merge #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item in. I updated the new tests here, but deliberately didn't run any other tests so we can get a clear idea of what else breaks, if anything.
Comment #74
fkildoo commented@alexpott in response to #65:
1. My database is MySQL v8.0.39
2. Running the query "select cid from cache_page limit 1;" returns 1 result (https://www.example.com//blog:), but I'm unclear on the rest of your request -- "...use the cid returned to a select where you have changed the case of the cid select cid from cache_page where cid = 'cid_case_changed';"
3. ?
Since applying the patch I haven't received the error. My Drupal version is 10.3.1.
Comment #75
catchhttps://git.drupalcode.org/project/drupal/-/merge_requests/9046 is green and is less ugly than I thought - we just have to restore the %20 at the end of the URL after urldecode() and incorporate #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item.
Comment #76
banoodle commented@alexpott thank you for working on this. Here are the requested details from our site:
Database: 10.4.25-MariaDB-log
The first query returns a cid of:
https://www.rescue.org/eu/search-eu/site/Child%20protection?page=16:If I change the case and run this query, it returns no results:
select cid from cache_page where cid = 'https://www.rescue.org/eu/search-eu/site/child%20protection?page=16:';Here is the output from
show create table cache_page;:Comment #77
fkildoo commentedAdd on to my comment #74:
2. I ran the query: "select cid from cache_page where cid = 'https://www.example.com//blog:';" which returned the same result as the first part.
3. Output of show create table cache_page:
Comment #78
alexpott@banoodle and @fkildoo thanks for the info - we found the bug in the cache layer and it has been fixed and will released in the next non-security releases of Drupal 10 and 11. See #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item. I'm pretty sure that's going to fix the issue that was originally reported here. This issue will now address the bug that @catch found when doing http://example.com/%20
Comment #79
smustgrave commentedCan anyone confirmed if the issue mentioned in #78 addressed the issue here please?
Comment #80
banoodle commented@smustgrave I'm not sure how we could confirm it. Since applying the MR on this issue over 2 weeks ago, there haven't been any further incidents.
We have never been able to reproduce the 404 on the homepage on-demand, so I think the only way we could validate this, would be to rollback the MR fix, apply the patch for the other issue and then wait to see if it happens again. The longest error-free interval we had when we still had the problem was 10 days.
Also we are on 10.3.1 and it looks like they haven't provided a patch for that yet (but from the latest comment it looks like one may be forthcoming soon). If it is, would we want to apply that on top of the MR from this issue?
Comment #81
catch@smustgrave 'needs more info' isn't particularly helpful here, we know there's a bug in the current logic by reading it and the contents of the router cache with the steps to reproduce in the test coverage. What we don't know is whether this issue or the other bug report recently fixed definitively fixes the original 404 issue people have been seeing, but since it's so hard to reproduce, I personally think we should go ahead with the issues individually, and then ask people to report back if they ever see the bug again. If they don't, then great we fixed it, if they do, then we might have yet a third bug to track down, but the two MRs definitely fix bugs, even if we can't guarantee it's what fixes the original 404 report.
@banoodle https://www.drupal.org/project/drupal/releases/10.3.2 includes the database cache backend fix for the other issue. If you update to that, and never get the problem again, that issue by itself may have been enough to fix it.
The diff from the MR here should apply to 10.3.x I think, but would probably need to wait and see whether you get the issue again on 10.3.2 first, then apply the change from here, then wait again to see if it re-occurs.
Comment #82
tonka67 commentedI just upgraded to 10.3.2 and still see this issue. My site is English, not bilingual. It seems to affect one set of content page types, but not others, but they're identical except for the block layout.
Comment #83
catchThe bug has so far only been reported for the front page on multi-lingual sites - are you getting 404s on other pages? Could use a bit more information.
Comment #84
tonka67 commentedYes, I've got one content type set to display author bio pages. It's a simple node with the following field types:
image, markup, body (text formatted, long, with summmary), 2 text (plain), and link field
It does have a views EVA footer, though, so I'm looking now to see if that might be the cause.
Comment #85
tonka67 commentedHmm, so the error disappeared in production. I cloned production back into dev (I was overdue for a cleanup reset anyway), and it cleared. I'm going to assume it persisted due to a legacy dev experiment that no longer exists.
But, yes, the issue did affect pages for me other than the home page. I also had a view last week that suddenly failed and triggered the same warning.
I'll keep an eye on things and will post if I see it again.
Comment #86
tonka67 commentedAaaand, it's back. Again, this is on a node page, not the front page. I was building the page directly from the node content. When I rebuilt it in Layout with view blocks and a contextual filter instead, the error cleared. Since this is a routing problem, I'm guessing it has something to do with passing the parameter from the URL to pull the correct content directly from the node.
I tried stripping out the EVA and anything other non-standard fields. Same issue.
For what it's worth, I'm also on Pantheon.
Comment #87
john_b commented@catch
I have seen the bug on a monolingual site, on pages with content type='home' which are not front page. However I do not know if it was a white space bug or a %20 bug. I am not aware of any occurrence since applying the white space patch.
The problem node had langcode 'und'. Nodes migated from D5/6/7 have langcode 'und'. Nodes created post-migration have langcode 'en'. In some cases nodes with langcode 'und' redirect to nodes with langcode 'en'.
Comment #88
smustgrave commentedCan Steps to reproduce section be filled out please
Comment #89
catchAdded steps to reproduce. The original report was fixed by #3368537: Ensure trailing whitespace at the end of a cache ID results in a unique cache item however in the process of debugging that I found the route normalization bug too.
Comment #90
catchComment #91
smustgrave commentedHey sorry but there are 2 MRs both against 11.x which one should be reviewed?
Comment #92
catch9046 is the one to review. I moved the other to a draft rather than hiding because it's an alternative approach.
Comment #93
smokrisChanging status back to Needs review since catch clarified which MR to review.
Comment #94
needs-review-queue-bot commentedThe Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".
This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #95
smokrisI rebased !9046 on current
11.x; setting status back to Needs review.Comment #97
dcam commentedI'm unable to verify that the fix corrects the problem as described in the steps to reproduce. When I try it with the MR I get the same 404 result and the
route:[language]=en:[query_parameters]=:/%20cid. I've tried it a few times now and I don't think I'm missing anything. I'm definitely on the3456244-11x-dont-trim-pathbranch.In any case, the deprecation versions need to be updated. So I'm setting this to Needs Work.