I have enabled path prefix for URL language detection (http://drupalsite.docksal/en).
When I tried to open /sitemap.xml (http://drupalsite.docksal/sitemap.xml) then the unexisting route processed.

/sitemap.xml 301 redirects to /en/sitemap.xml
/en/sitemap.xml 301 redirect to /en/sitemaps/en/sitemap.xml
/en/sitemaps/en/sitemap.xml is 404

After some investigation I found that this bug related to changes in https://www.drupal.org/project/simple_sitemap/issues/2930027
According to the comment https://www.drupal.org/project/simple_sitemap/issues/2930027#comment-127...
the url template for sitemap variant is {variant}/sitemap.xml .

This changes lead to incorrect sitemap processing, because language path prefix processed as a sitemap variant, which doesn't exist.

Comments

knyshuk.vova created an issue. See original summary.

knyshuk.vova’s picture

Status: Active » Needs review
StatusFileSize
new2.38 KB

I redefined the class LanguageNegotiationUrl and made sure that when opening a path containing "sitemap.xml" the path itself did not change.

gbyte’s picture

Hi knyshuk.vova,

thanks for catching and patching this!

I have altered mainly the logic of the plugin class, as it appears in your code calling parent::processOutbound() without returning its result does nothing. Can you please apply my patch and test if everything works?

Please also explain how you get this error, so I can test the patch as well. When I enable two languages and language detection by prefix, /sitemap.xml does not redirect to {prefix}/sitemap.xml. What am I missing?

gbyte’s picture

Priority: Major » Normal
Status: Needs review » Postponed (maintainer needs more info)
aspilicious’s picture

You need to set a prefix on the language to see the problem.

aspilicious’s picture

Status: Postponed (maintainer needs more info) » Needs review
StatusFileSize
new1.78 KB
gbyte’s picture

Status: Needs review » Needs work

@aspilicious I have done that, I do not get the issue. Could you please exactly describe how I can reproduce this on e.g simplytest.me?

This issue is now blocking the release:
#3002212: Create a stable release for 3.x

One more thing I don't like about the patch is

if (strpos($path, 'sitemap.xml')) {
  return $path;
}

There may be a path containing 'sitemap.xml' and not being a simple_sitemap path. Can we be more specific on this string check?

  • gbyte.co committed ddaeedd on 8.x-3.x
    Issue #2996734 by gbyte.co, knyshuk.vova: Page not found opens instead...
gbyte’s picture

Status: Needs work » Fixed

I have not had the weird redirection issue, but I have seen the language nogotiation trying to process the sitemap path which is not necessary. Hence I am committing this. The check for the sitemap path is more thorough, please test this and get back to me.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mitrpaka’s picture

@gbyte.co Please re-open this issue.

Problem/Motivation

On multilingual sites where language negotiation is enabled with path prefix, [webroot]/sitemap.xml path is processed in PathProcessorSitemapVariant to /sitemaps/
/sitemap.xml. As result of this, 404 is returned instead of sitemap.xml content.

How to reproduce

- Add two or more languages to the site.
- Go to Detection and selection tab (/admin/config/regional/language/detection).
- Enable URL detection method and add path prefix configuration for each of the language.
- Enable, configure and create sitemap.xml
- Try to open /sitemap.xml

Proposed resolution

See patch attached

gbyte’s picture

Status: Closed (fixed) » Needs review

Anyone else still having problems with this? This would be a 3.0 blocker.

mitrpaka’s picture

Status: Needs review » Active

Did have additional test run with this and now have better understanding why [webroot]/sitemap.xml path is working or is not working with path prefixes.

On multilingual sites where language negotiation is enabled with path prefix:
- If site is not not using Redirect module (or "Enforce clean and canonical URLs" option is set to disabled [enabled by default]), [webroot]/sitemap.xml path will work but e.g. [webroot]/en/sitemap.xml will not (404 error)
- If Redirect module enabled with "Enforce clean and canonical URLs" option, [webroot]/sitemap.xml path is processed in PathProcessorSitemapVariant to /sitemaps/
/sitemap.xml and will cause 404 error

Patch applied in #12 should address both use cases.

How to reproduce (revised)

- Add two or more languages to the site.
- Go to Detection and selection tab (/admin/config/regional/language/detection).
- Enable URL detection method and add path prefix configuration for each of the language.
- Enable, configure and create sitemap.xml
- Try to open /sitemap.xml (Should work if Redirect module is not enabled)
- Try to open /en/sitemap.xml (Should fail)
- Add and enable Redirect module
- Try to open /sitemap.xml (Should fail)
- Try to open /en/sitemap.xml (Should fail)

mitrpaka’s picture

Status: Active » Needs review
kirkkala’s picture

8.x-3.0-rc3 with our multilingual setup with path prefix (and fi as default language) serves 404 when sitemap.xml from root.

Examples without patch:

example.com/sitemap.xml -> /sitemaps/fi/sitemap.xml (404)
example.com/fi/sitemap.xml -> /sitemaps/fi/sitemap.xml (404)

Workaround was to add path alias to database for sitemap.xml to /sitemaps/default/sitemap.xml

But RC3 with patch #12 fixes the issue and makes alias redundant:

example.com/sitemap.xml -> /fi/sitemap.xml (displays the default sitemap /sitemaps/default/sitemap.xml)
example.com/fi/sitemap.xml -> /fi/sitemap.xml (displays the default sitemap /sitemaps/default/sitemap.xml)

Maybe good to have someone else to test also and then RTBC.

gbyte’s picture

Thanks for testing guys. I've altered some of it, would you mind testing the prettified patch again? I can't test myself right now, but I am happy to commit if you find it working for you.

  • gbyte.co committed 60b09f4 on 8.x-3.x
    Issue #2996734 by gbyte.co, mitrpaka, kirkkala: Page not found opens...
gbyte’s picture

Status: Needs review » Fixed

The problem with the above approach is sitemap.xml still redirects to /en/sitemap.xml with the redirect module's 'enforce clean and canonical URLs' option enabled.

I am implementing a cleaner approach by reverting the sitemap's path processor code to what it was before and adding the _disable_route_normalizer directive to the routes.

Test cases with the English language prefix enabled:

redirect module disabled:

  • sitemap.xml -> 200 -> pass
  • default/sitemap.xml -> 200 -> pass
  • en/sitemap.xml -> 404 -> pass
  • en/default/sitemap.xml -> 404 -> pass

redirect module enabled along with 'enforce clean and canonical URLs' option:

  • sitemap.xml -> 200 -> pass
  • default/sitemap.xml -> 200 -> pass
  • en/sitemap.xml -> 404 -> pass
  • en/default/sitemap.xml -> 404 -> pass

I have commited this because the above test cases all pass. Thank you @mitrpaka for establishing the redirect module as the culprit.
It would be nice if one of you could add functional tests.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.