Problem/Motivation

When simple sitemap is requested with non-ascii characters in the variant, an exception is generated, which could result in logs getting filled up, denial of service attacks, etc. Such requests should not be routed by simple sitemap.

Currently, the inbound path processor, takes the variant without checking the characters and redirects from {variant}/sitemap.xml to the route /sitemaps/{variant}/sitemap.xml. From there, the variant is compared to configuration to see if it is a defined variant. But the config table stores the config name using the ASCII character set and ascii_general_ci collation, so the comparison statement fails with "Illegal mix of collations"

Steps to reproduce

  • - Install Simple XML Sitemap
  • - Request /bfg6417%EF%BC%9Cs1%EF%B9%A5s2%CA%BAs3%CA%B9hjl6417/sitemap.xml
  • - Note an exception is generated- SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (ascii_general_ci,IMPLICIT) and (utf8mb4_general_ci,COERCIBLE) for operation '=': SELECT "name", "data" FROM "config" WHERE "collection" = :collection AND "name" IN ( :names__0 ); Array ( [:collection] => [:names__0] => simple_sitemap.sitemap.bfg6417<s1﹥s2ʺs3ʹhjl6417 )
  • source
  • /var/www/html/docroot/core/modules/mysql/src/Driver/Database/mysql/ExceptionHandler.php:56

Proposed resolution

This could be fixed in either the path processor (ensuring $arg 1 is an ASCII character before altering the path), but I think a much simpler fix is to just add a requirement to the simple_sitemap.sitemap_variant route: variant: '^[\x21-\x2E\x30-\x7E]+$'

Comments

dgroene created an issue. See original summary.

dgroene’s picture

Patch for 4.2.2

gbyte’s picture

Very good! How did you catch it?

Your solution didn't work for some reason, but I don't mind making the check in the context of loading the variant which would be the path processor (the alternative you suggested). What do you think of the patch?

dgroene’s picture

@gbyte- your version is better and is working for me. I am not sure if you created it against dev, but did not apply for me in 4.2.2, so I am attaching one that applied.

  • gbyte committed 9debae02 on 4.x
    [#3554196] fix: Non-Ascii Characters In Request Variant Cause Exception...
gbyte’s picture

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.