With a page path like /fr/article/à-propos-de-nous we're seeing the module.s cache tag cause a fatal db exception.

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[HY000]: General error: 1271 Illegal mix of collations for operation ' IN ': SELECT tag, invalidations FROM {cachetags} WHERE tag IN ( :tags__0, :tags__1, :tags__2, :tags__3, :tags__4, :tags__5, :tags__6, :tags__7, :tags__8, :tags__9, :tags__10, :tags__11, :tags__12, :tags__13, :tags__14, :tags__15, :tags__16, :tags__17, :tags__18, :tags__19, :tags__20, :tags__21, :tags__22, :tags__23, :tags__24, :tags__25, :tags__26, :tags__27, :tags__28, :tags__29, :tags__30, :tags__31, :tags__32, :tags__33, :tags__34, :tags__35, :tags__36, :tags__37, :tags__38, :tags__39, :tags__40, :tags__41, :tags__42, :tags__43 ); Array ( [:tags__0] => config:block.block.customblockslandingpageteasersblock [:tags__1] =>
...
[:tags__40] => optimizely:* [:tags__41] => optimizely:/article/à-propos-de-nous-0 [:tags__42] => optimizely:/node/* [:tags__43] => optimizely:/node/1391 ) in Drupal\Core\Render\RenderCache->set() (line 248 of core/lib/Drupal/Core/Render/RenderCache.php).

I'm not sure why both the node/ID and the alias need to be tagged, but I think the path_alias should be encoded like this to prevent this error, on line 238 of optimizely.module.

  if ($current_path_alias) {
    $page['#cache']['tags'][] = 'optimizely:' . UrlHelper::encodePath($current_path_alias);
  }
CommentFileSizeAuthor
#2 accent-cache-tag-2834892-1.patch643 bytesawolfey

Comments

awolfey created an issue. See original summary.

awolfey’s picture

StatusFileSize
new643 bytes

Here's a patch.

tz_earl’s picture

Assigned: Unassigned » tz_earl

@awolfey Thanks a lot for reporting this and for the work you've done on it.

Note that when adding the cache tags, it's also necessary to check that the identical tag names are being used for cache invalidation.

I should be able to take a close look at this issue and your patch in about a week or so.

awolfey’s picture

@tz_earl, for cache invalidation, the paths passed in to doRefresh() would also need to be encoded in order to clear the cache correctly?

tz_earl’s picture

@awolfey Again, I haven't taken a good look at this yet, but the tag names passed into doRefresh() must match the tag names that are attached when the page is rendered. It's just a string match.

tz_earl’s picture

@awolfey I have not been able to reproduce this by creating an article with an alias that contains an accented character, namely, "/à-propos-de-nous".

But please note that I'm using the following.

(1) Optimizely 8.x-1.1 which is the newest. This version incorporates a re-write of the caching mechanism so that it works more efficiently. You really should upgrade to this version of the module if you haven't already. It should be forward-compatible with your site.

(2) Drupal 8.1.0, default installation and configuration.

(3) MySQL 5.5. Collation for all the tables shows as utf8mb4_general_ci. I'm not doing anything multilingual, so that may make a difference. If you can, could you check the collation type of the tables?

tz_earl’s picture

Status: Active » Postponed (maintainer needs more info)
awolfey’s picture

Hi tz_earl,

I am using 8.1.1, but that version isn't listed in the version select list when creating an issue. I'll look into your other questions.

Thanks.

grendzy’s picture

Version: 8.x-2.15-rc1 » 8.x-3.0
Status: Postponed (maintainer needs more info) » Needs work

This is occurring for me on 8.x-3.0. The error is:

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[HY000]: General error: 1267 Illegal mix of collations (ascii_general_ci,IMPLICIT) and (utf8mb4_general_ci,COERCIBLE) for operation '=': SELECT tag, invalidations FROM {cachetags} WHERE tag IN ( :tags__0 ); Array ( [:tags__0] => optimizely:/blog/staying-organized-–-tools-trade ) in Drupal\Core\Render\RenderCache->set() (line 269 of drupal/core/lib/Drupal/Core/Render/RenderCache.php).

This is with MySQL 5.5.53. The cachetags table default is utfmb4, and the "tag" column is ASCII:

mysql> show create table cachetags\G
*************************** 1. row ***************************
       Table: cachetags
Create Table: CREATE TABLE `cachetags` (
  `tag` varchar(255) CHARACTER SET ascii NOT NULL DEFAULT '' COMMENT 'Namespace-prefixed tag string.',
  `invalidations` int(11) NOT NULL DEFAULT '0' COMMENT 'Number incremented when the tag is invalidated.',
  PRIMARY KEY (`tag`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='Cache table for tracking cache tag invalidations.'
grendzy’s picture

Instead of urlencoding, I think hashing the alias would be more reliable. The url_alias table is varchar(255) - in utfmb4. In the worst-case scenario, this would require 3,060 bytes to store as url-encoded ASCII (since a 4 byte UTF character requires twelve bytes when formatted as %xx%xx%xx%xx).

But, the cachetag can only fit 255 bytes (including the 11 bytes used for the "optimizely:" prefix).

tz_earl’s picture

@grendzy Thanks very much for reporting another occurrence of this problem.

Are you saying that this issue is still occurring with version 8.x-3.0 of the module?

If so, please provide me with an example of a path that is causing it.

That's a good thought about using a hash value of the path as the tag rather than its urlencoding as a way to prevent the size of the tag column from being a problem. Thank you for noticing this and mentioning a possible solution.

tz_earl’s picture

@grendzy My apologies for asking for information that was in your original comment, which I would have noticed if I'd read your comment more carefully.

Thanks again for your bug report, and also for your idea about using a hash to generate a tag.