Problem/Motivation

During cache clears or other activity that leads to a lot of config cache writes, race conditions can result in uncatched database exceptions, which exception seems to depend on the MySQL transaction level, it can either be a deadlock or an integrity constraint violation.

To reproduce, run something like ab -n 100 -c 10 http://d8/ and then do one or multiple drush cr while it is running.

Proposed resolution

The current patch addresses the problem in two ways:

- Provide a mysql specific implementation that uses REPLACE queries, this does not only avoid the use of transactions, it also just needs a single query and it is safe, as the actual change operation is atomar, it will transparently update existing data.

- Add a exception check for the integrity constraint violation exception, we have no generic exception for the deadlock, so we can't do the same there.

Remaining tasks

We still need to figure out how this behaves on PostgreSQL/Sqlite, but I think we can look into fixing those in a separate issue, if a fix is needed.

User interface changes

API changes

Original report/one possible exception message

Drupal\Core\Database\DatabaseExceptionWrapper: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction: INSERT INTO {cache_config} (cid, data, expire, created, serialized, tags, checksum_invalidations, checksum_deletions) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7), (:db_insert_placeholder_8, :db_insert_placeholder_9, :db_insert_placeholder_10, :db_insert_placeholder_11, :db_insert_placeholder_12, :db_insert_placeholder_13, :db_insert_placeholder_14, :db_insert_placeholder_15), (:db_insert_placeholder_16, :db_insert_placeholder_17, :db_insert_placeholder_18, :db_insert_placeholder_19, :db_insert_placeholder_20, :db_insert_placeholder_21, :db_insert_placeholder_22, :db_insert_placeholder_23), (:db_insert_placeholder_24, :db_insert_placeholder_25, :db_insert_placeholder_26, :db_insert_placeholder_27, :db_insert_placeholder_28, :db_insert_placeholder_29, :db_insert_placeholder_30, :db_insert_placeholder_31), (:db_insert_placeholder_32, :db_insert_placeholder_33, :db_insert_placeholder_34, :db_insert_placeholder_35, :db_insert_placeholder_36, :db_insert_placeholder_37, :db_insert_placeholder_38, :db_insert_placeholder_39), (:db_insert_placeholder_40, :db_insert_placeholder_41, :db_insert_placeholder_42, :db_insert_placeholder_43, :db_insert_placeholder_44, :db_insert_placeholder_45, :db_insert_placeholder_46, :db_insert_placeholder_47), (:db_insert_placeholder_48, :db_insert_placeholder_49, :db_insert_placeholder_50, :db_insert_placeholder_51, :db_insert_placeholder_52, :db_insert_placeholder_53, :db_insert_placeholder_54, :db_insert_placeholder_55), (:db_insert_placeholder_56, :db_insert_placeholder_57, :db_insert_placeholder_58, :db_insert_placeholder_59, :db_insert_placeholder_60, :db_insert_placeholder_61, :db_insert_placeholder_62, :db_insert_placeholder_63), (:db_insert_placeholder_64, :db_insert_placeholder_65, :db_insert_placeholder_66, :db_insert_placeholder_67, :db_insert_placeholder_68, :db_insert_placeholder_69, :db_insert_placeholder_70, :db_insert_placeholder_71), (:db_insert_placeholder_72, :db_insert_placeholder_73, :db_insert_placeholder_74, :db_insert_placeholder_75, :db_insert_placeholder_76, :db_insert_placeholder_77, :db_insert_placeholder_78, :db_insert_placeholder_79), (:db_insert_placeholder_80, :db_insert_placeholder_81, :db_insert_placeholder_82, :db_insert_placeholder_83, :db_insert_placeholder_84, :db_insert_placeholder_85, :db_insert_placeholder_86, :db_insert_placeholder_87), (:db_insert_placeholder_88, :db_insert_placeholder_89, :db_insert_placeholder_90, :db_insert_placeholder_91, :db_insert_placeholder_92, :db_insert_placeholder_93, :db_insert_placeholder_94, :db_insert_placeholder_95), (:db_insert_placeholder_96, :db_insert_placeholder_97, :db_insert_placeholder_98, :db_insert_placeholder_99, :db_insert_placeholder_100, :db_insert_placeholder_101, :db_insert_placeholder_102, :db_insert_placeholder_103), (:db_insert_placeholder_104, :db_insert_placeholder_105, :db_insert_placeholder_106, :db_insert_placeholder_107, :db_insert_placeholder_108, :db_insert_placeholder_109, :db_insert_placeholder_110, :db_insert_placeholder_111), (:db_insert_placeholder_112, :db_insert_placeholder_113, :db_insert_placeholder_114, :db_insert_placeholder_115, :db_insert_placeholder_116, :db_insert_placeholder_117, :db_insert_placeholder_118, :db_insert_placeholder_119), (:db_insert_placeholder_120, :db_insert_placeholder_121, :db_insert_placeholder_122, :db_insert_placeholder_123, :db_insert_placeholder_124, :db_insert_placeholder_125, :db_insert_placeholder_126, :db_insert_placeholder_127), (:db_insert_placeholder_128, :db_insert_placeholder_129, :db_insert_placeholder_130, :db_insert_placeholder_131, :db_insert_placeholder_132, :db_insert_placeholder_133, :db_insert_placeholder_134, :db_insert_placeholder_135); Array ( [:db_insert_placeholder_0] => language.en:field.storage.block_content.body [:db_insert_placeholder_1] => b:0; [:db_insert_placeholder_2] => -1 [:db_insert_placeholder_3] => 1410422618.847 [:db_insert_placeholder_4] => 1 [:db_insert_placeholder_5] => [:db_insert_placeholder_6] => 0 [:db_insert_placeholder_7] => 0 [:db_insert_placeholder_8] => language.en:field.storage.comment.comment_body [:db_insert_placeholder_9] => b:0; [:db_insert_placeholder_10] => -1 [:db_insert_placeholder_11] => 1410422618.847 [:db_insert_placeholder_12] => 1 [:db_insert_placeholder_13] => [:db_insert_placeholder_14] => 0 [:db_insert_placeholder_15] => 0 [:db_insert_placeholder_16] => language.en:field.storage.media.field_copyright [:db_insert_placeholder_17] => b:0; [:db_insert_placeholder_18] => -1 [:db_insert_placeholder_19] => 1410422618.847 [:db_insert_placeholder_20] => 1 [:db_insert_placeholder_21] => [:db_insert_placeholder_22] => 0 [:db_insert_placeholder_23] => 0 [:db_insert_placeholder_24] => language.en:field.storage.media.field_image [:db_insert_placeholder_25] => b:0; [:db_insert_placeholder_26] => -1 [:db_insert_placeholder_27] => 1410422618.847 [:db_insert_placeholder_28] => 1 [:db_insert_placeholder_29] => [:db_insert_placeholder_30] => 0 [:db_insert_placeholder_31] => 0 [:db_insert_placeholder_32] => language.en:field.storage.micro.field_body [:db_insert_placeholder_33] => b:0; [:db_insert_placeholder_34] => -1 [:db_insert_placeholder_35] => 1410422618.847 [:db_insert_placeholder_36] => 1 [:db_insert_placeholder_37] => [:db_insert_placeholder_38] => 0 [:db_insert_placeholder_39] => 0 [:db_insert_placeholder_40] => language.en:field.storage.micro.field_image [:db_insert_placeholder_41] => b:0; [:db_insert_placeholder_42] => -1 [:db_insert_placeholder_43] => 1410422618.847 [:db_insert_placeholder_44] => 1 [:db_insert_placeholder_45] => [:db_insert_placeholder_46] => 0 [:db_insert_placeholder_47] => 0 [:db_insert_placeholder_48] => language.en:field.storage.micro.field_link [:db_insert_placeholder_49] => b:0; [:db_insert_placeholder_50] => -1 [:db_insert_placeholder_51] => 1410422618.847 [:db_insert_placeholder_52] => 1 [:db_insert_placeholder_53] => [:db_insert_placeholder_54] => 0 [:db_insert_placeholder_55] => 0 [:db_insert_placeholder_56] => language.en:field.storage.node.body [:db_insert_placeholder_57] => b:0; [:db_insert_placeholder_58] => -1 [:db_insert_placeholder_59] => 1410422618.847 [:db_insert_placeholder_60] => 1 [:db_insert_placeholder_61] => [:db_insert_placeholder_62] => 0 [:db_insert_placeholder_63] => 0 [:db_insert_placeholder_64] => language.en:field.storage.node.field_blog_tags [:db_insert_placeholder_65] => b:0; [:db_insert_placeholder_66] => -1 [:db_insert_placeholder_67] => 1410422618.847 [:db_insert_placeholder_68] => 1 [:db_insert_placeholder_69] => [:db_insert_placeholder_70] => 0 [:db_insert_placeholder_71] => 0 [:db_insert_placeholder_72] => language.en:field.storage.node.field_design_details [:db_insert_placeholder_73] => b:0; [:db_insert_placeholder_74] => -1 [:db_insert_placeholder_75] => 1410422618.847 [:db_insert_placeholder_76] => 1 [:db_insert_placeholder_77] => [:db_insert_placeholder_78] => 0 [:db_insert_placeholder_79] => 0 [:db_insert_placeholder_80] => language.en:field.storage.node.field_drupal_version [:db_insert_placeholder_81] => b:0; [:db_insert_placeholder_82] => -1 [:db_insert_placeholder_83] => 1410422618.847 [:db_insert_placeholder_84] => 1 [:db_insert_placeholder_85] => [:db_insert_placeholder_86] => 0 [:db_insert_placeholder_87] => 0 [:db_insert_placeholder_88] => language.en:field.storage.node.field_files [:db_insert_placeholder_89] => b:0; [:db_insert_placeholder_90] => -1 [:db_insert_placeholder_91] => 1410422618.847 [:db_insert_placeholder_92] => 1 [:db_insert_placeholder_93] => [:db_insert_placeholder_94] => 0 [:db_insert_placeholder_95] => 0 [:db_insert_placeholder_96] => language.en:field.storage.node.field_gallery [:db_insert_placeholder_97] => b:0; [:db_insert_placeholder_98] => -1 [:db_insert_placeholder_99] => 1410422618.847 [:db_insert_placeholder_100] => 1 [:db_insert_placeholder_101] => [:db_insert_placeholder_102] => 0 [:db_insert_placeholder_103] => 0 [:db_insert_placeholder_104] => language.en:field.storage.node.field_links [:db_insert_placeholder_105] => b:0; [:db_insert_placeholder_106] => -1 [:db_insert_placeholder_107] => 1410422618.847 [:db_insert_placeholder_108] => 1 [:db_insert_placeholder_109] => [:db_insert_placeholder_110] => 0 [:db_insert_placeholder_111] => 0 [:db_insert_placeholder_112] => language.en:field.storage.node.field_single_image [:db_insert_placeholder_113] => b:0; [:db_insert_placeholder_114] => -1 [:db_insert_placeholder_115] => 1410422618.847 [:db_insert_placeholder_116] => 1 [:db_insert_placeholder_117] => [:db_insert_placeholder_118] => 0 [:db_insert_placeholder_119] => 0 [:db_insert_placeholder_120] => language.en:field.storage.node.field_technical_details [:db_insert_placeholder_121] => b:0; [:db_insert_placeholder_122] => -1 [:db_insert_placeholder_123] => 1410422618.847 [:db_insert_placeholder_124] => 1 [:db_insert_placeholder_125] => [:db_insert_placeholder_126] => 0 [:db_insert_placeholder_127] => 0 [:db_insert_placeholder_128] => language.en:field.storage.user.user_picture [:db_insert_placeholder_129] => b:0; [:db_insert_placeholder_130] => -1 [:db_insert_placeholder_131] => 1410422618.847 [:db_insert_placeholder_132] => 1 [:db_insert_placeholder_133] => [:db_insert_placeholder_134] => 0 [:db_insert_placeholder_135] => 0 ) in Drupal\Core\Database\Connection->query() (line 569 of /var/www/ef8/web/core/lib/Drupal/Core/Database/Connection.php).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Berdir’s picture

Any idea how this happened?

catch’s picture

dawehner’s picture

@cache
Thank you for more links.

Note: this is stock ubuntu 14.04, so version: 5.5.40-0ubuntu0.14.04.1
The transaction isolation level is REPEATABLE-READ

Berdir’s picture

Berdir’s picture

Component: configuration system » cache system

I'm seeing slightly different version of this, a lot:

Drupal\Core\Database\IntegrityConstraintViolationException</em>: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry &#039;block_content.type.basic&#039; for key &#039;PRIMARY&#039;: INSERT INTO {cache_config} (cid, data, expire, created, serialized, tags, checksum)

I'm not sure if it is the same problem, but it looks related.

This has to be something in setMultiple(), which I think explains why this is happening only for configuration, but it really isn't a configuration problem, it's a cache issue.

> SELECT * FROM information_schema.session_variables
    -> WHERE variable_name = 'tx_isolation';
+---------------+----------------+
| VARIABLE_NAME | VARIABLE_VALUE |
+---------------+----------------+
| TX_ISOLATION  | READ-COMMITTED |
+---------------+----------------+

Version: MariaDB 5.5.41

Based on how often I'm seeing and how annoying it is, I think this is critical.

Berdir’s picture

maybe the different error is because of READ COMMITTED? I've just had an error locally as well, and there it was a deadlock.

catch’s picture

See #1650930: Use READ COMMITTED by default for MySQL transactions for lots of discussion on READ_COMMITTED.

Berdir’s picture

Discussed this a bit with @DamZ.

Looks like the best option, short of locking the whole process of fetching config from the database and putting them into cache, which would be quite expensive, is to simply ignore those kind of exceptions.

We more or less do the same in the Merge query implementation, where we have the same scenario (check if it exists and if not, insert it.), except there we play save and do the update of that row, but I think we don't need that in this case, two concurrent cache writes likely have the same information to put into the cache, otherwise that would be.. strange.

So, we should simply check for those two specific errors (deadlock and IntegrityConstraintViolationException) and ignore them. Do we need a special exception for Deadlock, just like we have for integrity constraints, so that we can abstract them from the used backend?

Berdir’s picture

One other idea I had was to put a write lock on the whole table but I don't think we can do that in a database agnostic way.

Berdir’s picture

Status: Active » Needs review
FileSize
4.55 KB

Let's presume for a minute that we live in a mysql-only world and can use REPLACE.

Note: INSERT ON DUPLICATE KEY UPDATE would also be an option, but I have no idea how to use that with set multiple.

Note2: As discussed with @catch and also mentioned by @Fabianx somewhere, we can also have a special implementation that we use when mysql is enabled, but that won't actually solve this problem for different backends. So we probably still want to do the part about eating the exceptions.

Status: Needs review » Needs work

The last submitted patch, 10: replace-2336627-10.patch, failed testing.

Berdir’s picture

Status: Needs work » Needs review
FileSize
1.09 KB
5.16 KB

Haha, fastest test run ever.

Looks like something has been eating those exceptions on setMultiple() previously.

Berdir’s picture

FileSize
5.15 KB

Here's the same as an optional subclass that is used if the database type is mysql.

Fabianx’s picture

I like this, any numbers on performance, yet?

swentel’s picture

FileSize
5.15 KB
1.18 KB

Found two nitpicks, rolled quickly.

Berdir’s picture

I can reproduce the deadlock quite easily, just run something like ab -n 100 -c 10 http://d8/ and then do a drush cr while it is runner.

I see a few of those deadlocks every time I run it.

With the patch, they're gone. Did run drush cr in a row 5 times, now errors.

What's interesting is that I can however reproduce the PHP Fatal error: Class '__TwigTemplate_61f82e0aac09f5d1126d7ded09d615cd03eb2a460783f2d84db6dec1de74ecfa' errors with the same process. They come with a few notices and errors in the watchdog, will update the twig issue.

Not sure what else we need to do here? Maybe we should check that the error is silently ignored in the default implementation as well, and we should possibly make sure to test both implementations in the generic cache backend for mysql. not sure if we have an easy way to run a test just with MySQL? Or are the implicit tests everywhere else good enough?

Fabianx’s picture

Issue tags: +needs profiling, +Performance

The bug fix is RTBC, but for the new MySQL backend, I think we need:

a) some profiling
b) maybe some tests?

Not sure ...

Also: Berdir is this ready enough for warranting RTBC from your POV? (e.g. is the patch finished)

Berdir’s picture

I wouldn't know what else to do, so I think so, yes. Since I'm building manual params and so on, the REPLACE query building should be double-checked for correctness and security.

a) Pretty hard to reliably profile I guess, as it depends a lot on your mysql configuration. What I've done is compare drush cr runtimes, 3 runs each.

HEAD: 5.69s, 5.81s, 5.72s => ~5.74
Patch: 5.61s, 5.49s, 5.31s => ~5.47

That's ~5% faster... (with xdebug and uprofiler disabled)

Uhm, I didn't make those numbers up, even if it looks like that ;) Either way, please run the same yourself to get different numbers.

b) Yeah, as mentioned, the problem is I think running that test only when the database type is mysql. Note that by default, the new backend is used, so *all* existing tests except DatabaseBackendUnitTest, which hardcodes the class name, are using the mysql specific implementation. So we have tons of coverage ,just not explicit one. We could have the same logic in the unit test class, but then we're not testing the default implementation anymore at all on mysql testbots..

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -needs profiling

RTBC, I am fine with the implicit coverage, the queries are easy enough. We hopefully have postgres and the unit test to test the non-SQL part of the cache backend.

I am happy with the benchmarks and the fix of the deadlock. Around 5% is what I expected :).

catch’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/lib/Drupal/Core/Cache/MySqlDatabaseBackend.php
    @@ -0,0 +1,82 @@
    +  protected function doSet($cid, $data, $expire, $tags) {
    

    Why not call through to doSetMultiple()?

  2. +++ b/core/lib/Drupal/Core/Cache/MySqlDatabaseBackend.php
    @@ -0,0 +1,82 @@
    +    $this->connection->query("REPLACE {" . $this->bin . "} (cid, created, expire, tags, checksum, data, serialized) VALUES (:cid, :created, :expire, :tags, :checksum, :data, :serialized)", array(
    

    Needs an escapeTable().

  3. +++ b/core/lib/Drupal/Core/Cache/MySqlDatabaseBackend.php
    @@ -0,0 +1,82 @@
    +    $query = "REPLACE {" . $this->bin . "} (cid, created, expire, tags, checksum, data, serialized) VALUES " . implode(', ', $query_parts);
    

    Same here with escapeTable()

Placeholder generation looks OK. This is pretty ugly but so are deadlocks.

daffie’s picture

Berdir asked me to test the patch from comment #15 with a postgreSQL database. I have ran "drush cr" 6 times and had no problems and no watchdog-messages.

Berdir’s picture

Title: Deadlock on cache_config » Deadlock on cache_config (DatabaseBackend::setMultiple())
Issue summary: View changes
daffie’s picture

With the postgreSQL database with the patch applied:
- I ran ab -n 100 -c 10 http://www.drupal80.com/
- While apache bench was running I did a drush cr
- After apache bench finished I did drush ws
From the watchdog:

 ID   Date          Type      Severity  Message                                 
 273  10/Mar 16:21  php                 Drupal\Component\Plugin\Exception\Plugi 
                                        nNotFoundException: The "" entity type  
                                        does not exist. in                      
                                        Drupal\Core\Entity\EntityManager->getDe 
                                        finition() (line 257 of                 
                                        /var/www/drupal80.com/drupal            
 272  10/Mar 16:21  php                 Notice: Undefined index: entity_type in 
                                        Drupal\views\Plugin\views\row\EntityRow 
                                        ->init() (line 93 of                    
                                        /var/www/drupal80.com/drupal/core/modul 
                                        es/views/src/Plugin/views/row/EntityRow 
                                        .php).                                  
 271  10/Mar 16:21  php                 Drupal\Component\Plugin\Exception\Plugi 
                                        nNotFoundException: The "" entity type  
                                        does not exist. in                      
                                        Drupal\Core\Entity\EntityManager->getDe 
                                        finition() (line 257 of                 
                                        /var/www/drupal80.com/drupal            
 270  10/Mar 16:21  php                 Notice: Undefined index: entity_type in 
                                        Drupal\views\Plugin\views\row\EntityRow 
                                        ->init() (line 93 of                    
                                        /var/www/drupal80.com/drupal/core/modul 
                                        es/views/src/Plugin/views/row/EntityRow 
                                        .php).                                  
 269  10/Mar 16:21  php                 Drupal\Component\Plugin\Exception\Plugi 
                                        nNotFoundException: The "" entity type  
                                        does not exist. in                      
                                        Drupal\Core\Entity\EntityManager->getDe 
                                        finition() (line 257 of                 
                                        /var/www/drupal80.com/drupal            
 268  10/Mar 16:21  php                 Notice: Undefined index: entity_type in 
                                        Drupal\views\Plugin\views\row\EntityRow 
                                        ->init() (line 93 of                    
                                        /var/www/drupal80.com/drupal/core/modul 
                                        es/views/src/Plugin/views/row/EntityRow 
                                        .php).                                  
 267  10/Mar 16:21  php                 Drupal\Component\Plugin\Exception\Plugi 
                                        nNotFoundException: The "" entity type  
                                        does not exist. in                      
                                        Drupal\Core\Entity\EntityManager->getDe 
                                        finition() (line 257 of                 
                                        /var/www/drupal80.com/drupal            
 266  10/Mar 16:21  php                 Drupal\Component\Plugin\Exception\Plugi 
                                        nNotFoundException: The "" entity type  
                                        does not exist. in                      
                                        Drupal\Core\Entity\EntityManager->getDe 
                                        finition() (line 257 of                 
                                        /var/www/drupal80.com/drupal            
 265  10/Mar 16:21  php                 Notice: Undefined index: entity_type in 
                                        Drupal\views\Plugin\views\row\EntityRow 
                                        ->init() (line 93 of                    
                                        /var/www/drupal80.com/drupal/core/modul 
                                        es/views/src/Plugin/views/row/EntityRow 
                                        .php).                                  
 264  10/Mar 16:21  php                 Notice: Undefined index: entity_type in 
                                        Drupal\views\Plugin\views\row\EntityRow 
                                        ->init() (line 93 of                    
                                        /var/www/drupal80.com/drupal/core/modul 
                                        es/views/src/Plugin/views/row/EntityRow 
                                        .php).

With no patch:

 ID   Date          Type      Severity  Message
 135  10/Mar 16:18  php                 Drupal\Core\Database\IntegrityConstrain
                                        tViolationException: SQLSTATE[23505]:
                                        Unique violation: 7 ERROR:  duplicate
                                        key value violates unique constraint
                                        "cache_config____pkey"
                                        DETAIL:  Key
 134  10/Mar 16:18  php                 Drupal\Core\Database\IntegrityConstrain
                                        tViolationException: SQLSTATE[23505]:
                                        Unique violation: 7 ERROR:  duplicate
                                        key value violates unique constraint
                                        "cache_config____pkey"
                                        DETAIL:  Key
 133  10/Mar 16:18  php                 Drupal\Core\Database\IntegrityConstrain
                                        tViolationException: SQLSTATE[23505]:
                                        Unique violation: 7 ERROR:  duplicate
                                        key value violates unique constraint
                                        "cache_config____pkey"
                                        DETAIL:  Key
 132  10/Mar 16:18  php                 Drupal\Core\Database\IntegrityConstrain
                                        tViolationException: SQLSTATE[23505]:
                                        Unique violation: 7 ERROR:  duplicate
                                        key value violates unique constraint
                                        "cache_config____pkey"
                                        DETAIL:  Key
Berdir’s picture

Thanks, that's interesting. So it looks like PostgreSQL gets the same exception as MySQL in READ COMMITTED. And it's an exception we can catch, that the patch already does.. except that it then apparently fails on something weird...

Berdir’s picture

Status: Needs work » Needs review
FileSize
4.71 KB
1.61 KB

Addressing the review from @catch, untested.

Berdir’s picture

I can't reproduce the warnings from #23 on MySQL with READ COMMITTED and the new implementation commented out. I did get some rare serialization exceptions then, but not with the new implementation, so I'd consider this issue fixed with this patch.

I suggest you open a new issue to see if you can improve something for those problems on PostgreSQL.

Anyone wants to put this back to RTBC? :)

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

I assume tests passing is test enough of the interdiff :).

Back to RTBC

alexpott’s picture

Status: Reviewed & tested by the community » Needs review
+++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
@@ -48,7 +48,13 @@ function __construct(Connection $connection, CacheTagsChecksumInterface $checksu
+    // Use the optimized mysql implementation if possible.
+    if ($this->connection->databaseType() == 'mysql') {
+      return new MySqlDatabaseBackend($this->connection, $this->checksumProvider, $bin);
+    }
+    else {
+      return new DatabaseBackend($this->connection, $this->checksumProvider, $bin);
+    }

Is this the right way to do this? Couldn't make a new query class of Replace (or something like that) that has different implementations depending on the driver? That way this is more re-usable and is easy to implement for SQLite which also can do replace queries.

andypost’s picture

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackendFactory.php
@@ -48,7 +48,13 @@ function __construct(Connection $connection, CacheTagsChecksumInterface $checksu
+    // Use the optimized mysql implementation if possible.
+    if ($this->connection->databaseType() == 'mysql') {
+      return new MySqlDatabaseBackend($this->connection, $this->checksumProvider, $bin);

Also please add that under some mysql versions this could cause transaction commit because cache_* and cachetags tables are created on demand

catch’s picture

Agreed with #28.

mikeytown2’s picture

Was chatting in IRC and I really think we should use INSERT ON DUPLICATE KEY UPDATE instead of REPLACE. I've been using "INSERT ON DUPLICATE KEY UPDATE" in my MySQL only cache backend module https://www.drupal.org/project/apdqc.

Anyway REPLACE has more drawbacks than INSERT ON DUPLICATE KEY UPDATE.
http://stackoverflow.com/questions/9168928/what-are-practical-difference...

REPLACE internally performs a delete and then an insert. This can cause problems if you have a foreign key constraint pointing at that row. In this situation the REPLACE could fail or worse: if your foreign key is set to cascade delete, the REPLACE will cause rows from other tables to be deleted

Insert on Duplicate Key update is almost 32 times faster.

One reason to use REPLACE is if you're using a FEDERATED Storage Engine; something that shouldn't be done for the cache tables: http://dev.mysql.com/doc/refman/5.0/en/federated-limitations.html

FEDERATED accepts INSERT ... ON DUPLICATE KEY UPDATE statements, but if a duplicate-key violation occurs, the statement fails with an error.

Another reason to use REPLACE is if the database does not fit inside the innodb buffer pool; thus the reads will hit disk; also if you're using TokuDB instead of InnoDB, REPLACE is better.
http://www.tokutek.com/2010/07/why-insert-on-duplicate-key-update-may-be...

jeqq queued 25: replace-2336627-25.patch for re-testing.

Status: Needs review » Needs work

The last submitted patch, 25: replace-2336627-25.patch, failed testing.

geerlingguy’s picture

chx’s picture

  1. Introduce a SimpleMerge class to MySQL using IODKU -- it should only be used if there is only one unique index (which is typically the primary key)
  2. Add a mysql.cache.backend.database service utilizing it
amateescu’s picture

I was considering something very similar to #35, but using a new Upsert query class instead, which will be implemented for PostgreSQL and SQLite as well.

Discussed with @chx in IRC and we're working together on it :)

chx’s picture

The code is http://cgit.drupalcode.org/sandbox-chx-1857558/log/?h=upsert here, the two basic tests, stolen from merge pass on mysql. My involvement ends here , amateescu is working on the postgresql and sqlite implementantions and I hope he will add more tests and in general carry this to a happy end.

Fabianx’s picture

Priority: Major » Critical

And I am gonna make that critical as I don't think we should release with a known and reproducable race condition. And obviously core commiters can decide if it should stay critical.

(Background: There was some discussion on IRC between catch and alexpott already, if PHP fatal errors that are reproducable making the system unusable are critical, so its important for the core maintainers to discuss this.)

Reasoning:

- drush cr might theoretically never complete as you always get a deadlock before it completes, also that random users get fatal PHP errors is quite a big problem.

And it is likely many other things in core and contrib will get into the same situation in the future.

Also this is a 100% reproducable problem with clear steps to reproduce.

dawehner’s picture

I agree with this being critical. I'm hitting this one way more often than the twig one.
Also the probably of hitting this afaik scales with the size of the site.

Wim Leers’s picture

Also this is a 100% reproducable problem with clear steps to reproduce.

I'm hitting this one way more often than the twig one.
Also the probably of hitting this afaik scales with the size of the site.

Those are pretty good reasons to make this critical.

catch’s picture

I've updated https://www.drupal.org/core/issue-priority to clarify why issues like this should be critical - previously any 'not common PHP error' was more or less major by default which doesn't reflect the severity of something like this. https://www.drupal.org/node/45111/revisions/view/8664850/8729606

dawehner’s picture

Linking to the issue which will provide the needed functionality: #2542776: Add an Upsert class

alexpott’s picture

Issue tags: +Triaged D8 critical

Discussed with @catch, @xjm, @effulgentsia and @webchick, we all agreed that this is a critical issue to be resolved before Drupal 8's release as this is easily reproducible on real sites with traffic.

Wim Leers’s picture

Title: Deadlock on cache_config (DatabaseBackend::setMultiple()) » [PP-1] Deadlock on cache_config (DatabaseBackend::setMultiple())

AFAICT this is blocked on #2542776: Add an Upsert class. Updating as such.

(Not sure why #2542776 is not critical.)

amateescu’s picture

Status: Needs work » Postponed
Issue tags: +scalability
FileSize
2.01 KB

When the blocker gets in, this is what we need to do here.

mpdonadio’s picture

Title: [PP-1] Deadlock on cache_config (DatabaseBackend::setMultiple()) » Deadlock on cache_config (DatabaseBackend::setMultiple())
Status: Postponed » Needs review

Status: Needs review » Needs work

The last submitted patch, 45: 2336627-45.patch, failed testing.

amateescu’s picture

The patch from #2542776: Add an Upsert class is not pushed yet.

dawehner’s picture

Status: Needs work » Needs review

There we go

amateescu queued 45: 2336627-45.patch for re-testing.

amateescu’s picture

FileSize
2.78 KB
1.01 KB

Let's be consistent and use upsert for set() as well.

Berdir’s picture

  1. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -178,8 +178,9 @@ public function set($cid, $data, $expire = Cache::PERMANENT, array $tags = array
       protected function doSet($cid, $data, $expire, $tags) {
         $fields = array(
    -      'created' => round(microtime(TRUE), 3),
    +      'cid' => $this->normalizeCid($cid),
           'expire' => $expire,
    

    Isn't it even easier if we just call setMultiple() in set() and drop all this completely?

    We used to do it because set() was faster, but if we now use an equally fast implementation, that should result in quite a bit less duplicated code?

  2. +++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
    @@ -234,34 +235,20 @@ public function setMultiple(array $items) {
    +      // Only pass the values since the order of $fields matches the order of
    +      // the insert fields. This is a performance optimization to avoid
    +      // unnecessary loops within the method.
    +      $query->values(array_values($fields));
    

    Assume that I repeat my comment here from the other issue about whether this is worth it and if we really are considering to do micro-optimizations like this, why not just built a numeric array in the first place? readability?

I assume with the simplified implementation, it is no longer possible to replace the cache tag increment in DatabaseCacheTagsChecksum::invalidateTags() with this? I guess that's not a big issue, since cache tag invalidations really shouldn't have a high concurrency...

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Looks perfect

alexpott’s picture

Status: Reviewed & tested by the community » Needs review

Going to set to needs review for #52. Wrt to 52.2 I think readability is a good reason. And it is not really a micro-optimisation with large cache inserts. 52.1 looks sensible to me - however it does make me think we're missing cid normalisation in setMultiple :(

amateescu’s picture

Yup, removing all the set() code and passing everything to setMultiple() is a very good point. As for #52.2, readability++ :)

Status: Needs review » Needs work

The last submitted patch, 55: 2336627-55.patch, failed testing.

amateescu’s picture

So we still need the code that ensures the cache table exists.

alexpott’s picture

Ah nice that the cache table existence check is in setMultiple now. This is good work.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Yup, removing all the set() code and passing everything to setMultiple() is a very good point. As for #52.2, readability++ :)

I agree this is microopt for a rare usecase.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

This issue addresses a critical bug and is allowed per https://www.drupal.org/core/beta-changes. Committed 94673d2 and pushed to 8.0.x. Thanks!

  • alexpott committed 94673d2 on 8.0.x
    Issue #2336627 by amateescu, Berdir, swentel: Deadlock on cache_config (...
Wim Leers’s picture

#57 had test failures for SQLite & PostgreSQL. Is that a problem, or?

amateescu’s picture

@Wim Leers, it's not a new problem, we still have random failures for them.

olli’s picture

Nice fix!

+++ b/core/lib/Drupal/Core/Cache/DatabaseBackend.php
@@ -169,39 +178,19 @@ public function set($cid, $data, $expire = Cache::PERMANENT, array $tags = array
-  protected function doSet($cid, $data, $expire, $tags) {
...
+  public function doSetMultiple(array $items) {

Filed a tiny follow-up to #2549745: Make DatabaseBackend::doSetMultiple protected.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

jwilson3’s picture

We're still experiencing this deadlock issue on cache_config table on a production server running Nginx & MariaDB, whenever we run a drush cr (while in maintenance mode) during a composer and drush-based release deployment procedure. We cannot reproduce the error naturally on other environments.

For others landing on this issue first based on a Google search for the error message you're seeing, I recommend checking out these other related issues:

Finally, Acquia has good documentation on how to debug deadlock errors, and particularly mentioning using the Cache Debug module to get insight on the root causes, as well as setting the Isolation level to READ-COMMITTED. They also recommend that the isolation level should be set at the query level, not for all queries.

It's possible to avoid deadlocks by changing the transaction isolation (tx_isolation) variable from the default value of REPEATABLE-READ to READ-COMMITTED.

You can change the tx_isolation setting in your connection settings because it is both a session variable and a global variable. You can change it for all queries, or for select queries that are suspect. It's preferable that you modify specific queries, rather than changing tx_isolation for all queries or changing the overall database setting. This makes it easier to track down custom settings if the website begins behaving strangely.