Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mikeytown2’s picture

Status: Active » Needs review
FileSize
3.49 KB
mikeytown2’s picture

Issue tags: +Performance, +innoDB, +database cache

Status: Needs review » Needs work

The last submitted patch, 1: drupal-2336521-1-mysqli-async-cache_set.patch, failed testing.

DamienMcKenna’s picture

Just out of interest, does it make any difference when entitycache is enabled?

mikeytown2’s picture

Looking into the failures and I believe it is caused by the async query. cache_set no longer blocks until the query has completed; thus for tests where it does a cache_set followed by a cache_get the cached object isn't there yet. I might be able to reduce this issue by using this connection for cache_get as I set the isolation level to READ-UNCOMMITTED.

I have not tested it with entity cache yet.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
4.77 KB

Main issue with the last patch was database prefixes; I've fixed that (along with some other minor things) and I now get intermittent failures on the cache tests (same test rarely fails twice in a row). To get this to work 100% of the time with simpletest I would need to use the same db connection for the cache_gets. With the new patch, speedup is only 2x instead of 5x; but cache_set seems to be working 100% of the time (was getting random failures before excluding simpletest).

Status: Needs review » Needs work

The last submitted patch, 6: drupal-2336521-6-mysqli-async-cache_set.patch, failed testing.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
6.9 KB

cache_get now uses the async connection if a cache_set was done; should make the test bot happy.

Status: Needs review » Needs work

The last submitted patch, 8: drupal-2336521-8-mysqli-async-cache_set.patch, failed testing.

Status: Needs work » Needs review
mikeytown2’s picture

One small change

Status: Needs review » Needs work

The last submitted patch, 11: drupal-2336521-11-mysqli-async-cache_set.patch, failed testing.

Status: Needs work » Needs review
mikeytown2’s picture

Latest patch failed with this message.
Field is translatable. Other field.test 3190 FieldTranslationsTestCase->testTranslatableFieldSaveLoad()

Patch passes 2nd time around.

mikeytown2’s picture

Status: Needs review » Needs work

$mysqli->reap_async_query can be super slow in rare cases. Will see what I can do to address this issue.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
6.71 KB

Ended up using a connection pool to work around the slow queries. I've recorded some cachegrinds from cache clears and it shows that drupal will sometimes spend more time in cache_get than in cache_set with this patch; mind blown because writes are effectively faster than reads.

mikeytown2’s picture

mikeytown2’s picture

Set the wait_timeout to 45 seconds and it will ping the DB after 40 seconds to make sure the connection is up before using it again.

I tried using persistent connections but $mysqli->close() seems to stall so getting that working seems like a no go.

Status: Needs review » Needs work

The last submitted patch, 19: drupal-2336521-19-mysqli-async-cache_set-D7.patch, failed testing.

Status: Needs work » Needs review

Status: Needs review » Needs work

The last submitted patch, 19: drupal-2336521-19-mysqli-async-cache_set-D7.patch, failed testing.

mikeytown2’s picture

Looks like the wait_timeout and reconnect logic needs some help. I'll up the timeout to 90 seconds as testing takes longer than 45 once I get the reconnect stuff figured out.

Fabianx’s picture

Uhm, there is a better way to get a new DB connection, you just need to use a new key in settings.php or even just a new 'target'.

mikeytown2’s picture

The big performance gain comes from asynchronous queries; which is not available with PDO
http://php.net/manual/en/mysqlinfo.api.choosing.php

Fabianx’s picture

AH :), makes sense. So we could still write a DB driver for async mysql, but would need to use mysqli directly instead of PDO.

mikeytown2’s picture

Writing a DB driver for async mysqli is essentially what I've done to a very limited extent here. I'll work on the reconnect logic and post a new patch once I have that working.

Fabianx’s picture

We discussed this briefly and this could solve the critical of #2347867: Race conditions with lock/cache using non-database storage - add a non-transactional database connection while making the async support optional, but easily available via a config change / contrib module.

mikeytown2’s picture

Status: Needs work » Needs review
FileSize
10.1 KB

The wait_timeout and reconnect is now working at 1 second; so that bug has now been fixed. This patch uses 90 seconds as that seems to be a good happy medium; in the future the current wait_timeout should be taken into account when setting this value. I also fixed some issues with the connection pool management; the $db_connections array is now keyed off of $mysqli->thread_id. This allows for way better house keeping when using mysqli_poll(); luckily $mysqli->thread_id seems to be a noop from my testing.

pg_send_query() seems to be the way to do this using PostgreSQL; I don't use PostgreSQL though.

Fabianx’s picture

Looks great! :)

btopro’s picture

is there some suggestion of how to test this? an operation that you know invokes a ton of cache sets to do an A / B comparison / try and bench mark? Thinking this would reduce the pain after a full cache clear yes?

mikeytown2’s picture

It knocks 5 seconds off of a cache clear (on our test server), so yeah things that use cache_set will see the biggest improvements

mikeytown2’s picture

Title: DEADLOCK - RECORD LOCKS on cache_field table » Use asynchronous MySQLi connection to avoid RECORD LOCKS (Deadlocks) on cache tables
FileSize
12.63 KB

Gave everything proper documentation & did some micro optimizations (1 less query per connection if # of connections is greater than 1). The only place that has hard coded values is now inside of the advanced drupal_static() pattern; thus those can be overwritten by modifying drupal_static('drupal_mysqli_get_object') if it's really needed.

mikeytown2’s picture

Little bit more refactoring; created 2 new functions and reduced code duplication... Looks like I need to merge this in better next time; some of the documentation was accidentally changed. Will wait for the testbot to pass before fixing it. Should also handle the $error array from mysqli_poll.

mikeytown2’s picture

This one handles errors (like dropped connections) now. This is in it's final form as a D7 core patch. Future goals for asynchronous MySQLi connection pools is to combine it with Cache MultiGet & Alternative Database Cache to create the best DB backed cache backend. This is needed even when things like memcache/APC are in use do to the nature of the cache_field table in D7.

Getting an asynchronous connection pool into D8 core would be nice; unfortunately I won't be the one creating the patch for it. I hope someone picks up the torch as all five issues listed here https://groups.drupal.org/node/415883 make drupal core scale way better by its self.

Anybody’s picture

Whao this looks great! Is there any change in configuration one has to make after the patch has been roled against core or does it simply work?

btopro’s picture

no. it simply "Just Works" (TM). We run this in production as part of our pressflow fork -- https://github.com/btopro/Presser-Flow-FORK

gielfeldt’s picture

Two questions:

1.) Does this belong in core? The way the Drupal is designed, this would easily fit into a custom cache backend.
2.) Ins't this approach inherently "dangerous"? Using async queries for writing cache may improve speed, but the application is now under the wrong assumption that the operation has been performed. I can think of at least one scenario where this could become a problem:

#1 Read from cache.
#2 Cache is empty
#3 Acquire lock
#4 Fetch from DB.
#5 Write to cache (cache write is actually delayed. This is my assumption of how async queries work, since I've never used the mysql async feature before, please correct me if I'm wrong. If I'm wrong, just ignore this post :-).
#6 Release lock

The above scenario is a standard cache/lock pattern, and using async queries for the cache could potentially introduce a thundering herd problem, which the lock is supposed to eliminate. It sounds like this is the issue you addressed for the test bot. However, I think your solution with re-using the connection from the pool will only solve the problem inside that particular thread. Once concurrency is added, the problem will remain.

Also; using another connection for the cache introduces this problem: #1679344: Race condition in node_save() when not using DB for cache_field

mikeytown2’s picture

Status: Needs review » Closed (won't fix)

I recently created a cache module. It works around the cache_field race condition by using the core db connection so that operation happens inside of the transaction.
https://www.drupal.org/project/apdqc

I'll be creating a lock.inc for apdqc (#2387371: Create a lock back end that uses the APDQC connection) which can have some logic so that the lock doesn't get released until the write to the db is complete.

Sara101’s picture

Is this patch still recommended if using the Asynchronous Prefetch Database Query Cache module or is the module a replacement for this issue?

ibtisem’s picture

Hi ,

I am working on Drupal 7.69 for website on linux server and mysql . when i run the cache since backend : (/admin/config/development/performance) I face an error :-

PDOException: SQLSTATE[HY000]: SQLSTATE[40001]: Serialization failure: General error: 1205 Lock wait timeout exceeded; try restarting transaction ... after that the server freezes and website inaccessible .

After searching on similar issues I found that solution suggested are using :

add to settings.php file
$databases['default']['default']['init_commands']['isolation'] = "SET SESSION tx_isolation='READ-COMMITTED'";

can this line correct the deadlock error ?

Any suggestions would be appreciated.

Thanks