Problem/Motivation

It is possible that invalidating cache tags causes race conditions in combination with slow database transactions, so that an tag is invalidated during a transaction, then another process reads that and fetches the old values from the database before the transaction is committed.

Proposed resolution

Based on #2966607: Invalidating 'node_list' and other broad cache tags early in a transaction severely increases lock wait time and probability of deadlock, we can delay cache tag invalidation to happen after the database transaction has been committed.

Note: This will need a new version of the core patch that I will upload soon.

Remaining tasks

Write an explicit test for it.

User interface changes

API changes

Data model changes

Release notes snippet

Comments

Berdir created an issue. See original summary.

berdir’s picture

Issue summary: View changes
Status: Active » Needs review
StatusFileSize
new2.29 KB

Here's a first patch, all cache related tests are passing, the only fails are in the queue implementation, but it looks like core changed/improved the test coverage there.

berdir’s picture

And a second iteration, this also uses that pattern for direct deletions, which are specifically used for deleting cached entities which is the problem we are fighting.

It could be used for other interactions too, but deleteAll(), invalidation and so on are not often used and not during transactions.

berdir’s picture

Tracked down a problem we sometimes had when a cache item was loaded again during the transaction that had been deleted before, this caused some nasty loops or other problems. Forgot to check for scheduled deletions when fetching items again.

This now also relies on the latest iteration of the core patch for the $success argument.

davidwhthomas’s picture

@Berdir, many thanks for the update.

We are also hitting with this issue when the db transaction and cache invalidation is not in sync + stale cache and looking into applying these core + redis patches.

Just noting after applying the latest core patch ( from https://www.drupal.org/files/issues/2019-09-10/2966607-127.patch ) and the redis patch here, this error:

[15-Oct-2019 03:46:37] PHP Fatal error: Class Drupal\redis\Cache\RedisCacheTagsChecksum contains 1 abstract method and must therefore be declared abstract or implement the remaining methods (Drupal\redis\Cache\RedisCacheTagsChecksum::getTagInvalidationCounts) in /.../web/modules/contrib/redis/src/Cache/RedisCacheTagsChecksum.php on line 14"

Seems need to also implement RedisCacheTagsChecksum::getTagInvalidationCounts

edit: perhaps that function isn't used so much for redis, but this approach could be ok to add that function, just a suggestion:

  /**
   * Return the tag invalidation count for a set of tags
   * @param array $tags
   *   Tags in the form of [0 => 'node:12', 1 => 'node:15'] etc..
   * @return array $invalidated tags
   *   Tags in the form of ['node:12' => 1, 'node:15' => 1] etc..
   */
  protected function getTagInvalidationCounts(array $tags) {
    // Get plain intersect array of requested and invalidated tags
    $invalidated_tags = array_values(array_intersect($tags, array_keys($this->invalidatedTags)));
    // Default to one invalidation per item
    return array_fill_keys($invalidated_tags, 1);
  }
berdir’s picture

Yes, this patch has not been updated to be used with the most recent core patch. We're using it together with https://www.drupal.org/files/issues/2019-01-26/2966607-93.patch.

I'll update it once the core patch lands or that version no longer applies.

davidwhthomas’s picture

Thanks Berdir!

Here's an updated patch for now, just in case useful later

berdir’s picture

Thanks, that implementation isn't correct however.

The work I did in the core issue in recent patches allows us to massively simplify the implementation here again, as all the static cache handling is now in the trait and we only need to do the actual work.

interdiff against my latest patch.

Not tested yet.

berdir’s picture

Same patch but without the .info.yml change.

wim leers’s picture

#6 + #7: Wow, nice coincidence in reviving this issue today and then having the core patch land the same day :D

I was gonna ask "why not queue tests?" but then I realized you'd need a Redis instance 🤦‍♂️ Good thing I did not ask that!

Does this need manual testing? I think it's sufficient for Berdir to test this? Does this warrant an 8.x-2.x branch?

davidwhthomas’s picture

Nice, thanks for the clarification! (#9) I'm not so familiar with the update and your insight is much appreciated.

I'm going to test out this latest redis patch together with https://www.drupal.org/files/issues/2019-09-10/2966607-127.patch it looks like a good update.

Just one thing I'd like to clarify, if the cache tag is queued for delayed invalidation in the request, there is logic to prevent queuing it again, however after the transaction and the cache invalidation, want to double-check the same tags be invalidated later in the same request (e.g another node update transaction for the same node in same request?)

The reason/use case being we have some long running processes (drush queue workers for import) if the same node is updated in the same drush request want to be sure the cache can be invalidated later in that same request. That looks to be the case, but good to confirm. (will test it out)

edit: just saw Wim's comment (posted as I was typing this) and the fact the related core patch is now merged, great stuff! Will test it out with that, many thanks guys.

berdir’s picture

> Does this need manual testing?

The more testing the better. But the existing patch has been tested on large D8 production sites since last february, so I think it's pretty stable.

> Does this warrant an 8.x-2.x branch?

I don't think so, there is no public API change here. We just need 8.8, so I'm going to wait on that being released and then probably commit it.

wim leers’s picture

I don't think so, there is no public API change here. We just need 8.8, so I'm going to wait on that being released and then probably commit it.

Isn't the bump in core requirement sufficient to warrant this? Your call of course, but in the Drupal world we're unnecessarily afraid to bump major versions :D

We just need 8.8, so I'm going to wait on that being released and then probably commit it.

Ohhhh … you'll be able to specify core_version_requirement: ^8.8 🤓🥳

slasher13’s picture

StatusFileSize
new7.15 KB
new587 bytes

Make it compatible with php-redis 5
see https://www.drupal.org/project/redis/issues/3068810

davidwhthomas’s picture

Just noting a followup after some patch testing:

We had some issues with the patch from https://www.drupal.org/files/issues/2019-09-10/2966607-127.patch together with the redis patch in #10 ( on core v8.5.15 )

1. New field config import failed with drush config-import

Unexpected error during import with operation create for field.field.node.recipe.field_equipment: Attempt to create a field field_equipment that does not exist on entity type node.

Perhaps missing field storage or similar information for that step. Config could import later ok via the UI + cache clear

2. Deleting a field to test, the field still showing the field UI list, even though deleted.

3. and also when deleting a comment from a comment list View: (empty comment view row render issue?)

PHP message: TypeError: Argument 1 passed to Drupal\views\Plugin\views\field\EntityOperations::getEntityTranslation() must implement interface Drupal\Core\Entity\EntityInterface, null given,

Seems maybe it was expecting cleared cache but the cache wasn't yet cleared in that transaction?

Removing the redis patch but keeping the core patch was ok and allowed both those test cases to pass.

Seems may need a core update to 8.8? or some other update to the redis patch part.

berdir’s picture

Yes, you're right, that didn't work yet at all, as I said, that was completely untested :)

This should work better.

davidwhthomas’s picture

Thanks @Berdir! the updated patch in #17 fixes the issues mentioned in #16 and it appears to be working well. Will continue to test it out but looks good for now. Nice one!

wim leers’s picture

Drupal 8.8.0 shipped. What's blocking this from getting committed + shipped in a release? :)

berdir’s picture

Well, because of https://www.drupal.org/project/memcache/issues/2996615#comment-13401697 ;)

To commit this, I need to require 8.8, which I plan to do around 9.0 beta for my projects.

wim leers’s picture

That's fair. But … wouldn't it be acceptable in this case to already tag a release now that requires Drupal 8.8? Yes, that would mean dropping 8.7 support, but for 8.7, people can continue to rely on the latest release (1.2)?

berdir’s picture

That is correct, and there are different opinions on that among contrib maintainers. https://dev.acquia.com/drupal9/deprecation_status for example says about 8.8 deprecations "Once 8.7 is unsupported", but you could either interpret that as "once your module no longer supports it" or until is actually no longer supported.

Many adopted the core cycle of supporting the version with security support as long as it's supported. I'm somewhere in the middle and start requiring the latest version somewhere inbetween.

One problem with requiring 8.8 early, until we have minor/patch releases for contrib, is that if redis would have a security release then I couldn't go back to requiring 8.7 and would force users that are still on 8.7 to update to 8.8 over night to be safe. I have a plan if that should happen between ~march and 8.9 release for one of my modules but I want to limit the time window a bit ;)

wim leers’s picture

Many adopted the core cycle of supporting the version with security support as long as it's supported. I'm somewhere in the middle and start requiring the latest version somewhere inbetween.

Yep.

I have a plan if that should happen between ~march and 8.9 release for one of my modules but I want to limit the time window a bit ;)

That's fair — I would just say that right now it makes sense to weigh the likelihood of a security release in the mean time (very low given the maturity and surface area of this particular module) versus the benefit of getting Drupal 8.8 sites to benefit from this scalability improvement. I personally think the latter means it's worth tagging a new release. It's fine if you don't agree! I just wanted to make the case, that's all :)

berdir’s picture

Here's a patch that ads the core_version_requirement and also fixes a last deprecation notice and some travis fixes, so I can test it there too: https://github.com/md-systems/redis/pull/31

berdir’s picture

Fixing the default theme, needs classy for now.

  • Berdir committed 6d5fca1 on 8.x-1.x
    Issue #3018203 by Berdir, davidwhthomas, slasher13: Support delayed...
berdir’s picture

Status: Needs review » Fixed

So, after some consideration, I decided to get this in now as it is not just D9 preparation but fixes really nasty inconsistencies.

wim leers’s picture

🥳

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.