Problem/Motivation

Using AWS ElasticCache Redis cluster is requiring php RedisCluster client, otherwise it is not working correctly.
To achieve this, we need an initial implementation of client, cache and lock classes as a beginning.

Proposed resolution

Implementing client, cache and lock classes as a beginning compatible with general approach of the module.
We need to have some additional settings for the backend + PhpRedisClusterCacheTagsChecksum.php because the current CacheTagsChecksum is tightly connected with \PhpRedis class.

CommentFileSizeAuthor
#98 2900947-98.patch21.19 KBstefan.butura
#97 2900947-97.patch20.63 KBcbccharlie
#83 2900947-interdiff-82-83.txt1.76 KBjribeiro
#83 2900947-83.patch29.75 KBjribeiro
#82 2900947-82.patch29.44 KBvijaycs85
#77 2900947-60.patch30.36 KBfosserize
#75 2900947-59.patch9.43 KBfosserize
#58 2900947-58.patch309.81 KBzhangjy
#57 2900947-57.patch46.18 KBacbramley
#49 interdiff-2900947-41-49.txt6.92 KBacbramley
#49 2900947-49.patch31.25 KBacbramley
#47 2900947-47.patch35.04 KBacbramley
#47 interdiff-2900947-46-47.txt663 bytesacbramley
#46 interdiff-2900947-45-46.txt559 bytesacbramley
#46 2900947-46.patch35.04 KBacbramley
#45 interdiff-2900947-41-45.txt1.64 KBacbramley
#45 2900947-45-no-logging.patch34.82 KBacbramley
#45 2900947-45.patch35.09 KBacbramley
#41 interdiff-2900947-38-41.txt736 bytesacbramley
#41 2900947-41.patch34.95 KBacbramley
#40 2900947-38.patch34.67 KBsteveworley
#39 2900947-38.patch34.67 KBsteveworley
#38 interdiff-2900947-36-38.txt1.03 KBsteveworley
#38 2900947-38.patch23.09 KBsteveworley
#36 interdiff-2900947-35-36.txt5.28 KBacbramley
#36 2900947-36.patch34.35 KBacbramley
#35 Implement_initial_RedisCluster_client_integration-2900947-35.patch34.62 KBslydevil
#33 redis-cluster_mode-2900947-33.patch34.51 KBbarig
#28 2900947-28.patch34.14 KBjacobbell84
#17 interdiff-2900947-16-17.txt1.9 KBacbramley
#17 2900947-17.patch25.7 KBacbramley
#16 interdiff-2900947-12-15.txt10.18 KBacbramley
#16 2900947-16.patch27.21 KBacbramley
#12 interdiff-8-12.txt5.28 KBjungle
#12 redis-rediscluster_client_integration-2900947-12-D8.patch27.24 KBjungle
#11 interdiff-8-9.txt5.28 KBjungle
#9 redis-rediscluster_client_integration-2900947-9-D8.txt27.24 KBjungle
#8 redis-rediscluster_client_integration-2900947-8-D8.patch21.96 KBskek
#7 redis-rediscluster_client_integration-2900947-7-D8.patch20.89 KBskek
#3 redis-rediscluster_client_integration-2900947-3-D8.patch18.75 KBskek
#2 redis-rediscluster_client_integration-2900947-2-D8.patch18.33 KBskek

Issue fork redis-2900947

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

skek created an issue. See original summary.

skek’s picture

Adding the initial implementation.

skek’s picture

Adding FloodInterface implementation because it is required.

skek’s picture

Status: Active » Needs review

Tested this on AWS Elastic Cache Redis cluster and it is working just fine so I can mark this as needs review.

googletorp’s picture

Status: Needs review » Needs work

I just gave it a very short read through.

You have a few places with code that is commented out - this should be deleted.
You reference Drupal 7 in the code documentation, I'm guessing you have converted the Drupal 7 implementation to Drupal 8. Is what is going on in lockMayBeAvailable still relevant for Drupal 8. If so you should update comments.

/**
+ * {@inheritdoc}
+ */
+ public function lockMayBeAvailable($name) {
+ $key = $this->getKey($name);
+ $value = $this->client->get($key);
+
+ // In Drupal 7, this method treated the lock as available if the ID did
+ // match. The database backend and test expects it to return FALSE in that
+ // case, updated accordingly.
+ return FALSE === $value;
+ }

DeleteAll seems a bit fishy, not sure if this is a good idea...

+ /**
+ * {@inheritdoc}
+ */
+ public function deleteAll() {
+ // The last delete timestamp is in milliseconds, ensure that no cache
+ // was written in the same millisecond.
+ // @todo This is needed to make the tests pass, is this safe enough for real
+ // usage?
+ usleep(1000);
+ $this->lastDeleteAll = round(microtime(TRUE), 3);
+ $this->client->set($this->getKey(static::LAST_DELETE_ALL_KEY), $this->lastDeleteAll);
+ }

skek’s picture

@googletorp,

Thank you for the review and comments.
I've actually used the D8 implementation as a base of RedisCluster implementation so maybe some leftovers in the old code.
The deleteAll() as far I remember was an interface requirement but I will check.
Thanks again for the review.

skek’s picture

Status: Needs work » Needs review
StatusFileSize
new20.89 KB

Fixed some of the issues from code review:

1. Fixing the class and functions comments.
2. Removing the commented code.
3. Adding a README.PhpRedisCluster.txt with details how to setup the client.

Regarding the deleteAll(), I get your idea. However at this point, as the main PhpRedis client use the same implementation I'm not quite sure if we need to touch it for the initial RedisCluster implementation. Otherwise I think this patch will be endless and merging in smaller chunks maybe will be a good idea - just sharing my thoughts here.

skek’s picture

Adding a PersistentLock class for PhpRedisCluster implementation.

jungle’s picture

jungle’s picture

jungle’s picture

StatusFileSize
new5.28 KB

Sorry forgot to attach an interdiff

jungle’s picture

StatusFileSize
new27.24 KB
new5.28 KB

Modify the patch file's extension from .txt to .patch.

Sorry, polluted here.

skek’s picture

Hey guys. Any chance somebody to review this patch so we can have it the module?

dakku’s picture

++

acbramley’s picture

Status: Needs review » Needs work

Back to NW after #3018203: Support delayed cache tag invalidation added doDeleteMultiple to CacheBase

acbramley’s picture

Status: Needs work » Needs review
StatusFileSize
new27.21 KB
new10.18 KB

Rerolled against latest dev and fixed up a bunch of linting errors.

acbramley’s picture

StatusFileSize
new25.7 KB
new1.9 KB

Removed a bunch of methods that were identical to CacheBase

aby v a’s picture

I have added this patch and connected redis successfully. But /admin/reports/redis page getting errors due to missing of info(),scan() function values from $client_factory->getClient() .
Please help me to resolve this errors.

berdir’s picture

Don't use the report page then. That is new and experimental and doesn't account for redis cluster yet.

aby v a’s picture

The error not only in report page. If we do any operation throug admin then getting error like below.
Predis\NotSupportedException: Cannot use 'DEL' with redis-cluster. in Predis\Connection\Aggregate\RedisCluster->getConnection()
Predis\NotSupportedException: Cannot use 'INFO' with redis-cluster. in Predis\Connection\Aggregate\RedisCluster->getConnection()
Predis\NotSupportedException: Cannot use 'SPAN' with redis-cluster. in Predis\Connection\Aggregate\RedisCluster->getConnection()

Currently i am using predis cluster. Do you know anything about this?

And I have tried phprediscluster method but the phpredis library not installed properly. where I need to put https://github.com/phpredis/phpredis folder in our drupal?

acbramley’s picture

Sorry @aby v a I can't help you there, we are using PhpRedisCluster and it's working fine for us.

acbramley’s picture

+++ b/src/Cache/PhpRedisClusterCacheTagsChecksum.php
@@ -0,0 +1,134 @@
+  public function invalidateTags(array $tags) {
...
+      $keys_to_increment[] = $this->getTagKey($tag);
...
+        $this->client->incr($key);

@Berdir I'd love your input here. We noticed that our Redis cluster had some 500k+ items without a TTL in the cache that seemed to be hanging around for a long time (i.e forever). Our Dev cluster has 250k+ items without a TTL and when investigating this all of the items without a TTL are in the form of prefix:cachetags:entity_type:entity_id

I honestly can't see what this cache item is used for apart from being incremented here.

When looking in the non-clustered implementation I noticed this logic was removed in https://git.drupalcode.org/project/redis/-/commit/6d5fca1004f94e8ef6cb14...

I'm wondering if we need to apply a similar fix here?

berdir’s picture

It wasn't removed there, just moved a few lines down.

These are cache tag invalidation counts. They are necessary. But yes, they tend to grow. The exact same thing would happen with the database cache backend as well, if you save 500k different entities, the cachetags table will have have 500k records: #2250033: Garbage collection for cache tag invalidations.

There is no API to have them purged, but even if there were, implementing it in redis is a challenge.

If you don't use redis for any persistent data, I'd consider to just add a redis-cli flushall to your deployment process to remove all data from time to time.

fabianx’s picture

Little bit off-topic, but regarding growing cache tag tables:

You could use the approach of the TimestampCacheTagsChecksum that memcache uses.

Then you can expire seldom used tags automatically via some mechanism or via setting a TTL. (yes that will have some cache misses, but it can be cleaned up):

https://git.drupalcode.org/project/memcache/-/blob/8.x-2.x/src/Cache/Tim...

In essence:

- Each item has a checksum which is the max() of the cache invalidation timestamps

- Each cache tag invalidation gets a timestamp assigned as well

That way we can afford to loose items (memcache cannot guarantee that an item never vanishes) and still have a consistent cache.

As redis has touch() you could just touch frequently used tags before they expire and let the others expire naturally.

aby v a’s picture

@acbramley #21 Could you please share how to set up phprediscluster in drupal 8. I have tried phprediscluster method but the phpredis library not installed properly. where I need to put https://github.com/phpredis/phpredis folder in our drupal? In library folder?

acbramley’s picture

@aby v a you should follow the README on the project page - this is a PHP extension not a library or module that is installed via Drupal or anything. It needs to be installed at the webserver level.

For example, in our Docker image we have:

RUN apk add php7.3-redis

aby v a’s picture

I have added this patch and connected redis successfully. But Here the node updated are not reflecting without manual cache clearing in edit pages. how can i fix this issue?

jacobbell84’s picture

StatusFileSize
new34.14 KB

I tested this against RedisCluster 5.0.6 on AWS and the caching seems to work correctly. One thing I did notice is that the Redis report screen is broken on Clustered mode (in that most of the fields are blank) because the RedisCluster class needs a specific master server to target for most information function calls. I updated the Report screen to abstract the information retrieval a little bit to clean things up and then added support for RedisCluster specific logic. In my testing this seems to work properly for both clustered and non-clustered mode.

aby v a’s picture

@gacobbell84 getting Predis\NotSupportedException: Cannot use 'DEL' with redis-cluster. in content edit. Could you please check and let me know the solution.

singularo’s picture

I'm subscribed to this issue, and keen to see it in, but in case anyone can't/couldn't wait, we've used keydb.dev in production with very good results in a two node cluster with no changes required to the redis client at all, and very simple master<->master setup. Maybe that will help someone.

jacobbell84’s picture

@aby v a, Unfortunately I'm not very familiar with PhpRedis/Predis. The update I made on the patch was focused on just tweaking the report screen to work again since other folks had done the hard parts :)

barig’s picture

Status: Needs review » Reviewed & tested by the community

Hi everyone !

I applied patch in #28 and it works well on my preproduction with 3 Redis nodes in cluster mode !

Thanks for all your work guys, hope this will get merged soon :-)

Regards,
Barig

barig’s picture

StatusFileSize
new34.51 KB

Hi everyone,

I rerolled the patch since the PhpRedisCluster.php class was generating a warning about using "assert" with a string argument (with PHP7.2+).

Regards,
Barig

jacobbell84’s picture

slydevil’s picture

StatusFileSize
new34.62 KB

Re-rolled the patch so it would apply to the latest version.

acbramley’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new34.35 KB
new5.28 KB

Fixes constructors in the new client classes and all of the missing newlines.

ugolek’s picture

I could say that this patch works well for me. I have a clustered Redis on AWS.

My settings.php configuration looks like this:

$settings['redis.connection']['interface'] = 'PhpRedisCluster';
$settings['redis.connection']['seeds'] = [
  'my-redis-cluster.rtt0bz.clustercfg.use1.cache.amazonaws.com:6379'
];
$settings['redis.connection']['read_timeout'] = 1.5;
$settings['redis.connection']['timeout'] = 2;

In the case of using Redis to store sessions, patch and some additional fixes required due to https://github.com/phpredis/phpredis/blob/develop/cluster.markdown#sessi...

$settings['redis_sessions'] = [
  'save_path' => 'seed[]=tcp://my-redis-cluster.rtt0bz.clustercfg.use1.cache.amazonaws.com:6379',
];
steveworley’s picture

StatusFileSize
new23.09 KB
new1.03 KB

Hey everyone, great work on the patch! I've been doing some testing with it and it looks like its working well so far. I have noticed that the RedisCluster client doesn't support password authentication which was throwing me for a loop.

I've rerolled the patch with password support and it seems to be working well, passing NULL allows you to connect to a cluster that doesn't require a password and setting redis.connection.password works correctly.

steveworley’s picture

StatusFileSize
new34.67 KB

Re-uploading patch 38, errant gitignore removed some things.

steveworley’s picture

StatusFileSize
new34.67 KB

Sorry wish I could remove previous comments, the latest version of the patch should be good to go.

Testing the 38_1 patch with the following config:

[
     "host" => "redis-cluster",
     "port" => "6379",
     "password" => "secret",
     "base" => 0,
     "interface" => "PhpRedisCluster",
     "read_timeout" => 2,
     "timeout" => 2,
     "seeds" => [
       "redis-cluster",
     ],
   ]

Drupal is able to connect and looks to be adding and fetching keys properly across the cluster.

$ redis-cli -a secret  --cluster call redis-cluster:6379 info keyspace
>>> Calling info keyspace
redis-cluster:6379: # Keyspace
db0:keys=3089,expires=3075,avg_ttl=31534983273

127.0.0.1:13005: # Keyspace
db0:keys=3116,expires=3110,avg_ttl=0

127.0.0.1:13004: # Keyspace
db0:keys=3116,expires=3110,avg_ttl=31534756758

127.0.0.1:13002: # Keyspace
db0:keys=3059,expires=3055,avg_ttl=0

127.0.0.1:13003: # Keyspace
db0:keys=3059,expires=3055,avg_ttl=31534989664

127.0.0.1:13000: # Keyspace
db0:keys=3089,expires=3075,avg_ttl=0

Without the password it fails without being able to build the keyspace (which is expected) and if the redis cluster doesn't require a password passing NULL to the constructor is okay.

I'll be looking to do some more testing too! Sorry about the mixups with the patch.

acbramley’s picture

StatusFileSize
new34.95 KB
new736 bytes

We had an issue with cluster exceptions popping up that were impossible to debug because there is no error handling currently in the Cache implementations. We locally patched the cache backend to log an exception with the key and were able to track down the cause of the redis exceptions (in this case it was huge entries for token_info)

I'm wondering if we should add that functionality here with something like this patch

sime’s picture

After months of tuning redis configuration with help from @acbramley and others, I have come to the conclusion that keys that start to hit 1,000,000 (bytes?) are at risk of redis exceptions when memory is hovering at 100% on AWS redis service. We have used two different policies for removing old keys.

In hook_block_alter() and hook_token_info_alter() you can use strlen(serialize($data)); to review the lengths of these likely candidates for exceptions. If you're hitting these numbers you are likely to have some governance issues with your content model, and you can judiciously consider pruning the tokens and block plugins that you are not using. Many contrib modules can add to the auto-discovered definitions and combined with a lot of entities, bundles and fields it's easy to get into this situation.

I'm sure there are optimisations that can happen but at least this info might be useful.

sime’s picture

Recommended contrib solution for block plugins https://www.drupal.org/project/block_list_override

charly71’s picture

The patch works perfectly in our redis cloud environment, but we've noticed that keys' expired time are not set (-1). I've tried patch https://www.drupal.org/files/issues/2020-09-23/redis-default_permanent_t... without success...

Thanks for your suggestions!

acbramley’s picture

StatusFileSize
new35.09 KB
new34.82 KB
new1.64 KB

I've been trying to track down a strange caching bug for the last 2 days on our site with a new feature that uses sub requests.

Basically the sub request would fetch a JSON:API endpoint during a postSave() and store the result in another entity (i.e a snapshot feature).

Locally this worked fine but on our hosting platform which uses clustered Redis, the snapshot always had the values of the entity BEFORE the save, essentially breaking the feature entirely. I was able to track it down to something in Redis, but long story short I found that the culprit was the lack of using $this->client->multi(); in the set() call to set and expire keys.

Swapping to use a multi() fixed this strange bug so I've tried to update everywhere that uses a multi() in the non-clustered classes in the relevant places.

2 patches - 1 with exception logging from #41 and one without.

Ignore this and the following 2 patches...today has been a bad day :(

acbramley’s picture

StatusFileSize
new35.04 KB
new559 bytes

There's a bug with using a multi with gets https://github.com/phpredis/phpredis/issues/876 so reverting that part of the patch.

acbramley’s picture

StatusFileSize
new663 bytes
new35.04 KB

Fixing call in PhpRedisClusterCacheTagsChecksum -_-

acbramley’s picture

Reverting to #41

acbramley’s picture

StatusFileSize
new31.25 KB
new6.92 KB

Ok - I was on to something! Just wasn't quite the right differences. But I have found something that does fix the bug originally described in #45

The bug comes from the expandEntry code where we have overridden the CacheBase implementation which contains an important check on cid's that are in the delayedDeletions list:

    // Ignore items that are scheduled for deletion.
    if (in_array($values['cid'], $this->delayedDeletions)) {
      return FALSE;
    }

I've removed the expandEntry and createEntryHash implementations from the PhpRedisCluster cache class so that we use the CacheBase versions.

I've also added the CacheTagsChecksumTrait to PhpRedisClusterCacheTagsChecksum to re-use a bunch of logic from that too.

kim.pepper’s picture

Issue tags: +#pnx-sprint
raphael apard’s picture

Patch #49 works fine for me.
Thanks

jienckebd’s picture

#49 worked great for me too -- thanks so much! <3

gaurishankar made their first commit to this issue’s fork.

sumachaa’s picture

getting an error as below

In PhpRedisCluster.php line 38:

  [RedisClusterException]
  Couldn't map cluster keyspace using any provided seed

siggy’s picture

What is the state of this feature?

antonín slejška’s picture

After update to redis 1.6.0 there is an exception:

Cannot apply patch #2900947 Redis cluster support (patches/redis/2900947-49.patch)!

acbramley’s picture

StatusFileSize
new46.18 KB

Rerolled against latest dev.

zhangjy’s picture

StatusFileSize
new309.81 KB

It seems that we didn't implement tls connection for PhpRedis, and I found this Redis cluster not considering host protocol for masters
Now I work with the AWS ElasticCache Redis cluster, and Encryption in transit is enabled, so it needs to connect with TLS.

I made a simple patch based on 2900947-57.patch

Add the SSL context parameter
$client = new \RedisCluster($this->getClusterName(), $this->getSeeds(), $this->getTimeout(), $this->getReadTimeout(), $this->getPersistent(), $this->getPassword(), ['verify_peer' => false]);

Now I tested on AWS ElasticCache Redis cluster, it works good.

digantdj’s picture

Patch #49 works fine for me until there's auto-scaling(adding shards) & slot redtribution with Predis. I understand with pipeline we will get MOVED server error, but is there a way to handle & retry this? Otherwise it defeats the purpose of going for Cluster Enabled Elasticache if we can't autoscale.

Tested into PHPRedis & didn't run into the same issue so looks like the issue is more because of Predis.

berdir’s picture

Status: Needs review » Needs work

#58 is a huge patch that seems to change file endings or so.

Can we convert this to a merge request? We have gitlab testing now and while the cluster integration won't be able to be tested there unless someone figures out how to set up a cluster with multiple redis docker services, it will at least help to make sure that no existing feature breaks.

  1. +++ b/src/Cache/PredisClusterCacheTagsChecksum.php
    @@ -0,0 +1,33 @@
    +  public function invalidateTags(array $tags) {
    +    $keys_to_increment = [];
    +    foreach ($tags as $tag) {
    +      // Only invalidate tags once per request unless they are written again.
    +      if (isset($this->invalidatedTags[$tag])) {
    +        continue;
    +      }
    +      $this->invalidatedTags[$tag] = TRUE;
    +      unset($this->tagCache[$tag]);
    +      $keys_to_increment[] = $this->getTagKey($tag);
    +    }
    +    if ($keys_to_increment) {
    +      $pipe = $this->client->pipeline();
    +      foreach ($keys_to_increment as $key) {
    

    why does this override invalidateTags() and not doInvalidateTags()? Is it possible this predates the refactoring that was done there?

  2. +++ b/src/Controller/ReportController.php
    @@ -168,20 +169,19 @@ class ReportController extends ControllerBase {
     
    @@ -192,15 +192,19 @@ class ReportController extends ControllerBase {
    
    @@ -192,15 +192,19 @@ class ReportController extends ControllerBase {
           ],
           'version' => [
             'title' => $this->t('Version'),
    -        'value' => $info['redis_version'] ?? $info['Server']['redis_version'],
    +        'value' => $info['redis_version'],
    +      ],
    +      'mode' => [
    +        'title' => $this->t('Mode'),
    +        'value' => Unicode::ucfirst($info['redis_mode']),
           ],
           'clients' => [
             'title' => $this->t('Connected clients'),
    -        'value' => $info['connected_clients'] ?? $info['Clients']['connected_clients'],
    +        'value' => $info['connected_clients'],
    

    this seems to revert a lot of changes that were done to handle different info structures. There's also a .rej and .org file in the patch.

  3. +++ b/src/Flood/PredisCluster.php
    @@ -0,0 +1,11 @@
    +/**
    + * Defines the database flood backend. This is the default Drupal backend.
    + */
    +class PredisCluster extends Predis {
    

    copy paste of an already wrong docblock?

acbramley’s picture

Assigned: Unassigned » acbramley

Let me tidy it up and rebase onto the MR.

acbramley’s picture

re #60

1. I'll look into refactoring this
2. All of that logic got pushed into $this->info()
3. Fixed

acbramley’s picture

Assigned: acbramley » Unassigned
Status: Needs work » Needs review
berdir’s picture

Status: Needs review » Needs work

Hm, it might not run the test because you're not a maintainer, I can't really see a way in the UI to trigger it, clicking on run pipeline asks me but doesn't seem to do anything.

Posted review comments in gitlab.

SpadXIII made their first commit to this issue’s fork.

spadxiii’s picture

I've added some functionality to allow setting of some options for both PhpRedisCluster and PredisCluster.
I was missing the 'scheme' option and the timeout options.

As seen here, https://github.com/phpredis/phpredis/blob/develop/redis_cluster.stub.php..., the RedisCluster class allows for a context-array to be passed into the constructor. (It's not documented properly yet, and I can't find the issue I came across that mentions this method argument :) )

With my addition, it's possible to set the scheme to tls like so: $settings['redis.connection']['scheme'] = 'tls';
The options `timeout` and `read_timeout` are also used.

acbramley’s picture

@SpadXIII the latest changes don't seem to match with what's allowing in the RedisCluster constructor.

public function __construct($name, $seeds, $timeout = null, $readTimeout = null, $persistent = false, $auth = null) {}

There's no parameter for an array of context values.

EDIT: I may have been looking at out of date docs via PHPStorm https://github.com/JetBrains/phpstorm-stubs/blob/master/redis/RedisClust...

mihaic’s picture

Hi,

Both patches #57 or #58 are not applying any longer with the latest version of the module drupal/redis (1.7.0).

Is anyone else seeing the same ?

On a separate note, we are using this cluster implementation on production for more than 6 months and is working fine for us as per #57.

I encourage maintainers to commit this work so we avoid the hassle re-working this complex patch after every module update.

Thanks

acbramley’s picture

Yes I agree with #69, we've been running this for a long time and the rerolls are getting quite difficult. It would be great if we could commit this in some form and open followup(s) for any remaining changes.

@Berdir - is there some compromise we can come to? What absolutely needs to be done before this gets committed?

milosr’s picture

The patch are not applying on the latest version of the module, we need to work on this.

ajaynimbolkar’s picture

@mihaic

Can you please provide version of redis module used where patch #57 working fine.

nterbogt made their first commit to this issue’s fork.

harlor’s picture

Caching with the new redis cluster implementation seems to work fine on our Drupal 10!

Only the recently added flush logic #2765895: Currently Drush Cr or Cache Clear UI does not flush Redis cache does not work. If I am not mistaken there are no databases in the cluster implementation - so the check for the database does not make sense for redis cluster right?

Edit: Sorry - the status of the the other issue confused me I move my comment to that one...

fosserize’s picture

StatusFileSize
new9.43 KB

Hi,
Since someone requested it, here's the latest patch

fosserize’s picture

fosserize’s picture

StatusFileSize
new30.36 KB
acbramley’s picture

Status: Needs work » Needs review

This should have been set to NR again with the feedback addressed all the way back in Feb last year.

We have been running this patch for years at this point. If this can't be committed yet, can we please get a detailed list of steps to get it to a point where it can be?

jribeiro’s picture

We are using the following settings for our AWS ElasticCache Serverless (Cluster mode on) Redis with TLS enabled.

$settings['redis.connection']['interface'] = 'PhpRedisCluster';
$settings['redis.connection']['seeds'] = [
  "{redis_host}:{redis_port}"
];
$settings['redis.connection']['scheme'] = 'tls';

It is working fine on our end.

acbramley’s picture

Status: Needs review » Needs work

Again this is now heavily conflicting with more code committed to 8.x-1.x...

acbramley’s picture

Status: Needs work » Needs review

I've gone back through https://git.drupalcode.org/project/redis/-/commits/8.x-1.x/src/Controlle... and added all changes back to the ReportController instead of trying to fix the insane conflicts (due to the refactors to that controller on this branch).

If we can get this in without having to reroll again it would make my year 😅

vijaycs85’s picture

StatusFileSize
new29.44 KB

Here is the re-roll of the MR against 1.8.

jribeiro’s picture

StatusFileSize
new29.75 KB
new1.76 KB

If you are using the failover patch (https://www.drupal.org/node/3102739) along with this change, apply the following patch.

berdir’s picture

Status: Needs review » Needs work

This will need some adjustments for the recent changes that went in 8.x-1.9 (which is being created as I'm writing this), specifically the last write timestamp optimization: #3498940: Optimize bin cache tags and last write timetamp.

> If we can get this in without having to reroll again it would make my year 😅

Sorry, as we've discussed a while ago, I'm the sole maintainer of this and I have no experience with redis cluster, so it's hard for me to commit to maintaining that.

For the reports page, there are two open issues about problems with the reports page. It would be _great_ if someone would extract the approach from this, which I really like, ensure that the two other issues are also covered (minus the small cluster specific parts). Then we can get that in, which will greatly reduce the conflict potential this has.

That would also help us move into the direction of an idea for a 2.x release of this project that I have, that would introduce an adapter that sits between cache/lock/queue/tag backends and the actual libraries and abstracts the relevant bits of the different implementations into something that each adapter can then customize. I don't know yet if that's feasible but that has the potential to be a major simplification for stuff like this, this info() method could then move into that adapter.

As for the remaining bits of this, this issue has 50 followers and cluster setups implies that there might be budget so I could invest a few hours into setting up a cluster test setup, on CI if possible, and then we could have this covered with tests. Reach out to me if you're interested.

elc’s picture

@berdir Which are the open issues regarding the report page you are referencing?

berdir’s picture

One of them is #3337750: DivisionByZeroError - Problem to access Status Page on PHP 8.1.14 that you rerolled, another is #3173988: Azure/AWS: Take into account CONFIG command may not work. They all change things in the ReportsController.

pobster’s picture

There's also an error in PredisCluster:

As this class doesn't have an $this->initSettings(); like the others tend to, and it references $this->settings[...] when it's not available (and is just $settings at this point) so always applies the defaults (and can I say, maybe "scheme" should default to "tcp" and not NULL as this just instantly throws a connection error).

 /**
   * {@inheritdoc}
   */
  public function getClient($host = NULL, $port = NULL, $base = NULL, $password = NULL, $replicationHosts = [], $persistent = FALSE) {
    $settings = Settings::get('redis.connection', []);
    $parameters = $settings['hosts'];
    $options = [
      'cluster' => 'redis',
      'parameters' => [
        'scheme' => $this->settings['scheme'] ?? NULL,
        'timeout' => $this->settings['timeout'] ?? self::DEFAULT_TIMEOUT,
        'read_write_timeout' => $this->settings['read_timeout'] ?? self::DEFAULT_READ_TIMEOUT,
      ],
    ];

    try {
      $client = new Client($parameters, $options);
    }
    catch (\Exception $e) {
      if (Settings::get('redis.failover', FALSE)) {
        return FALSE;
      }

      throw $e;
    }

    return $client;
  }

  /**
   * {@inheritdoc}
   */
  public function getName() {
    return 'PredisCluster';
  }

}
berdir’s picture

This will need another rebase, I have extracted and committed the reports stuff in #3537948: Abstract info-collection RedisController to deal with special environments (kept the cluster things, so should be able to just drop those parts here). This needs work per #87 and also the docs will conflict and should be renamed to .md and added to mkdocs.yml, there are some pending comments on them as well to not repeated general docs, which are also outdated (chained fast stuff)

acbramley’s picture

Status: Needs work » Postponed

acbramley’s picture

https://git.drupalcode.org/project/redis/-/merge_requests/62 is an initial approach to refactoring the cluster client (PhpRedisCluster only, not Predis) using the new architecture in 3080449

It would be great if anyone wants to test that branch and report back if it's working for you :)

acbramley’s picture

Latest commit on https://git.drupalcode.org/project/redis/-/merge_requests/61 has broken cluster integration due to the forced exec. I can't see a way around it at the moment so I've just rerolled the original MR for now.

berdir’s picture

Version: 8.x-1.x-dev » 2.x-dev
Status: Postponed » Needs work

Development on Redis 2.x has officially started. New features to 8.x-1.x will no longer be considered. This will need to be redone on top of the 2.x branch. There's already starting point in the referenced issue, and while there are some things to figure out around pipelines, I believe they can be resolved.

As mentioned before, I think Gitlab CI configuration is crucial for me to be able to support this. It should be possible similar to valkey and adjusted REDIS_INTERFACE constant values. If more configuration is needed, we can extend the relevant test traits and setup.

acbramley’s picture

Rebased on top of 2.x. I was going to suggest using a compiler pass to auto register the client factories inside a dedicated namespace so we don't have to manually declare each one in services.yml, I might open a new issue for that.

Still TBC how we solve the pipeline issue elegantly, especially with things like getMultiple returning from the exec command which isn't supported by default on the cluster client.

berdir’s picture

I'd recommend to create a separate branch for 2.x, people are going to need to 8.x-1.x version for quite some time, kinda too late, but you could also re-open an 8.x-1.x MR based on a patch or something I suppose.

> I was going to suggest using a compiler pass to auto register the client factories inside a dedicated namespace so we don't have to manually declare each one in services.yml, I might open a new issue for that.

I don't think that is necessary, this isn't something like event subscribers or so. almost nobody will never add another factory and a compiler pass comes at a cost as it has to check every module. I also doubt that it would be fewer LOC than the yaml service definitions.

acbramley’s picture

I'd recommend to create a separate branch for 2.x, people are going to need to 8.x-1.x version for quite some time, kinda too late, but you could also re-open an 8.x-1.x MR based on a patch or something I suppose.

There are 2 MRs open, one against 8.x-1.x and one against 2.x. The 8.x-1.x one is rerolled against HEAD.

cbccharlie’s picture

StatusFileSize
new20.63 KB

I have readjusted the patch from comment #83 to work with version 8.x-1.11.

stefan.butura’s picture

StatusFileSize
new21.19 KB

Added support for \RedisCluster::OPT_SLAVE_FAILOVER for 8.x-1.11 - see https://github.com/phpredis/phpredis/blob/develop/cluster.md#automatic-s...

Example:

$settings['redis.connection']['slave_failover'] = RedisCluster::FAILOVER_ERROR;

Context: I wanted to simulate a real life scenario with 3 separate servers, each hosting a master node and a replica of a different node. I tested the patch in #97 using Docker with a Redis cluster with 3 master nodes and 3 replicas. I then brought down one master node and a different replica node. This made Drupal throw the following error on random requests: Redis Cluster's master data is incomplete. Cluster is in invalid state. in hgetall()

Using the following option fixed the issue for me:

$obj_cluster->setOption(RedisCluster::OPT_SLAVE_FAILOVER, RedisCluster::FAILOVER_ERROR);

If I understand correctly, using this option, if a master node is down, requests to that node will be sent to its replica instead, which for me was the main appeal of using Redis Cluster.