Problem/Motivation
The FastChainedCacheBackend is very useful provided that the data is mostly read, but not written often, as any write or clear leads to a full invalidation of the whole cache bin.
#2422657: Skip fast chained cache backend during installer improved the installer performance by removing the FastChainedCacheBackend during the installer, such especially saving the calls to markAsOutdated().
#2423591: Optimize config entity import sees the same problem with config entity import.
However it is possible that completely valid writes would lead to the same problem.
The FastChainedCache backend is at one point very ineffective and just leading to more database writes, than what it saves.
Proposed resolution: Shut off for a period of time
It makes sense hence to shut off the chained backend for periods of time and just use the consistent cache backend instead directly.
Shutting off is fortunately very easy:
Just writing a 'future timestamp' (e.g. 10 s into the future) and ensuring that a cache is never read / written or marked as outdated, when the current_ts < future_ts.
This is the first task to implement and should be fairly easy.
However what now remains is:
When should the fast backend be shut down?
In computer science a problem space like this is usually modeled via: "Oracle" (not the company) and the number of writes / time when the performance suffers is called the 'break-even' time.
Oracle knows all requests in advance and such can lead to an optimal policy, where the fast chained backend is shut off
Fortunately looking at the problem space like this, it can be compared with a device that 'overheats' when it gets too hot, where hot here is defined as the number of writes per time. This means the problem can be simulated well.
e.g. a very simple policy would be. If there is more than 20 writes within the interval of a second, then shut off for 10 seconds.
In our simple model this would mean:
- cool down temperature == 20 writes / second
- a threshold of e.g. 20 writes when we shut off
- a shut down time of e.g. 10s
whenever we write markAsOutdated() and are not in shutdown we:
- decrement the temperate by the cooldown * duration the backend was inactive
- increment the temperature per markAsOutdated request
Remaining tasks
- Discuss
- Prototype (simple)
- Implement a TemperatureAwareFastChainedCacheBackend extends FastChainedCacheBackend
- Implement shut off function in the that can be called with a time to shut off for.
- Ensure shut off is taken into account by read / write / markAsOutdated()
- Implement a generic 'SimpleCacheTemperature' component (as this could be useful for other caches as well)
- Implement a generic 'CacheTemperatureInterface' with the function calculateTemperature(duration, number_of_writes)
- Add as service cache.temperature.simple
- Inject service into the TemperatureAwareFastChainedCacheBackend
* Benchmark
User interface changes
- None
API changes
- None
| Comment | File | Size | Author |
|---|---|---|---|
| #16 | 2431259-chainedfast.patch | 3.31 KB | david_garcia |
| #13 | fast-chained-avoid-repeated-invalidation-2431259-13.patch | 1.52 KB | berdir |
| #7 | interdiff.txt | 7.95 KB | kim.pepper |
| #7 | 2431259-temperature-cache-7.patch | 4.44 KB | kim.pepper |
| #3 | 2431259-temperature-cache-3.patch | 3.51 KB | kim.pepper |
Comments
Comment #1
fabianx commentedComment #2
catchThis looks like a good change to me, couple of comments:
1. I think we can just count writes/invalidations per-request rather than per-second. Module install and configuration sync are the most likely offenders and those generally happen within a single request. Also means nothing persistent to store.
2. We should try to avoid an extra consistent cache hit to get the shut off value - could possibly re-use the query for last write timestamp.
Comment #3
kim.pepperI've put together a very naive implementation. Just writes directly to the consistent cache if the threshold is reached.
Comment #4
dawehnerWe should certainly explain the idea implemented in that class
Comment #5
berdirDo we really need this as a separate implementation? Shouldn't we just make it part of the existing class?
This isn't something that you want to explicitly enable IMHO, it should just work...
Comment #6
fabianx commentedJust the >= check is enough and then just being two lines it can live directly in setMultiple().
Nice implementation though!
--
I agree with berdir though, lets just add this directly to the FastChainedCacheBackend given how simple the implementation here is.
CNW for simplifying that, but looks great overall.
Comment #7
kim.pepperThanks for the feedback.
Moves the threshold logic into ChainedFastBackend and inlines condition.
Comment #8
kim.pepperSpoke with @Fabianx at Drupalcon sprint, and agreed to write out how many writes we are doing per request during install to see what a good threshold is.
Comment #9
kim.pepperRe-read the issue summary, and found we aren't using this in the installer as per #2422657: Skip fast chained cache backend during installer.
After installing, I couldn't find any good examples where we were writing to the cache more than 20 times per request, which seemed insignificant.
Comment #10
berdirDid you try [Flush all caches] or manually clearing the cache tables?
One big problem is also cache tag invalidations. For example, try something like my field_map.php script in #2473983: [meta] Evaluate Entity Field API Scalability, there's a huge number of cache tag invalidations going on, and every time one happens, the fast chained invalidates, and that happens for any bin that's using it. Any ideas on how to improve that?
Comment #12
berdirI don't think this patch helps much at the moment.
Writing to the fast backend is fast. We don't need to worry about that.
The expensive part is markAsOutdated(), specifically in combination with cache tag invalidations as written above. Because every invalidation in turn calls that. Keep in mind that we do that for each fast chained bin. So by default, for every cache tag invalidation, we call markAsOutdated() on 3 different bins. And if you add more (I made default also fast chained on some of my sites as there is pretty much nothing left in there), that number increases.
So one thing we could try is something similar to DatabaseCacheTagsChecksum and avoid calling it repeatedly, unless we are fetching caches again. Will only work well if we don't fetch caches inbetween, need to do some tests on that.
Comment #13
berdirSomething like this.
To test, e.g. put something like this in a test:
In my case in FolderTest, an absolutely trivial test, the numbers went from ~320/380 to 70/79, this is already a nice improvement. I'd expect the number goes up the more modules are enabled, so we should also try something like that in a normal standard install.
Also, make sure you run the test in the browser UI ;)
Might want to open a separate issue, since this doesn't have anything in common really with this issue, but as written above, I'm not sure this issue can help much in general. We avoid the likely cheapest part of the whole thing only.
Comment #15
david_garcia commentedI've been working on something for the ChainedFastBackend and came accross this issue.
The whole point of binary invalidation is to be able to have consistent and reliable cache values between environments (or "webheads"). If we only had one environment, we would actually not need this at all.
So why should we invalidate our own environment instead of only the other ones - if any. See patch attached - quite simple indeed- where we keep track of what "environment" has done the last binary invalidation and use this information not to invalidate the binary for the environment that has made the invalidation itself.
Indeed, I believe that with this change what happened in:
#2422657: Skip fast chained cache backend during installer
Would not have been needed at all as the install process happens on just on environment.
This is just a POF, maybe I got something wrong...
Another interesting change would be to move the actual binary invalidations to the end of the request, and do them all at once.
I'm setting this to needs review because I'm curious on seeing if this patch passes tests.
Comment #16
david_garcia commentedUps...
Comment #19
david_garcia commentedMoving it away to it's own place #2611400: [meta] ChainedFastBackend should not invalidate the whole fastBackend when doing a Set()
Even if that ever ends up working, the aim of this issue is still completely valid.
Comment #33
smustgrave commentedThank you for creating this issue to improve Drupal.
We are working to decide if this task is still relevant to a currently supported version of Drupal. There hasn't been any discussion here for over 8 years which suggests that this has either been implemented or is no longer relevant. Your thoughts on this will allow a decision to be made.
Since we need more information to move forward with this issue, the status is now Postponed (maintainer needs more info). If we don't receive additional information to help with the issue, it may be closed after three months.
Thanks!
Comment #34
catchThis is probably a duplicate of #3526080: Reduce write contention to the fast and consistent backend in ChainedFastBackend at this point - the other issue is much newer and the approach is different, but it's trying to solve the same problem.
Comment #35
smustgrave commentedWorks for me!