Problem/Motivation
First a brief overview of our case:
1. We have a lot of entities of a custom entity type.
2. We have a view, which displays some of the properties of each entity in a single row. The view itself is tags-based cached.
3. This page has a really high access frequency rate.
4. Additionally we have a drush command, which is executed each minute once. This drush command triggers a service, which might change some of the entites and save them with a new revision, even create a couple of revisions at once. Currently this process takes up to couple of seconds.
Now here is what we've observed:
1. An entity which is shown on the view has been saved with the changed timestamp 1473263761, which is the REQUEST_TIME of the execution of the drush command, which is not equal to time().
2. The cache entry for the views field of this entity has a created timestamp 1473263765.398, which is estimated with microtime(). You might say 4 seconds after the entity has been saved, but this would not be correct, as the entity changed timestamp is set through REQUEST_TIME and the service might need a couple of seconds to come to the point of saving the entity.
3. The entity was originally in revision 20.
4. The drush command created revisions 21 and 22.
5. On the views page we are still seeing the entity in the revision 20, for which the new views field cache entry has been created instead for the revision 22.
I think this is not a problem with views but with the cache system.
I could not explain how this exactly happens but I guess this should be reproducible.
Proposed resolution
none.
Possibly resolved by these issues. See #25
- #3018203: Support delayed cache tag invalidation
- #2966607: Invalidating 'node_list' and other broad cache tags early in a transaction severely increases lock wait time and probability of deadlock
- #2996615: Transaction support for cache (tags) invalidation
Remaining tasks
Confirm this is still a problem.
Find a way to reproduce it.
User interface changes
none.
API changes
none.
Data model changes
none.
Comment | File | Size | Author |
---|---|---|---|
#22 | 2796629-patch-apply-fail.txt | 131 bytes | needs-review-queue-bot |
#7 | 2796629-7.patch | 754 bytes | hchonov |
Comments
Comment #2
hchonovComment #4
BerdirIs this still an issue after we improved the loadUncached() method?
This might be a duplicate of #1679344: Race condition in node_save() when not using DB for cache_field
Comment #5
hchonovThe problem is this has happened only once, so I am unfortunately unable of reproducing it, but I don't think that it is related to loadUnchanged or to the other issue.
The only thing I could think of is that during loading the page the entity has been loaded in the static cache with revision 20 and before rendering the page in the background that entity has been saved with two new revisions and therefore the cached view has been invalidated but because we've already loaded the entity in the previous revision the view will be rendered with it and therefore we'll create again the previous actually already obsolete cache entry. In order to fix this case I could think of only one solution and it is that we retrieve all cache tag invalidations on the beginning of a request and when creating cache entries check if the cache tag invalidations required for a cache entry have changed meanwhile and if so then skip caching the entry, because in such case we could conclude that the entry has been created based on an obsolete data.
Comment #6
hchonovDoes this sound as a reasonable explanation of the observed problem?
Comment #7
hchonovI've encountered this problem again while I was debugging and now I was able to identify the place where the problem was caused.
In \Drupal\Core\Cache\DatabaseCacheTagsChecksum::invalidateTags() the cache tags are being invalidated only once per request which means that if an entity view is being loaded in a concurrent process right in the middle between two entity saves of another process then both the entity and its rendered view will be cached and not invalidated anymore. I perfectly understand the idea and the goal of writing invalidations for a specific tag only once during a request in order to save resources, but the more complex and slow the system is, the higher the probability of introducing such race conditions will be. I would say it is better that we remove this optimisation instead of having a place which might be causing cache entries not to be invalidated.
Comment #8
BerdirWe discussed this already in another issue where you had cache issues. The optimization there is pretty critical, for example each menu link invalidates is menu on save IIRC, so doing a menu rebuild can do hundreds of cache invalidations of the same tag.
Note that this static cache is updated when a cache entry with such a cache tag is
loaded(written actually, not loaded IIRC).Comment #9
hchonovI understand that in this case we will cause a lot of update queries, but this example is something that is not occurring that frequently and on each request, while the use case I am describing might be occurring all the time on each request. Isn't this than more critical?
What about if we have this optimization only when needed and explicitly required by e.g. a second parameter to invalidateTags()?
Comment #10
hchonovWe also have to differ between systems where a not properly invalidated cache entry is allowed and systems where such an entry is a bad thing and the probability of such entries should be minimized as much as possible.
Comment #22
needs-review-queue-bot CreditAttribution: needs-review-queue-bot as a volunteer commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.
Comment #23
larowlanWould be good to get an issue summary update here
Comment #24
quietone CreditAttribution: quietone at PreviousNext commentedI checked with larowlan in #bugsmash and as a result I am changing this to get more information.
There hasn't been discussion here for 6 years, so maybe this has been resolved?
@hchonov, is this still a problem? If not, what did you do to resolve it? If not, is there a way to provide steps to reproduce?
Since we need more information to move forward with this issue, I am setting the status to Postponed (maintainer needs more info).
Thanks!
Comment #25
catchI think this has been resolved in the following three issues:
https://www.drupal.org/project/redis/issues/3018203
https://www.drupal.org/project/drupal/issues/2966607
https://www.drupal.org/project/memcache/issues/2996615
Comment #26
quietone CreditAttribution: quietone at PreviousNext commented@catch, thanks for that
We now have an indication that this may have been fixed by work done in several issues.
In order to keep this open we need confirmation that this is still a problem. If you are experiencing this problem on a supported version of Drupal reopen the issue, by setting the status to 'Active', and provide complete steps to reproduce the issue (starting from "Install Drupal core").
If we don't receive additional information to help with the issue, it may be closed after three months.