Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Update module is currently the only core module that requires a non volatile cache. The efforts of bringing Localization Update module into core add a second user of non volatile caching. Both modules collect data from remote and need to store this without it being flushed before the cache is expired. Update module has it own cache api which is not re-usable by other systems.
This patch does three things:
- Introduces a new cache backend class NonVolatileBackend which implements Drupal\Core\Cache\DatabaseBackend. NonVolatileBackend does not support the minimum cache lifetime.
- Adds a CacheBackendInterface::getPrefix method to get cached data with cid's like "fetch_task:*" and "available_releases:*" form the cache.
- Modifies all uses of cache in Update module. Deletes the update cache API functions
Comment | File | Size | Author |
---|---|---|---|
#57 | update-cache-key-value-1547008-57.patch | 42.47 KB | Berdir |
#57 | update-cache-key-value-1547008-57-interdiff.txt | 825 bytes | Berdir |
#54 | update-cache-key-value-1547008-54.patch | 42.6 KB | Berdir |
#54 | update-cache-key-value-1547008-54-interdiff.txt | 7.25 KB | Berdir |
#51 | update-cache-key-value-1547008-51.patch | 42.25 KB | Berdir |
Comments
Comment #1
Sutharsan CreditAttribution: Sutharsan commentedFixed white spaces and changed use of cache in theme.api.php documentation.
Comment #2
BerdirThis is not possible. Many cache backends like Memcache have no support for this.
There is an alternative to this, which is stop naming this cache and use it as a separate api that has nothing to do with caching. There are also plans to make the fetch_task:* part a separate API that is provided by Drupal core. Then it's just project update information and cache_update could be renamed to update_project_info or something like that.
Comment #3
Sutharsan CreditAttribution: Sutharsan commentedSurely I can work around this limitation. But I don't understand the difference between ::getPrefix() and the existing ::deletePrefix(). Or is deletePrefix() also a no-go for some backends?
Comment #4
BerdirdeletePrefix() is ugly, but can be implemented.
Memcache has no list/search API. You are limited to get/put/delete (and some stuff on top of that, like increment). So what the memcache module does to support deletePrefix() is store the prefix in an internal cache and when an item is requested, compare it against the stored prefixes and drop it if one matches (including things like counters/timestamps to know that an item was added later and so on).
This is the only usage of this, right?
We could replace this for now. For example be storing them in a single array. We fetch them all at once anyway, so it's not relevant. Or we could add a real API to Drupal core to handle unique queue items.
Because...
This is the real problem. We must never delete those fetch_task entries or we end up with duplicate fetch tasks in the queue (where duplicates can be 9000 queue items, as I've seen on large sites). My patch over at #1492188: Update module creates duplicate queue items ensures that by adding a NOT LIKE condition to the query in that function. This is not possible through the cache API.
That linked issue was originally about providing an API for doing this unique queue item thing but has drifted back to update.module as we found that workaround.
Comment #5
BerdirOpened #1548286: API for handling unique queue items as a new issue to deal with the fetch tasks.
Comment #6
Sutharsan CreditAttribution: Sutharsan commentedThere are two use cases of a getPrefix method. One described above for fetch_task:* and one for available_releases:*. Both cases can be handled by creating a dedicated cache for each of them. Of course when the queue system ensures unique items, the cache for fetch_tasks can be dropped. This patch:
Comment #8
BerdirThat doesn't work either, exactly the same problem :)
As I said, I think this should not use the cache interface at all but represent a custom API, that does not use the name cache. Two, actually, one for the unique item stuff and one for actually storing information.
Comment #9
Sutharsan CreditAttribution: Sutharsan commentedThe "update_release" cache is a real cache and should not be replaced. It stores the data of available releases as collected from updates.drupal.org up to 1 week. I agree that the "update_fetch" cache is a candidate to be replaced by #1548286: API for handling unique queue items. But I don't want to put this issue on hold for that.
This patch:
Comment #10
BerdirIt doesn't make sense to have a cache implementation that can do more than the interface. If at all, this would need an interface that defines that additional method and a separate factory method, e.g. cache_non_volatile() that would check that the configured class does implement that interface.
But as said before, I don't think this should be a cache ( in the way Drupal uses that word aka something invoked through cache()), it could as well be a table with some API functions. That could even mean using a class that extends from the database cache, but shouldn't be invoked through cache() because it does not need to be pluggable IMHO. If you have a high performance website, you're not going to have the update.module enabled in the productive environment anyway.
We might even be able to use the new config system for this instead of a database table, although I'd like to see how it performs with 2-300 projects. But apart from that, it's possible that config() already provides everything (or close to that) that we need. Including getAll().
Comment #11
Sutharsan CreditAttribution: Sutharsan commentedInstead of extending the cache backend classes, #1202336: Add a key/value store API may be an alternative. It is being developed as part of CMI to serve as a storage for status variables in Drupal (example: timestamp when cron run last; 'cron_last'). It is persistent storage, but it does not have a time stamp of when the data was added to the storage.
Postponed for #1202336: Add a key/value store API.
Comment #12
fubhy CreditAttribution: fubhy commented#1175054: Add a storage (API) for persistent non-configuration state (which #1202336: Add a key/value store API got merged into) is now fixed. And yes, it actually does make sense to put this into a k/v store.
Comment #13
Berdir#1642062: Add TempStore for persistent, limited-term storage of non-cache data is actually an even better match because all data stored by update.module has explicit expiration handling.
Comment #14
BerdirOk, here is an initial implementation of this.
- To get things started, update_storage() is currently hardcoded. I don't know how to implement that properly, it doesn't make sense to add this to the DIC, obviously. Note that they KeyValueStoreWithOwnerFactory from the tempstore issue is a class but the returned storage is just as hardcoded.
- Using 3 different collations to avoid the wildcard stuff. update, update_available_releases and update_fetch_task. This also makes whole functions unecessary although I might have deleted a thing or two too much, especially in _update_get_cached_available_releases(), the last_fetch information probably needs to be set on write time, that is not yet done.
- Adds a KeyValueStorageInterface::deleteAll() + implementations to be able to clear whole collections.
- It is mindblowing how update.module explains every single cache get call in length, it obviously is very proud of it's caching abilities! Rewritten lots of them to stop talking about caches, cache ids, {cache_update} and so on.
One patch against the latest tempstore patch and one against 8.x to be able to run the tests.
Comment #15
BerdirUpdated patch that respects the collection for deleteAll().
Comment #16
BerdirAnd this fixes the missing fetch_item key. Sorry testbot. interdiff is against the patch in #14.
Comment #18
BerdirAs suspected in the other issue, set() on the expirable implementation doesn't work. Worked around it for the moment, not sure if this is right, certainly the @return isn't correct anymore now. We could possibly also use state() for the fetch tasks.
Comment #19
BerdirComment #20
fubhy CreditAttribution: fubhy commentedThat doesn't seem to be the right diff :)
Comment #21
BerdirYeah, had to rebase my branch. The branch against the tempstore patch should actually be correct, though.
Comment #22
pounardThe deleteAll() method is something to be thought of, will it be possible to implement easily for every target backends?[EDIT: As berdir pointed me out on IRC, actually if getAll() exists this is not a new problem, forget that!] Aside of that, we indeed need it.Comment #23
fubhy CreditAttribution: fubhy commentedGood intention, however I would prefer db_drop_table() here :).
While we are at it we should try to stop abusing hook_cache_flush().
Considering that 'update_fetch_task' never automagically expires it should not be considered an 'expirable' anyways. So either we leave it right here or actually put it into state(). Problem there would be that we lose the collection namespacing and the ability to use ->getAll().
Comment #24
BerdirFixed the db_delete() and added some comments.
Keeping it there isn't really possible, because the function then returns implementations of different interfaces, which is weird and can't be documented. So we'd need to add another function. state() doesn't work because we need the getAll(), yes.
the cache flush hook is weird, but I don't think we can or should cover that in this issue. As discussed in IRC, we might even change the purpose of that hook so that it's actually a correct usage.
As before, first patch is the one that should be reviewed, the other one is to execute the tests.
Comment #25
fubhy CreditAttribution: fubhy commentedConsidering what we discussed in IRC I think that this is the only part that we have to discuss / fix before this patch is ready.
Adding another function for the 'update_fetch_task' is probably a good temporary workaround for this issue. However, we are not in a rush with this issue so maybe we will find a better, final solution before commiting this.
You could remove the "return array();" from hook_flush_cache() as that is just redundant noise even more so once we fixed the cache bin registration issue.
Comment #26
pounardAren't you missing some new statements? Why isn't this into the DIC? Maybe I'm misreading the patch.
Comment #27
BerdirBecause I wasn't sure if and how it should be added to the DIC.
Discussed it with chx yesterday and the correct way to do it is to add two factories to core, one for keyvaluestore and one for expirable keyvaluestore. Then Core can register that through the DIC and update.module can use it. I think.
Also re-rolled to include the latest tempstore patch and changed state() to use the same factory. We should also update the keystorewithowner factory and pass a reference to the keyvalue.expirable service instead of database but I wanted to wait with that until that other patch is commited, because it will make re-roll's considerable harder if they decide to change the class or so.
What's not so nice is that it's quite a bit more code, but it's code that others can use too and we now have 3 chained get() calls: drupal_container()->get('service')->get('collection')->get('key'). That's just mad :) Could be solved by adding two helper functions, something like keyvaluestore($collection)->get('key). Also, the factory currently re-creates a storage class for each call, not sure if we should or are even allowed to do some sort of caching there.
Comment #29
BerdirOh. It failed because it can't find the database service, which means we must move that down there too. Or remove it can call it statically. But it actually makes more sense to move it down, I already found out that KeyValueStorage is available earlier than the database, which is wrong.
Comment #30
BerdirNoticed the above patch actually didn't make sense because DatabaseStorage does get the connection passed in. So I was able to remove that. I think it is wrong, at least that we use db_query() and so on instead of Database::getConnection() but that's not our problem in this issue.
So, here's a re-roll with the usual bunch of patches.
Comment #32
BerdirOk, fixed the calls for update_fetch_item and replaced the container calls with hardcoded ones for the two cases that run in maintaince mode. Not really sure how to fix that other than moving the service definition including the database to boostrap.inc.
Suggestions welcome. But I think this should at least pass now.
Comment #33
tim.plunkett#32: update-cache-key-value-1547008-32-with-tempstore.patch queued for re-testing.
Comment #34
tim.plunkett#1608842: Replace list of bootstrap modules, enabled modules, enabled themes, and default theme with config added a KeyValueFactory, and I was trying to add one in #1804000: Convert implementations of KeyValueStoreInterface to annotated plugins. This is going to get messy.
Comment #36
BerdirThis is currently blocked on #1764474: Make Cache interface and backends use the DIC, we need the database changes in there to be able to refactor both implementations to use a database Reference and still be usable in maintaince mode. Same story for the form_state issue.
Comment #37
BerdirOk, here is a re-roll which includes the latest patch from the cache DIC issue.
No special cases anymore, because it's now possible to use the keyvalue.expirable service during maintainance mode and other low-bootstrap stages. Also unified the database storage implementations a bit more.
Comment #39
BerdirWow, haven't re-rolled this one in a long time :)
Much simpler now, just needs the deleteAll() method and maybe the service destructor. Let's see if it works without.
Comment #41
BerdirWould help if I would actully add the service to the core bundle. Also, the service destructor is now in, so with the tag.
Comment #43
BerdirNeed to create the key_value_expire table earlier to fix the upgrade tests.
Comment #45
BerdirAnd I of course also need to actually update the keyvalue expirable factory and implementation class. Code taken over from the form_state issue.
Comment #46
chx CreditAttribution: chx commentedDo we still need the hook_cache_flush implementation?
Comment #47
BerdirRight now, this is used to only run on update.php, which has a direct clear as well, so I'm not sure.
Comment #48
chx CreditAttribution: chx commentedWell update_info_page calls the clear unconditionally so why fire it on every cache flush while updating? It'd be even enough to do once all is done but this way, it's cleared every time that page loads. Also, that call you converted to deleteAll is bogus, it should be update_storage_clear I did research on this for #1764474: Make Cache interface and backends use the DIC and there it is fixed already but catch made it depend on this one...
Comment #49
BerdirNo, the deleteAll() in my patch here isn't bogus, it's the same code as update_storage_clear() :)
Comment #50
chx CreditAttribution: chx commentedOh I see you split the fetch tasks out! that's great. Still, can we remove the hook_cache_flush implementation :) ? It seems unnecessary as per above.
Comment #51
BerdirMore than happy to do that if you think it's fine :)
Also had to re-roll because update.settings.php was moved into a form class.
Comment #52
chx CreditAttribution: chx commentedI like it. Let's see whether catch does.
Comment #53
jibranAs confirmed with @berdir on IRC this should be Drupal::service() or Drupal::keyValueExpirable().
Comment #54
BerdirYes, I totally wanted to add that method, thanks for the reminder! Much nicer.
Comment #55
chx CreditAttribution: chx commentedEven better.
Comment #56
dwwIn principle, I love this issue. Huge thanks to everyone who's worked on it! I hated adding all that code to core in the first place, but I had no choice since at the time there was nothing like a non-volatile cache anywhere in core I could use. So generally, +100 to this.
Apologies for the drive-by-review, no time for a really thorough look, and I have nothing resembling a D8 test environment right now. I'm way out of the loop on D8 core development, so while I grok OOP in general, I'm not totally fluent in PHP's idiosyncrasies, and a lot of our new conventions and practices are still not 2nd nature for me.
The first thing of substance I noticed in the patch was that the logic in system_update_8023() to create a new {key_value_expire} table was:
It's not clear *why* it was moved, just that it was moved. I haven't read every comment in here closely, so maybe I missed it, but I don't really get it. If this table now has a generic name and is provided by core as the default implementation for a generic key/value storage service, why isn't system.module responsible for creating the table on upgrades from D7?
If this is in fact a good idea, I'd appreciate an explanation, especially in the form of a new patch that expands that comment with a reason, not just what happened. ;) Perhaps also a comment in
update_fix_d8_requirements()
, too.Also, not sure what this line has to do with this patch:
That's inside core/modules/update/update.authorize.inc which is supposed to depend on the bare minimum of core's codebase to make #606592: Allow updating core with the update manager feasible. It's not clear why we need that or are adding that here, and if we're going to add a new dependency on all that code, it should probably happen in a separate issue, no?
If/when I can review everything else, I will. Meanwhile, please don't wait to commit if everyone else is happy. As I said, I'm hugely in favor of doing this in general, and am happy to clean up any small details in documentation and even minor implementation details as a followup. However, it seems like it'd be important to get the DB update code "right" before commit. Maybe it's already right and it's only a doc issue. I don't want to set to needs work (or even downgrade to needs review) over these concerns, but I think a new patch would be good before catch spends time on this. I'd rather this was assigned to Berdir, but I can't do that directly.
Thanks again!
-Derek
Comment #57
Berdir@dww: Thanks for the review!
update_fix_d8_requirements() is a bit misleading, it has nothing to do with update.module, it's a function from update.php/core upgrade process to deal with changes that need to happen before Drupal can be fully bootstrapped during the upgrade process. Specifically, this is required because the update info page attempts to clear some of update.module's collections. Slightly extended the comment, there are already multiple very similar comments in other update functions, so it's not a new pattern.
Not sure how the use slipped in there, possibly because Netbeans did that automatically as it's listed in one of the @param. It's not a new dependency and it should probably be added as there multiple @param's/function arguments that seem to receive a FileTransfer object but that's not something to deal with here. Removed.
Comment #58
dwwThanks for the re-roll!
Duh, sorry. It's been so long I forgot about core's own built-in namespace collision between update.php and update.module and all the resulting fun that entails! ;) Whoops. I thought based on my quick skim this was something being triggered in update.install, not includes/update.inc (which no one would necessarily know means update.php, not update manager -- *sigh*).
Duly noted on the FileTransfer dependency. I guess that's right, we already have to use that code, it's just how we declare the dependency has changed (and perhaps been broken) in D8. But yes, different issue/patch to fix that.
If I get a chance to review more, I will, but as I said, probably anything I'd uncover at this point could be fixed in a follow-up issue/commit, so I'l leave it RTBC and let catch decide what to do next. ;)
Thanks again,
-Derek
Comment #59
podarokLooks nice
#57 +1 RTBC
Comment #60
catchThis looks great. I still think we could use #1548286: API for handling unique queue items instead of KeyValue directly for tracking the unique queue items but this gets us closer to that rather than further and unblocks other stuff.
Committed/pushed to 8.x.
dww please open follow-ups if you have them!