Problem/Motivation
We currently have some issues with the {key_value_expire} table. Basically, the main problem is that deleting rows is very slow because the records are so large. That's probably additionally slowed down in our case because we have a database cluster.
We're on an older core version, so the problem is bigger for us, but I think it still wouldn't hurt to improve this.
Proposed resolution
@DamZ suggested to change the primary key to (expire, collection, name). The primary problem seems to be that InnoDB always needs to re-order the data for the primary key and needs to move huge amounts of data around. When he first part of the PK is the expire column, then that never has to change, we can just always append data.
I guess we'd add a unique key on collection, name (or name, collection, not sure what's more performant) then and remove the index on the expire column and the all index then.
Remaining tasks
User interface changes
API changes
Data model changes
Indexes need to be removed/added. Especially for the PK, that can be a very slow operation if there's a lot of data already.
Comment | File | Size | Author |
---|---|---|---|
#41 | interdiff.2510438.36-41.txt | 1.88 KB | longwave |
#41 | 2510438-41.patch | 2.48 KB | longwave |
Comments
Comment #1
Fabianx CreditAttribution: Fabianx as a volunteer commentedSetting to major, if expire is not in the PK, that is a problem for the query parser IIRC.
To clarify:
- Do we have any key on 'expire' right now?
Overall: +1 to the approach and +1 to getting it in before supporting upgrade path as that could get hairy on production.
Best to delete the table and have key-value-expire recreate it automatically (except its installed by system module still ... so that won't work ... ).
Comment #2
catch@berdir, would you mind seeing what happens if you drop the 'all' key? We don't have any queries that need that as far as I can see.
Comment #3
catchSo I'd be up for fixing that properly, but also the update could just delete and recreate manually for now.
Comment #4
BerdirYes, deleting that index seems to help a bit, don't have too reliable numbers as it's not very constant.
Comment #6
cilefen CreditAttribution: cilefen commentedI am triaging Major issues at New Orleans. Nothing has changed about this situation. This is a major performance issue at the database level. I've tagged it on the assumption that this would be a releaseable change. It seems that way based on the conversation.
Comment #7
xjmThanks @cilefen for helping triage majors at the sprint. Updating issue credit.
Comment #8
catchDiscussed this with @xjm, @alexpott, @webchick and @cottser and agreed this is definitely major due to the severe scalability issue.
Comment #16
catchThis is still valid as far as I can see.
In #2510438-2: Remove 'all' index from {key_value_expire} - serves no purpose and negatively impacts performance I suggested dropping the 'all' key - this would just need system_schema() updating and a straightforward update hook to drop the index. IMO this could still be backported to 8.9.x
We could then open a follow-up to discuss changing the primary key altogether. #2664322: key_value table is only used by a core service but it depends on system install would simplify the update there since we could simply drop the existing table in the update. That kind of change I think we'd need to do only in a minor release, so 9.1.x only.
Comment #17
quietone CreditAttribution: quietone as a volunteer commentedComment #18
johnwebdev CreditAttribution: johnwebdev commentedWorking on this :)
Comment #19
johnwebdev CreditAttribution: johnwebdev commentedComment #20
jungleCould be
dirname(__DIR__, 3) . '/fixtures/update/drupal-8.8.0.bare.standard.php.gz',
Comment #21
jungleSorry, the patch is empty, Thanks @johnwebdev for the ping!
Comment #22
catchRe-titling.
Comment #23
catchJust on the original suggestion here as opposed to the current patch:
If we did this, we'd have no database-level constraint preventing duplicates for
collection, name
- so assuming we want to keep that, we'd need to make a new unique index oncollection, name
. Not sure that would necessarily help with the deletion issue unless it's specifically the primary key causing the problem here.Comment #24
BerdirI created this issue a long time ago. Back then, form_state storage/cache was still written for every new form, which caused _massive_ amounts of data being written to that table, non-stop. So 99% of the records being written were large and had a pretty stable, steadily increasing expiration.
Back when Damien explained this to me he said that having the expire in there would allow MySQL to manager the primary key much more efficiently, as I understand it now, it has to reorganize it on delete.
That it hasn't been a severe issue anymore for a long time, so IMHO I'm fine with that smaller improvement and also happy to move this down to a normal issue, I don't think it is major anymore.
Comment #26
larowlanBased on #24
Comment #27
catchGiven #24 I don't think we need the follow-up here for the primary key change at all.
Comment #28
quietone CreditAttribution: quietone as a volunteer commentedNeeded reroll due to #2664322: key_value table is only used by a core service but it depends on system install.
Comment #29
quietone CreditAttribution: quietone as a volunteer commentedComment #30
daffie CreditAttribution: daffie commentedCould we test first for the existence of the table. The function dropIndex() does not like it when you try to drop an index from a non-existing table.
Comment #31
quietone CreditAttribution: quietone as a volunteer commentedSure. Test added.
Comment #32
daffie CreditAttribution: daffie commentedThe removed index is not used.
There update function added to the patch for existing sites.
There is testing added to the update function.
For me it is RTBC.
Comment #33
catchThis should be system_update_9201().
However, I'm not sure this needs to be a hook_update_N() since it's only dropping an index rather than changing the way data is scored in the schema. If we made it a post update, it would also help us avoid running into #3108658: [policy] Handling update path divergence between 11.x and 10.x.
Comment #34
ravi.shankar CreditAttribution: ravi.shankar at OpenSense Labs commentedUpdated as per comment #33.
Comment #35
daffie CreditAttribution: daffie commentedThis function should be changed to a hook_post_update_NAME() function as requested by @catch in comment #33.
Comment #36
ankithashettyUpdated patch in #34 addressing #35. Thanks!
Comment #37
quietone CreditAttribution: quietone as a volunteer commentedThe link given by catch in #33 suggests changes that still need to be implemented here.
Comment #38
daffie CreditAttribution: daffie commentedThe update function has been changed to post update function and that was what @catch wanted in comment #33.
Therefore back to RTBC.
@quietone: If I am missing something, then please explain what.
Comment #39
quietone CreditAttribution: quietone as a volunteer commented@daffie, thanks for asking. I reread the issue referred to and learned that I misread the Proposed Resolution, substituting hook_post_update for hook_update which is wrong. So, yes, the changes requested in #33 are complete now.
Comment #40
catchThis should be moved to system.post_update.php and also named something like 'system_post_update_remove_key_value_expire_all_index()' - since there's no numbering of post updates.
Comment #41
longwaveComment #42
johnwebdev CreditAttribution: johnwebdev commentedComment #44
catchCommitted 1fb62cc and pushed to 9.2.x. Thanks!