Updated: Comment #N

Problem/Motivation

Since the introduction of the key_value_expire table @Cottser and I noticed during profiling using xhprof-kit. that each run of the profilier on a requesting a page with a form on it that the profiler would consistently get slower and slower on each bbranch run. We looked for the problem in the profiling results and it indicated a PDO execution getting worse and worse and so we tracked down the table and it turned out that key_value_expire would balloon out of control.

A temporary solution was to truncate that table after every run. See pull request #4

Proposed resolution

Provide a way to clean the expired values. Possibly other solutions.

Remaining tasks

n/a

User interface changes

n/a

API changes

n/a

Files: 
CommentFileSizeAuthor
#18 key-value-expirable-garbage-collection-optimization-2182423-18.patch2.02 KBBerdir
PASSED: [[SimpleTest]]: [MySQL] 63,549 pass(es).
[ View ]
#6 add-expire-index-2182423-1.patch427 bytesBerdir
PASSED: [[SimpleTest]]: [MySQL] 63,336 pass(es).
[ View ]

Comments

Cottser’s picture

Title:key_value_expire doesn't get reset and baloons the PDO execution time for form pages» key_value_expire doesn't get reset and balloons the PDO execution time for form pages

Thanks for filing this @joelpittet!

joelpittet’s picture

key_value_expire seems to balloon really huge on 100 views of a /node/add/article form page. ~5.5MB from phpmyadmin.

Each run generates two entries, one for form and the other for form_state
~28K and ~8K respectively.

For comparison on table sizes:
router 480 KiB
cache 480 KiB
cache_views_info 464 KiB
cache_config 432 KiB

joelpittet’s picture

Berdir’s picture

Well the table will obviously grow, as every form is saved there, including all objects like the node object in there.

I don't know why it should affect execution time though?

The values in that table are supposed to expire after 6h.

joelpittet’s picture

They seem to grow the PDOStatement::execute's time on every loop of loading the page. I double checked and that table is indexed so I'm not really sure why that would have such an effect as well... Since the form isn't being executed, should those form and form_state really be stored?

Maybe the keyvalue store can be defered to after submit? *wild stab in the dark*

Berdir’s picture

Status:Active» Needs review
StatusFileSize
new427 bytes
PASSED: [[SimpleTest]]: [MySQL] 63,336 pass(es).
[ View ]

It also has nothing to do with the entity system, that is how every form works, most just don't have that many objects in them.

I did run that page 1k times, what I did notice is that the delete statement is a) executed twice and b) does get slow, because there's no useful index for just the expiration column.

So, try this patch, this speeds those queries up for me.

It's also overkill to call delete every time we set something, we should probably apply the same pattern and have an explicit clean-up method that is called in system_cron(), just like the cache.

Berdir’s picture

The forms are executed and we need to store that information there, it's needed for ajax requests like the image upload.

joelpittet’s picture

Thanks @Bedir I think that shaved ~1ms off, I'll do a few more tests and post some results.

Berdir’s picture

Ah, we can't clear them from the outside as we have no knowledge about which collections exist and they could be in separate tables. :(

Noticed that the factory does potentially create multiple storages for a collection every time it is requested, leading to multiple delete query. That's however not the case here, we do have multiple collections.

Only 1ms? The difference is 2x 10ms for me and I have a pretty well optimized MySQL installation, apart from the setCache() inserts, every select query is <0.01ms for me.

Berdir’s picture

Note that you either need to re-install or create an index on the expire column manually, otherwise it won't have an effect.

Berdir’s picture

Another thought would be to introduce a random check so that we only do this on 5% of the requests, or something like that? And possibly make the flag static? Or only call the first storage, as we know about the implementation in the factory and that they act on all collections?

effulgentsia’s picture

There are several issues involved here.

1. We're putting both $form and $form_state into the kv, but we should only need to put $form_state in there. I'm working on a patch for that in #2183275: Use cache for $form, kv for $form_state.

2. We have too much stuff in $form_state on the first step of a form (e.g., when just doing a GET of node/add/article). At some point, we'll need issues to optimize that. The same problem exists in D7, in both core and contrib. For example, #1512568: Managing cache_form when Fivestar appears in Teaser.

3. Once 1 and 2 are done, it would be good to change AJAX forms from needing to store the $form_state generated during the initial GET request of the page server-side to being able to store it client-side. That would massively cut down on the kv growing huge and solve issues like #774876: Form cache entries can expire before their host page's cache entry.

Fixing all 3 of the above will take some time. In the meantime, adding an index per #6, and clearing the kv manually or via a script when profiling seem like good ideas.

joelpittet’s picture

@Berdir yeah sorry, that was silly of me. This time I checked both repos @HEAD and fresh installs and did a better test for this: ab vs xhprof diff.

Without Patch
ab -n 500 -c 1 -C session:key http://d8.dev/node/add/article

Requests per second:    1.68 [#/sec] (mean)
Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Total:        534  595  33.0    589     933

With Patch
ab -n 500 -c 1 -C sessionprof:keyprof http://d8prof.dev/node/add/article

Requests per second:    1.76 [#/sec] (mean)
Connection Times (ms)
              min  mean[+/-sd] median   max
Total:        520  568  34.5    561     918

So more like ~27ms better:)

@effulgentsia thank you for looping us into those related issues!

effulgentsia’s picture

It occurs to me that when profiling, you could probably get away with a DELETE from key_value_expire WHERE collection IN ('form', 'form_state'). It means you'd lose all information from in-progress, uncompleted, multistep forms, and won't be able to submit any AJAX forms without first redoing a GET of the original page, but during profiling, that's probably ok, right? This might be better than a full TRUNCATE, in case there's other kv collections that contain useful data relevant to what you're profiling. If you want to do it this way, do we also need an index on collection by itself?

joelpittet’s picture

The way we did TRUNCATE kinda works but we did it before running the LOOP and not in the LOOP so at least the profiling is on even par with each other (kinda), but means that every run through the loop is slowing down by ever increasing PDOstatement times enough that choosing the fastest run will likely be the second run through the loop on each branch. Is DELETE faster than TRUNCATE? maybe we should do that.

We've had no problems to speak of or worries with AJAX in profiling yet. A primed cache good for real world scenarios.

Berdir’s picture

I can't imagine that the delete queries add a measurable overhead with the index in place, shouldn't matter much anymore if you have 100 or 10k rows in there.

Yes, moving the form into cache will remove one of the DELETE queries. So would my idea to only call destruct() once from the factory.

sun’s picture

only call destruct() once from the factory

That sounds like the better approach to me (in case I understand it correctly; code would certainly help :)).

Berdir’s picture

StatusFileSize
new2.02 KB
PASSED: [[SimpleTest]]: [MySQL] 63,549 pass(es).
[ View ]

Here's what I meant.

Berdir’s picture

Going even further, we could also define that key value backends are responsible for clearing expired entries on their own. So we could really just add a call to the key value expire database factory service from system_cron() and don't have to worry about it at runtime.

And if a module provides a different backend that does not do it's own expiration handling, it has to implement hook_cron().

sun’s picture

Status:Needs review» Reviewed & tested by the community

This makes sense, thanks.

I'd recommend to move forward with these ready changes first. We can look into further optimizations in a follow-up issue.

Berdir’s picture

webchick’s picture

Status:Reviewed & tested by the community» Fixed

Awesome catch!

Committed and pushed to 8.x. Thanks!

Status:Fixed» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.