Problem/Motivation
Once #3346765: Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests lands it will be feasible to add database query logging to PerformanceTestTrait, this would allow us to count the number of database queries on certain pages (hot and cold caches too etc.), add assertions for them, then tighten the tests when the number of queries drops, or be warned if it's going to increase (which might be fine, or might be a regression).
Steps to reproduce
Proposed resolution
Add a performance_test module to core.
Enable this module in PerformanceTestBase.
Add a query logging implementation using the database API's new query events system.
In a post response task, designed to run last, write the logged query events to a key/value entry.
In PerformanceTestTrait, check the key value and add the query count to the PerformanceData value object.
Queries are collected incrementally until they're checked - this means that if a page causes asset, image style, or AJAX controllers to be requested too, the queries from those will be added to the log.
The number of queries is not 100% deterministic in the currrent test coverage, this might be interactions of things like cache collectors which could vary by the order in which http requests finish on the tested site. We can add greater than and less than assertions to start with and potentially address this later. But also asserting on a range will still lets us adjust that range if core goes lower or higher than it due to a change.
Remaining tasks
Already using this to verify #3395776: Make POST requests render cacheable and it found a missing change in that issue.
Additionally, via working on the two issues I think I've found a caching bug/shortcoming in Views #3400596: Views adds cache contexts inconsistently.
Once this is committed, we can build on it (probably without any significant changes, just additions) to add database queries to opentelemetry traces too #3377657: Add database query spans to otel traces. They'll then be browsable on http://grafana.tag1demo.tag1.io/d/teMVIdjVz/umami?orgId=1&refresh=30s&fr...
User interface changes
API changes
Data model changes
Release notes snippet
Issue fork drupal-3352851
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3352851-query-events
changes, plain diff MR !5268
- 3352851-db-query-assertions
compare
Comments
Comment #2
catchComment #4
catchWe can probably use the brand new API from #3313355: Allow the database query log to be dispatched as log events to collect the queries. Bit of a question on how best to get that data into phpunit from the tested site, also whether we only implement this for functional tests or also kernel tests.
Comment #6
catchComment #7
catchCouple of things to figure out, but the general concept is here.
1. Need to see if the full backtrace was intentional in the database logger, if it was, add a workaround here.
2. There are always variations of a handful of queries within these tests so I've had to use >= <= comparisons for the assertions. Will open a follow-up to investigate exactly why. Already opened #3399970: Umami content is all created in the same second but same problem with the standard profile so it's not only that.
3. We're currently writing more to the log than we need just to get a count, but doing so enables sending data to open telemetry later. I will probably do that in a separate issue/MR to see how much work is involved, and maybe move it here if it turns out to be easy.
Comment #8
mondrakeOne inline comment in the MR.
Comment #9
catch@mondrake I think if we do that, it should be its own issue since it's making changes to the actual database event API then (even if smaller ones), the question then becomes should it be a blocker to this issue (making this PP-1) or a follow-up to make things more flexible after this lands?
Comment #10
mondrake#9 I certainly would not be blocking this for that; followup (if wanted) is OK. Rather fringe anyway.
Comment #11
catchUpdated the issue summary.
Comment #12
catchThis can be added to the existing CR, so tagging for needs CR updates: https://www.drupal.org/node/3366904
Comment #13
catchComment #14
fabianx commentedRTBC - Code looks good to me, fixes an important oversight from the last MR related to this and overall enhances the performance testing capabilities.
Only caveat is that only pages loaded via the Mink driver can be performance tested in that way right now, but that is by design and hence the perfect scope and hence fine for this issue.
Comment #15
xjmComment #16
xjmNifty. Bumping to major -- this is an important gap in our test coverage currently and historically.
Posted some CS nitpick suggestions on the MR. Additionally, I think the code changes should document both the
@todoand the "why" behind the specific ranges of query counts used. Seems very arbitrary ATM.I think the CR updates are also outstanding? The relevant CR(s) should also be attached to this issue.
Thanks!
Comment #17
catch@xjm we can't update already-published CRs until issues are committed, otherwise the CRs would be misleading. Or that's always been my understanding of 'needs change record updates'.
I applied the suggestions.
Will add the @todo and the why comments next time I'm properly looking at this, I don't like the 'why' very much but it's the reality of HEAD and the reason we need stuff like this in the first place.
Comment #18
catchAdded the additional docs/todos and opened #3402610: Allow assertions on specific cache IDs in performance tests.
Comment #19
fabianx commentedBack to RTBC!
Comment #20
catchAdded the CR to this issue, and since it's only a one line addition to the code examples, went ahead and updated it now.
Comment #21
poker10 commentedSeems like the last pipeline job (https://git.drupalcode.org/project/drupal/-/jobs/359970) failed with:
We probably need to increase the max. number of expected queries in
StandardPerformanceTest::testNoJavaScript().Otherwise I think the MR looks good!
Comment #22
catchOof yeah that is the drawback of the current state, but the follow-up to this in #3377657: Add database query spans to otel traces should help us diagnose what the differences are since you'll be able to see which queries ran between different requests in the traces.
Since this is a one character change, moving back to RTBC. Having to specify the range means we can end up with random failures once this is committed too - but we can always easily fix them by adjusting the range up or down. And I hope we can make things a lot more predictable in follow-ups with the tests or whatever's causing the variation in core too.
Comment #23
catchNow have a working implementation on the follow-up to this at #3377657: Add database query spans to otel traces - only a dozen or so additional lines of code to get devel-esque query reports into the tempo traces (see screenshot in the issue if that sentence doesn't make much sense).
Comment #24
alexpottDiscussed with @catch. We had a think about the asserting on ranges of query numbers and think that maybe this can be improved by separating cache queries from other queries. Then we can assert against a fixed count for other queries and a range for cache queries.
Comment #26
andypostThere's common approach to collect this numbers https://symfony.com/doc/current/profiler.html
Moreover it's used by the webprofiler contrib module, the implementation to collect queries
DatabaseDataCollector.phpExtra dev dependency could be added even into minor release
Comment #27
catchI've pushed a commit that does #24, but working on that made me realise with only a small bit of extra work, we can cover all of #3396196: Separate cache operations from database queries in OpenTelemetry and assertions. We'll eventually want to do something to allow similar when running redis as the cache backend, but that's only going to help custom performance tests on real sites and won't be made harder by splitting the queries like this, might even be easier since we could probably swap out the logic while keeping the API the same.
So leaving at needs work a bit more to try that out.
@andypost the way that webprofiler collects the queries and the way that the performance_test module added here collects the queries are the same - it relies on the database statement events (although it has a nifty way to enable events on all connections which is a good idea we should borrow).
The difference is that it's using Symfony profiler to store and render the events as you point out. We could re-use the storage, but that will result in more code more than less because there's a fair amount of boilerplate to implement in the profiler implementations which is similar to the database event information, as well as for things like rendering profiles which we can't really do in core tests (except maybe as gitlab artifacts? but then you don't get the ability to browse like you do with grafana).
So I'd need a bit of convincing that it'd actually be a net benefit.
Comment #28
catchDiscussed this with @alexpott in slack.
The ranges on the database queries mean that it's going to be hard to show regressions in the current state, while I think we could have deferred that to a follow-up, we had an idea to separate out cache and other database queries from each other, since about 98% of the variation is from cache collector writes. This lets us assert on smaller ranges everywhere, in some cases using assertSame(), although I haven't switched from ranges to same for those cases yet in case results are different on gitlab.
This pre-empts #3396196: Separate cache operations from database queries in OpenTelemetry and assertions a bit, but we can keep the same API for assertions if we do that and just change how the data is collected (i.e. via a cache backend decorator, so it works for memcache and redis too). The basic logic of filtering out cache queries would need to stay if we do that, so that we don't duplicate between the two and regress the more specific assertions here.
This will not only make the tests less random, but also more generally useful - e.g. we'll be able to see if a caching fix removes a database query and adds a cache get, whereas without separation, this would be the same number of database queries.
I've added the is_a() check, slightly awkward due to the double backslashes in the backtrace output.
Comment #29
catchI think I've got things as precise as I can for the initial pass:
- now more aggressively checks that nothing from previous requests is finishing off before it starts collecting data
- added some pre-warming drupalGet() calls to try to stabilise the site a bit more before making requests. This means we're testing slightly different things (i.e. not a completely cold cache), but that's fine given we're going from zero coverage in the first place.
- switched to assertSame() for the assertions that seem to be actually predictable now, which makes it easier to see which still have to assert a range.
Comment #30
fabianx commentedBack to RTBC!
One thing is that we should open a follow up issue to generally track cache events in the caching layer and not in the database layer.
As the performance data is neutral to where it comes from , this is straightforward:
- Collect cache events
- Subscribe to it
- Add to performance data
- Ignore DB cache things instead
Comment #31
catchThat issue is here: #3396196: Separate cache operations from database queries in OpenTelemetry and assertions.
Once we have that (we don't have cache events, but we can decorate the services to get this), it'll be a case of completely skipping the database queries to avoid duplication - which just means we can simplify the cache-special casing added here. I was originally worried about duplication between the two things, which was the reason for tackling this first, but this issue has answered how to handle that so should be easy enough.
Comment #32
catchNow using this on #3395776: Make POST requests render cacheable and it's successfully picking up additional cache gets after submitting the login form (also additional database queries, at least one of which I've got a theory about but need to investigate more).
Comment #33
alexpottCommitted and pushed b1a8736022 to 11.x and fdf0228fb6 to 10.2.x. Thanks!
Backported to 10.2.x as this is a test-only change and could prove useful to contrib and custom testing in 10.2.x