Problem/Motivation

Drupal core doesn't have any automated performance testing, but it's been sorely needed since at least 2009 when I opened #638078: Automated performance testing for core.

Since then, a lot of things have changed - at that time we were just in the process of adding SimpleTest coverage to core, now we're using PHPUnit. However the issue of uncaught performance regressions like #3327856: Performance regression introduced by container serialization solution and having to manually test performance improvements and fixes is still there.

In general, unless a change is very likely to be performance-sensitive, we don't require manual performance testing before commit - even with lighthouse/devtools/xhprof it's quite a lot of work to set up a test case, and even when we do manual performance testing, people often get it wrong (like comparing requests where one has some cold caches and one has warm caches for example).

We sometimes do require manual performance testing for improvements/fixes - but this only validates the improvement, it doesn't stop regressions very often - and in fact regressions are often introduced with changes that visually look 'innocent' from a performance perspective.

There is a lot of potential performance data we can collect, but broadly it falls into two categories:

1. Absolute performance data - how many queries are executed in a particular request, how many http requests are made from a particular request etc. For these, we can potentially set a fixed number and fail if a test deviates - although this will require issues that change the number without actually being regressions (like an extra article on the Umami front page with an image) to be adjusted each time. Even these might need adjusting with different database engines, for example.

2. Relative performance data - time to first byte, PHP execution time, largest contentful paint, memory usage. These will all vary depending on the environment that the test runs on and also vary across runs on the same environment.

Ideally, we'd collect as much data as possible, potentially with hard fails for things in the 'absolute' category, and then graphing or much wider thresholds for the 'relative' category.

Steps to reproduce

Proposed resolution

There are various options for driving a browser and collecting data, but we already have one in core - phpunit + mink/selenium drivers, which gives us access to a full browser. PHPUnit also allows everyone to maintain any tests we might add using an API/language they're familiar with and potentially to run them locally. Additionally, PHPUnit lets us add basic sanity checking to the test via non-performance assertions, so that functionality changes which might break the test can be picked up. Even if we ended up using something else eventually, it's an easy start.

Current plan:

Add a base class PerformanceTestBase extending WebDriverTestBase.

This enables chromedriver test logging, and then overrides ::drupalGet() to collect the performance log after each page request.

In this issue, collect number of CSS, JavaScript and image requests and add a basic test of Umami to assert how many on the front page.

Once this issue lands, there are multiple other types of performance data we can begin to collect:

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Issue fork drupal-3346765

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch created an issue. See original summary.

catch’s picture

Status: Active » Needs work

Started adding the new test type to have a base class to work in without polluting WebDriverTestBase, haven't started on getting the actual data out of chrome yet.

Akram Khan’s picture

added updated patch and try to fix CCF of MR #3

catch’s picture

Issue summary: View changes
catch’s picture

Issue summary: View changes
heddn’s picture

Status: Needs work » Needs review

I got the test finally running on local. While doing that, I stumbled upon core/tests/README.md that needs to be updated to mention this new test type. But this really great stuffs.

smustgrave’s picture

Status: Needs review » Needs work
Issue tags: +Needs Review Queue Initiative, +Needs followup

1 small comment about a commented out line.

Added follow-up tag to address #7 unless it's agreed it can be added here.

catch’s picture

We should add the docs here, and also I think we need to update phpunit.xml.dist to add the new test type alongside the others in there.

Apart from that there's the general issue of actually putting the performance data somewhere accessible.

catch’s picture

Both me and @heddn were seeing 2.3-2.7 seconds for largest contentful paint. I cross-checked that against lighthouse on a 10.1.x install, and confirmed it's accurate, opened #3349447: Improve Largest Contentful Paint in Umami front page to improve it. Once we fix that, we can see if the test can pick up the improvement.

catch’s picture

Title: Support getting performance data from chromedriver » Add performance tests as a new phpunit test suite
Issue summary: View changes
Issue tags: +Needs framework manager review, +Needs frontend framework manager review, +Needs release manager review
catch’s picture

A couple of recent updates:

- now sending time to first byte -> first paint and first paint -> largest contentful paint as spans to opentelemetry.

- added assertions for number of styles/scripts/images requested on the page. Not sure if the image assertion is really possible because chromedriver browser window could be non-deterministic potentially.

andypost’s picture

Meantime PECL auto-instrumentation extension is packaged https://github.com/open-telemetry/opentelemetry-php-instrumentation

Looks it could be useful for measuring other tests

catch’s picture

I've added an additional test method + supporting code. We now collect the number of scripts, stylesheets and images via chrome network requests. Testing the Umami front page with a user logged in that has access to tours, I'm able to get a proper failure if I revert #2695871: Aggregation creates two extra aggregates when it encounters {media: screen} in a library declaration locally - i.e. the test expects four stylesheets but with that issue reverted there are six.

Added image support in there, but (ironically) lazy image loading makes it hard to assert on a specific number of images, because it depends what the browser engine decides needs to be loaded or not, and that appears to be slightly variables - sometimes I get 19, sometimes 20. Not sure what if anything to do about that yet - we can always put that data into OpenTelemetry even if it's not viable for assertions.

catch’s picture

catch’s picture

Now that it's possible to do useful assertions in phpunit (although I still need a failing patch that fails in the right way), I've split the OpenTelemetry part of this out to #3352459: Add OpenTelemetry Application Performance Monitoring to core performance tests and #3352389: Add open-telemetry/sdk and open-telemetry/exporter-otlp as dev dependencies.

catch’s picture

catch’s picture

Status: Needs work » Needs review

I think we're approaching a committable patch here so marking needs review.

catch’s picture

Issue summary: View changes
catch’s picture

Title: Add performance tests as a new phpunit test suite » Add PerformanceTestTrait for allowing browser performance assertions within FunctionalJavaScriptTests

Discussed with @heddn whether we could use a trait to add server-side metrics to functional/kernel and maybe unit tests.

I think this is doable. The chromedriver stuff in here wouldn't be relevant for those tests, but we can also make this a (different) trait instead of adding an entirely new test suite. The only complication is having to override ::setUp() to re-enable css/js aggregation, so I moved that out to the Umami test. Also moved the Umami test to Umami itself.

catch’s picture

Issue summary: View changes
catch’s picture

OK issue with the recent test fails, which I wasn't seeing locally as dramatically, was that chromedriver stores the performance log until it's retrieved, then empties it out - so if you have http requests in-between drupalGet, you'd get the log for all of them, not just the drupalGet - this is easily solved by requesting the performance log prior to making the request, which resets it so you only get what you're expecting next time it's retrieved.

Attaching a fail patch that should fail properly this time - code from the MR + a revert of #2695871: Aggregation creates two extra aggregates when it encounters {media: screen} in a library declaration, you'll be able to see the extra two http requests causing the test to fail. We can apply the same pattern to database queries and assert on those, since that will be consistent across environments.

Status: Needs review » Needs work

The last submitted patch, 22: 3346765-fail.patch, failed testing. View results

catch’s picture

Status: Needs work » Needs review

1) Drupal\Tests\demo_umami\FunctionalJavascript\PerformanceTest::testFrontPageTour
Failed asserting that 4 is identical to 2.

That's what we wanted with the fail patch.

Back to needs review.

heddn’s picture

Status: Needs review » Reviewed & tested by the community

I'm good w/ this. Marking RTBC. But it still looks like we need several more reviews; which hopefully can happen soonish.

catch’s picture

Untagging for release manager review since there's been a +1 from @quietone to the general concept elsewhere, and any dependency additions have been split out of here now. However the tests needs framework and front-end framework reviews still I think - it's a light patch but it's got implications once we keep going.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 22: 3346765-fail.patch, failed testing. View results

catch’s picture

Status: Needs work » Reviewed & tested by the community

Hiding the fail patch so it doesn't get retested and restoring RTBC.

After doing some extra testing I don't think we can add assertions on exact image count because I think chromedriver is slightly indeterminate there, however we might be able to figure out the maximum by running a test lots of times and seeing what level it fails at then using assertLessThan(). With opentelemetry we can also send the number of image requests as a trace attribute.

That should definitely be in a follow-up (which I'll open next week), question is whether to leave the supporting code in here or also move that to the follow-up.

heddn’s picture

I don't feel like we need to test it here in this patch. We could move that bit out into the follow-up. Or we could do a assertMoreThan(1) or something that will be fairly resilient to test changes.

mondrake’s picture

Some typehinting nits, leaving RTBC

Wim Leers’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +front-end performance

I surfaced some concerns. 😅

catch’s picture

Issue summary: View changes
FileSize
75.11 KB

I looked at converting NoJavaScriptAnonymousTest and there is one problem:

We could disable those modules maybe but opened #3354344: [PP-1] Convert NoJavaScriptAnonymousTest to use PerformanceTestTrait since it won't be a direct conversion of just the test.

catch’s picture

Status: Needs work » Needs review

Also opened #3354347: Add xhr and BigPipe assertions to PerformanceTestTrait for the xhr and BigPipe assertions.

If @Wim agrees with the two follow-ups I think we're back to needs review here.

Wim Leers’s picture

Status: Needs review » Needs work

#32: hm, that makes me wonder whether we should this not as a trait, but as a new base class, because that'd allow us to not do \Drupal\FunctionalJavascriptTests\WebDriverTestBase::installModulesFromClassProperty(). But based on #20 that was highly intentional: to allow non-functional tests to use this too. But … I see no such uses yet?

That'd also allow every performance test to not have to worry about overriding ::setUp() to re-enable CSS/JS aggregation?

#33: Definitely agreed. 👍

Marking Needs work for the question at the start of this comment plus the missing documentation for the used webdriver arguments.

catch’s picture

Title: Add PerformanceTestTrait for allowing browser performance assertions within FunctionalJavaScriptTests » Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests
Issue summary: View changes
catch’s picture

Status: Needs work » Needs review

hm, that makes me wonder whether we should this not as a trait, but as a new base class, because that'd allow us to not do \Drupal\FunctionalJavascriptTests\WebDriverTestBase::installModulesFromClassProperty(). But based on #20 that was highly intentional: to allow non-functional tests to use this too. But … I see no such uses yet?

I think this and the demo_umami setUp() override are the clincher - we need a base class, although 'just' a base class this time not a new test type as I started with.

We can still put any shared logic with functional tests like database querylogging into a trait. Have made that change in the MR and hopefully addressed the other points too.

catch’s picture

With the base class approach that allows #3354344: [PP-1] Convert NoJavaScriptAnonymousTest to use PerformanceTestTrait to pass without having to do its own overrides now.

heddn’s picture

Status: Needs review » Reviewed & tested by the community

All feedback addressed! LGTM.

heddn’s picture

Status: Reviewed & tested by the community » Needs work
Wim Leers’s picture

That was a random failure.

This looks superb, but I have a few trivial nits and one semantical question 😊 This is on the verge of RTBC for me.

Do we really need to update core/tests/README.md like #7 and #8 were saying? 🤔

catch’s picture

Do we really need to update core/tests/README.md like #7 and #8 were saying?

Ah no that's when I was considering a full new test type, over and above a base class, we don't need it for just a base class.

Doesn't that mean we should move this out of core/modules/system and into core/profiles/standard for consistency?

I thought about that but it felt out of scope... don't really mind either way.

Wim Leers’s picture

Issue summary: View changes
Status: Needs work » Reviewed & tested by the community
Issue tags: -Needs followup

I do think there's a good reason to move NoJavaScriptAnonymousTest (see comment on MR), but I defer to @catch on that, and it obviously shouldn't block commit.

I think this is an exciting step forward to avoid future front-end performance regressions 🤩 And it enables exciting next steps:

Let's do this? 😄

catch’s picture

Wim Leers’s picture

catch’s picture

Split image handling out to #3361833: Add image request counts to PerformanceTestBase, it works but it's not entirely clear how to assert on it. Makes the diff here a dozen lines smaller. Leaving RTBC because this is pure removals to a follow-up.

catch’s picture

Issue summary: View changes
Wim Leers’s picture

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

catch’s picture

This issue is now doing a lot less than when I opened it, so moving the needs x review tags over to #3352459: Add OpenTelemetry Application Performance Monitoring to core performance tests which is the one with more implications. There is other dependent work following this issue that doesn't involve opentelemetry, so that can be assessed on its own merits when we get there.

Wim Leers’s picture

Would still love to see this one land! 😊

larowlan’s picture

Issue credits

  • larowlan committed 2fc7e23e on 11.x
    Issue #3346765 by catch, heddn, Akram Khan, Wim Leers, mondrake: Add...
larowlan’s picture

Title: Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests » [needs backport] Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests
Version: 11.x-dev » 10.1.x-dev

Pushed to 11.x, have asked RMs if this should go into 10.1.x too

larowlan’s picture

Added a bare bones CR, feel free to expand https://www.drupal.org/node/3366904

larowlan’s picture

Title: [needs backport] Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests » Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests
Version: 10.1.x-dev » 11.x-dev
Status: Reviewed & tested by the community » Fixed
andypost’s picture

Thank you! I bet it needs follow-up to write some docs, moreover it could be useful for custom development

Issues from summary could be unpostponed

heddn’s picture

Glad to see this land!

andypost’s picture

Looking at CI logs I see that no new tests are executed
PHP Notice: Undefined index: Drupal\Tests\system\FunctionalJavaScript\NoJavaScriptAnonymousTest in /opt/drupalci/testrunner/src/DrupalCI/Build/Artifact/Junit/JunitXmlBuilder.php on line 82

Moreover this code fails on PHP 8.3 #3366843: Fix tests broken on PHP 8.3

andypost’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.