CI: review automatic jobs and when they run [#3585979]

Overview

https://git.drupalcode.org/project/gitlab_templates/-/work_items/3572380 was recently merged for all contrib modules using the default DA templates. Canvas implementation extends the default templates and adds their own jobs (which is totally fine), however some of those jobs always run, no matter what.

Checking the usage over the last week alone, canvas is on par with Drupal core:
CI usage

This is probably due to the volume of issues and MR being worked on each day, so again, the above is not a bad thing, but something to think about.

Last, I see that there is already a "*manual-rule", which is exactly what has been done in the gitlab_templates,

Proposed resolution

I think the idea would be to review which jobs could use the "*manual-rule" that currently do not use it.
I don't know enough about the whole pipeline to know which job is relavant where, so I'll let the people that know suggest possible improvements, if any.

User interface changes

If implemented, some pipeline jobs would need to be manually triggered.

Comment	File	Size	Author
#22	horror CI.gif	4.69 MB	wim leers
#2	Screenshot 2026-04-21 at 12.56.21.png	83.68 KB	fjgarlin
#2	Screenshot 2026-04-21 at 12.54.22.png	120.62 KB	fjgarlin
	Screenshot 2026-04-21 at 09.45.48.png	70.76 KB	fjgarlin

Issue fork canvas-3585979

Show commands

Start within a Git clone of the project using the version control instructions.

Add & fetch this issue fork’s repository

Or, if you do not have SSH keys set up on git.drupalcode.org:

Add & fetch this issue fork’s repository

3585979-ci-e2e-disable changes, plain diff MR !995
Check out this branch for the first time

Check out existing branch, if you already have it locally
3585979-ci-easy-disable changes, plain diff MR !944
Check out this branch for the first time

Check out existing branch, if you already have it locally

About issue forks

Comments

Comment #1

21 April 2026 at 10:44

fjgarlin created an issue. See original summary.

Comment #2

fjgarlin commented 21 April 2026 at 10:57

Status	File	Size
new	Screenshot 2026-04-21 at 12.54.22.png	120.62 KB
new	Screenshot 2026-04-21 at 12.56.21.png	83.68 KB

Also, seeing the per-job breakdown (see the images), there are many cypress and playwright jobs that reach the 30 min timeout, so these jobs are likely to be run again (either automatically or manually). Is there a way to parallelize these jobs so they take max 10-15 min?

Comment #3

21 April 2026 at 11:03

penyaskito made their first commit to this issue’s fork.

Comment #4

21 April 2026 at 11:07

penyaskito opened merge request !944

Comment #5

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 21 April 2026 at 13:19

One impediment we have is Gitlab absence of support for exclude/negation rules.

e.g. we could reduce the amount of jobs easily with something like

> if ONLY packages/* change, skip X jobs.

Gitlab supports
> if packages/* change, run X jobs.

Flipping the condition would force us to maintain (and hardcode) an allowlist: meaning every new file type or directory could become a silent trap for skipping the job that shouldn't be skipped.

Comment #6

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 21 April 2026 at 13:22

Component:	… to be triaged	» Project management
Status:	Active	» Needs review

Needs review for !944, that's an easy win both for the team speed + reducing the number of timeouts (aka people hitting retry).

Can't claim this would help with the DA spend though, as there will be extra jobs per MR. Not sure which one outweighs the other.

Comment #7

fjgarlin commented 21 April 2026 at 13:43

1 jobs at 30 minutes equals 6 jobs at 5 minutes, but the former could be "retried" if it goes over the 30 min, so that's an improvement. Every bit counts.

Comment #8

fjgarlin commented 21 April 2026 at 13:49

As per the reverse logic: https://gitlab.com/gitlab-org/gitlab/-/work_items/198688

Comment #9

wim leers

Ghent 🇧🇪🇪🇺

commented 22 April 2026 at 10:04

#3529128: Speed up PHPUnit on CI; stop relying on drupal.org composer template should help drastically with the PHPUnit CI jobs.

Comment #10

wim leers

Ghent 🇧🇪🇪🇺

commented 22 April 2026 at 10:05

Status:

Needs review

» Reviewed & tested by the community

RTBC'd the first MR.

Comment #11

23 April 2026 at 03:53

penyaskito committed 464e13b6 on 1.x

chore: #3585979 CI: Use 6 shards for playwright to avoid 30m timeouts...

Comment #12

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 23 April 2026 at 03:55

Status:

Reviewed & tested by the community

» Needs work

Merged !944. Leaving as NW as there's more to fix here.

Comment #13

penyaskito

he/him

Spanish

🧑🏽‍🌾 Seville 💃 Andalusia, UTC+2 🇪🇺

commented 24 April 2026 at 21:47

#3586655: CI: do not run E2E tests when zero JS or PHP is modified; reorganize `rules` is definitely a subset of this.

Comment #14

27 April 2026 at 17:34

penyaskito opened merge request !995

Comment #15

wim leers

Ghent 🇧🇪🇪🇺

commented 28 April 2026 at 11:27

Assigned:

Unassigned

» justafish

I am not convinced that the script in !995 really helps? It's a single line we're expected to modify. I defer to @justafish.

Comment #16

wim leers

Ghent 🇧🇪🇪🇺

commented 28 April 2026 at 11:27

Status:

Needs work

» Needs review

Comment #17

wim leers

Ghent 🇧🇪🇪🇺

commented 29 April 2026 at 09:09

#3586655: CI: do not run E2E tests when zero JS or PHP is modified; reorganize `rules` just landed, and should make a serious dent in this!

Comment #18

wim leers

Ghent 🇧🇪🇪🇺

commented 29 April 2026 at 09:38

Opened #3587543: Stop CI disruption from phpstan/phpstan and mglaman/phpstan-drupal: pin versions + introduce nightly `phpstan-is-up-to-date` CI job to nudge updates, which will definitely also make a dent.
I think #3529128 is almost ready: #3529128-17: Speed up PHPUnit on CI; stop relying on drupal.org composer template, which should make another dent.

Comment #19

wim leers

Ghent 🇧🇪🇪🇺

commented 29 April 2026 at 09:50

After #3529128: Speed up PHPUnit on CI; stop relying on drupal.org composer template lands, I think we should consider making:

PHPUnit (11.2) run on all DBs
PHPUnit (11.3) run only on SQLite

Or vice versa.

Thoughts?

EDIT: to clarify, I mean for merge commits (aka push to 1.x). Currently both 11.2 and 11.3 run all tests on all 4 DBs.

Comment #20

wim leers

Ghent 🇧🇪🇪🇺

commented 29 April 2026 at 13:07

@justafish's #3586660: CI: Add additional caching to GitLab CI pipelines should also make a difference here: same # of CI jobs, but they should run faster.

Comment #21

wim leers

Ghent 🇧🇪🇪🇺

commented 30 April 2026 at 09:29

#3529128: Speed up PHPUnit on CI; stop relying on drupal.org composer template is in!

Next up: #3587819: Optimize which PHPUnit CI jobs run: which version (11.2 vs 11.3) + DB combos, for #19.

Comment #22

wim leers

Ghent 🇧🇪🇪🇺

commented 30 April 2026 at 10:27

Assigned:	justafish	» Unassigned
Status:	Needs review	» Postponed (maintainer needs more info)

Status	File	Size
new	horror CI.gif	4.69 MB

@fjgarlin: Yesterday's last commit had an absolutely terrifying CI pipeline. Why? See attached GIF. Pipeline URL: https://git.drupalcode.org/project/canvas/-/pipelines/808898

In part due to our PHPUnit CI jobs taking >30 mins. That's fixed as of today: #3529128: Speed up PHPUnit on CI; stop relying on drupal.org composer template landed.

However, it also appears to be in part due to infra instability. I've seen this happen many times in the past. But perhaps now is the time to investigate it? It manifests like this:

………
  - Drupal\Tests\canvas\Unit\DataType\ComponentInputsTest
  - Drupal\Tests\canvas\Unit\UiFixturesValidationTest
Test run started:
  Thursday, April 30, 2026 - 00:36
Test summary
------------
WARNING: Event retrieved from the cluster: 0/6 nodes are available: 2 node(s) were unschedulable, 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling.
WARNING: Event retrieved from the cluster: 0/7 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }, 2 node(s) were unschedulable, 4 node(s) didn't match Pod's node affinity/selector. preemption: 0/7 nodes are available: 7 Preemption is not helpful for scheduling.
WARNING: Event retrieved from the cluster: The node was low on resource: memory. Threshold quantity: 100Mi, available: 102296Ki. Container helper was using 503912Ki, request is 0, has larger consumption of memory. Container database was using 193996Ki, request is 0, has larger consumption of memory. Container build was using 8420012Ki, request is 0, has larger consumption of memory. Container chrome was using 1748Ki, request is 0, has larger consumption of memory. Container selenium was using 246008Ki, request is 0, has larger consumption of memory. 
WARNING: Event retrieved from the cluster: Container runtime did not kill the pod within specified grace period.
WARNING: Event retrieved from the cluster: error killing pod: [failed to "KillContainer" for "build" with KillContainerError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded", failed to "KillContainer" for "chrome" with KillContainerError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded", failed to "KillContainer" for "helper" with KillContainerError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded", failed to "KillContainer" for "database" with KillContainerError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded", failed to "KillContainer" for "selenium" with KillContainerError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded", failed to "KillPodSandbox" for "70149220-9ffa-4954-b4cb-4795fbd8400c" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
Uploading artifacts for failed job
00:00
Cleaning up project directory and file based variables
00:00
ERROR: Job failed (system failure): pod "gitlab-runner/runner-s8ex1x2yj-project-19391-concurrent-11-ug7cktfg" is disrupted: reason "TerminationByKubelet", message "The node was low on resource: memory. Threshold quantity: 100Mi, available: 102296Ki. Container helper was using 503912Ki, request is 0, has larger consumption of memory. Container database was using 193996Ki, request is 0, has larger consumption of memory. Container build was using 8420012Ki, request is 0, has larger consumption of memory. Container chrome was using 1748Ki, request is 0, has larger consumption of memory. Container selenium was using 246008Ki, request is 0, has larger consumption of memory. "

— https://git.drupalcode.org/project/canvas/-/jobs/9597857

Any idea what's going on? Is this a known d.o GitLab CI problem? Is it something Canvas is doing?

Comment #23

fjgarlin commented 30 April 2026 at 11:22

Oh wow, that is indeed horrifying! However, the timestamps coincide with yesterday's security update to the CI runners (https://drupal.slack.com/archives/C51GNJG91/p1777507813295379) so I think, in a way, it was just bad timing.

If this happens again at a time when there aren't any updates happening, then we need to investigate for sure. But from the above output, it seems that the jobs were running and the containers were just killed, which most likely retriggered the jobs to run (if not automatic it was done manually, hence the fully green pipeline).

If this was posponed just based on this, I guess it can be unposponed.

Comment #24

wim leers

Ghent 🇧🇪🇪🇺

commented 30 April 2026 at 11:28

#3587819: Optimize which PHPUnit CI jobs run: 1) nightly: ALL version+DB combos, 2) 1.x/merged MRs: 11.3 all DBs, 11.2 only SQLite is also in. Now I'd like to wait and see the results 😊

CI: review automatic jobs and when they run

Overview

Proposed resolution

User interface changes

Issue fork canvas-3585979

Comments

Comment #1

Comment #2

Comment #3

Comment #4

Comment #5

Comment #6

Comment #7

Comment #8

Comment #9

Comment #10

Comment #11

Comment #12

Comment #13

Comment #14

Comment #15

Comment #16

Comment #17

Comment #18

Comment #19

Comment #20

Comment #21

Comment #22

Comment #23

Comment #24

Parent issue

Child issues

Related issues

Referenced by

News items

Our community

Documentation

Drupal code base

Governance of community