Problem/Motivation
When a merge request modifies too many files (around 1000) the 📔 Spell-checking job fails with "Argument list too long" errors.
$ export MODIFIED=`git diff --name-only refs/heads/$TARGET_BRANCH|while read r;do echo "$CI_PROJECT_DIR/$r";done|tr "\n" " "`
$ echo $MODIFIED | tr ' ' '\n' | yarn --cwd=./core run -s spellcheck:core --no-must-find-files --file-list stdin
/scripts-125737-332819/step_script: line 262: /usr/bin/tr: Argument list too long
/scripts-125737-332819/step_script: line 262: /usr/bin/yarn: Argument list too long
The maximum length of arguments is determined by getconf ARG_MAX. There are some docs about this in GitLab CI at https://docs.gitlab.com/ee/ci/variables/#argument-list-too-long
Aside: do we need a gitlab or ci component for these issues?
Steps to reproduce
https://git.drupalcode.org/issue/drupal-3399388/-/jobs/332819
Proposed resolution
Instead of piping the list of files to tr and again to yarn we can write the list to a file read it in the yarn command. Something like this:
git diff --name-only refs/heads/$TARGET_BRANCH|while read r;do echo "$CI_PROJECT_DIR/$r";done > ./core/modified.txt
yarn --cwd=./core run -s spellcheck:core --no-must-find-files --file-list modified.txt
Remaining tasks
- Determine optimal command formatting / file location
- Update .gitlab-ci.yml
User interface changes
N/A
API changes
N/A
Data model changes
N/A
Release notes snippet
N/A
Issue fork drupal-3401988
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
fjgarlin commentedNote that the "git diff" command might not work as expected if the MR needs rebasing.
#3401971: Test-only job shouldn't require constant rebases to detect which files were changed. needs to change the
git diffcommand.git diff ${CI_MERGE_REQUEST_DIFF_BASE_SHA}seems to be working well, so maybe, if we need to update the command here, we can use the same.Comment #3
acbramley commentedI just hit this on https://git.drupalcode.org/project/drupal/-/merge_requests/5500
As you can see, there aren't many files changed at all. The issue was that the fork's 11.x branch was way out of date. That doesn't make sense to me since the MR's target is origin's 11.x. Why does it matter where the tip of the fork's 11.x branch is pointing?
Pushing 11.x to the fork fixed it, but then the 11.x branch shows up on the issue (which can easily be hidden now, but still shouldn't be required).
Comment #5
fjgarlin commentedI made an MR based on my suggestion in #2.
Comment #6
smustgrave commentedSeems like a simple enough change
Comment #7
acbramley commentedTested the change here https://git.drupalcode.org/issue/drupal-3412420/-/jobs/651480 which errored out but passed the pipeline.
Comment #8
smustgrave commentedCould argue almost critical as some MRs aren't running now
Comment #9
acbramley commentedBumping the depth to 50 (as per recommendation via slack) has fixed it https://git.drupalcode.org/issue/drupal-3412420/-/jobs/651854
Comment #10
mstrelan commentedI agree that we should make this change, but it's not addressing the issue that was actually reported here. We would still have a fail when an MR actually does touch that many files (such as #3399388: Add declare(strict_types=1) to all Kernel tests).
Comment #11
acbramley commentedFor those hitting this (where you DONT have lots of files changed) - a temporary fix is to push the latest 11.x branch to your fork:
Where FORK is your issue's fork (i.e
drupal-123123You'll then need to manually hide the 11.x branch on your issue in the Issue fork section.
Comment #12
fjgarlin commentedRe #10, why not looping through the files changed and run the spellcheck per file? Even if we make more calls, it should be safer, and for small MRs, the impact shouldn't be too big.
Comment #13
mstrelan commentedWe don't need to loop because the issue summary already shows an example using a temp file.
Comment #14
fjgarlin commentedOh! My bad, I didn't read that part. Will revert the changes.
Comment #15
fjgarlin commentedUsing the suggested solution (thanks @mstrelan!) and the fetching fix, things are working as expected. See https://git.drupalcode.org/issue/drupal-3401988/-/jobs/654322
MR is ready for review: https://git.drupalcode.org/project/drupal/-/merge_requests/6209/diffs
Comment #16
andypostComment #17
wim leers👀 Ran into this too, I thought this was a recent regression 😅, when it clearly is not.
Comment #18
drummI ran this patch on the private fork used for testing security patches and confirmed it works as expected there.
Comment #19
wim leersRan into this again in a core issue at #3317769: Add support for linking to entities in CKEditor 5.
Comment #20
acbramley commentedBig +1 to get this committed, it's biting me on almost every issue I touch at the moment.
Comment #21
wim leersAgain at #3379104-31: Add a "CKEditor 5 nightly" GitLab CI job.
Comment #22
longwaveAdded some nitpicks to the shell script.
Comment #23
fjgarlin commented@longwave - your suggestion differs from the one proposed in the issue summary and #13. Both work, so it's just a matter of agreeing to which one to choose.
If there are 100 files changed:
- The pipe+sed+yarn would trigger the command will trigger the spellcheck command 100 times for a single file each time
- The keeping changed files in a file+yarn would trigger the spellcheck command 1 time for the 100 files with just one argument
I'm fine either way but it'd be great to agree on the approach to do before making any more changes.
Comment #24
smustgrave commentedSeen a few more issues get hit by this today #3347343: Add Views EntityReference filter to support better UX for exposed filters which is a large issue that's close. If this fixes it can we merge this and tweak in a follow up?
Comment #25
longwave@fjgarlin Why would
git diff | sed | yarntrigger it 100 times? The--file-list stdinargument takes the entire list directly from stdin, we're not using xargs.The suggestion is the same as https://github.com/streetsidesoftware/cspell/issues/1850#issuecomment-10... except that we need to do some fixing up of the paths because package.json isn't in the repo root.
Comment #26
fjgarlin commentedThat’s just me not knowing enough about bash.
So won’t that pass “too many arguments” again if many files change? Or does it count as just one argument?
Comment #27
longwaveIn fact testing locally we can simplify it to:
which fixes the directory issue using relative instead of absolute paths.
Comment #28
longwavehttps://docs.gitlab.com/ee/ci/variables/#argument-list-too-long only mentions this affecting variables, I think the problem before was $MODIFIED was too long - here we aren't using variables or arguments at all, just pipes and stdin/stdout.
Comment #29
fjgarlin commentedI thought that the limitation was from cspell, not gitlab!!
I'm going to try the suggested fix in the last failing reported MR.
Comment #30
fjgarlin commentedTesting the suggested fix on an MR that was failing and is not rebased (#3347343: Add Views EntityReference filter to support better UX for exposed filters):
Reason. The fork does not even have 11.x. This is not the problem on this issue.
I needed to push the 11.x branch, and the pipeline now works, but then I can't reproduce the issue here.
Trying to find another MR.
Comment #31
fjgarlin commentedNext #3379104: Add a "CKEditor 5 nightly" GitLab CI job.
Will apply the suggested fix in this issue's MR now.
Comment #32
fjgarlin commentedThanks for the explanations about the different variations of the commands.
I think this is ready for review again. #31 shows how it fixes it in a failing MR.
Comment #33
fjgarlin commentedEverything is rebased too.
Comment #34
longwaveCommitting from NR as this is affecting multiple merge requests.
Committed and pushed 7986c7c8d6 to 11.x and e874b9620b to 10.2.x. Thanks!
Comment #37
norman.lol#11 did not help unfortunately, but in the end a rebase did.
Comment #39
ressa#3082958: Add gitignore(s) to Composer-ready project templates just got hit by this, even though the MR only modifies a few files ...
https://git.drupalcode.org/issue/drupal-3082958/-/pipelines/108333/failures