Problem/Motivation

When working with Drupal, it is common to apply patches to core or contrib modules in order to fix a bug or test someone else’s code change before the change is committed and included in a release.

This is relatively straightforward when working with patch files, but when utilizing GitLab’s merge request (MR) functionality, the problem is that the diff URL is not stable. Once multiple commits are pushed to an MR, it becomes complex to generate a patch file that does not include either a single commit or the entire changeset of the MR.

Steps to reproduce

Given an MR (https://git.drupalcode.org/project/commerce_migrate/-/merge_requests/3), it's possible to get a diff or patch file for the entire MR changeset by appending a .diff or .patch extension:

However, the following scenario can occur:

  1. Alice finds a bug. She creates an issue fork on drupal.org, pushes her branch 1234-alice, and creates a MR to fix it. It contains more than one commit.
  2. Alice pushes an additional commit to the MR.
  3. GitLab generates a diff for the MR as well as exposes a patch file at (https://gitlab.com/gitlab-org/gitlab/-/merge_requests/31527.patch) so Alice can include it via whatever patch workflow she chooses.
  4. Clare reviews the MR but sees what she thinks is a potential improvement. She pushes a change to the MR’s 1234-alice branch.
  5. Sarah finds this issue in the issue queue and wants to test the fix. She accesses the patch generated by GitLab. It contains the entirety of Alice and Clare’s changes.
  6. Sarah determines that Clare’s changes actually break the fix proposed in the MR. She wants to test just Alice’s original changes. GitLab is able to provide a patch file for a specific commit or the entire MR, but not a subset of the commits on an MR.

Proposed resolutions

  1. GitLab does have the ability to display a diff between two specified commits, but this view does not support generating a patch file via appending .patch to the URL. @moshe has created a GitLab issue to propose this functionality: Allow .patch to be appended to any Compare URL. There are some similar proposals as well including "Compare" View: Export compare data in .diff format.
  2. This could be positioned as a GitLab bug and pursued in that way.

Note: During the GitLab Acceleration Initiative meeting on 9/8/21 it was discussed and generally agreed upon that we should avoid potential resolutions that resulted in “special snowflaking the architecture” such as generating patch files that get posted back to the D.O. issue as GitLab activity.

Workarounds

  1. If you'd like a static patch file for an MR at a given point in time, create a patches folder in your project next to your global composer.json, download the diff into this folder and then use composer-patches to apply it from there: "Example patch entry from file": "patches/drupal.ISSUE.SHA.brief-description.patch"
  2. Use composer-patches' fix for Patches from drupal.org merge request URLs are dangerous?: Add support for patch checksums. If the MR changes, the new patch won't be applied because the checksum doesn't match. This will have the effect of breaking building that run into this, which will avoid the stability and security problems. This is included in version 2. However, at the time of this writing, v2 hasn't been released yet.
CommentFileSizeAuthor
#40 image.png30.14 KBbbrala
#39 3Su2AuOUmT.png225.75 KBthursday_bw
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

joachim created an issue. See original summary.

tedbow’s picture

@joachim thanks for writing up the issue.

I am not sure it is possible but I think solution A) would be the best. Gitlab already supports url for comparing any 2 commits https://git.drupalcode.org/issue/drupal-3041885/-/compare/a3d2c5da7304e9...
just not adding the ".patch" suffix.

My suggestion in B) was just a way you could get fixed patch url right now with what gitlab currently supports.

The benefit of A) is that different people could use different points in the merge request. If a previous commit worked for them but then later commits added for some reason didn't work. This would be important if the final fix hasn't been figured out yet. With solution B) you would need a branch for every commit someone wanted to use.

drumm’s picture

Issue summary: View changes

Adding a link to https://gitlab.com/gitlab-org/gitlab/-/issues/15593, which should provide static diffs from GitLab’s compare view.

Building Drupal-specific conventions to work around this isn’t great for contributors, it is extra knowledge to remember and extra work to do. The best practice for site owners right now is to keep a copy of the patch in your codebase, as documented at https://www.drupal.org/docs/develop/git/using-git-to-contribute-to-drupa.... (I did just update this to be more straightforward about what to do, rather than saying using the MR patch can be used, but is a bad idea.) This also removes a bit of dependency on the internet, which is always good.

frob’s picture

Edit: I wrapped the URLs in code tags because it is the structure of the URLs that matter in my comment.

There might be something that can be done using the interdiff features in the Merge Request.

For example from issue #3050384: Provide a starterkit theme in core. When clicking the "Compare with previous changes" link you end up at a URL which looks like https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs?dif...

By adding .patch to the end of the URL you end up with a raw output.

https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs.patch?diff_id=12372&start_sha=78c4942113baa0ce7535d6a84e1c72d92617e817

However, another parameter can be send as end_sha giving us:

https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs?diff_id=5899&start_sha=78c4942113baa0ce7535d6a84e1c72d92617e817&end_sha=9b336ec0d148dca3db58c6f151bc4909a22cfebc

Which allows us to create raw patch files from any two commit hashes. Such as:

https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs.patch?start_sha=43d6b052030e795f5bcb97f70f7ddc8313c93ba9&end_sha=f7dd26aa69017add3bc67a36603c9b40a41f6369

Notice in that I have removed the parameter diff_id. I don't know the significance of this parameter but it doesn't seem necessary.

I have not tested this method in a composer.json patches section yet because I just worked this out after reading the issue title. MR patch files is something we use internally so I was alarmed when I read the issue title.

joachim’s picture

@frob that looks good! Thanks for figuring this out!

> I have not tested this method in a composer.json patches section yet because I just worked this out after reading the issue title.

Should work -- I've used MR .diff URLs with Composer Patches. As long as the content if a diff, it'll work.

> MR patch files is something we use internally so I was alarmed when I read the issue title.

Sorry for the panic! :)

I'm struggling a bit to get the right URL.

For instance, with #3201414: display_inclusive for tax rates on order items isn't migrated, causing incorrect prices, how do I get to the URL for just the diff to the first commit -- imagining that I'd made a composer patch with the MR at that point.

frob’s picture

I removed the diff_id part and nothing happened. I expect it to be vestigial.

tedbow’s picture

re #5

I'm struggling a bit to get the right URL.

I am not sure the method in #4 works

If compare the files produced by these 4 URLs

When I download these I get all the same contents which makes be think start_sha and end_sha are actually processed and this only ever produces the diff of the merge request. So it would change if another commit was added.

It would be good if some else to could confirm that these 4 URLs give the same results.

Dave Reid’s picture

I confirmed that all four URLs generate the same MD5 hash of their contents:

➜  Dave Reid set URL="https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs.patch?start_sha=43d6b052030e795f5bcb97f70f7ddc8313c93ba9&end_sha=f7dd26aa69017add3bc67a36603c9b40a41f6369"
➜  Dave Reid curl -sL $URL | md5sum | cut -d ' ' -f 1
aece17a5c667d94d11aef9f01dfe6de9
➜  Dave Reid set URL="https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs.patch?start_sha=43d6b052030e795f5bcb97f70f7ddc8313c93ba9&end_sha=d9a224d1169819c2fd6ee0ca8d4781772154c4b6"
➜  Dave Reid curl -sL $URL | md5sum | cut -d ' ' -f 1
aece17a5c667d94d11aef9f01dfe6de9
➜  Dave Reid set URL="https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs.patch"
➜  Dave Reid curl -sL $URL | md5sum | cut -d ' ' -f 1
aece17a5c667d94d11aef9f01dfe6de9
➜  Dave Reid set URL="https://git.drupalcode.org/project/drupal/-/merge_requests/262/diffs.patch?start_sha=yyyyyy"
➜  Dave Reid curl -sL $URL | md5sum | cut -d ' ' -f 1
aece17a5c667d94d11aef9f01dfe6de9

I'm feeling like we might need a different approach to this, because wouldn't we also have the same issue with Github PR urls in the format (e.g. https://github.com/php/php-src/pull/296.patch).

Would an alternative be for composer-patches to keep track of a hash of the file contents and re-patch when it changes?

Would an alternative be for there to be some kind of "capture the current progress of this MR into a patch and post it back to the Drupal.org issue" button? I thought there was supposed to be patches posted back automatically when the MR feature for Drupal.org was being talked about, but it's possible it was de-scoped for effort.

drumm’s picture

I thought there was supposed to be patches posted back automatically when the MR feature for Drupal.org was being talked about, but it's possible it was de-scoped for effort.

Yes, that was probably mentioned as a possibility with issue workspaces, and would have been de-scoped. The merge request patches GitLab generates are great for the intended purpose of collaboration and testing. Needing to deploy a patched codebase to production is always non-ideal, and deploying a dynamic patch is even more non-ideal.

joachim’s picture

> because wouldn't we also have the same issue with Github PR urls in the format

True, but most of the things we patch in the Drupal world are core and contrib, which are on gitlab. At least, that's my experience.

> Would an alternative be for composer-patches to keep track of a hash of the file contents and re-patch when it changes?

It would make composer-patches a fair bit slower, as it would have to re-download all of the patches every time it checks things.

Also, I don't know when a major change like that would get into composer-patches.

frob’s picture

Sorry to get everyone's hopes up earlier. That is odd, the UI makes it seem like it should be giving us a limited patch. Have we reached out to gitlab to see if this is an upstream bug?

drumm’s picture

The upstream issue in GitLab is https://gitlab.com/gitlab-org/gitlab/-/issues/15593, there are a couple duplicates or near-duplicates, so that could change. The UX making it look like #4 is possible might be considered a new bug.

tedbow’s picture

Does anybody know if there is any kind of backwards compatibility promise with gitlab URL patterns?

If it became common practice to use these URLs in composer.json files and somehow a new version of gitlab changed those URL, like changed start_sha to start, it could disrupt most Drupal sites. Not sure if that is a worry but just thought I would ask.

drumm’s picture

Not that I know of. GitLab has made changes to UI URLs, although those are more flexible. We do have some ability to put redirect rules in at the load balancer level, if it is worthwhile to provide that service.

Pasqualle’s picture

The url change can disrupt but should not break Drupal sites, as I hope nobody deploys a site after composer error..

Pasqualle’s picture

Thinking about this more, as I see the only stable solution would be to post the patches back to the issue (as issue fork branches are deleted, and there is no diff without the issue git branch).

It does not have to be automated, it can be a button on the issue which attaches a new patch file based on the last issue-fork commit. If same patch is already attached then it should not create a new patch file. There can be a link next to the patch file which shows the nice commit diff in GitLab in case the MR (issue fork git branch) still exists.

It seems Drupal lived with patch files so long, that it can't live without them any more..

douggreen’s picture

The patch is there, it's just not exposed in the UI.

For example, I wanted to use #3207832: Active menu item not showing as title if it has no children's merge request https://git.drupalcode.org/project/menu_block/-/merge_requests/4. There's a "download" link in the UI, and if you look at the HTML, I found https://git.drupalcode.org/project/menu_block/-/merge_requests/4.patch, which I can now use with composer patches.

tunic’s picture

#17 That URL gives the patch for the current merge request. If someone adds any commits to that MR that URL would return a different patch.

douggreen’s picture

I'm not arguing against having this feature, it would be nice to have. I'm pointing out we can already almost do everything without gitlab changes.

* To track ongoing commits, use the repository in composer. This is still problematic, in that it doesn't get changes made to the dev branch, and it doesn't allow you to apply two merge requests simultaneously
* To pull a specific patch, use composer patches and the patch URL as above.

drumm’s picture

To pull a specific patch, use composer patches and the patch URL as above.

This is not recommended. All Drupal.org users have free access to collaborate on and push to all issue fork branches. In the long run, someone will push a buggy commit, or worse. Always keep a local copy of the merge request patch you are deploying, and have composer patches apply that local copy.

frob’s picture

I am curious, does Composer cache the patches used in composer-patches?

douggreen’s picture

@drumm, @tunic claims that the patch URL is to a specific patch, just as if I used the numbered patch currently attached. Is @tunic wrong? I don't see this as buggy or recommended. I am pointing out that you can mimic our current best practice today for gitlab MR's. Today I point to a specific patch file on d.o. Tomorrow I can point to a specific MR patch file.

The part that I think is not recommended is to use the branch in composer, which can include any commit from any user, and that is indeed risky, even riskier than using a dev branch of a project. But still, some user's may wish to do this.

joachim’s picture

@douggreen as far as we can tell, diff URLs on gitlab are either for:

- a single commit, which is no good if the MR branch has more than one commit
- a branch, which is no good as the MR branch could have more commits added to it in future

The best solution at the moment is indeed for people to download diffs, commit them to their codebase, and refer to those files in their composer.json.

This should be documented somewhere so people aren't caught out by this!

Pasqualle’s picture

drumm’s picture

#17 That URL gives the patch for the current merge request. If someone adds any commits to that MR that URL would return a different patch.

That is correct. There merge request patch URL is the current patch for the merge request. A merge request is metadata for merging a branch. When someone pushes commits to that branch, the content at the merge request patch URL changes, adding the commits to return a different patch.

As far as I know, GitLab does not have a way to return a diff for a static start and end commit. GitLab only provides the patch files as noted in #23.

frob’s picture

What is the new mechanism for generating a test on d.o with MRs? I am curious if it could be hooked somehow to automatically upload the generated patch to d.o? More than likely I expect CI is just checking out the fork though.

drumm’s picture

GitLab has a hidden ref for the current merge result, assuming the merge request is current mergeable, which DrupalCI uses.

For deployments, keeping a copy of a patch file with your site’s codebase is really the best solution. This keeps your deployment stable, and reduces dependence on Drupal.org.

jdleonard’s picture

I just ran into this limitation. I'd vote for either A or B.

markdorison’s picture

For deployments, keeping a copy of a patch file with your site’s codebase is really the best solution. This keeps your deployment stable, and reduces dependence on Drupal.org.

Based on what I understand the current state of GitLab capabilities, I agree this is the recommended/best practice solution.

I am as big of a GitLab/MR fan as anyone, but with this particular use case where pre-GitLab a patch file was readily available, it has actually become a harder workflow. It would be really fantastic if we could make it easier to obtain the patch file when an MR is submitted or updated. Is that something that is possible/feasible? I am imagining a patch file being included in the activity data that is already coming back from GitLab to the issue with each commit.

DamienMcKenna’s picture

The best practice for site owners right now is to keep a copy of the patch in your codebase,

If you're downloading a module from d.o why should you assume you can't also download a patch file?

I don't agree with this policy as it causes more problems than it solves.

rcodina’s picture

What @markdorison suggests would be great:

I am imagining a patch file being included in the activity data that is already coming back from GitLab to the issue with each commit.

hestenet’s picture

Title: Gitlab MRs are unsuitable for use with Composer Patches - need a way to make a branch stable » Gitlab MRs are unsuitable for use with Composer Patches - need a way to make a branch stable in order to have static patches
Issue summary: View changes

Updating the issue summary

If you're downloading a module from d.o why should you assume you can't also download a patch file?

You *can* download a patch file... and then commit it to a patches directory in your codebase. It should absolutely not be best practice to hot-link to an externally hosted file for repeatable deployments. It's an accident of history that people started using regular patch files on d.o that way, but it was never a great idea (your deployments fail when d.o is down..)

This is what the current official documentation recommends:
https://www.drupal.org/docs/develop/git/using-git-to-contribute-to-drupa...

However, if folks are truly insistent on using that workflow, you should chime in on the GitLab issue in order to be able to get static patch links:
https://gitlab.com/gitlab-org/gitlab/-/issues/15593 - this issue may actually be the better one to chime in on: https://gitlab.com/gitlab-org/gitlab/-/issues/217206

I will also add it to the Drupal Migration GitLab issue to ask for prioritization.

hestenet’s picture

hestenet’s picture

Issue summary: View changes
hestenet’s picture

Issue summary: View changes
nicxvan’s picture

Whether we link to the file on d.o. in composer or not the static patch file is still useful to see which point in time other projects have vetted as well.

markdorison’s picture

However, if folks are truly insistent on using that workflow

I don't think this issue is really about whether people hot-link a patch or commit it to their repo. I believe it should be about making sure accessing a patch file is easy, whether it be on the day an MR is created or months later possibly after others have committed to the MR. The state this is currently in makes it difficult (at best).

You *can* download a patch file... and then commit it to a patches directory in your codebase.

This is true if the MR was just created and no other new commits have been made to it, but potentially false or very difficult in other scenarios.

In my opinion, the issue title should either be changed or maybe these two threads should be addressed as separate issues?

xurizaemon’s picture

It should absolutely not be best practice to hot-link to an externally hosted file for repeatable deployments.

Is there a significant difference between downloading core or contrib components as part of a build, and downloading a static patch to apply changes to same?

Agree that the docs linked to in #32 refer to local patches. The composer-patches README is an example of docs that points people the other way.

I've advocated to encourage devs to upload patches as a default behaviour; my position is that if we encourage saving patches to a repo, time pressure may encourage people to skip the important upstreaming part of their patch workflow. It's easy for devs under time pressure to omit that important step, or defer it in the hope that they'll follow up "later".

EDIT: I get it now - malicious code injection vector. I've upgraded my position :)

I know changing Gitlab's behaviour would be a significant effort, and has to be done in the linked issue - not here. I do think it's good for us to acknowledge that use of patch URLs is both common practice and (IMO, at least) can be supportive of working open by getting patches out of local repos and into a community space.

thursday_bw’s picture

FileSize
225.75 KB

Please excuse me if this has been covered. It is already possible to link to a static version of a diff on Gitlab, the URLs take a diff_id parameter.

Here's an example of how I have achieved it:

            "drupal/core": {
                "It is possible to overflow the number of items allowed in Media Library (https://www.drupal.org/project/drupal/issues/3059955)": "https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55477"
            }

Note the diff_id parameter.

Each time a merge request receives a push a new version of the diff is created so you can link directly to a snapshot of the merge request at any point.

So you can take a diff page such as https://git.drupalcode.org/project/webform/-/merge_requests/43/diffs use the 'versions' drop down select box on that page to find the various diff versions and form the appropriate URL.

Here's the formula to form the URL of a diff version.
https://git.drupalcode.org/project/webform/-/merge_requests/43/diffs append .patch then append ?diff_id=<my diff id>

Here's a screenshot that shows exactly where to locate the diffs version.
Annotated screenshot demonstrating versions dropdown select box

So this is a Developer Experience problem rather than a functional one. It IS possible to use a merge request with composer patches, it's not very obvious how and one needs to form the URL .. but it can be done relatively easily.

I recommend we work towards modify the UI to make this URL exposed directly and displayed prominently on the merge request, and update relevant documentation in the mean time.

bbrala’s picture

FileSize
30.14 KB

My testing earlier this week resulted in the same conclusion (see relevant slack thread). When you use a patch that has a refererring diff_id the result seems static

Did 3 commits and linked the changes with the diff_id in url: https://git.drupalcode.org/project/drupal/-/merge_requests/1165?diff_id=...

Then did an interactive rebase resulting in 2 commits, this got a new diff_id: https://git.drupalcode.org/project/drupal/-/merge_requests/1165/diffs?di...

After that both "branches" are still available.

gitlab screenshot

Guess we need to do some more rigurious testing on this?

Testplan

  1. Make a branch with a merge request. For Patch filename use the following filename: [issue]-[diff_id]-[latest commit hash].patch
  2. Commit and push once to the branch.
  3. Download the patch file from the changes tab (patch 1), make sure a diff id is present.
  4. Commit twice and push to the branch.
  5. Download the patch file (patch 2) from the changes tab and generate an interdiff
  6. Rebase and remove a commit in the middle, force push to the branch.
  7. Download the patch file (patch 3) from the changes tab with the 2 commits and generate interdiff against previous patch
  8. Download the patch file (patch 4) from the changes tab with the first diff_id and diff with patch 2 and patch 3.

I think if we do those steps we should be able to prove once and for all static diff's are possible. Anything missing in those steps?

If this is true, we should indeed work towards making these URLs accessible. Either through gitlab, or perhaps after every push in the issue itself, since it seems a push results in a new diff id.

tunic’s picture

Not sure if I'm doing it well, but those two URLS:

https://git.drupalcode.org/project/drupal/-/merge_requests/1165/diffs.patch?diff_id=73814
https://git.drupalcode.org/project/drupal/-/merge_requests/1165/diffs.patch?diff_id=73820

show the same exact patch, but I understood they should point to different patches.

bbrala’s picture

You are right. I wonder what i was looking at then when i tested... @bevan.wishart@gmail.com could you also verify just to be sure and link to your test branch so we can get this cleared up one and for all?

thursday_bw’s picture

Edit: After further testing it turns out the diff_id is considered by the .patch endpoints afterall. Read this comment and the next one for the two comparison tests.

OK. After talking with bbrala I did some deeper testing and can show that the URLs with .patch appended do ignore the diff_id parameter. So those formulated urls I mentioned above are not usable.

Here's the script and it's result to show the comparison of two patches of an MR.

Compared IDS: 55467 and 55477 - these patches are the same.

user@user-workstation:~/gitlabtest$ cat gitlabttest.sh 
wget https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55467 -O 55467.patch
wget https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55477 -O 55477.patch
if cmp -s "55467.patch" "55477.patch"; then
        echo "Patches are the same";
else
        echo "Patches are different";
fi
user@user-workstation:~/gitlabtest$ bash gitlabttest.sh 
--2021-09-10 23:04:59--  https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55467
Resolving git.drupalcode.org (git.drupalcode.org)... 151.101.2.217, 151.101.66.217, 151.101.130.217, ...
Connecting to git.drupalcode.org (git.drupalcode.org)|151.101.2.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '55467.patch’

55467.patch                             [ <=>                                                              ]  44.57K  --.-KB/s    in 0.01s

2021-09-10 23:05:00 (3.35 MB/s) - '55467.patch’ saved [45644]

--2021-09-10 23:05:00--  https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55477
Resolving git.drupalcode.org (git.drupalcode.org)... 151.101.194.217, 151.101.130.217, 151.101.66.217, ...
Connecting to git.drupalcode.org (git.drupalcode.org)|151.101.194.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '55477.patch’

55477.patch                             [ <=>                                                              ]  44.57K  --.-KB/s    in 0.01s

2021-09-10 23:05:01 (3.40 MB/s) - '55477.patch’ saved [45644]

Patches are the same
thursday_bw’s picture

I realised the previous test didn't strictly enough ensure the downloaded patches were indeed different on the end point.
That was a comparison of the 'latest' version and 'version 4'.. this following script compares version 1 and 4 and proves that the diff_id is considered when gitlab returns the patch.

Compared IDS: 55467 and 40320 - these patches differ

user@user-workstation:~/gitlabtest$ cat gitlabttest.sh 
wget https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55467 -O 55467.patch
wget https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=40320 -O 40320.patch
if cmp -s "55467.patch" "55477.patch"; then
	echo "Patches are the same";
else
	echo "Patches are different";
fi
user@user-workstation:~/gitlabtest$ bash gitlabttest.sh 
--2021-09-10 23:38:23--  https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55467
Resolving git.drupalcode.org (git.drupalcode.org)... 151.101.2.217, 151.101.66.217, 151.101.130.217, ...
Connecting to git.drupalcode.org (git.drupalcode.org)|151.101.2.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '55467.patch’

55467.patch                             [ <=>                                                              ]  44.57K  --.-KB/s    in 0.02s

2021-09-10 23:38:24 (2.75 MB/s) - '55467.patch’ saved [45644]

--2021-09-10 23:38:24--  https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=40320
Resolving git.drupalcode.org (git.drupalcode.org)... 151.101.194.217, 151.101.130.217, 151.101.66.217, ...
Connecting to git.drupalcode.org (git.drupalcode.org)|151.101.194.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '40320.patch’

40320.patch                             [ <=>                                                              ]  44.57K  --.-KB/s    in 0.01s

2021-09-10 23:38:24 (3.07 MB/s) - '40320.patch’ saved [45644]

Patches are different
tunic’s picture

There's a typo in the second script. The second wget downloads the patch to the file "40320.patch". However, the "if" statement compares the first file with "55477.patch", but this "55477.patch" file doesn't exist because the second wget created "40320.patch", not "55477.patch".

A fixed script returns the patchs are the same:

cat gitlabttest.sh 
wget https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55467 -O 55467.patch
wget https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=40320 -O 40320.patch
if cmp -s "55467.patch" "40320.patch"; then
        echo "Patches are the same";
else
        echo "Patches are different";
fi    
./gitlabttest.sh 
--2021-09-13 09:20:20--  https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=55467
Resolving git.drupalcode.org (git.drupalcode.org)... 151.101.134.217
Connecting to git.drupalcode.org (git.drupalcode.org)|151.101.134.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘55467.patch’

55467.patch                                               [ <=>                                                                                                                   ]  44,57K  --.-KB/s    in 0,01s   

2021-09-13 09:20:21 (3,21 MB/s) - ‘55467.patch’ saved [45644]

--2021-09-13 09:20:21--  https://git.drupalcode.org/project/drupal/-/merge_requests/760/diffs.patch?diff_id=40320
Resolving git.drupalcode.org (git.drupalcode.org)... 151.101.134.217
Connecting to git.drupalcode.org (git.drupalcode.org)|151.101.134.217|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘40320.patch’

40320.patch                                               [ <=>                                                                                                                   ]  44,57K  --.-KB/s    in 0,01s   

2021-09-13 09:20:21 (3,23 MB/s) - ‘40320.patch’ saved [45644]

Patches are the same  
thursday_bw’s picture

awesome. Thanks for spotting that and getting us an answer. The .patch endpoints do not consider the diff_id parameter.

bbrala’s picture

That is the bottom line, and could actually be considered a bug in gitlab imo.

Probably the culprit is the merge_requsts_controller.rb that only servers a diff or patch for the mergerequest and doesn't use the same logic as the diffs_controler.rb.

merge_requests_controller.rb:


      format.patch do
        break render_404 unless @merge_request.diff_refs

        send_git_patch @project.repository, @merge_request.diff_refs
      end

      format.diff do
        break render_404 unless @merge_request.diff_refs

        send_git_diff @project.repository, @merge_request.diff_refs
      end

But that after a 5 minute glance at the code of gitlab.

markdorison’s picture

Title: Gitlab MRs are unsuitable for use with Composer Patches - need a way to make a branch stable in order to have static patches » GitLab Merge Requests Unable to Generate Incremental Patch Files
Issue summary: View changes

Updated the issue title and description to summarize/encapsulate where this discussion has ended up here and in the GitLab Acceleration Initiative meeting from 9/8/21.

xurizaemon’s picture

Issue summary: View changes
hestenet’s picture

GitLab is hosting a quarterly hackathon on 22nd and 23rd of September 2021, so if anyone is feeling handy with Ruby and wants to see if they can fix this in the upstream, this might be a good opportunity.

https://about.gitlab.com/community/hackathon/

This issue (and the related ones in the comments) maybe the place to start:
https://gitlab.com/gitlab-org/gitlab/-/issues/217206

bbrala’s picture

Yesterday I checked out the gitlab code and was looking through it, but my limited ruby experience will make this hard lol. I only really dabbled in Ruby a few years ago to build an Hyper-V provider in Vagrant, but couldn't get very far in the hour i spent.

Technically it does seem possible as a mentioned in an ealier post here.

RoSk0’s picture

RE the merge request version and getting a stable patch out of it - looks like still not possible as I just faced with this on the https://git.drupalcode.org/project/drupal/-/merge_requests/1779/diffs . Both pairs , diff and patch , produce the same output for both versions.

Found https://gitlab.com/gitlab-org/gitlab/-/issues/27447 reporting this bug with no updates in two years.

achap’s picture

FYI there is a MR in composer-patches that will use a checksum to see if the patch from the url changed. Yet to be commited though.

https://github.com/cweagans/composer-patches/pull/388

xurizaemon’s picture

Issue summary: View changes

(fixed reference link to Slack discussion in ID)

bbrala’s picture

About that composer patches patch. If you want to help, posting op using an emoji to show the interest will probably help. :)

Murz’s picture

Want to report here a very big security issue when people use links to patch files from Gitlab, like this https://git.drupalcode.org/project/drupal/-/merge_requests/1779/diffs in composer.json patches.

Any Drupal user can get write access to the issue's branch and push any code to it. Even malicious code! And this code will be automatically downloaded by the composer in pipelines and deployed on production!

But returning back to the .patch files is a very big step back!

So I've created a new issue in GitLab tracker (possibly a duplicate, but fresh and with a more worried description): https://gitlab.com/gitlab-org/gitlab/-/issues/373246

drumm’s picture

Murz - Please read the issue summary and do not file duplicate issues.

This is also much more of a client/user-side issue than server-side. Even if its possible to get a static diff from GitLab, I expect the dynamic one will still be more findable. Unless someone knows to look for the specific diff, they won’t use it.

dmitry.korhov’s picture

Diff on top of each commit can resolve the issue but workaround with checksum is ok as well

Murz’s picture

workaround with checksum is ok as well

The workaround with checksum will fix the security problem, but will cause breaking the composer install process instantly, when the diff file got even any minor change, instead of keeping available and applicable the previous version of the patch in Drupal.org.

So after any update in the MR all developers, that use the previous version of the patch, will get broken pipelines and will have to urgently fix them!

And the proper stable workaround here is to only manually save the needed version of the diff file locally and add it to the project's CVS.

Even if its possible to get a static diff from GitLab, I expect the dynamic one will still be more findable. Unless someone knows to look for the specific diff, they won’t use it.

This described problem can be resolved on the user side via rendering warning messages around the link to the "always latest" diff, and making the static link to the current latest commit more visible, than the link to the dynamic diff.

Eduardo Morales Alberti’s picture

Maybe this is not a Gitlab side problem. As we always did, maybe the solution is to post the patches in the issue.

There are webhooks for each commit, maybe the issue should automatically generate a patch of each commit of the MR, it could be a possible solution because is possible that GitLab will never solve it, and this can be done by the Drupal community.

I know that this solution can poisoning the issue with to many patches if there are to many commits, so we can add a button for generating the patch only for the commit authors, for example.

Murz’s picture

There are webhooks for each commit, maybe the issue should automatically generate a patch of each commit of the MR, it could be a possible solution because is possible that GitLab will never solve it, and this can be done by the Drupal community.

That's a good workaround for the problem, that can be implemented already now, without waiting the GitLab side. Thanks for the idea!

John Pitcairn’s picture

so we can add a button for generating the patch only for the commit authors, for example

That assumes commit authors who forget will be responsive to requests from others to do so, before another MR is made, right? It would be better if anyone could generate that, IMO.

gordon’s picture

@eduardo-morales-alberti yes I think that this is the best idea.

I am a Drupal dev and system implementor and use the composer-patches to implement fixes. So using uploaded patches is just so easy to do. Where as working with the a MR branch is so much more difficult and also applying this to a branch which is not the dev branch is just very difficult.

Options I have considered is:-

1. being able to download a patch generated from the gitlab user interface. However this doesn't seem possible.
2. Pin the release to a commit, this is possible but it means you are locked to the version when the branch was created, and you can't apply multiple issues.

If we created patches it would solve a lot more issues with the using of the patches in system deployments.

1. Having a patch means you may be able to apply to a stable version.
2. you can apply multiple patches to the same project/module
3. Have the posibility to use patches with older version of a module.
4. Makes it easier for people who do not use gitlab to look at patches and see what the changes are.
5. Some people uploading patches or using existing patches are doing this as a patch of a larger project who have dedicated testers and are doing more intense testing that what most developers will do.

FYI I still upload patches to d.o because I am generally working with clients to fix an issue, and uploading a patch means that it will get fixed upstream and the fix is not required to be maintained by my client.

colan’s picture

Issue summary: View changes

Added section for workarounds to the IS:

  1. If you'd like a static patch file for an MR at a given point in time, download the diff, and then upload it as an attachment to a comment on the issue. This patch file can then be used by Composer in composer.json. The problem is that this option adds more comments to the issue, which don't directly help solve it (i.e. noise).
  2. Use composer-patches' fix for Patches from drupal.org merge request URLs are dangerous?: Add support for patch checksums. If the MR changes, the new patch won't be applied because the checksum doesn't match. This will have the effect of breaking building that run into this, which will avoid the stability and security problems. This is included in version 2. However, at the time of this writing, v2 hasn't been released yet.
joachim’s picture

> If you'd like a static patch file for an MR at a given point in time, download the diff, and then upload it as an attachment to a comment on the issue.

I download the diff and commit it to the project repo in a /patches folder, and then specify the patch in composer.json as a relative path.

FeyP’s picture

Issue summary: View changes

I download the diff and commit it to the project repo in a /patches folder, and then specify the patch in composer.json as a relative path.

This is the way to go. I don't think we want to promote that people upload patches of various states of MRs just to apply them to composer. That would just add more noise to the issue and complicate finding the latest code to review/commit. Updated the IS.

ressa’s picture

Thanks for sharing your simple and pragmatic solution @joachim, and @FeyP for adding it in the IS.

nicxvan’s picture

While I agree that adding the file to a comment can add noise, I think my point in #36 still stands. Having a way to track which points in time the community has tested is helpful.

When there is a patch submitted for each change it's easier to do an interdiff if there was a regression. If you want to do that purely with people downloading the diff themselves, you have to compare the comment time with commit history and guess when they downloaded their version of the diff.

It also helps in another way, for projects that are in maintenance mode sometimes a patch is only looked at again when it fails to apply in CI / CD. If I have a patch from an early comment in an issue that no longer applies and there are three patches between that and the newest I can check the comments more clearly with the progress of the patch and testing. If it's just an MR and everyone has been downloading the diff as they go correlating the comment history with the commit history is more difficult.

Another thing that makes point in time patching helpful is if a new patch has regressions. Imagine the following scenario:

Patch-1 (Patch I initially applied fixing my issue)
Patch-2 (New progress)
Patch-3 (Regression in patch)

If I am using Patch 1 and it no longer applies, if there are point in time patches I can see patch 3 has a regression from other user comments and I can attempt to apply patch 2 that other community members have stated still works. If it's an MR only, I again have to try to guess which commit put the regression in and apply that change instead.

I don't think generating a static patch per commit or push is helpful, but being able to click a button to generate a patch in time automatically would be a huge improvement ( assuming there is some check to ensure you're not generating duplicate patches which is unhelpful in all scenarios).

joachim’s picture

> When there is a patch submitted for each change it's easier to do an interdiff if there was a regression. If you want to do that purely with people downloading the diff themselves, you have to compare the comment time with commit history and guess when they downloaded their version of the diff.

I recommend using a filename for the patch in a project repo that contains all of: Drupal project name, issue number, and SHA of the branch tip at the point the patch was made. E.g. 'drupal.ISSUE.SHA.brief-description.patch.

Patches that people upload are *probably* fresh, but it's a bit fiddly to compare the date of the comment the file is on with the dates in the git commit log. AND if the MR gets rebased -- as it should be -- don't the commits get new dates?

ressa’s picture

Issue summary: View changes

It would be awesome if patches were offered as files in the format of drupal.ISSUE.SHA.issue-title-lower-cased-and-shortened.patch. That would probably require adjusting Gitlab code though, which may not be possible ...

joachim’s picture

Yeah, my impressions is that github doesn't allow much customisation -- it's why we have this issue in the first place!

I meant that I manually rename the patch file.

Though... it's something that could be automated as a Composer command.

nicxvan’s picture

@joachim, that solution still works just on individual / team for finding that info. It still loses the community aspect of testing and knowing what point in time others have tested.

colan’s picture

☝️