Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Problem/Motivation
Make any changes to core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
, then
$ git diff
diff --git a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
index 847c49ff93..9d5a50f855 100644
Binary files a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php and b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php differ
It looks like it's null bytes in the serialized string in \Drupal\Tests\jsonapi\Functional\MenuLinkContentTest::testLinkOptionsSerialization()
that are causing the problem, though they are valid in a PHP serialized string from what I understand.
One of the consequences is that the patch there could not apply https://www.drupal.org/pift-ci-job/1647366
Proposed resolution
See #26 and #39.
Remaining tasks
- Idenfiy files recongized as binary files
A command optional:
rm -rf core && git diff | grep "Binary files" && git checkout .
- Wrap string contains
\x00
with double quotes, instead of single quotes - Replace \x00 with \0
- Use
git diff --text
to get the patch.
User interface changes
API changes
Data model changes
Release notes snippet
Comment | File | Size | Author |
---|---|---|---|
#39 | 3126906-39.patch | 2.06 KB | jungle |
Comments
Comment #2
jungleComment #3
jungle- here
+ there
Comment #4
jungleLet's reproduce it by CI, to prove that it's not only happening on my local
Comment #5
jungleProved, see https://dispatcher.drupalci.org/job/drupal_patches/41639/console
Comment #6
jungleInstead of
echo 'drupal' > core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
Append whitespace to the file
echo ' ' >> core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
By doing this to simulate making changes to the file.
Comment #7
jungleStill the same
Comment #8
jungleTesting using
git diff --text
Comment #9
junglegit diff --text
worksComment #10
jungleAfter chatting with @hansfn on slack, I decided to reopen this and submit a patch by adding
core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php -text diff
to.gitattributes
Comment #13
AndyF CreditAttribution: AndyF at Fabb commentedI was curious and looked into it a little and thought I'd just record my findings in case it's useful:
file
command and it will describe MenuLinkContentTest.php as data.cat -v
.\Drupal\Tests\jsonapi\Functional\MenuLinkContentTest::testLinkOptionsSerialization()
that are causing the problem.Thanks!
Comment #14
jungleComment #15
AndyF CreditAttribution: AndyF at Fabb commentedThanks @jungle!
I think
-text
actually tells git the file isn't a text file, thediff
flag alone might be enough.Do you think it makes sense to add a new section to .gitattributes for individual files?
I ran a check on 8.9.x to see if any other files might cause us problems in future:
Comment #16
jungleTRUE!, I was wrong. Thank you for pointing out!
It's a good idea for me as long as it works. And if so, we need to change the scope of this issue.
Comment #17
AndyF CreditAttribution: AndyF at Fabb commentedHmmn, investigating further, it seems the culprit is the php diff driver, the following seems to prevent the file being considered binary:
Comment #18
jungleOh, no. from the comment in
.gitattributes
. I think the way below is correct? Correct me if I were wrong.It asks MenuLinkContentTest to include binary diff, and it's text diff in fact.
Comment #19
jungleRe #17 yes, but it failed when meets MenuLinkContentTest.php
Comment #20
AndyF CreditAttribution: AndyF at Fabb commented@jungle, IIUC the comment from #18 is explaining the two flags:
-text
diff
We only want to ensure we have a diff rather than "binary files differ" so we just want the one
diff
flag. FWIW I've been referring to https://git-scm.com/docs/gitattributes#_generating_diff_text.In fact, it looks like you can tell git that the php diff driver should always assume text by configuring
diff.php.binary
, eggit config diff.php.binary false
, but that's local config.For me if I have the change in #17 (ie change
diff=php
todiff
) then it correctly displays a textual diff on MenuLinkContentTest.php.Thanks!
Comment #21
jungleAgain let's the CI do the talk :p
Comment #22
jungleThe CI output of #21 can be found here https://dispatcher.drupalci.org/job/drupal_patches/41818/console, and the first command supposed to be
diff index.php index.1.php
In short,
This should work proposed by @AndyF.
Thank you @AndyF
Comment #23
junglegit diff --binary
generated patch may work, but it's not human-readable, and hard to review at least by using dreditor.To minimize the potential effects a not quite sure what the benefit of using
diff=php
compared withdiff
,. Adding a section to maintain special files is reasonable and good to me.
For example
Comment #24
AndyF CreditAttribution: AndyF at Fabb commentedIf we allow that the user has to do something their end to deal with the issue, then I think running
git config diff.php.binary false
once for your repo might be better? I'm not sure if there might be negative consequences to that, it feels fairly safe (famous last words!)From my reading of the docs it affects what the "hunk header" is eg.
@@ -1340,6 +1340,22 @@ protected function elementTriggeredScriptedSubmission
and also where to split words if using--word-diff
(it could be more), so I definitely think it's not something we want to change lightly.Yeah, I don't see a neater way to do it myself without any per-user configuration. If we do go down that route I think we ought to include the snippet used to generate the list in a comment, though note it was run (from bash) on Linux; it might not work on Darwin (I've had issues before with core tools like
find
).Thanks!
Comment #25
xjmWow, this is a weird one! 🤔
Comment #26
longwaveInstead of changing .gitattributes, can we change the test itself? We can switch the string to use double instead of single quotes, and then we can escape the null bytes - and then git shouldn't consider this as a binary file any more?
The attached patch has an attempt at this, in both binary form and text form - the binary form might be needed for commit but the text form is easier to review. I also added an extra assertion that the serialized string can be unserialized again.
Comment #28
jungleLooks like this does not work as well. As longs as there are null bytes in the file, no matter quoting it with single quotes or double quotes, git can not recognize it as non-binary by default.
I would suggest adding a section to .gitattributes to handle special cases still.
Changing the title to expand the scope
Comment #30
jungleUsed a command form SO to find the binary files (see https://stackoverflow.com/a/30690662), and added them into the .gitattributes file
BTW,
cat -v
is a useful command mentioned by @AndyF on slack.Comment #31
jungleComment #32
AndyF CreditAttribution: AndyF at Fabb commentedI reckon that method can work, it's just the change needs to be committed before it will start working (currently the null byte is right there in HEAD, so still impacts the diff). Personally I wonder if it's not a little error-prone to be trying to convert serialized output from a literal to a double-quoted string; but this is all a bit edge-casey tho, I calculate 7 out of 11,906 core PHP files affected.
Just to be stupidly pedantic :p I think strictly speaking we're actually telling git not to use the PHP diff driver for those explicit files.
Thanks!
Comment #33
longwave> Looks like this does not work as well.
It won't work in the patches in #26 because the source still contained NULL bytes at that point. But after that is committed there are no real NULL bytes in the file - they are all escaped inside strings - so future patches against this should always consider it text. This seemed to work locally when I tested it.
> Personally I wonder if it's not a little error-prone to be trying to convert serialized output from a literal to a double-quoted string
This is why I added the unserialize assertion, if you then deliberately corrupt the serialized string the assertion fails (maybe not in all cases, but it's still a reasonable test I think).
Comment #34
jungleThanks, @longwave and @AndyF!
Here is a patch, per #33 or #32 or #26, it's supposed working, but did not, to me.
Steps to verify:
git diff
, it tells that it's a binary file still.The two files found in #15 but not found in #30 are working as expected, so did not use them for testing.
Comment #35
longwave#34 isn't right, as well as changing from single to double quotes you need to properly escape the NULL characters by replacing them with \0.
Something like this will do the conversion:
but I tried this and it seems there are a few missed strings in #34 that need converting to double quoted strings first.
Comment #37
longwaveTrying again.
Comment #39
jungleThe way @longwave proposed should work. Steps to verify.
git diff
, text-like diff info shows.Todo:
\x00
with double quotes, instead of single quotesgit diff --text
to get the patch.Updating IS.
Thank you @longwave!
Comment #40
junglerm -rf core && git diff | grep "Binary files" && git checkout .
By using the command above, found the following files need fixing:
8.8.x:
8.9.x:
9.0.x:
9.1.x:
Patch #39 is applicable to 9.0.x and 9.1.x.
Need a patch for 8.x still.
Comment #41
longwaveI wonder if this is worth even fixing in those 8.9.x fixtures, they seem unlikely to change again. Maybe we should just fix MenuLinkContentTest across all branches and leave it at that?
Comment #42
jungleI am ok with that. and we know what's the root cause and how to avoid it. If so, please set it to RTBC, and let's hear from the committer(s).
@longwave, thanks for your help!
Comment #43
jungleQueued testings against all 4 branches.
Comment #44
longwaveRetitling as we changed scope here and marking RTBC.
Comment #45
xjmNice sleuthing on this one. Saving issue credit.
Comment #50
xjmI confirmed that once this patch is committed locally, the diff is detected as text. Since I couldn't run a proper regex diff because of the "binary data", I also read the whole serialized string to confirm that what's changing is double quotes and escaped null bytes, and nothing else.
I agree with leaving the fixtures alone; they're legacy now and less important to be able to diff than an actual test class.
I committed this to 9.1.x and cherry-picked it to 9.0.x, but couldn't cherry-pick it to 8.9.x because it couldn't merge a "binary file". (Another reason it's good we're fixing this!) So I applied the patch and committed again to 8.9.x, which I was then able to cherry-pick to 8.8.x (since this is a test improvement to an individual test, it's eligible for backport).
Tricksy little null bytes!
Comment #51
jungleBye-bye Little Null Bytes! 👋
🙏 Thanks to @AndyF, @longwave, and @daffie on slack, for helping. Thank you @xjm for committing! 🍻