Problem/Motivation

Make any changes to core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php, then

$ git diff
diff --git a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
index 847c49ff93..9d5a50f855 100644
Binary files a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php and b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php differ

It looks like it's null bytes in the serialized string in \Drupal\Tests\jsonapi\Functional\MenuLinkContentTest::testLinkOptionsSerialization() that are causing the problem, though they are valid in a PHP serialized string from what I understand.

One of the consequences is that the patch there could not apply https://www.drupal.org/pift-ci-job/1647366

Proposed resolution

See #26 and #39.

Remaining tasks

  1. Idenfiy files recongized as binary files

    A command optional:

    rm -rf core && git diff | grep "Binary files" && git checkout .
    
  2. Wrap string contains \x00 with double quotes, instead of single quotes
  3. Replace \x00 with \0
  4. Use git diff --text to get the patch.

User interface changes

API changes

Data model changes

Release notes snippet

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

jungle created an issue. See original summary.

jungle’s picture

Issue summary: View changes
jungle’s picture

Issue summary: View changes

- here
+ there

jungle’s picture

Let's reproduce it by CI, to prove that it's not only happening on my local

jungle’s picture

13:39:50 diff --git a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
13:39:50 index 847c49f..9e0d04b 100644
13:39:50 Binary files a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php and b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php differ

Proved, see https://dispatcher.drupalci.org/job/drupal_patches/41639/console

jungle’s picture

@daffie: With the echo command you are adding the text "drupal" to the START of the file. Now the file starts with "drupal<?php" and not with "<?php".

Instead of echo 'drupal' > core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php

Append whitespace to the file

echo ' ' >> core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php

By doing this to simulate making changes to the file.

jungle’s picture

FileSize
308.76 KB

Still the same

jungle’s picture

Testing using git diff --text

jungle’s picture

Status: Active » Closed (won't fix)

git diff --text works

jungle’s picture

Status: Closed (won't fix) » Needs review
FileSize
699 bytes
286 bytes

After chatting with @hansfn on slack, I decided to reopen this and submit a patch by adding core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php -text diff to .gitattributes

The last submitted patch, 10: 3126906-10-test-only.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

Status: Needs review » Needs work

The last submitted patch, 10: 3126906-10.patch, failed testing. View results

AndyF’s picture

Issue summary: View changes

I was curious and looked into it a little and thought I'd just record my findings in case it's useful:

  • You can also test with the file command and it will describe MenuLinkContentTest.php as data.
  • You can identify non-printing characters with cat -v.
  • It looks like it's null bytes in the serialized string in \Drupal\Tests\jsonapi\Functional\MenuLinkContentTest::testLinkOptionsSerialization() that are causing the problem.
  • Null bytes are valid in a PHP serialized string from what I understand.

Thanks!

jungle’s picture

Status: Needs work » Needs review
FileSize
676 bytes
AndyF’s picture

Thanks @jungle!

+++ b/.gitattributes
@@ -59,3 +59,4 @@
+core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php  -text diff

I think -text actually tells git the file isn't a text file, the diff flag alone might be enough.

Do you think it makes sense to add a new section to .gitattributes for individual files?


I ran a check on 8.9.x to see if any other files might cause us problems in future:

find \( -name '*.php' -o -name '*.module' -o -name '*.install' -o -name '*.profile' -o -name '*.inc' -o -name '*.test' -o -name '*.theme' \) -not -path '*/vendor/*' | xargs grep -lnPa '\x00' 
./core/modules/content_moderation/tests/fixtures/update/drupal-8.4.0-content_moderation_installed.php
./core/modules/workspaces/tests/fixtures/update/drupal-8.6.0-workspaces_installed.php
./core/modules/rest/tests/fixtures/update/drupal-8.rest-rest_post_update_resource_granularity.php
./core/modules/media/tests/fixtures/update/drupal-8.4.0-media_installed.php
./core/modules/layout_builder/tests/fixtures/update/layout-builder-tempstore.php
./core/modules/hal/tests/fixtures/update/drupal-8.rest-hal_update_8301.php
./core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
./core/modules/system/tests/fixtures/update/drupal-8.language-enabled.php
jungle’s picture

I think -text actually tells git the file isn't a text file, the diff flag alone might be enough.

TRUE!, I was wrong. Thank you for pointing out!

Do you think it makes sense to add a new section to .gitattributes for individual files?

It's a good idea for me as long as it works. And if so, we need to change the scope of this issue.

AndyF’s picture

Hmmn, investigating further, it seems the culprit is the php diff driver, the following seems to prevent the file being considered binary:

diff --git a/.gitattributes b/.gitattributes
index a37894e8e4..1c8834ba65 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -29,7 +29,7 @@
 *.map     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
 *.md      text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
 *.module  text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
-*.php     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
+*.php     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff
 *.po      text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
 *.profile text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
 *.script  text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2
jungle’s picture

# Define binary file attributes.
# - Do not treat them as text.
# - Include binary diff in patches instead of "binary files differ."

Oh, no. from the comment in .gitattributes. I think the way below is correct? Correct me if I were wrong.

+++ b/.gitattributes
@@ -59,3 +59,4 @@
+core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php  -text diff

It asks MenuLinkContentTest to include binary diff, and it's text diff in fact.

jungle’s picture

Re #17 yes, but it failed when meets MenuLinkContentTest.php

AndyF’s picture

@jungle, IIUC the comment from #18 is explaining the two flags:

-text
Do not treat them as text.
diff
Include binary diff in patches instead of "binary files differ.

We only want to ensure we have a diff rather than "binary files differ" so we just want the one diff flag. FWIW I've been referring to https://git-scm.com/docs/gitattributes#_generating_diff_text.

In fact, it looks like you can tell git that the php diff driver should always assume text by configuring diff.php.binary, eg git config diff.php.binary false, but that's local config.

Re #17 yes, but it failed when meets MenuLinkContentTest.php

For me if I have the change in #17 (ie change diff=php to diff) then it correctly displays a textual diff on MenuLinkContentTest.php.

Thanks!

jungle’s picture

Again let's the CI do the talk :p

jungle’s picture

  1. +++ b/core/drupalci.yml
    @@ -5,53 +5,32 @@ build:
    +        commands:
    +          - 'diff index.php'
    

    The CI output of #21 can be found here https://dispatcher.drupalci.org/job/drupal_patches/41818/console, and the first command supposed to be diff index.php index.1.php

In short,

-*.php     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff=php
+*.php     text eol=lf whitespace=blank-at-eol,-blank-at-eof,-space-before-tab,tab-in-indent,tabwidth=2 diff

This should work proposed by @AndyF.

A path to which the diff attribute is set is treated as text, even when they contain byte values that normally never appear in text files, such as NUL.

-- from the link in #20

Thank you @AndyF

jungle’s picture

@alexpott: I use git diff --binary to generate patches and then you can ignore this… at least the patches apply. Your fix looks good.

23:07:54 git diff --binary core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
23:07:54 diff --git a/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php b/core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
23:07:54 index 847c49ff93d579ada017d090f8533f6fb5492dd2..8bd2740060fb9d8e30fb57725eef8cfa7c11e1eb 100644
23:07:54 GIT binary patch
23:07:54 delta 13
23:07:54 UcmccU(BQD)w>)bt7gsG804QVxxc~qF
23:07:54 
23:07:54 delta 75
23:07:54 zcmZp0xahFqxBTR8DRKLt)Z&8tyy8?1BLf2+1+U`#JS(e|)a3k>R1K9Ps9bzfeoCdL
23:07:54 Sjsjdj)0#^`p_YrQmJ0wFbQy;L 

git diff --binary generated patch may work, but it's not human-readable, and hard to review at least by using dreditor.

To minimize the potential effects a not quite sure what the benefit of using diff=php compared with diff,
. Adding a section to maintain special files is reasonable and good to me.

For example

core/modules/content_moderation/tests/fixtures/update/drupal-8.4.0-content_moderation_installed.php diff
core/modules/workspaces/tests/fixtures/update/drupal-8.6.0-workspaces_installed.php diff
core/modules/rest/tests/fixtures/update/drupal-8.rest-rest_post_update_resource_granularity.php diff
core/modules/media/tests/fixtures/update/drupal-8.4.0-media_installed.php diff
core/modules/layout_builder/tests/fixtures/update/layout-builder-tempstore.php diff 
core/modules/hal/tests/fixtures/update/drupal-8.rest-hal_update_8301.php diff
core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php diff
core/modules/system/tests/fixtures/update/drupal-8.language-enabled.php diff
AndyF’s picture

git diff --binary generated patch may work, but it's not human-readable, and hard to review at least by using dreditor.

If we allow that the user has to do something their end to deal with the issue, then I think running git config diff.php.binary false once for your repo might be better? I'm not sure if there might be negative consequences to that, it feels fairly safe (famous last words!)

not quite sure what the benefit of using diff=php compared with diff

From my reading of the docs it affects what the "hunk header" is eg. @@ -1340,6 +1340,22 @@ protected function elementTriggeredScriptedSubmission and also where to split words if using --word-diff (it could be more), so I definitely think it's not something we want to change lightly.

Adding a section to maintain special files is reasonable and good to me.

Yeah, I don't see a neater way to do it myself without any per-user configuration. If we do go down that route I think we ought to include the snippet used to generate the list in a comment, though note it was run (from bash) on Linux; it might not work on Darwin (I've had issues before with core tools like find).

Thanks!

xjm’s picture

Wow, this is a weird one! 🤔

longwave’s picture

Instead of changing .gitattributes, can we change the test itself? We can switch the string to use double instead of single quotes, and then we can escape the null bytes - and then git shouldn't consider this as a binary file any more?

The attached patch has an attempt at this, in both binary form and text form - the binary form might be needed for commit but the text form is easier to review. I also added an extra assertion that the serialized string can be unserialized again.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

jungle’s picture

Title: core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php is recognized as a binary file » A few .php files are recognized as binary files
Version: 9.1.x-dev » 8.9.x-dev

Instead of changing .gitattributes, can we change the test itself? We can switch the string to use double instead of single quotes, and then we can escape the null bytes - and then git shouldn't consider this as a binary file any more?

Looks like this does not work as well. As longs as there are null bytes in the file, no matter quoting it with single quotes or double quotes, git can not recognize it as non-binary by default.

I would suggest adding a section to .gitattributes to handle special cases still.

Changing the title to expand the scope

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

jungle’s picture

$ diff <(git grep -Ic '') <(git grep -c '') | grep '^>' | cut -d : -f 1 | cut -d ' ' -f 2-

Used a command form SO to find the binary files (see https://stackoverflow.com/a/30690662), and added them into the .gitattributes file

BTW, cat -v is a useful command mentioned by @AndyF on slack.

I was looking for characters that might cause programs to think it wasn't text. From man cat
-v, --show-nonprinting
use ^ and M- notation, except for LFD and TAB

jungle’s picture

Title: A few .php files are recognized as binary files » A few .php files are recognized as binary files by git
AndyF’s picture

Looks like this does not work as well. As longs as there are null bytes in the file, no matter quoting it with single quotes or double quotes, git can not recognize it as non-binary by default.

I reckon that method can work, it's just the change needs to be committed before it will start working (currently the null byte is right there in HEAD, so still impacts the diff). Personally I wonder if it's not a little error-prone to be trying to convert serialized output from a literal to a double-quoted string; but this is all a bit edge-casey tho, I calculate 7 out of 11,906 core PHP files affected.

+++ b/.gitattributes
@@ -59,3 +59,13 @@
+# Tells git to treat following file(s) as non-binary file explicitly,

Just to be stupidly pedantic :p I think strictly speaking we're actually telling git not to use the PHP diff driver for those explicit files.

Thanks!

longwave’s picture

Version: 9.1.x-dev » 8.9.x-dev

> Looks like this does not work as well.

It won't work in the patches in #26 because the source still contained NULL bytes at that point. But after that is committed there are no real NULL bytes in the file - they are all escaped inside strings - so future patches against this should always consider it text. This seemed to work locally when I tested it.

> Personally I wonder if it's not a little error-prone to be trying to convert serialized output from a literal to a double-quoted string

This is why I added the unserialize assertion, if you then deliberately corrupt the serialized string the assertion fails (maybe not in all cases, but it's still a reasonable test I think).

jungle’s picture

Thanks, @longwave and @AndyF!

Here is a patch, per #33 or #32 or #26, it's supposed working, but did not, to me.

Steps to verify:

  1. Apply the patch to 8.9.x
  2. Commit it on the local
  3. Make some changes to one of 6 files following from #30,
  4. Run git diff, it tells that it's a binary file still.
+++ b/core/assets/scaffold/files/gitattributes
@@ -59,3 +59,13 @@
+core/modules/hal/tests/fixtures/update/drupal-8.rest-hal_update_8301.php                        diff
+core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php                               diff
+core/modules/layout_builder/tests/fixtures/update/layout-builder-tempstore.php                  diff
+core/modules/rest/tests/fixtures/update/drupal-8.rest-rest_post_update_resource_granularity.php diff
+core/modules/system/tests/fixtures/update/drupal-8.language-enabled.php                         diff
+core/modules/workspaces/tests/fixtures/update/drupal-8.6.0-workspaces_installed.php             diff

The two files found in #15 but not found in #30 are working as expected, so did not use them for testing.

  1. core/modules/content_moderation/tests/fixtures/update/drupal-8.4.0-content_moderation_installed.php
  2. core/modules/media/tests/fixtures/update/drupal-8.4.0-media_installed.php
longwave’s picture

#34 isn't right, as well as changing from single to double quotes you need to properly escape the NULL characters by replacing them with \0.

Something like this will do the conversion:

find core -name '*.php' | xargs grep -lPa '\x00' | xargs -n1 sed -i 's/\x00/\\0/g'

but I tried this and it seems there are a few missed strings in #34 that need converting to double quoted strings first.

The last submitted patch, 34: 3126906-34.patch, failed testing. View results

longwave’s picture

FileSize
213.7 KB

Trying again.

Status: Needs review » Needs work

The last submitted patch, 37: 3126906-37.patch, failed testing. View results

jungle’s picture

Issue summary: View changes
Status: Needs work » Needs review
FileSize
2.06 KB

The way @longwave proposed should work. Steps to verify.

  1. Apply the patch in #26 https://www.drupal.org/files/issues/2020-04-30/3126906-26-text.patch
  2. Commit it
  3. Make some changes to file core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
  4. Run git diff, text-like diff info shows.

Todo:

  1. Idenfiy files recongized as binary files
  2. Wrap string contains \x00 with double quotes, instead of single quotes
  3. Replace \x00 with \0
  4. Use git diff --text to get the patch.

Updating IS.

Thank you @longwave!

jungle’s picture

Issue summary: View changes
Status: Needs review » Needs work

rm -rf core && git diff | grep "Binary files" && git checkout .

By using the command above, found the following files need fixing:

8.8.x:

  1. core/modules/hal/tests/fixtures/update/drupal-8.rest-hal_update_8301.php
  2. core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
  3. core/modules/layout_builder/tests/fixtures/update/layout-builder-tempstore.php
  4. core/modules/rest/tests/fixtures/update/drupal-8.rest-rest_post_update_resource_granularity.php
  5. core/modules/system/tests/fixtures/update/drupal-8.language-enabled.php
  6. core/modules/workspaces/tests/fixtures/update/drupal-8.6.0-workspaces_installed.php

8.9.x:

  1. core/modules/hal/tests/fixtures/update/drupal-8.rest-hal_update_8301.php
  2. core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php
  3. core/modules/layout_builder/tests/fixtures/update/layout-builder-tempstore.php
  4. core/modules/rest/tests/fixtures/update/drupal-8.rest-rest_post_update_resource_granularity.php
  5. core/modules/system/tests/fixtures/update/drupal-8.language-enabled.php
  6. core/modules/workspaces/tests/fixtures/update/drupal-8.6.0-workspaces_installed.php

9.0.x:

  1. core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php

9.1.x:

  1. core/modules/jsonapi/tests/src/Functional/MenuLinkContentTest.php

Patch #39 is applicable to 9.0.x and 9.1.x.

Need a patch for 8.x still.

longwave’s picture

I wonder if this is worth even fixing in those 8.9.x fixtures, they seem unlikely to change again. Maybe we should just fix MenuLinkContentTest across all branches and leave it at that?

jungle’s picture

I am ok with that. and we know what's the root cause and how to avoid it. If so, please set it to RTBC, and let's hear from the committer(s).

@longwave, thanks for your help!

jungle’s picture

Status: Needs work » Needs review

Queued testings against all 4 branches.

longwave’s picture

Title: A few .php files are recognized as binary files by git » MenuLinkContentTest.php is recognized as a binary file by git
Status: Needs review » Reviewed & tested by the community

Retitling as we changed scope here and marking RTBC.

xjm’s picture

Nice sleuthing on this one. Saving issue credit.

  • xjm committed 7590d5b on 9.1.x
    Issue #3126906 by jungle, longwave, AndyF: MenuLinkContentTest.php is...

  • xjm committed c9ef4c2 on 9.0.x
    Issue #3126906 by jungle, longwave, AndyF: MenuLinkContentTest.php is...

  • xjm committed d77c396 on 8.9.x
    Issue #3126906 by jungle, longwave, AndyF: MenuLinkContentTest.php is...

  • xjm committed c723533 on 8.8.x
    Issue #3126906 by jungle, longwave, AndyF: MenuLinkContentTest.php is...
xjm’s picture

Version: 8.9.x-dev » 8.8.x-dev
Status: Reviewed & tested by the community » Fixed

I confirmed that once this patch is committed locally, the diff is detected as text. Since I couldn't run a proper regex diff because of the "binary data", I also read the whole serialized string to confirm that what's changing is double quotes and escaped null bytes, and nothing else.

I agree with leaving the fixtures alone; they're legacy now and less important to be able to diff than an actual test class.

I committed this to 9.1.x and cherry-picked it to 9.0.x, but couldn't cherry-pick it to 8.9.x because it couldn't merge a "binary file". (Another reason it's good we're fixing this!) So I applied the patch and committed again to 8.9.x, which I was then able to cherry-pick to 8.8.x (since this is a test improvement to an individual test, it's eligible for backport).

Tricksy little null bytes!

jungle’s picture

Bye-bye Little Null Bytes! 👋

🙏 Thanks to @AndyF, @longwave, and @daffie on slack, for helping. Thank you @xjm for committing! 🍻

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.