Problem/Motivation
On libxml 2.9.14 there is a difference in DOMDocument::loadHTML() behavior. The failures are not present in PHP 8.3 with libxml 2.9.10 (tested in Debian bullseye vs bookworm).
Currently, Drupal PHP 8.3 image is using libxml 2.9.14, which causes failures in D7 tests:
https://www.drupal.org/pift-ci-job/2790709
https://www.drupal.org/pift-ci-job/2790712
Similar issue was investigated in D10, but it seems like the solution was to replace DOMDocument parsing with another library, which seems difficult for D7 in this phase.
See the full history here: #3383577: TextSummaryTest:testLength() fails on some libxml versions
Steps to reproduce
Run tests on PHP 8.3 with libxml 2.9.14
Proposed resolution
Find a workaround how to fix this changed behavior to have green tests again.
Remaining tasks
User interface changes
API changes
Data model changes
Release notes snippet
Issue fork drupal-3397882
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
andypostDrupal 10.2 solved the issue via #2441811: Upgrade filter system to HTML5
probably it needs some wrapper which is checking for libxml version or relax test somehow
Comment #3
poker10 commentedAn answer from libxml2 team is that this was a deliberate change, so we need to find workaround as @andypost mentioned.
See: https://gitlab.gnome.org/GNOME/libxml2/-/issues/615
Comment #4
andypostLooks like checking for libxml version in test looks the most viable solution
Comment #5
poker10 commentedYes, we can fix the test, but I am not sure if this change in libxml2 could not have a potential impact on the real core functionality in some edge cases (and that we will not hide some problems when we use this workaround). Other than that, I am not sure how many different "characters" are affected by this change - so far we know that parsing
<is problematic, but there can be more I think.Comment #6
joseph.olstadThere's been 15 releases of libxml2 since 2.9.14
any chance that this issue is fixed in https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.12.3 v2.12.3?
Comment #7
poker10 commentedWe can try it, but I do not think so, because when I raised this question here: https://gitlab.gnome.org/GNOME/libxml2/-/issues/615, the answer was it is not a bug, but a deliberate change in the library.
Comment #8
joseph.olstadah ok, ya just looked at your issue 615 on github, so the new versions would likely not have changed from the 2.9.14
Funny they couldn't make it work for both html4 and html5
Comment #9
andypostThere's bigger changes are coming in Dom PHP extension https://wiki.php.net/rfc/opt_in_dom_spec_compliance
Comment #10
andypostComment #12
poker10 commentedWe discussed this issue with @Fabianx on Slack and we agreed it would be the best to just skip the testing of this edge-case for the LIBXML version >= 2.9.14. It seems like the change in libxml2 have not caused other issues with D7, as other tests are not failing.
I have created a MR here with the proposed fix (it just skips the testing of
"<"when tested withfilter_htmlcorrectorin case the LIBXML version is >= 2.9.14. It was possible to do this in several ways, but I think this approach has the smallest impact (unfortunatelly it seems not easily possible to remove the array element entirely, as the for loop is testing the lenght by array key position).All current tests seems to be green with this change: https://git.drupalcode.org/project/drupal/-/pipelines/170137 . Also the PHP 8.3 test is without this failure (only the session one is still present after applying the fix from there): https://www.drupal.org/pift-ci-job/2893113 .
For illustration, if we combine fix from this issue and fix from the other one (session failure), the 8.3 testing is green: https://git.drupalcode.org/project/drupal/-/pipelines/170105 (see the META issue).
Moving to Needs review.
Comment #13
joseph.olstad100% green here for 8028
Comment #14
poker10 commentedComment #16
mcdruid commentedAgree that this looks like the most sensible pragmatic solution.
Also LOL at the dry humour in https://gitlab.gnome.org/GNOME/libxml2/-/issues/474
Thanks!
Comment #17
joseph.olstadthis ? lol