Change record status: 
Project: 
Introduced in branch: 
10.2.x
Introduced in version: 
10.2.0
Description: 

Until Drupal 10.2.0, the \Drupal\Component\Utility\Html utility class methods and the Drupal input filter system used the libxml2 DOMDocument parser, which only supports XHTML.

Drupal 10.2.0 now uses the HTML5-PHP library to parse and output HTML5 instead. This affects the following methods:

  • \Drupal\Component\Utility\Html::load()
  • \Drupal\Component\Utility\Html::serialize()
  • \Drupal\Component\Utility\Html::normalize()

Drupal input filters which use these methods, such as "Limit allowed HTML tags and correct faulty HTML", will also now output HTML5.

As an example, the following would previously be output as XHTML:

<p>
  Example text
  <br />
  <img src="sites/defaut/files/image.jpg" />
</p>

HTML5 uses void elements for the br and img tags so this is now output as:

<p>
  Example text
  <br>
  <img src="sites/defaut/files/image.jpg">
</p>

This should not have any effect in most cases, but anything that was looking for very specific XHTML output may need to be updated to look for the updated HTML5 equivalent. Some notable changes:

  • libxml2 previously used to strip newlines from input in some places. For example, this meant that the line break between the image and the caption in filter-caption.html.twig was stripped. Line breaks are now preserved correctly in all cases.
  • Escaped newline characters are fully normalized, e.g. &#13; will be normalized to \n.
  • <br></br> was previously normalized to a single line break <br /> but a change in the HTML5 parsing spec means this is now normalized to two line breaks <br><br>. This matches what modern browsers do anyway if they see this HTML input.
  • <script> tags were previously normalized so the script was surrounded by CDATA markers, e.g.
    //<![CDATA[
    ...
    //]]>
    

    This was only required if treating the XHTML as an XML document; HTML5 no longer requires this so the CDATA markers are no longer added.

Impacts: 
Module developers
Themers

Comments

trackleft2’s picture

kopeboy’s picture

The last sentence

Use the extra " /" inside the tag to maintain XHTML 1.0 compatibility

should be removed, and the example changed to Text with <br>line break