The _filter_html_image_secure_process() filter in FilterHtmlImageSecure checks for image tags in content. Is short, it does this:
<?php
$html_dom = Html::load($text);
$images = $html_dom->getElementsByTagName('img');
foreach ($images as $image) {
// Perform checks and replacements.
}
$text = Html::serialize($html_dom);
return $text;
?>
The Html::serialize
is always performed, even if the content does not have images. This leads to double serializing when the 'Correct faulty and chopped off HTML' filter is enabled as well, which currently results in HTML like this: text. Note the duplicate xml:lang
attributes. The double attribute tag bug is tackled in another issue (#1333730: [Meta] PHP DOM (libxml2) misinterprets HTML5), but while debugging #2346989: [PP-1] 'Restrict images to this site' leads to incorrect HTML with language attribute, I found this issue.
Comment | File | Size | Author |
---|---|---|---|
#5 | interdiff-2665684-2-5.txt | 478 bytes | dimaro |
#5 | filterhtmlimagesecure-2665684-5.patch | 564 bytes | dimaro |
#2 | restrict_images_to-2665684-2.patch | 639 bytes | BarisW |
Comments
Comment #2
BarisW CreditAttribution: BarisW at LimoenGroen commentedAnd here is the patch that simply returns the source text if it does not contain an image tag.
Comment #3
BarisW CreditAttribution: BarisW at LimoenGroen commentedComment #4
Wim LeersI think even the comment is unnecessary TBH :)
I also don't see why this could only go into 8.1 since this is an absolutely harmless optimization.
Comment #5
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedRemove unnecessary comment as described in #4.
Comment #6
Wim LeersComment #7
alexpottSeems like a test could be written considering core provides both filters. And I;m not sure what the patch does to fix the double serialising if both filters are enabled and there are images...
In fact if there are images I'm not sure how we can prevent the double serializing.
Comment #8
aerozeppelin CreditAttribution: aerozeppelin commentedIn this statement ,
$images would contain a 'DOMNodeList' object, so instead of
it would be appropriate to check for
Comment #9
heykarthikwithumy ide is showing an error (Closing tag name missing) on filter.module file, line number 744. which is from this commit http://cgit.drupalcode.org/drupal/commit/?id=8fe50513608c519f36ee42f658f...
Comment #12
dimaro CreditAttribution: dimaro at La Drupalera by Emergya commentedAdd test against 8.2.x on #5.
Comment #16
Ivan Berezhnov CreditAttribution: Ivan Berezhnov as a volunteer and at Drupal Ukraine Community for Levi9 commentedComment #23
larowlanStill needs work for tests here
Comment #24
smustgrave CreditAttribution: smustgrave at Mobomo commentedAfter testing it out I was not seeing the double encoding. Think the issue summary needs to be updated.
Currently I don't think it's an issue. If there are images it loops through them. If not it doesn't.
Comment #26
DanielVezaI was rerolling this because on the surface it does make sense to skip all the processing if there is no images.
I had one thought while I was doing it that if we aren't calling
Html::serialize
anymore, things likescript
won't be stripped anymore since that happens in the serialize function. So if there is a text format with only this filter enabled, scripts will get through.In which case we could change the early return to be
return Html::serialize($text);
. But I'm not sure if thats worth doing since the code skipped by the early return would be minimal.Interested in hearing thoughts.