Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
#374441: Refactor Drupal HTML corrector (PHP5) added some code to load a text into a DOM object and then convert that back to a string.
As proved by #362972: Filters should remove "rel" attributes instead of just adding rel="nofollow" and #16161: Move Read More link to end of node content the exactly same code will be used by any filter that doesn't want to use regexp voodoo to alter the HTML.
Given that here is a patch to create 2 functions used to load and serialize the DOM. It probably needs phpdocs improvements.
Comments
Comment #1
tic2000 CreditAttribution: tic2000 commentedAnd a patch with a one line html corrector.
Comment #2
tic2000 CreditAttribution: tic2000 commentedAnd now a patch that removes the first regexp.
I hope that not many kittens are killed in the process.
Comment #4
tic2000 CreditAttribution: tic2000 commentedI saw that coming. New patch. It should fix the exceptions.
Comment #5
tic2000 CreditAttribution: tic2000 commentedComment #6
tic2000 CreditAttribution: tic2000 commentedComment #7
tic2000 CreditAttribution: tic2000 commentedComment #8
Damien Tournoud CreditAttribution: Damien Tournoud commentedThanks, tic2000.
A few nitpicks:
^ That should be filter_dom_serialize()
There is a blank line missing between the two sentences.
Let's rewrite the second one as "The resulting XHTML snippet will be properly formatted to be compatible with HTML user agents."
Comment #9
tic2000 CreditAttribution: tic2000 commentedComment #10
Damien Tournoud CreditAttribution: Damien Tournoud commentedYay! Thanks.
Comment #11
Dries CreditAttribution: Dries commentedCommitted this to CVS HEAD. Thanks!
I'm marking this 'code needs work' because I feel it is lacking some isolated tests -- or at least, maybe we should consider refactoring some of the existing tests. If you disagree, feel free to mark this 'fixed'.
Comment #12
tic2000 CreditAttribution: tic2000 commentedWhat should they test? They are just some wrapper function. I mean the load wrapper just does DOMDocument::LoadHTML. I don't think we need a test for that.
The serialize is a little more complex, but I think that tests for HTML corrector cover that.
Comment #13
dropcube CreditAttribution: dropcube commentedTests in FilterUnitTest->testHtmlCorrectorFilter() cover that.
Comment #14
tic2000 CreditAttribution: tic2000 commented