Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Let's try to find out.
Comment | File | Size | Author |
---|---|---|---|
#22 | parse-xml-1.patch | 13.87 KB | c960657 |
#13 | validation-exceptions-all-in-one.patch | 5.46 KB | Damien Tournoud |
#12 | validation-exceptions-all-in-one.patch | 5.48 KB | Damien Tournoud |
#8 | 402254-validation-exceptions.patch | 4.12 KB | Damien Tournoud |
#3 | 402254-validation-exceptions.patch | 917 bytes | Damien Tournoud |
Comments
Comment #1
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #2
mr.baileysI'm unsure what exactly this patch is about... I arrived here while searching for duplicates for #402356: XHTML validator needed in SimpleTest..
Can you explain a bit what the goal behind this patch is? Also, is simpletest the correct component for this (most of the changes seem to be in the filter module)
Comment #3
Damien Tournoud CreditAttribution: Damien Tournoud commentedComment #4
Damien Tournoud CreditAttribution: Damien Tournoud commented@mr.baileys: I uploaded the wrong patch. I wanted to know how many exceptions are hidden by the '@' before the loadHTML().
Comment #6
mr.baileysGotcha...
I don't think loadHTML will validate XHTML though unless validateOnParse is set (which will cause some other issues), so it's not really XHTML validation, more HTML validation. It will raise errors on invalid HTML like unclosed or mismatched tags, so still nice to have. One example of the errors against the XHTML spec it probably won't report is the use of duplicate IDs.
I'm working on full-blown XHTML validation in #402356: XHTML validator needed in SimpleTest. but it's proving more difficult than I originally thought.
Comment #7
Damien Tournoud CreditAttribution: Damien Tournoud commented@mr.baileys: could we first try to concentrate on fixing the issues identified by the test bot above?
Comment #8
Damien Tournoud CreditAttribution: Damien Tournoud commentedFolded #399488: Invalid markup generated by l(). inside, and prevented meta check when we are not on a HTML page (should fix Aggregator exceptions).
Comment #9
Dries CreditAttribution: Dries commentedInteresting patch! My tests are still running but it seems to have identified quite a few problems already. Just like I hoped/expected. This is certainly a good first start.
Comment #11
Damien Tournoud CreditAttribution: Damien Tournoud commented#405086: Custom 404 pages displayed twice was discovered by this.
Comment #12
Damien Tournoud CreditAttribution: Damien Tournoud commentedMerged #405086: Custom 404 pages displayed twice in.
Comment #13
Damien Tournoud CreditAttribution: Damien Tournoud commentedLet's try again.
Comment #15
Damien Tournoud CreditAttribution: Damien Tournoud commentedI opened the following issues, that cover most of the exceptions triggered during the tests:
#405238: Duplicate ID 'new' in forums breaks XHTML validation (patch attached, just needs reroll)
#405258: The blog test uses HTML assertion on RSS content
#405248: Invalid entities in user cancel
#405246: Several duplicate IDs on the comment module
#405284: Duplicate IDs in the contact form
#405270: Strange duplicate IDs during node preview and when a node is displayed multiple times on a page
#405292: Invalid entity in test results
Comment #16
Damien Tournoud CreditAttribution: Damien Tournoud commentedThose, and of course:
#405086: Custom 404 pages displayed twice
and #399488: Invalid markup generated by l().
Comment #17
Damien Tournoud CreditAttribution: Damien Tournoud commentedOh, and #405298: Invalid entity in add a language
Comment #18
Dave ReidComment #19
boombatower CreditAttribution: boombatower commentedGood to see HTML cleanup, subscribe.
Comment #20
Damien Tournoud CreditAttribution: Damien Tournoud commentedSeveral of the "duplicate IDs" issues are probably duplicates of #111719: FAPI fails to check auto-generated IDs for uniqueness - XHTML validation fails..
Comment #21
Damien Tournoud CreditAttribution: Damien Tournoud commentedSorry about the tag change.
Comment #22
c960657 CreditAttribution: c960657 commentedI don't know if it has been discussed, but we could parse the page using the XML parser instead of the more forgiving HTML parser. This will not validate the page, but it will report if the page is not well-formed XML.
Note that all browsers parse the pages as HTML, not XML (unless we change the Content-Type to application/xhtml+xml). With PHP's XML and HTML parser it looks like the resulting DOMs are identical (in a browser, the DOMs may be slightly different), except that the XML parser imports the elements into the proper XHTML namespace, so I don't think this will be a problem.
This patch is just a proof of concept.
Comment #23
Damien Tournoud CreditAttribution: Damien Tournoud commented@c960657: I really don't see the point. The main difference between loadHTML() and loadXML(), except the root of the DOM, is that loadHTML() will issue warnings, while loadXML() will fail the loading completely.
loadHTML() is really what we need, as it does proper XHTML validation without failing to load the tree altogether.
Comment #24
c960657 CreditAttribution: c960657 commented@Damian Tournoud: loadXML() also complains about unclosed tags (e.g.
<br>
), unquoted attribute values (<div id=abc>
), attribute minimization (<option selected>
) and tags using the wrong case (<div>...</DIV>
). These are allowed in HTML but not in XHTML.Comment #25
Dries CreditAttribution: Dries commentedI've been thinking about this patch and it _might_ be best to just commit it, break the test bot for a day or two, and collectively fix the notices over the next couple of days.
Sometimes, it is good to create efforts that are laser-focused and to step away from the convential path for a while. It might be the best and fastest way to get these 174 notices fixed in a short time frame, and to significantly advance our test framework.
I'd happy to be bold, break the bot, and put all other patches on hold until these 174 notices are fixed.
Thoughts?
Comment #26
ejort CreditAttribution: ejort commentedsubscribe
Comment #27
boombatower CreditAttribution: boombatower commentedAs I always ask, please please let me know before you do this....instead of after (like usual).
Comment #28
Dries CreditAttribution: Dries commentedTagging.
Comment #29
rcross CreditAttribution: rcross commentedI guess Dries decided not to make his bold move on this. Still interested in seeing how this progresses. Probably will be something post code freeze i guess.
Comment #30
effulgentsia CreditAttribution: effulgentsia commentedSubscribing from #750008: Add test for HTML ID uniqueness which was marked a "duplicate" of this one. That issue just adds a test for id uniqueness and is showing quite a few failures of that in HEAD.
Comment #31
Damien Tournoud CreditAttribution: Damien Tournoud commented@effulgentsia: see #15 and #16 for a bunch of issues (some of them related to ID uniqueness). I marked the other issue as a duplicate because this one has a larger scope (and already studied two different ways to validate the markup during testing).
Comment #32
ohnobinki CreditAttribution: ohnobinki commented+
Comment #33
mgiffordMaybe this isn't needed in D8 as it's all HTML5.
Comment #34
mgiffordComment #47
DamienMcKennaComment #48
longwaveI'm sure this is completely outdated now, we support HTML5 everywhere over XHTML now, simpletest.module is long gone, and it's not clear to me from reading the issue or comments exactly what this was meant to solve.