Problem/Motivation
This is a followup from #3409587: [10.2 regression] RSS feeds invalid due to .
RSS feeds are now valid but have a warning on the W3C feed validator:
This feed is valid, but interoperability with the widest range of feed readers could be improved by implementing the following recommendations.
line 6, column 42: description should not contain HTML:
&
<description>Training &amp; Events</description>
Steps to reproduce
Create a feed in views with an RSS channel description that contains an ampersand, e.g. "Training & Events". Channel description is a field in the Feed:Style options setting section in views.
Checking the feed output against https://validator.w3.org it prints a warning for the channel description line: "description should not contain HTML: &
". The RSS feed literally contains &amp;
which is parsed into human-readable text &
. It should contain &
which is parsed as human-readable text &
.
Proposed resolution
The \Drupal\Core\EventSubscriber\RssResponseRelativeUrlFilter::transformRootRelativeUrlsToAbsolute() method processes all RSS feed description elements as markup. However, RSS has two different kinds of description elements: item description elements, which according to the RSS specs are interpreted as markup, and channel description elements, which are interpreted as human-readable. So that method should skip channel description elements.
Remaining tasks
User interface changes
API changes
Data model changes
Issue fork drupal-3424768
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
- 3424768-problematic-xml-characters changes, plain diff MR !6842
Comments
Comment #2
cilefen CreditAttribution: cilefen commentedSo it should be
&
, correct?See https://www.w3.org/TR/REC-xml/#dt-chardata
Comment #3
mfbWhat I found when trying to reproduce this issue is that Views outputs
<channel><description>Training &amp; Events</description></channel>
as the feed description.However, it is recommended to output
<channel><description>Training & Events</description></channel>
My interpretation of what the feed validator is saying is that it's recommended that a feed description be semantically considered to be plain text, and thus a user-entered feed description should be encoded once to be rendered in the feed description XML element:
<channel><description>Training & Events</description></channel>
An item description, on the other hand, could be semantically considered to be markup, thus a user-entered item description would be encoded once to render valid markup from the entered text, and a second time to be rendered in the item description XML element:
<item><description>Training &amp; Events</description></item>
Comment #4
mfbThis does seem to be pretty heavily related to #3409587: [10.2 regression] RSS feeds invalid due to after all, although it's a separate bug in the same code.
It appears that \Drupal\Core\EventSubscriber\RssResponseRelativeUrlFilter::transformRootRelativeUrlsToAbsolute() is operating on channel description elements, but it should not, as these are considered to be human-readable plain text. It should only be operating on item description elements, which are considered to be markup.
Comment #6
mfb@OMD can you test my attempted fix in MR 6842? If it resolves the warning then we can update issue summary, add a test and reroll as a merge request on 11.x branch.
Comment #7
cilefen CreditAttribution: cilefen commentedThe
&
is being escaped twice?Comment #8
mfb@cilefen Yes, that's what I found when trying to reproduce the issue. If we confirm that the issue is basically the opposite of the title then I will update it :)
Comment #9
mfbAdded unit test coverage for
&
in channel description element.Comment #10
smustgrave CreditAttribution: smustgrave at Mobomo commentedFrom reading the issue summary provided I believe @mfb you were correct in your assumption.
So can the issue summary be updated to match. Also MR should probably be pointed to 11.x
Thanks.
Comment #11
mfbComment #12
smustgrave CreditAttribution: smustgrave at Mobomo commentedThanks @mfb!
Ran the test-only feature here https://git.drupalcode.org/issue/drupal-3424768/-/jobs/983276 which showed the test failure.
The change to the loop makes sense and fixes the issue following the scenario described.
Comment #13
longwaveThe findings here match with the comment in
template_preprocess_views_view_rss()
:Committed and pushed e8db570e86 to 11.x and 3ff7664833 to 10.3.x and e851b33905 to 10.2.x. Thanks!