Problem/Motivation
Try, for example, with https://www.drupal.org/project/entityreference_dragdrop, whose info.yml file says:
name: Entity Reference Drag & Drop
As for tests, I can see a theme whose name contains special characters, presumably for testing this (modules/block/tests/modules/block_test/themes/block_test_specialchars_theme/block_test_specialchars_theme.info.yml), but we don't seem to have a module for this! So that means there's maybe no test coverage for the Extend page, even though that gets the escaping right.
Steps to reproduce
Proposed resolution
Decode the title just after the XML is fetched.
Remaining tasks
review
commit
User interface changes
Before

After

API changes
Data model changes
Release notes snippet
Comments
Comment #2
mb_lewis commentedThis patch includes a fix for the double encoding on the update page, however it is not the nicest solution, but hopefully might help shine some light on some of the underlying issues.
Essentially this issue is being caused by the module title being escaped both when the update information is fetched and when the title is changed to be a link using:
Link::fromTextAndUrl($project['title'], Url::fromUri($project['link']))->toString();Comment #3
afeijoI don't see the problem happening, I edited a module, added & to the name and description, but Drupal 8.8 shows it correctly at the Uninstall page.
Can you provide a screen shot? Or am I missing something here...
Comment #4
mb_lewis commentedRe: #3: This issue only appears on the update page, and only if the name of the module on drupal.org has an & (or presumably other symbols) in it. Unfortunately editing the modules
name:in the .info file won't replicate the issue as the name used on the update page comes directly from drupal.org.So to reproduce this you would need to find a module with a symbol in the name (for example: Entity Reference Drag & Drop) and install an outdated version of this module (or modify the version number in the modules .info) so that is shows up on the update page. You now should see that the symbols have been double encoded on the update page.
Comment #8
mb_lewis commentedComment #9
mitthukumawat commentedI have checked this issue with Entity Reference Drag & Drop module and can see Entity Reference Drag & Drop 2.0.0-beta1 on available updates.
I have applied the above patch also but still I can see the same module name.
Is there anything else to look on available updates page?
Comment #12
vikashsoni commented@mb_lewis i have applied the patch
But i didn't see any changes
for reference sharing screenshots .....
Comment #14
sonam.chaturvedi commentedVerified and tested patch#2 on 9.5.x-dev.
Patch applied successfully with 2 offsets.
However, patch #2 does not fix the issue. Moving to Needs Work.
Comment #16
quietone commentedBugsmash triage. This is still valid. Added the standard template to the IS.
Comment #17
ameymudras commentedComment #18
ameymudras commentedComment #19
smustgrave commentedDid not test.
Was previously tagged for tests which are still needed.
Comment #21
quietone commentedI don't know why the fix is made in two places. In my testing I found that having it only in the preprocess function was sufficient. Therefor, I have removed the change from the update manager. I also added a simple unit test. Also, screenshots in the Issue Summary.
Comment #23
smustgrave commentedThanks @quietone test case seems to show the problem so will remove that tag
Fix LGTM.
Comment #24
lauriiiI think it would be better to write an integration test for this to make sure that this continues to work if the input changes.
Why do we need to call
\Drupal\Component\Utility\Html::decodeEntitieshere? Shouldn't we wrap the title with\Drupal\Component\Render\MarkupInterfacewhere it gets escaped?Comment #25
quietone commentedIt looks like it is escaped in the release xml.
Comment #26
quietone commentedThe comment above is not correct.
The XML from d.o is
where it is escaped but not double escape. The extra escape happens in the fetching process, in stream_get_contents().
It took awhile to figure out how the update Functional tests work. Or, at least enough to add tests for this problem.
#24
1. This patch adds a new test module and update XML with double escaped characters. The additions to \Drupal\Tests\update\Functional\UpdateContribTest::testUpdateContribBasic test the text on the updates page.
2. I don't really understand this nor do I know about MarkupInterface. I opted to decode the title just after it is fetched.
The interdiff is between the success and fail patch. There is no interdiff with a previous patch because this is a new approach. Testing confirms that the screenshots are still correct.
Comment #28
smustgrave commentedAdditional test coverage looks good.
For #24
1 looks like additional test coverage has been added
2 decodeEntities is still there but a comment was added.
LGTM!
Comment #29
quietone commentedI'm triaging RTBC issues.
After reading the IS and comments I don't see anything left to do. I guess there is still a question of whether adding the double escape to the title in source XML is the correct way to test this but I didn't see another way. See #26
Leaving at RTBC.
Comment #31
quietone commentedNeither failure is related to the change here. Retesting.
Comment #32
smustgrave commentedAll green!
Comment #34
smustgrave commentedOdd I see all green.
Comment #35
xjmTwo things:
Honestly, this makes me quite nervous. I read the explanation in the comment, but still.
I mean, if the update data from d.o is compromised, the site has way worse problems than a little XSS in a title (i.e., full RCE). The same is true if the module author is malicious. But I'd still be inclined to harden it after the decode so that it's protected in case of future refactoring etc. This would also mean an additional test case for a
or something.
Has anyone tried the approach @lauriii suggested, since @quietone confirmed it was single-escaping in the XML? That would also allow us to sanitize it for disallowed markup and mark it as safe at the same time.
Comment #36
quietone commented35.1. Fixed.
I am not running tests because it is only a comment change.
Still needs work for 35, 1 and 2.
Comment #37
mfbAs far as I see, comment #25 correct, the bug appears to be on drupal.org
https://updates.drupal.org/release-history/save_edit/current has
<title>Save &amp; Edit</title>- this is double escaped.Comment #38
mfbI found the bug in project module release/project_release.drush.inc
simplexml handles the escaping for you, so this should be
$xml->title = $release->title;e.g.:
Prints
which is correct.
Comment #39
mfbHere's a patch for project module.
Comment #40
xjmHmm, nervous about that approach as well. :)
Comment #41
mfb@xjm well, all I can do is assure you that the property overloading used by project_release module, e.g.
$xml->title = $project->title;is how you instruct SimpleXML to escape some text and add it to the document.SimpleXML actually does have a way to add already-escaped text to a document, but this is not it. The code could be rewritten to use this method, but would be more verbose, and presumably slower since the escaping would be happening via the drupal function rather than C code.
Comment #42
drummMy concern would be if any version of Drupal (or other clients) double-decoded this data, so if resetting this to be technically correct would open them up to XSS/etc issues.
Comment #43
mfbI would say it's best to just fix the bug at the source, because if a client is HTML-decoding the data and treating it as trusted markup with no filtering, they would also have a security issue today. The key is that clients need to treat this data like any other untrusted, third-party, user-generated content; i.e. these fields should be HTML-encoded when being rendered in markup.
But if you really want to leave the XML as-is for backwards compatibility reasons, then yes you could specify that these fields have an extra round of HTML-encoding, meaning clients need to get the data from XML parser, do the extra round of HTML-decoding, then HTML-encode it again when rendering in markup. Seems silly (to me) but certainly works as a solution.
Comment #44
dwwComing here from #3493742: Double encoded ampersand in /admin/reports/updates which I was pinged about in Slack for a subsystem maintainer review.
From a little Git archeology, looks like this double-encoding was introduced in #1003764: Add an alter hook invoked while generating release-history XML files when @drumm converted this from manually putting the XML together into using SimpleXML. Apparently no one noticed that SimpleXML is doing another round of encoding for us.
I just did this:
So the source is definitely double-encoded. It seems wonky to say that core should have to decode twice to get the raw values.
I don't understand the "BC" concern here. The primary client for these feeds, update.module, was written to assume properly encoded markup in the XML feeds, and is now "broken" by the double encoding. If other clients are already decoding twice, and we stop double encoding, what's the harm?
Here's a little test script:
If I run this, I get:
We're still going to turn around and re-encode this when printing it in the available updates report. That happens via
core/modules/update/templates/update-project-status.html.twigright here:So Twig is going to escape this for us, and I don't see how this could be used as an attack vector.
IMHO, this is the right fix, not the extra decode being proposed at #3493742. I am a maintainer for project*, so I could just commit this, but I can't deploy the fix on d.o anymore. 😅
Comment #45
mfbYeah only bad code interpreting the feed could be a danger here, because untrusted third party strings need to be just that, untrusted, and run thru something like HTML Purifier or the filter built into drupal core; i.e. security issues shouldn't be possible regardless of what's in a third party string.
Comment #46
drummThanks for the research!
My only real concern is if core was ever double-decoding. And I hope every version of core does always filter on output, so this should be fine.
Comment #48
drummThis is now deployed and I rebuilt the XML for https://www.drupal.org/project/entityreference_dragdrop
Before rebuilding the XML for every project with an
&in its title, I'd appreciate a final confirmation that this is looking good in Drupal core now.Comment #49
dwwSweet, thanks!
Looks great to me on a local site still running 10.3.10 core. Just did:
Visited /admin/reports/updates on the site, and saw this:

Please rebuild all the projects with
&.Thanks so much!
-Derek
Comment #50
drummThat's now done