Problem/Motivation

Try, for example, with https://www.drupal.org/project/entityreference_dragdrop, whose info.yml file says:

name: Entity Reference Drag & Drop

As for tests, I can see a theme whose name contains special characters, presumably for testing this (modules/block/tests/modules/block_test/themes/block_test_specialchars_theme/block_test_specialchars_theme.info.yml), but we don't seem to have a module for this! So that means there's maybe no test coverage for the Extend page, even though that gets the escaping right.

Steps to reproduce

Proposed resolution

Decode the title just after the XML is fetched.

Remaining tasks

review
commit

User interface changes

Before

After

API changes

Data model changes

Release notes snippet

Comments

joachim created an issue. See original summary.

mb_lewis’s picture

Issue summary: View changes
Status: Active » Needs review
StatusFileSize
new1.08 KB

This patch includes a fix for the double encoding on the update page, however it is not the nicest solution, but hopefully might help shine some light on some of the underlying issues.

Essentially this issue is being caused by the module title being escaped both when the update information is fetched and when the title is changed to be a link using:
Link::fromTextAndUrl($project['title'], Url::fromUri($project['link']))->toString();

afeijo’s picture

I don't see the problem happening, I edited a module, added & to the name and description, but Drupal 8.8 shows it correctly at the Uninstall page.
Can you provide a screen shot? Or am I missing something here...

mb_lewis’s picture

Re: #3: This issue only appears on the update page, and only if the name of the module on drupal.org has an & (or presumably other symbols) in it. Unfortunately editing the modules name: in the .info file won't replicate the issue as the name used on the update page comes directly from drupal.org.

So to reproduce this you would need to find a module with a symbol in the name (for example: Entity Reference Drag & Drop) and install an outdated version of this module (or modify the version number in the modules .info) so that is shows up on the update page. You now should see that the symbols have been double encoded on the update page.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

mb_lewis’s picture

mitthukumawat’s picture

StatusFileSize
new15.18 KB

I have checked this issue with Entity Reference Drag & Drop module and can see Entity Reference Drag & Drop 2.0.0-beta1 on available updates.
I have applied the above patch also but still I can see the same module name.
Is there anything else to look on available updates page?

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

vikashsoni’s picture

StatusFileSize
new11.95 KB
new11.95 KB

@mb_lewis i have applied the patch
But i didn't see any changes
for reference sharing screenshots .....

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

sonam.chaturvedi’s picture

Status: Needs review » Needs work

Verified and tested patch#2 on 9.5.x-dev.

Patch applied successfully with 2 offsets.

$ git apply -v module-update-double-escaped-3069491-2.patch
Checking patch core/modules/update/src/Form/UpdateManagerUpdate.php...
Hunk #1 succeeded at 6 (offset 1 line).
Hunk #2 succeeded at 118 (offset 8 lines).
Applied patch core/modules/update/src/Form/UpdateManagerUpdate.php cleanly.

However, patch #2 does not fix the issue. Moving to Needs Work.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

quietone’s picture

Issue summary: View changes
Issue tags: +Bug Smash Initiative

Bugsmash triage. This is still valid. Added the standard template to the IS.

ameymudras’s picture

StatusFileSize
new1.89 KB
ameymudras’s picture

Status: Needs work » Needs review
smustgrave’s picture

Status: Needs review » Needs work
Issue tags: +Needs Review Queue Initiative

Did not test.

Was previously tagged for tests which are still needed.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

quietone’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new1.87 KB
new1.21 KB
new2.07 KB
new783 bytes
new14.73 KB
new14.12 KB

I don't know why the fix is made in two places. In my testing I found that having it only in the preprocess function was sufficient. Therefor, I have removed the change from the update manager. I also added a simple unit test. Also, screenshots in the Issue Summary.

The last submitted patch, 21: 3069491-21-fail.patch, failed testing. View results

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs tests

Thanks @quietone test case seems to show the problem so will remove that tag

Fix LGTM.

lauriii’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/modules/update/tests/src/Unit/ProjectStatusTest.php
    @@ -0,0 +1,39 @@
    +        'title' => Html::escape('This & that'),
    

    I think it would be better to write an integration test for this to make sure that this continues to work if the input changes.

  2. +++ b/core/modules/update/update.report.inc
    @@ -5,6 +5,7 @@
    +use Drupal\Component\Utility\Html;
    
    @@ -149,6 +150,7 @@ function template_preprocess_update_project_status(&$variables) {
    +  $variables['title'] = Html::decodeEntities($variables['title']);
    

    Why do we need to call \Drupal\Component\Utility\Html::decodeEntities here? Shouldn't we wrap the title with \Drupal\Component\Render\MarkupInterface where it gets escaped?

quietone’s picture

It looks like it is escaped in the release xml.

<?xml version="1.0" encoding="utf-8"?>
<project xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>Save &amp;amp; Edit</title>
  <short_name>save_edit</short_name>
  <dc:creator>himerus</dc:creator>
quietone’s picture

Issue summary: View changes
Status: Needs work » Needs review
StatusFileSize
new4.29 KB
new893 bytes
new5.34 KB

The comment above is not correct.
The XML from d.o is

<project>
<title>Save &amp; Edit</title>
<short_name>save_edit</short_name>
<dc:creator>himerus</dc:creator>
<type>project_module</type>
<supported_branches>8.x-1.</supported_bran

where it is escaped but not double escape. The extra escape happens in the fetching process, in stream_get_contents().

It took awhile to figure out how the update Functional tests work. Or, at least enough to add tests for this problem.

#24
1. This patch adds a new test module and update XML with double escaped characters. The additions to \Drupal\Tests\update\Functional\UpdateContribTest::testUpdateContribBasic test the text on the updates page.
2. I don't really understand this nor do I know about MarkupInterface. I opted to decode the title just after it is fetched.

The interdiff is between the success and fail patch. There is no interdiff with a previous patch because this is a new approach. Testing confirms that the screenshots are still correct.

The last submitted patch, 26: 3069491-26-fail.patch, failed testing. View results

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

Additional test coverage looks good.

For #24
1 looks like additional test coverage has been added
2 decodeEntities is still there but a comment was added.

LGTM!

quietone’s picture

I'm triaging RTBC issues.

After reading the IS and comments I don't see anything left to do. I guess there is still a question of whether adding the double escape to the title in source XML is the correct way to test this but I didn't see another way. See #26

Leaving at RTBC.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 26: 3069491-26.patch, failed testing. View results

quietone’s picture

Status: Needs work » Needs review

Neither failure is related to the change here. Retesting.

smustgrave’s picture

Status: Needs review » Reviewed & tested by the community

All green!

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 26: 3069491-26.patch, failed testing. View results

smustgrave’s picture

Status: Needs work » Reviewed & tested by the community

Odd I see all green.

xjm’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Security
  1. +++ b/core/modules/update/src/UpdateProcessor.php
    @@ -175,6 +176,9 @@ public function processFetchTask($project) {
    +        // The title is escaped in the XML from drupal.org and the fetch
    +        // process will encode it again. Therefor, it is decoded here.
    

    Two things:

    • Drupal.org is canonically capitalized when used as a sentence. (Or put otherwise, there is a website at https://drupal.org and it is called "Drupal.org".)
    • "Therefore" is misspelled.
  2. +++ b/core/modules/update/src/UpdateProcessor.php
    @@ -175,6 +176,9 @@ public function processFetchTask($project) {
    +        $available['title'] = Html::decodeEntities($available['title']) ?? '';
    

    Honestly, this makes me quite nervous. I read the explanation in the comment, but still.

    I mean, if the update data from d.o is compromised, the site has way worse problems than a little XSS in a title (i.e., full RCE). The same is true if the module author is malicious. But I'd still be inclined to harden it after the decode so that it's protected in case of future refactoring etc. This would also mean an additional test case for a

    Test module <script>confirm('DO YOU LIKE MY MODULE???');</script>
    

    or something.
     

  3. Has anyone tried the approach @lauriii suggested, since @quietone confirmed it was single-escaping in the XML? That would also allow us to sanitize it for disallowed markup and mark it as safe at the same time.

quietone’s picture

StatusFileSize
new833 bytes
new5.34 KB

35.1. Fixed.

I am not running tests because it is only a comment change.

Still needs work for 35, 1 and 2.

mfb’s picture

As far as I see, comment #25 correct, the bug appears to be on drupal.org

https://updates.drupal.org/release-history/save_edit/current has <title>Save &amp;amp; Edit</title> - this is double escaped.

mfb’s picture

I found the bug in project module release/project_release.drush.inc

    $xml = new SimpleXMLElement('<root></root>');
    $xml->title = check_plain($project->title);

simplexml handles the escaping for you, so this should be $xml->title = $release->title;

e.g.:

    $xml = new SimpleXMLElement('<root></root>');
    $xml->release->name = 'Save & Edit';
    echo $xml->asXml();

Prints

<?xml version="1.0"?>
<root><release><name>Save &amp; Edit</name></release></root>

which is correct.

mfb’s picture

Project: Drupal core » Project
Version: 11.x-dev » 7.x-2.x-dev
Component: update.module » Projects
Status: Needs work » Needs review
StatusFileSize
new5.63 KB

Here's a patch for project module.

xjm’s picture

Hmm, nervous about that approach as well. :)

mfb’s picture

@xjm well, all I can do is assure you that the property overloading used by project_release module, e.g. $xml->title = $project->title; is how you instruct SimpleXML to escape some text and add it to the document.

SimpleXML actually does have a way to add already-escaped text to a document, but this is not it. The code could be rewritten to use this method, but would be more verbose, and presumably slower since the escaping would be happening via the drupal function rather than C code.

drumm’s picture

My concern would be if any version of Drupal (or other clients) double-decoded this data, so if resetting this to be technically correct would open them up to XSS/etc issues.

mfb’s picture

I would say it's best to just fix the bug at the source, because if a client is HTML-decoding the data and treating it as trusted markup with no filtering, they would also have a security issue today. The key is that clients need to treat this data like any other untrusted, third-party, user-generated content; i.e. these fields should be HTML-encoded when being rendered in markup.

But if you really want to leave the XML as-is for backwards compatibility reasons, then yes you could specify that these fields have an extra round of HTML-encoding, meaning clients need to get the data from XML parser, do the extra round of HTML-decoding, then HTML-encode it again when rendering in markup. Seems silly (to me) but certainly works as a solution.

dww’s picture

Coming here from #3493742: Double encoded ampersand in /admin/reports/updates which I was pinged about in Slack for a subsystem maintainer review.

From a little Git archeology, looks like this double-encoding was introduced in #1003764: Add an alter hook invoked while generating release-history XML files when @drumm converted this from manually putting the XML together into using SimpleXML. Apparently no one noticed that SimpleXML is doing another round of encoding for us.

I just did this:

% wget https://updates.drupal.org/release-history/save_edit/current -O save_edit.xml
...
HTTP request sent, awaiting response... 200 OK
Length: 9702 (9.5K) [text/xml]
Saving to: ‘save_edit.xml’

save_edit.xml             100%[=====================================>]   9.47K  --.-KB/s    in 0s

2025-01-15 14:45:23 (23.8 MB/s) - ‘save_edit.xml’ saved [9702/9702]
% grep title save_edit.xml
<title>Save &amp;amp; Edit</title>

So the source is definitely double-encoded. It seems wonky to say that core should have to decode twice to get the raw values.

I don't understand the "BC" concern here. The primary client for these feeds, update.module, was written to assume properly encoded markup in the XML feeds, and is now "broken" by the double encoding. If other clients are already decoding twice, and we stop double encoding, what's the harm?

Here's a little test script:


$inputs = [
  'not_encoded' => '<title>Save & Edit</title>',
  'encoded_once' => '<title>Save &amp; Edit</title>',
  'encoded_twice' => '<title>Save &amp;amp; Edit</title>',
];

foreach ($inputs as $key => $value) {
  echo "------\n$key\n";
  echo "Raw: $value\n";
  $decoded = html_entity_decode($value, ENT_QUOTES | ENT_HTML5, 'UTF-8');
  echo "Decoded: $decoded\n";
  echo "Double decoded: " . html_entity_decode($decoded, ENT_QUOTES | ENT_HTML5, 'UTF-8') . "\n";
}

If I run this, I get:

------
not_encoded
Raw: <title>Save & Edit</title>
Decoded: <title>Save & Edit</title>
Double decoded: <title>Save & Edit</title>
------
encoded_once
Raw: <title>Save &amp; Edit</title>
Decoded: <title>Save & Edit</title>
Double decoded: <title>Save & Edit</title>
------
encoded_twice
Raw: <title>Save &amp;amp; Edit</title>
Decoded: <title>Save &amp; Edit</title>
Double decoded: <title>Save & Edit</title>

We're still going to turn around and re-encode this when printing it in the available updates report. That happens via core/modules/update/templates/update-project-status.html.twig right here:

<div class="project-update__title">
  {%- if url -%}
    <a href="{{ url }}">{{ title }}</a>
  {%- else -%}
    {{ title }}
  {%- endif %}
  {{ existing_version }}
  {% if install_type == 'dev' and datestamp %}
    <span class="project-update__version-date">({{ datestamp }})</span>
  {% endif %}
</div>

So Twig is going to escape this for us, and I don't see how this could be used as an attack vector.

IMHO, this is the right fix, not the extra decode being proposed at #3493742. I am a maintainer for project*, so I could just commit this, but I can't deploy the fix on d.o anymore. 😅

mfb’s picture

Yeah only bad code interpreting the feed could be a danger here, because untrusted third party strings need to be just that, untrusted, and run thru something like HTML Purifier or the filter built into drupal core; i.e. security issues shouldn't be possible regardless of what's in a third party string.

drumm’s picture

Thanks for the research!

I don't understand the "BC" concern here. The primary client for these feeds, update.module, was written to assume properly encoded markup in the XML feeds, and is now "broken" by the double encoding. If other clients are already decoding twice, and we stop double encoding, what's the harm?

My only real concern is if core was ever double-decoding. And I hope every version of core does always filter on output, so this should be fine.

  • drumm committed b8bedd4a on 7.x-2.x
    Issue #3069491 by quietone, mfb, ameymudras, dww, lauriii: Module names...
drumm’s picture

Status: Needs review » Reviewed & tested by the community

This is now deployed and I rebuilt the XML for https://www.drupal.org/project/entityreference_dragdrop

Before rebuilding the XML for every project with an & in its title, I'd appreciate a final confirmation that this is looking good in Drupal core now.

dww’s picture

StatusFileSize
new27.18 KB

Sweet, thanks!

Looks great to me on a local site still running 10.3.10 core. Just did:

composer require drupal/entityreference_dragdrop
drush en entityreference_dragdrop

Visited /admin/reports/updates on the site, and saw this:
Update status from 10.3.10 core showing project with ampersand in title correctly escaped / decoded

Please rebuild all the projects with &.

Thanks so much!
-Derek

drumm’s picture

Status: Reviewed & tested by the community » Fixed

That's now done

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.