Problem Statement:
When a string containing encoded html entities is parsed through `Xss::filterAdmin` function, the returned string contains improper html entity code.

The returned html entity code starts with one extra `&amp` and resultant html entity code becomes `&amp&amp`.

What is causing this issue?
The following code fragment at static function Xss::filter is causing this issue:

// Defuse all HTML entities.
  $string = str_replace('&', '&', $string);

Comments

adhariwal created an issue. See original summary.

adhariwal’s picture

Issue summary: View changes
cilefen’s picture

Priority: Major » Normal
goz’s picture

Status: Active » Closed (works as designed)

I don't know were you see this code

 // Defuse all HTML entities.
  $string = str_replace('&', '&', $string);

In 8.4.x, it's

// Defuse all HTML entities.
    $string = str_replace('&', '&', $string); 

http://cgit.drupalcode.org/drupal/tree/core/lib/Drupal/Component/Utility...

which work as expected.

Testing with :

echo \Drupal\Component\Utility\Xss::filter('something & something & else');

result : something & something & else

Everything look fine.

adhariwal’s picture

StatusFileSize
new14.81 KB
new7.22 KB
new19.58 KB
new8.62 KB

@GoZ: See these lines:
incorrect code

Here inside str_replace we are searching for the "&" and replacing it with  htmlentity which works well if string does not contain any encoded html entities. Like you tested. But this is not the case to reproduce this bug. Use a string contains already encoded html entities. like the following string:
Encoded string
The str_replace output will look like this:
Output

Not sure rest of the code takes care of this situation.

goz’s picture

Status: Closed (works as designed) » Active

Ok, let's open this again to discuss.

I'm really not sure it's a bug.

Can you explain in which case you have to send encoded html to Xss::filterAdmin() ?
We need something to reproduce, not only testing Xss::filterAdmin() with encoded html from php-eval but a real Drupal case.

adhariwal’s picture

@GoZ: https://www.drupal.org/node/2856598

- The above is the parent issue, When we rewrite the field output in a view and use a field token to render the output in this scenario the above issue will be reproduced.

This is the one case where i have seen it happening but there will be more cases apart from views.

Initially I thought it is something only related to views and submitted the patch in https://www.drupal.org/node/2856598 issue.

Thanks.

pradeep22saini’s picture

Status: Active » Closed (works as designed)

@abhishek
Can you provide the use case in terms of Views or other features where this is occurring? It is still unclear in what context encoded entity is passed.

pradeep22saini’s picture

Status: Closed (works as designed) » Active
pradeep22saini’s picture

StatusFileSize
new17.47 KB

@adhariwal
Please ignore above comments. I am able to reproduce this issue. As suggested in views.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.6.x-dev » 8.7.x-dev

Drupal 8.6.0-alpha1 will be released the week of July 16, 2018, which means new developments and disruptive changes should now be targeted against the 8.7.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.7.x-dev » 8.8.x-dev

Drupal 8.7.0-alpha1 will be released the week of March 11, 2019, which means new developments and disruptive changes should now be targeted against the 8.8.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.8.x-dev » 8.9.x-dev

Drupal 8.8.0-alpha1 will be released the week of October 14th, 2019, which means new developments and disruptive changes should now be targeted against the 8.9.x-dev branch. (Any changes to 8.9.x will also be committed to 9.0.x in preparation for Drupal 9’s release, but some changes like significant feature additions will be deferred to 9.1.x.). For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 8.9.x-dev » 9.1.x-dev

Drupal 8.9.0-beta1 was released on March 20, 2020. 8.9.x is the final, long-term support (LTS) minor release of Drupal 8, which means new developments and disruptive changes should now be targeted against the 9.1.x-dev branch. For more information see the Drupal 8 and 9 minor version schedule and the Allowed changes during the Drupal 8 and 9 release cycles.

Version: 9.1.x-dev » 9.2.x-dev

Drupal 9.1.0-alpha1 will be released the week of October 19, 2020, which means new developments and disruptive changes should now be targeted for the 9.2.x-dev branch. For more information see the Drupal 9 minor version schedule and the Allowed changes during the Drupal 9 release cycle.

Version: 9.2.x-dev » 9.3.x-dev

Drupal 9.2.0-alpha1 will be released the week of May 3, 2021, which means new developments and disruptive changes should now be targeted for the 9.3.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.3.x-dev » 9.4.x-dev

Drupal 9.3.0-rc1 was released on November 26, 2021, which means new developments and disruptive changes should now be targeted for the 9.4.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.4.x-dev » 9.5.x-dev

Drupal 9.4.0-alpha1 was released on May 6, 2022, which means new developments and disruptive changes should now be targeted for the 9.5.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 9.5.x-dev » 10.1.x-dev

Drupal 9.5.0-beta2 and Drupal 10.0.0-beta2 were released on September 29, 2022, which means new developments and disruptive changes should now be targeted for the 10.1.x-dev branch. For more information see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

Version: 10.1.x-dev » 11.x-dev

Drupal core is moving towards using a “main” branch. As an interim step, a new 11.x branch has been opened, as Drupal.org infrastructure cannot currently fully support a branch named main. New developments and disruptive changes should now be targeted for the 11.x branch, which currently accepts only minor-version allowed changes. For more information, see the Drupal core minor version schedule and the Allowed changes during the Drupal core release cycle.

pameeela’s picture

Status: Active » Postponed (maintainer needs more info)
Issue tags: +Bug Smash Initiative, +Needs steps to reproduce
Related issues: +#2856598: Views field rewrite replacement subtoken yields double encoded HTML entities

The related issue is now fixed, I wonder if this can be closed? #2856598: Views field rewrite replacement subtoken yields double encoded HTML entities

There are no steps to reproduce document and the instance in comments appears to be the same as the other issue that was fixed.

longwave’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

Indeed, can't reproduce this:

> \Drupal\Component\Utility\Xss::filterAdmin('&');
= "&"

Xss::filter() does defuse all entities initially, but then it converts named entities such as & back again:

    // Defuse all HTML entities.
    $string = str_replace('&', '&', $string);

    // Named entities.
    $string = preg_replace('/&([A-Za-z][A-Za-z0-9]*;)/', '&\1', $string);

Closing as cannot reproduce.