Tested in alpha-14

When creating a feed using Views > Format > Format: RSS Feed, Show: Fields, the titles for items are double encoded. For example:

Given the article title: This "cool" & "neat" article's title has HTML entities

RSS Feed - Fields:

<title>This &amp;quot;cool&amp;quot; &amp;amp; &amp;quot;neat&amp;quot; article&amp;#039;s title has HTML entities</title>
 

which incorrectly appears as "This &quot;cool&quot; &amp; &quot;neat&quot; article&#039;s title has HTML entities"

Double encoding occurs for apostrophes, ampersands and quotes. I have not tested other entities.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

dawehner’s picture

Issue tags: +VDC

.

Just adding a tag to find issues.

Charles Belov’s picture

Title: HTML entities in the item title are double encoded » HTML entities in the item title are double encoded for feeds using fields
dawehner’s picture

Priority: Normal » Major

So yeah the problem is that both views, FieldPluginBase::render() as well as autoescaping in views-view-row-rss.html.twig causes the issue.

Berdir’s picture

Status: Active » Needs review
FileSize
677 bytes

Took me a while to find this, the problem is that in FieldPluginBase::advancedRender(), for fields, we pass things through renderItems(), that assumes unsafe items and passes it through SafeMarkup::escape($item);

If we mark the return value of renderText() as safe, then it is only escaped in twig and works as expected. Not sure if this is the right approach?

dawehner’s picture

+++ b/core/modules/views/src/Plugin/views/field/FieldPluginBase.php
@@ -1165,7 +1165,7 @@ abstract class FieldPluginBase extends HandlerBase {
-          $items[] = $this->renderText($alter);
+          $items[] = SafeMarkup::set($this->renderText($alter));

We might solve that in a better way with #2280961: (Views)FieldPluginBase::advancedRender() calls SafeMarkup::set() on a string that it doesn't know to be safe but for now having a dedicated comment why we are doing it would be great. Also having a test that this doesn't break anything is probably required.

jhedstrom’s picture

Status: Needs review » Needs work
Issue tags: +Needs tests

Needs tests and a code comment as per #5.

mikeker’s picture

The patch in #4 didn't fix the issue for me with the attached view.

To repro:

  • Import the attached view
  • Add a node with the title: This "cool" & "neat" article's title has HTML entities
  • Navigate to /rss_test.xml

The title is double-escaped as noted in the original report.

mikeker’s picture

FileSize
4.68 KB

/me: attaches "attached view" and goes to lunch...

Berdir’s picture

Title: HTML entities in the item title are double encoded for feeds using fields » HTML entities/tags in the item title are double encoded for feeds using fields

Can you try it with the body and HTML tags in there instead of html entities? That's what I'm using it for, HTML entities might be a different problem that are not fixed by my patch.

mikeker’s picture

I'm not sure I understand what you're saying here... And I have to dash off to get my kid from school in about 1 minute. :)

I think the title and description fields are getting double check_plain()'ed. I tried using HTML entities and plain old HTML in the description field. In core\modules\views\views.theme.inc:

function template_preprocess_views_view_row_rss(&$variables) {
  $item = $variables['row'];

  $variables['title'] = String::checkPlain($item->title);
  $variables['link'] = check_url($item->link);
  $variables['description'] = String::checkPlain($item->description);
  $variables['item_elements'] = empty($item->elements) ? '' : format_xml_elements($item->elements);
}

But the title and description are already sanitized earlier as part of Views' rendering (at least as far as I can tell). I'll look at this again later this evening...

mikeker’s picture

Status: Needs work » Needs review
FileSize
1.05 KB

Attached patch runs the title, description, and (just for good measure) link fields through a Twig inline template to ensure they've been sanitized and escaped once and only once... I think. This is where I start to get fuzzy about how Twig and Drupal work -- specifically how Drupal is calling Twig escape functions (especially in light of the cautions here: http://twig.sensiolabs.org/doc/filters/escape.html).

At a minimum, this patch fixes the originally reported issue.

@Berdir, let me know if this fixes the issues you were seeing with HTML in RSS feed fields.

mikeker’s picture

Issue tags: -Needs tests
FileSize
4.23 KB

Added tests.

dawehner’s picture

+++ b/core/modules/views/views.theme.inc
@@ -926,10 +926,18 @@ function template_preprocess_views_view_rss(&$variables) {
+  // Ensure the text in these fields is properly sanitized and escaped.
+  foreach (array('title', 'description', 'link') as $field) {
+    $build = array(
+      '#type' => 'inline_template',
+      '#template' => '{{ temp }}',
+      '#context' => array('temp' => $item->$field),
+    );
+    $variables[$field] = \Drupal::service('renderer')->render($build);
+  }

Maybe a dump question, but why can't we do that in the template directly?

mikeker’s picture

Not a dumb question at all because we CAN do most of this in the template. Not sure what I was thinking...

Pretty much everything except:

$variables['item_elements'] = empty($item->elements) ? '' : format_xml_elements($item->elements);

which, as pointed out in #2296885: Remove format_xml_elements() needs to be... something. I'm looking into refactoring format_xml_elements() into the template but haven't had the time to dig into it yet.

mikeker’s picture

Status: Needs review » Closed (duplicate)
Related issues: +#2296885: Remove format_xml_elements()

Closing this as a duplicate of #2296885: Remove format_xml_elements() which fixes not only this issue but (hopefully) cleans up a pile of legacy XML processing.

karenann’s picture

--- removed my own post ---