I'm trying to index the entire entity view, and it seems work.
But if I try to add the field to a view, it display the output stripping every html tag.
So I have not div or span for labels, etc etc.

What's wrong?
I've tried to index the field in fulltext and string mode, and I've tried also to disable tokenizer and html filter from the index.

Thanks,
Sergio

CommentFileSizeAuthor
#4 2156023-4--views_strip_html.patch563 bytesdrunken monkey
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

drunken monkey’s picture

You are using Solr, I presume? Then you should disable the Tokenizer in any case. And the HTML filter would have to be disabled for the "Entity HTML output" field, too, for this to work.
Other than that, I don't know what might strip the HTML from the field, though. Did you completely reindex after disabling those processors for the index? Are there maybe any Views settings for the field that might remove the tags?

arrubiu’s picture

I've tried to disabled tokenizer and HTML filter on that field and views is not configure to strip html (yes, I'm using Solr).
Or I have to disable tokenizer at all, not only for my field with html entity?

arrubiu’s picture

I've tried disabling tokenizer and html filter at all, and nothing changes.

drunken monkey’s picture

Title: Complete entity view strips html tags » Don't strip HTML when displaying VIews fields
Project: Search API » Entity API
Version: 7.x-1.9 » 7.x-1.x-dev
Status: Active » Needs review
FileSize
563 bytes

Ah, OK, it seems that this is actually caused by the Views field handler, which lives in the Entity API module. Moving this issue there.
The fix is very easy: tune down the escaping for Views field values to only remove scripts and styles, instead of restricting the content to a very small subset of HTML. Since this only displays internal data, I don't think that there's any risk involved here, at least not a realistic one. We might even consider dropping the filtering entirely and even allowing scripts and styles, if that's what's contained in the field. This would have to be discussed.

Anyways, patch attached. Please test whether it works for you (applied to the Entity API module).

If you want to use this feature, please also go to #1796110-13: Rendered entity fields show same result, test my patches there and appeal to fago to finally commit one of them. Otherwise, this feature will be gone with the next Entity API release!

arrubiu’s picture

It seems work, but I've another problem with search api.
I've a field (a term reference field) that is not displayed in the html output (the field is indexed)

arrubiu’s picture

However: applying the patch there are problems.
It remove also "label" (used for languae and other fields).
Removing filter at all "label" appears.

drunken monkey’s picture

None of this has anything to do with this patch, or the issue in general.
The fields that will appear in the HTML output are entirely configurable. They are the ones set to appear for the view mode selected in the "Complete entity view" data alteration (on the index's "Workflow" or "Filters" tab). I'm not sure I understand your last comment, but you can also configure field labels in the view mode settings ("Manage display" under Structure > Content types).

Also, as said, please bump [#7908833] if you are using this feature (displaying fields directly from Solr) and want it to stay.

arrubiu’s picture

The default output for a field label is:

<div class="field-label">My great label</div>

The label for the "language" of a node is:

<label>English</label>

With $this->sanitize_value('xss') are removed "label" and "div" tags.
With $this->sanitize_value('xss_admin') is removed "label" tag.

drunken monkey’s picture

Oh. OK, seems like the filter_xss_admin() documentation is just lying and the function will filter out all form elements, too.
However, I'm not really sure whether the correct fix here isn't to just not use form elements when there is no form. That's misusing HTML. Removing the filter function completely seems a bit too risky to do just for fixing that. We could of course substitute our own version of filter_xss_admin(), calling filter_xss() with an even more permissive element list. Or we could add an option for the amount of filtering that should be done – though end users usually won't be aware of the security implications, so that might not be such a good idea.
Leaving this open for comments for the moment. Unless fago wants to commit any of this, it's useless to post new patches anyways.

Also, as said, please go to #1796110: Rendered entity fields show same result if you want this feature to stay and help get the patch there committed.

arrubiu’s picture

I think that Views gives, through the UI, a way to strip html, so it could be the developer to choice if strip or not html tags.

drunken monkey’s picture

Yes, you're right, it does. However, when removing all filtering by default we would default to unsafe settings, which is the opposite of what should usually be done.

arrubiu’s picture

Is there possible to add an option, in views, to let user to remove any filter?

valderama’s picture

As far as I can say this issue is resolved with entity 1.5, as filter_xss is no longer called in the field handler (entity_views_handler_field_text / EntityFieldHandlerHelper).

almc’s picture

Version: 7.x-1.x-dev » 7.x-1.5

I've hit the same issue, it took me hours to trace this issue back to Entity API upgrade to version 1.5 (among other ongoing changes analyzed).
Changing the version for the issue to the current stable - 1.5, as it's even more critical that it happens in the stable version. And it seems to concern not only Search API indexed fields, it also happens for custom entity properties that generate html markup. It used to work well in Views before Entity API upgrade to 1.5. After the upgrade all such html markup gets HTML-encoded/sanitized for output in Views.

almc’s picture

Priority: Normal » Major
Status: Needs review » Active

Also elevated the status to Major as this issue is in the current stable version and may break many views (if fields used HTML markup) that worked before.

valderama’s picture

Priority: Major » Normal

@almc What I was saying is that an update to entity 1.5 acutally _solved_ the problem for me. You could try to update to the current dev version of entity API and take a look at
sites/all/modules/entity/views/handlers/entity_views_field_handler_helper.inc - in particular at the render_single_value function.

Also, I do not think it is major, but maybe someone with more clue about entity API could try to describe the expected behavior of HTML stripping or encoding of custom properties.

almc’s picture

Well, the current stable version of Entity API breaks mentioned things in Views, and this can be quite unexpected and very confusing (and may take much time to debug) as you may not expect Entity API to break such things. I wouldn't use the dev version in production, and for now I had to do effort-rich (with other changes accumulated since upgrade) rollback to version 1.4 of Entity API.
The justification for issue criticality includes:

  1. happens in current stable version
  2. breaks things in another module workings
  3. breaks visual representation to user (views start looking ugly, and intended functionality not working, e.g. when links markup was generated - now it's shown HTML-sanitized so such links are not working for user)
  4. difficult to find out that the upgrade causes this issue

So the upgrade to 1.5 becomes unusable and has to be reverted back for the user experience not to be broken.

What else might be needed for the issue to be considered major in the stable version of a mature and infrastructurally important module?

Also, regarding the commonality of the use case. The Views module is very popular, and custom entity properties are handy to increase flexibility in dealing with entities, including entity representation.
If someone decided to introduce in Entity API 1.5 the HTML sanitization for custom properties output in Views for the sake of security, that was not actually helpful. Because that's why those properties are custom - so the developer of a custom property can decide if he wants to HTML-sanitize its output or not, depending on the required functionality.

almc’s picture

Priority: Normal » Major
Anonymous’s picture

I too upgraded Entity from 1.3 to 1.6 and now all my custom output links are broken, as the Tags get encoded. I too spent time debugging the views module, only to find out it broke with the entity module.

We have a hook_entity_property_info_alter where we use

// Generic action link for all kind of CT (event -> add to calendar, document -> download file(s), ...)
  $properties['generic_action_link'] = array(
    'label' => t('Generic action link'),
    'description' => t('Provide generic action link for all kind of CT (event -> add to calendar, document -> download file(s), ...).'),
    'type' => 'text',
    'entity views field' => TRUE,
    'getter callback' => 'mymodule_generic_action_link',
  );

In mymodule_generic_action_link we then return the link markup.

Edit: I think too many changes are happening, for me to be comfortable to try to patch it, so I reverted the module to 1.3 and applied the security fix (from the commit http://cgit.drupalcode.org/entity/commit?id=724e719)

kevinquillen’s picture

Just to add on here, I solved this under SearchAPI and SearchAPI Views as an addition to a custom module I wrote for SearchAPI.

#2729453: Provide Views integration

Basically using hook_views_data_alter, I looked for my items and changed their handlers to be able to print HTML as a Views result.

dcam’s picture

I had this issue as well. I'm working with a site that has a lot of <em> tags in the node titles due to a prevalence of species scientific names. I have a view that displays results from the Search API, which uses the Entity API for rendering. Unfortunately, entity_views_field_handler_helper::get_value() encodes the tags unconditionally. I followed @kevinquillen's example and created a small module that replaces the Entity API's text field handler with a custom one that decodes the HTML. It's not the best solution, but at least it's a viable workaround.

florianmuellerCH’s picture

This still happens - the problem is not fixed on Drupal 7.54 and Entity API 1.8 - It can't be that such a central problem is not yet fixed!

The patch worked well, but then the devs decided to move the sanitation to a wrapper and now the problem persists again.

florianmuellerCH’s picture

Title: Don't strip HTML when displaying VIews fields » Don't strip HTML when displaying Views fields
Version: 7.x-1.5 » 7.x-1.8
florianmuellerCH’s picture

Priority: Major » Critical

I can't point out how important this problem is, and it remains unfixed! EntityAPI is absolutely unusable with this bug!

We've installed Views module 3.18 and upgraded EntityAPI again from our old version (1.3) to 1.8. The problem was already present on 1.6. This really needs a fix!