Updated to #32.

Problem/Motivation

Some text filters, such as the media_filter from https://drupal.org/project/media need to be able to add both #attached assets and cache tags as well as just modifying a string.

Media filter allows the referencing of arbitrary media entities from a textarea, the render cache will need to be cleared for those as much as if it was an entity reference field.

Additionally, rendering an entity might result in #attached calls within the entity view triggered by the text filter, it's only possible for the string to get returned from that, not anything else at the moment.

Proposed resolution

  1. Add a processed_text element type.
  2. Refactor check_markup() to use this type.
  3. Add a FilterProcessResult class which is a simply setter-getter carrying the processed text, cache tags, attached and post render cache callbacks. I doubt we need more than the constructor taking these with everything but the processed text optional and then a __toString and three getters.

See #16 for more details.

Remaining tasks

None.

User interface changes

None.

API changes

Filter callbacks will return FilterProcessResult objects, which can optionally contain cache tags, assets to be attached and #post_render_cache callbacks to be executed.

CommentFileSizeAuthor
#78 interdiff-2217877-78.txt3.52 KBdamiankloip
#78 2217877-78.patch105.45 KBdamiankloip
#74 interdiff.txt6.95 KBWim Leers
#74 filter_assets_cache_tags_prc-2217877-74.patch108.63 KBWim Leers
#71 interdiff.txt1.43 KBWim Leers
#71 filter_assets_cache_tags_prc-2217877-71.patch105.09 KBWim Leers
#70 2217877-filter-assets-cache-tags-prc-70.patch101.98 KBDave Reid
#64 2217877-filter-assets-cache-tags-prc-64.patch101.77 KBDave Reid
#62 2217877-filter-assets-cache-tags-prc-62.patch101.83 KBDave Reid
#56 interdiff.txt1.69 KBWim Leers
#56 filter_assets_cache_tags_prc-2217877-56.patch107.08 KBWim Leers
#54 interdiff.txt1.01 KBDave Reid
#54 2217877-filter-assets-cache-tags-prc.patch104 KBDave Reid
#50 interdiff.txt6.57 KBDave Reid
#50 2217877-filter-assets-cache-tags-prc.patch103.97 KBDave Reid
#48 interdiff.txt6.35 KBWim Leers
#48 filter_assets_cache_tags_prc-2217877-47.patch105.22 KBWim Leers
#45 2217877-psr4-reroll.patch99.38 KBxjm
#42 interdiff.txt11.57 KBWim Leers
#42 filter_assets_cache_tags_prc-2217877-42.patch104.78 KBWim Leers
#39 interdiff.txt47.55 KBWim Leers
#39 filter_assets_cache_tags_prc-2217877-39.patch101.48 KBWim Leers
2217877_36.patch1.81 KBchx
#22 interdiff-remnants.txt6.88 KBWim Leers
#22 interdiff-fails.txt6.02 KBWim Leers
#22 interdiff.txt12.38 KBWim Leers
#22 filter_assets_cache_tags_prc-2217877-22.patch95.06 KBWim Leers
#18 interdiff-rest.txt57 KBWim Leers
#18 interdiff-3.txt4.81 KBWim Leers
#18 interdiff-2.txt7.08 KBWim Leers
#18 interdiff-1.txt23.06 KBWim Leers
#18 filter_assets_cache_tags_prc-2217877-18.patch84.8 KBWim Leers
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Berdir’s picture

It also affects the entity/field storage cache. See #597236: Add entity caching to core.

Wim Leers’s picture

Wim Leers’s picture

catch’s picture

Priority: Major » Critical
Issue tags: +beta target

This is almost certainly an API change (unless we do a mixed return value or something to keep bc). Bumping priority. I'd put in an API change for this after beta since there's a relatively limited number of text filters out there and it'll be a simple update for the ones that don't need to use the stuff added here.

Wim Leers’s picture

Assigned: Unassigned » Wim Leers
Issue tags: +Performance, +Spark, +sprint

On this.

This is a MUST HAVE for things like media filters, or any filter that references any entity. And we actually have at least one example of that in core: any image that is inserted via CKEditor is registered by editor.module's hook_entity_(insert|update|delete) implementations, but that doesn't allow us to set a corresponding cache tag; this would enable that.

Dave Reid’s picture

Issue tags: +Media Initiative
Dave Reid’s picture

Yes this is very, *very* much needed. I started to look at this solution before I saw this issue and got very discourage since the use of check_markup() is often so low-level that a render API isn't even close to the object. For example: https://api.drupal.org/api/drupal/core%21modules%21text%21lib%21Drupal%2... is deep inside TypedData.

Dave Reid’s picture

Title: Text filters should be able to add #attached and cache tags » Text filters should be able to add #attached, #post_render_cache, and cache tags

We also need to be able to attach #post_render_cache, not just #attached.

Dave Reid’s picture

This is also complicated by the fact that text fields perform check_markup() on field load, rather than when the field is viewed/rendered.

catch’s picture

Now that we have render caching for entities, we could skip that optimization I think. It was only done for performance - i.e. to avoid 30 check_markup() cache hits when there were thirty entities rendered. But since we now cache the entire HTML, we'd not be making those cache hits all the time.

Berdir’s picture

Maybe, but we can only do that once the #pre_render issue landed, we still call it right now..

Also, that brings up an interesting question that I'm pretty sure we're ignoring right now? Text filters can control if they can be cached or not, and if they can't, then that needs to bubble up somehow as well and disable the render cache?

Dave Reid’s picture

If filters can use post_render_cache, then I forsee no more need for filter having disabled caching. All the cases I could think about wouldn't need it anymore.

Berdir’s picture

Hm, not sure, there is for example the complexity of multiple filters that run in a given order and if one of them doesn't support caching, then a filter that runs later might change something so that a post render callback no longer works? I doubt that we can just drop the logic that we have there right now...

What might work is that we move the complete check markup processing to a post render callback automatically if one of them says it's not cacheable?

catch’s picture

I'd be fine with removing the ability to have uncacheable text filters, even if we lose a feature somehow.

#836000: Media filter should not prevent the core text format cache from being used is a classic example of a non-cacheable filter - it was disabled caching precisely because of the lack of cache tags and #attached support.

I can think of one (not contributed) filter that can and should be refactored to use #post_render_cache if it gets ported to 8.x

Something like the PHP filter may or not need to disable caching depending on what the PHP is, in that case you still have the option of completely disabling render caching altogether, or #2099137: Entity/field access and node grants not taken into account with core cache contexts to increase granularity. So the option isn't taken away entirely if someone genuinely has a use case that absolutely cannot be solved any other way.

Wim Leers’s picture

#15: that's super helpful, thank you!

Wim Leers’s picture

Had to get other stuff done first for a while, now actively working on this!

Wim Leers’s picture

Assigned: Wim Leers » catch
Status: Active » Needs review
FileSize
84.8 KB
23.06 KB
7.08 KB
4.81 KB
57 KB

In this first patch:

1. All caching is removed from check_markup(), FilterInterface::preprocess() and FilterInterface::process().
Because we have entity render caching now.
We must keep $langcode on those 3 function signatures, because that exists not only for caching purposes — see #319788: Pass language information to filters.
This should allow us to close #198855: check_markup() hardcodes cache expiration, #199759: Allow custom cache expiration for filters / text formats.
See interdiff-1.txt.
2. Remove a filter's ability to declare itself uncacheable
Because we need to add #post_render_cache support anyway, which allows you to be as granular as you like with uncacheable things, instead of preventing the entire processed text from being cached, and by extension containing entity and even the page.
Hence also removed filter_format_allowcache(), which had no more use.
This allows us to close #1191938: Stop allowing filters to disable caching.
See interdiff-2.txt.
3. Remove check_markup() upon loading of fields.
Enabled by the above.
Addresses #9 & #10.
BUT: we still need to verify that this removal does indeed not cause a performance regression!
Berdir suggested to not remove this in this issue, what do others think?
See interdiff-3.txt.
All of the above are about getting rid of old assumptions. Now let's move on to the new stuff.
See interdiff-rest.txt.
4. We have one example in Drupal core that needs assets: filter_caption.
i.e. when you set a data-caption attribute on an image, the associated CSS should be loaded; this is currently done on all pages, using hook_page_build().
In this patch, we delete that and load the attached asset, but only when >=1 caption is actually generated. Comes with updated test coverage.
5. We have one example in Drupal core that needs cache tags: inline images.
i.e. when you use CKEditor (or any other text editor) and insert an image (through EditorImageDialog), then a data-editor-file-uuid attribute is set on the <img /> tag. This is how the Text Editor module is capable of correctly tracking in which nodes a file is used without that file being referenced via a file field, but being referenced through a processed text field instead.
There is no filter yet in current Drupal 8 HEAD to also reflect this in the cache tags for that text field (and hence for the containing node and containing page), but that's what this issue is here to resolve; so we add a new EditorFileReference filter, along with test coverage.
6. The solution: '#type' => 'processed_text', FilterInterface::process() can return either a string or a render array, check_markup() is a wrapper for the simple cases.
All logic in check_markup() is moved to filter_pre_render_text, which is the #pre_render callback for the new '#type' => 'processed_text'. The "native environment" for a piece of processed text is in a Drupal render array, which can have #attached assets, #cache tags and #post_render_cache callbacks. The "atypical environment" for a piece of processed text is in tokens, e-mails, and so on, where cache tags, assets and #post_render_cache callbacks don't make sense. In this case, we just define a code>'#type' => 'processed_text' element, call drupal_render() on it and return that result, and voila: we have the same "string only" output that we have today.
Comes with test coverage at FilterAPITest::testProcessedTextElement(); all check_markup() tests continue to exist, this ensures we didn't regress.
On the filter implementation side, filters that need to #attached, #cache tags and #post_render_cache callbacks may now return a render array, with the filtered text in #markup. Filters that don't need this can continue to work as today.
Hence, I was able to reduce the number of calls to check_markup() from 7 to 4 (converted use cases: user signature on comments, TextDefaultFormatter, \Drupal\views\Plugin\views\area\Text; remaining use cases: CKEditorPlugin\Internal, EditorController:getUntransformedText(), TextProcessed, Drupal\views\Plugin\views\field\Markup and tests). The remaining use cases are valid use cases of check_markup() (i.e. only string of markup is needed), with the exception of the Markup views field, which I unfortunately was unable to figure out where to use or configure, so I couldn't convert it.
UserSignatureTest updated to verify the cache tag of the text format is bubbled up.
TextFormatterTest written (we had zero test coverage for the text formatters!), this ensures we didn't regress.

Whew. Hope that helps.

We could potentially do points 1, 2 and 3 in separate issues each. Hence the separate interdiffs for those, to make that splitting up easier.

Assigning to catch for a +1 or -1 on the general approach, since he opened the issue.


I think the biggest WTF is going to be this snippet, which occurs several time in the patch:

+        // @todo Document; similar to what we saw in https://drupal.org/node/2099131
+        drupal_render($build[$id]['signature'], TRUE);

See #2099131-177: Use #pre_render pattern for entity render caching in particular. In a nutshell: in the case of the "processed text" element type, we need to do the text processing in some sort of preprocessing stage, for which #pre_render is the only fit. However, because of this, cache tags are only added during the #pre_render stage. And usually, processed text is inside a field, which is inside an entity, which is "themed". A "themed" entity has a corresponding Twig template, which operates on a copy of the renderable array and calls render() whenever it wants to render a subtree of a renderable array into a Twig template. Which means that without the above hack, we unfortunately invoke the #pre_render callback of "processed text" elements on a copy of the render array, and hence set cache tags on that copy of the render array, hence the cache tags don't bubble up. The only solution for now is to call drupal_render() (to invoke all #pre_render callbacks) on the element, render() will detect that the element was already rendered and will just return the final markup to the Twig template.
This is very ugly, and solving this is very much out of scope.

Status: Needs review » Needs work

The last submitted patch, 18: filter_assets_cache_tags_prc-2217877-18.patch, failed testing.

Berdir’s picture

My arguments for not touching prepareCache() here were that it a) It makes the patch bigger as there are a bunch of not really related changes, removed/changed tests and so on. And b) apart from text fields, the only other use case for PrepareCacheInterface in core are date fields, and the same argument as we have here (it is no longer needed because it's supposed to make the formatter faster, which is now cached), so we could remove that too and possibly even remove the whole feature.

And that part certainly needs it's own issue, so we could just as well leave the text item stuff to be removed there as well :) but looks like you already removed it, so we can probably remove that here and discuss it further in a follow-up...

Didn't look at the patch yet, but what about #13? If filters themself need to add post_render_cache callbacks, it's impossible to control the filter order?

catch’s picture

I didn't review the patch in depth, but the plan looks good.

@berdir the only filters I've ever come across that needed to prevent caching were either doing a (bad) workaround for the problems this issue fixes, or were the PHP filter where anything goes anyway. Do you have a concrete use case that is covered now that wouldn't be covered by this patch? I can see that there's a small regression, I just can't see a case where it would matter in practice.

I think it would make sense to remove the prepareCache changes out (and do them for date formats at the same time) if we can. If for some reason the issue blocks this one we could bump that to critical if we want.

I worked on the original patch that added the check_markup() in hook_field_load() and it was purely for optimization of content listings with lots of text fields (i.e. comments) so that they wouldn't do an extra round trip to the cache for each call to check_markup(). This is obsolete with render caching. Same for the date formats.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
95.06 KB
12.38 KB
6.02 KB
6.88 KB

Fixed the test failures in #18 (see interdiff-fails.txt). In doing so, I found a few more remnants that should've been deleted (see interdiff-remnants.txt).

Only one test fail in #18 was tricky/interesting: that in EntityReferenceFormatterTest. It's tricky because we're rendering an entity (rendered in a #pre_render callback) that has an entity reference formatter that renders the entire referenced entity (also rendered in a #pre_render callback), which contains amongst others a processed text field (also rendered in a #pre_render callback). The


#20:

  • I wanted to show the complete consequences of this patch, and it was simple enough to do, so I did it here. But I kept it in a separate interdiff, so it's easy to split out into its separate issue if that's preferred. So far, I think it's fine to do it as part of this patch. It'll justify the removal of PrepareCacheInterface further :)
  • Regarding #13, i.e. the impossibility of controlling #post_render_cache callback execution order. It's true, this is impossible. #post_render_cache is designed for this exact purpose: in the final HTML, find placeholders and replace them. This means that if you previously had an uncacheable filter that for example generated a <img src="llama.jpg" data-caption="here is a funky dynamic, user-customized llama" />, that filter_caption would still be able to convert that into an actual captioned image. With this, that's no longer possible.
    But I think that's only natural — if you need to retain that capability (which I'd question was ever used — counterevidence is welcome of course), then the filter system would be fundamentally incompatible with the render cache, and hence a single filter can destroy the performance of any Drupal site. I think that therefore that is an undesirable feature.
Berdir’s picture

I wanted to show the complete consequences of this patch, and it was simple enough to do, so I did it here. But I kept it in a separate interdiff, so it's easy to split out into its separate issue if that's preferred. So far, I think it's fine to do it as part of this patch. It'll justify the removal of PrepareCacheInterface further :)

Well, it might not be *that* simple, because I guess it's now questionable if we even want to keep the ->processed computed property, it's only used in very few places (mostly token implementations and tests). encourages use of check_markup(), which I think is discouraged with this (?) and is a performance overhead (computed field properties are always created). And if we do, we could at least simplify it and remove the complexity that was added to support setting the cached value. But I can we can discuss that part also in the follow-up.

catch’s picture

Assigned: catch » Unassigned

Unassigning myself.

Dave Reid’s picture

This is really great and wonderful job Wim implementing this! It made total sense to me. I took a stab at implementing this for the entity_embed module protoype: https://github.com/drupal-media/entity_embed/pull/6/files

The only odd thing I found is that when using post_render_cache placeholder HTML inside filtered text, because of the Html::normalize call in FilterHtmlCorrector::process(), the normal placeholder of <drupal:render-cache-placeholder .. /> was converted to <render-cache-placeholder ... ></render-cache-placeholder>, so I had to perform string replacement twice in my post_render_cache callback to make sure I was covering both placeholders. I had my entity_embed filter which ran *after* the HTML corrector filter, so I'm not sure what was up - and I think this is a test condition we're missing here.

Wim Leers’s picture

#25: Aha — great catch! I'll just rename the tag from <drupal:render-cache-placeholder> to just <drupal-render-cache-placeholder> and that should fix it. FilterHtmlCorrector uses DOMDocument, which AFAIK doesn't support (XML) namespaces, which is probably the cause.

Leaving at NR for further feedback.

Also provided feedback on https://github.com/drupal-media/entity_embed/pull/6/files :)

slashrsm’s picture

Reviewed and tested (with entity_embed PR that @DaveReid mentioned) this patch and it looks good to me. However, note that I'm not super experienced field expert so I may have missed something.

yched’s picture

Double yay for the ability to remove the special handling of TextItem's 'processed' property and possibly the whole prepareCache() construct.

I agree with @Berdir that it would be best considered thoroughly in an issue of its own though. Although I never really liked it (or its D7 equivalent), the 'processed' property has the advantage of encapsulating the correct processing according to the field settings (either check_plain() or check_markup()). I guess that logic could move to a dedicated sanitize() method on TextItem though.

Dave Reid’s picture

@Wim: Be also sure to note that Html::normalize also converts the single tag into an opening and closing tag.

Berdir’s picture

Also, see text_filter_format_update() and text_filter_format_disable() (not sure if that hook still exists with that name?!), that exists because we cache the processed texts in the field cache.

If that just works for the render cache, then we can remove the two hook implementations, see also FilterSecurityTest.

chx’s picture

Assigned: Unassigned » chx

Assigning to myself only to make sure it appears on my TODO list; I do not intend to patch this but I do intend to security review this. Someone must take care of Drupal 8 security and apparently noone else does.

Edit: to clarify, please continue developing/discussing etc but do not RTBC or commit this until I have approved it. I know this is harsh but something must be done. I do not like this but I do not really want to find security holes complete with the exploit PoC in the relevant change notice ever again either.

chx’s picture

@@ -1337,8 +1337,9 @@ function template_preprocess_comment(&$variables) {
     $variables['user_picture'] = array();
   }
 
-  if (\Drupal::config('user.settings')->get('signatures') && $account->getSignature()) {
-    $variables['signature'] = check_markup($account->getSignature(), $account->getSignatureFormat(), '', TRUE) ;
+  if (isset($variables['elements']['signature'])) {
+    $variables['signature'] = $variables['elements']['signature']['#markup'];

You see, the same pattern plays a role in the title XSS so I am superb anxious: you are separating assigning a template variable from the security guard of it. With the current code flow you have no way to know whether this is filtered already or some dimwit put a raw variable in there. Let me get back to this in a bit.

All of the format's filters will be applied, except for filters of the types that are marked to be skipped. FilterInterface::TYPE_HTML_RESTRICTOR is the only type that cannot be skipped.

That's good, that was the first I thought of when I saw we can skip filters, but I'd word it like "All of the format's filters will be applied, except for filters of the types that are marked to be skipped. FilterInterface::TYPE_HTML_RESTRICTOR filters will always apply, they can not be skipped."

I would vastly simplify the code responsible for this:

  if (in_array(FilterInterface::TYPE_HTML_RESTRICTOR, $filter_types_to_skip)) {
    $filter_types_to_skip = array_diff($filter_types_to_skip, array(FilterInterface::TYPE_HTML_RESTRICTOR));
  }
// ...
  foreach ($filters as $filter) {
    // If necessary, skip filters of a certain type.
    if (in_array($filter->getType(), $filter_types_to_skip)) {
      continue;
    }
    if ($filter->status) {
      $text = $filter->prepare($text, $langcode);
    }
  }

You could save some continue and some array_diff both of which are prone to some really fun errors. array_diff seems really bad, as it compares on (string) $elem1 === (string) $elem2 and I do not even want to think of the consequences later. Nah, let's not play this sort of roulette:

  function _filter_needs_applying($filter, $filter_types_to_skip) {
    $type = $filter->getType();
    return $filter->status && ($type == FilterInterface::TYPE_HTML_RESTRICTOR || !in_array($type, $filter_types_to_skip));
  }
  foreach ($filters as $filter) {
    if (_filter_needs_applying($filter)) {
      $text = $filter->prepare($text, $langcode);
    }

Use a closure if you like that better. Separation of code because process needs it too.

In EntityReferenceFormatterTest

    entity_create('filter_format', array(
      'format' => 'full_html',

What happened here that it needs a full html filter format for testing? For posterity, this is the exact sort of question I'd like to see on issues. Even if you do not grasp the full code flow, this is an insecure text format entity so if it appears somewhere where previously it didn't exist then there is a chance something somewhere doesn't get filtered and that needs to be clarified at least, fixed at most. Maybe it's just because previously we didn't need the full blown entity in the test but ran individual filters or perhaps it is a quick shortcut in the test instead of listing tags -- whatever, it needs clarification.

filter_pre_render_text takes a #context which can contain langcode and ... and .... erm, nothing else. Why don't we use #langcode? If in the future more properties are neded, then we ... then we add more properties. What's the difference between #filter_types_to_skip and #langcode that #langcode needed to be stuffed inside a #context ?

Now the filter process can return a render array which is not rendered but processed by filter_pre_render_text. In the Middle Ages they were building cities where the streets were deliberately confusing to lead an invading army back to the walls. Are we trying to confuse the enemy here :P ? Why it doesn't return an object instead? Ie.

$text = (string) $result;
if ($result instanceof FilterProcessResult) {
  $all_cache_tags[] = $result->cacheTags();
  $all_attached_assets[] = $result->attached();
  $all_post_render_cache_callbacks[] = $result->postRenderCache();
}

Or, you know, if you were to remove the (string) cast then we would have an excellent way to check for being filtered -- do instead

if ($result instanceof FilterProcessResult) {
}
else {
  $text = new FilterProcessResult($result);
}

and then check in preprocess whether it's a FilterProcessResult and if not, then raise an exception, check_plain it, throw it out the window, whatever, just don't let it through. And when it actually becomes text then __toString will do the work automatically.

chx’s picture

Assigned: chx » Unassigned

I would appreciate a <h2>DO NOT COMMIT UNLESS CHX APPROVES<h2> in the issue summary but it'd be really obnoxious if I added that so I won't.

chx’s picture

Issue summary: View changes
chx’s picture

Issue summary: View changes
Dave Reid’s picture

Dave Reid’s picture

I think the next steps are:

  1. Need drupal_render_cache_generate_placeholder to return markup that wouldn't need to be changed by Html::normalize().
  2. Add test coverage for using a post_render_cache text format with the 'Basic HTML' text format (or a format that includes 'Limit allowed HTML tags' and 'Correct faulty and chopped off HTML'
  3. I'm guessing that if we get #2, we'll need to add the render cache placeholder tag to the list of default allowed HTML in \Drupal\filter\Plugin\Filter\FilterHtml.
Wim Leers’s picture

FileSize
101.48 KB
47.55 KB

#23/#28/#30: Ok, removed the changes in TextItemBase. Opened a new issue for doing this: #2274517: Remove \Drupal\Core\Field\PrepareCacheInterface.


#25/#26/#29: Changed <drupal:render-cache-placeholder /> to <drupal-render-cache-placeholder></drupal-render-cache-placeholder> (both tag name changed and closing tag added for compatibility with DOMDocument and therefore Html::normalize()).
Test coverage updated to ensure that this works correctly and prevent regressions.


#31/#32:

You see, the same pattern plays a role in the title XSS so I am superb anxious: you are separating assigning a template variable from the security guard of it. With the current code flow you have no way to know whether this is filtered already or some dimwit put a raw variable in there. Let me get back to this in a bit.

We talked about this in chat, and chx' only concern is idiot-proofing Drupal here. Of course you shouldn't be rendering things in the theme layer. But unfortunately, that happens.
Yet another reason for #2060783: Remove the preprocess layer./

That's good, that was the first I thought of when I saw we can skip filters, but I'd word it like "All of the format's filters will be applied, except for filters of the types that are marked to be skipped. FilterInterface::TYPE_HTML_RESTRICTOR filters will always apply, they can not be skipped."

This is completely out of scope; this is not new code, this is just moving around of new code. Please open a new issue for this; this patch is very big as is.

What happened here that it needs a full html filter format for testing?

The pre-existing test coverage was broken: the "Full HTML" text format was being used, but didn't exist! This issue merely uncovered that, and the code you cited is the fix.
Also, this does test not at all need Full HTML; it just needs a text format. It needs a text format to verify that the entity reference formatter bubbles up the cache tag of the text format of the body field of the referenced entity.

filter_pre_render_text takes a #context which can contain langcode and ... and .... erm, nothing else. Why don't we use #langcode? If in the future more properties are neded, then we ... then we add more properties. What's the difference between #filter_types_to_skip and #langcode that #langcode needed to be stuffed inside a #context ?

Because the langcode provides context to all invoked filter, but filter_types_to_skip is not at all related to the context and is not passed to filters. But, you're right, we could rename #langcode to #context at a later point if needed, this change is inappropriate.
Changed to #langcode as you suggested.

Now the filter process can return a render array which is not rendered but processed by filter_pre_render_text. In the Middle Ages they were building cities where the streets were deliberately confusing to lead an invading army back to the walls. Are we trying to confuse the enemy here :P ? Why it doesn't return an object instead? Ie.

I don't think it's any more confusing or different than other APIs, but I agree that this is better. Added FilterProcessResult.


#38:

  1. Done, plus text coverage.
  2. See above. The "Limit allowed HTML tags" filter (filter_html) must simply always run before any filter that adds #post_render_cache callbacks (and render cache placeholders).
  3. See above.

Status: Needs review » Needs work

The last submitted patch, 39: filter_assets_cache_tags_prc-2217877-39.patch, failed testing.

chx’s picture

About the fails, we struggle with them in the twig autoescape (funny how things come together although this object suggestion predated the autoescape reboot) -- WebTestBase::buildXPathQuery needed if (is_string($value) || $value instanceof SafeMarkup). In general, your enemies include json_encode not running __toString , we patched InsertCommand::__construct to read $this->html = (string) $html; which I suspect will solve some of your fails too and PHP being stupid beyond imagination and not casting array indexes although I am not sure whether any check_markup'd code ends up in array index but if it does you need a manual string cast and also strpos $needle needs a string cast (I even wrote a phpwtf entry for that one).

Wim Leers’s picture

Assigned: Unassigned » Wim Leers
Status: Needs work » Needs review
FileSize
104.78 KB
11.57 KB

I talked to chx in chat, and we concluded that his suggested security hardening from #32 for template_preprocess_comment() can't work because drupal_render() ensures that every instance of #markup is a string (therefore it's impossible to do a if ($el['#markup'] instanceof FilterProcessResult) check).
Fortunately, his Twig autoescape work will already ensure this is safe.

Fixed all test failures. This should be good to go.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Should be good.

Wim Leers’s picture

Issue summary: View changes
xjm’s picture

catch’s picture

Status: Reviewed & tested by the community » Needs work
  1. +++ b/core/modules/ckeditor/src/Plugin/CKEditorPlugin/Internal.php
    @@ -251,22 +252,42 @@ public function getButtons() {
        *   An array containing the "format_tags" configuration.
        */
       protected function generateFormatTagsSetting(Editor $editor) {
    -    // The <p> tag is always allowed — HTML without <p> tags is nonsensical.
    -    $format_tags = array('p');
    

    The caching added here looks a bit out of scope?

  2. +++ b/core/modules/comment/src/CommentViewBuilder.php
    @@ -131,6 +131,18 @@ public function buildComponents(array &$build, array $entities, array $displays,
    +        );
    +        // @todo Document; similar to what we saw in https://drupal.org/node/2099131
    +        drupal_render($build[$id]['signature'], TRUE);
    +      }
    

    We don't assign the result of drupal_render() to anything. Needs a comment if that's correct.

  3. +++ b/core/modules/editor/src/Plugin/Filter/EditorFileReference.php
    @@ -0,0 +1,58 @@
    + * )
    + */
    +class EditorFileReference extends FilterBase {
    +
    +  /**
    +   * {@inheritdoc}
    +   */
    +  public function process($text, $langcode) {
    

    I'd expect this to be done in the filter that also generates the image tag etc., why a separate filter?

  4. +++ b/core/modules/filter/filter.module
    @@ -98,10 +99,137 @@ function filter_element_info() {
    +
    +  // Collect all #post_render_cahe callbacks.
    +  if (isset($element['#post_render_cache'])) {
    

    cahe.

  5. +++ b/core/modules/filter/filter.module
    @@ -1144,10 +1201,3 @@ function filter_filter_secure_image_alter(&$image) {
    -
    -/**
    - * Implements hook_page_build().
    - */
    -function filter_page_build(&$page) {
    -  $page['#attached']['library'][] = 'filter/caption';
    -}
    

    Yay.

  6. +++ b/core/modules/filter/filter.services.yml
    @@ -1,11 +1,4 @@
     services:
    -  cache.filter:
    -    class: Drupal\Core\Cache\CacheBackendInterface
    -    tags:
    -      - { name: cache.bin }
    -    factory_method: get
    -    factory_service: cache_factory
    

    More yay.

  7. +++ b/core/modules/filter/src/FilterProcessResult.php
    @@ -0,0 +1,204 @@
    +
    +  /**
    +   * An array of associated assets to be attached.
    +   *
    +   * @see drupal_process_attached()
    +   *
    +   * @var array
    +   */
    +  protected $assets;
    +
    +  /**
    +   * The attached cache tags.
    +   *
    +   * @see drupal_render_collect_cache_tags()
    +   *
    +   * @var array
    +   */
    +  protected $cacheTags;
    

    This looks extremely close to HtmlFragment and CacheableInterface (see #2256373: Factor HtmlFragment out to an interface for example). Wondering why you went for a new class as opposed to returning an render array (then handling string/array return value) or extending one of those.

chx’s picture

Owie! The previous patch was a render array but it didn't go through drupal_render but a fixed list of properties was processed and so I recommended an object. It could've been a normal array instead of an object but an object is more D8-ish. A render array that doesn't get passed to drupal_render is just wrong.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
105.22 KB
6.35 KB
  1. No, it's definitely intentional and in scope. This was one of the very few remaining places where the cache_filter cache was actually useful. This code gets executed on every single page that has a text editor. Since cache_filter is gone, we now do our own caching.
  2. Comment added.
  3. Because it's independent of captioning/alignment. We always want cache tags to be set, but not every use case needs (or wants to allow) captioning and alignment. Hence it should be separate. (FYI: The current FilterCaption filter does both alignment and captioning, and is being split also, over at #2239413: Split FilterCaption into FilterAlign (for data-align) and FilterCaption (for data-caption).)
  4. Fixed
  5. :)
  6. :)
  7. I originally had the @return string|array strategy. But chx strongly opposed that:
    Now the filter process can return a render array which is not rendered but processed by filter_pre_render_text. In the Middle Ages they were building cities where the streets were deliberately confusing to lead an invading army back to the walls. Are we trying to confuse the enemy here :P ? Why it doesn't return an object instead?

    I honestly didn't consider HtmlFragment nor CacheableInterface because both of them allow both too much and too little: neither supports assets or #post_render_cache callbacks, and both support cache keys, cache max-age, etc. We really only want three things: processed text (which can be considered HTML, indeed), cache tags and #post_render_cache callbacks. Hence a new object.

Dave Reid’s picture

Yeah I kind of agree with catch. Having a new class just for here seemed odd. If we can't repurpose HtmlFragment, then I'd probably prefer a render array instead.

The main problem I have with the existing FilterProcessResult is that I can't just *add* assets, cache tags or post_render_cache callbacks. If my filter needs to process more than one item, I have to run a NestedArray::mergeDeepArray every time I want to call setCacheTags or setPostRenderCacheCallbacks because it's such an all or nothing call.

Dave Reid’s picture

Tweaked changes to FilterProcessResult which added support for addCacheTags(), addAssets(), and addPostRenderCacheCallback(). Also adds regression protection for drupal_render_cache_generate_placeholder() to ensure it will not be modified by Html::normalize() in the future.

Dave Reid’s picture

I think ultimately we want what FilterProcessResult to be is what is desired in #1843798: [meta] Refactor Render API to be OO, but that seems to be D9 material at this point.

Dave Reid’s picture

Status: Needs review » Needs work

The last submitted patch, 50: 2217877-filter-assets-cache-tags-prc.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
104 KB
1.01 KB

Fixed the NestedArray issues. The other fails seem unrelated, so let's give this a try.

Status: Needs review » Needs work

The last submitted patch, 54: 2217877-filter-assets-cache-tags-prc.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
107.08 KB
1.69 KB

The cause of those failures is $all_file_cache_tags not being an array of cache tags, but an array of arrays of cache tags :)

Just one other minor change.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

RTBC'ing to get feedback from catch, to see whether he likes Dave Reid's change, which addressed the only remaining concern.

catch’s picture

Assigned: Wim Leers » alexpott

I still don't like the class, but on the other hand if we change it again (i..e if CacheableInterface and friends were able to support what we need here) it's only going to affect the filter process API (which I'd be fine making changes to any time up until RC frankly).

Assigning to Alex for a second opinion.

catch’s picture

Category: Task » Bug report

Also this is a bug.

chx’s picture

On a somewhat related issue: check_markup must return an object after the twig autoescape patch; this object extending SafeMarkup is IMO no big deal.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 56: filter_assets_cache_tags_prc-2217877-56.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
101.83 KB

Re-rolled for 8.x only.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 62: 2217877-filter-assets-cache-tags-prc-62.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
101.77 KB

Fixes the regression in FilterTestPostRenderCache due to #2275679: Clean up render cache placeholder.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

Was un-RTBC'd for rerolling against HEAD changes only, so back to RTBC.

alexpott’s picture

Issue tags: +Missing change record

We need a change record.

alexpott’s picture

  1. +++ b/core/modules/editor/src/Tests/EditorFileReferenceFilterTest.php
    @@ -0,0 +1,120 @@
    +class EditorFileReferenceFilterTest extends DrupalUnitTestBase {
    

    DrupalUniteTestBase has been deprecated should KernelTestBase. This is very minor and if the point below is won't fix then we should ignore this.

  2. +++ b/core/modules/filter/src/FilterProcessResult.php
    @@ -0,0 +1,250 @@
    +  /**
    +   * Adds #post_render_cache callbacks associated with the processed text.
    +   *
    +   * @param callable $callback
    +   *   The #post_render_cache callback that will replace the placeholder with
    +   *   its eventual markup.
    +   * @param array $context
    +   *   An array providing context for the #post_render_cache callback.
    +   *
    +   * @see drupal_render_cache_generate_placeholder()
    +   *
    +   * @return $this
    +   */
    +  public function addPostRenderCacheCallback($callback, array $context) {
    +    $this->postRenderCacheCallbacks[$callback][] = $context;
    +    return $this;
    +  }
    

    Hmmm... a callable can be an array to call a method on a class. Does the post render cache system not support that?

Eric_A’s picture

+    $this->assets = drupal_merge_attached($this->assets, $assets);

Why the procedural wrapper?

EDIT: @Wim Leers, it's there since #56.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 64: 2217877-filter-assets-cache-tags-prc-64.patch, failed testing.

Dave Reid’s picture

Status: Needs work » Needs review
FileSize
101.98 KB

interdiff is failing to generate for me, so:

- Fixed merge conflicts
- Clarified that the post_render_cache callback is a string callback, and not a callable.
- Use NestedArray::mergeDeep in FilterProcessResult:: addAssets()
- Fixed the processed_text element had default langcode of 'und' but check_markup() had a default langcode of an empty string. Used an empty string in both places now.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
105.09 KB
1.43 KB

Fixed #67.1.

#68: Because that's the standardized way of merging attached assets. Which means that if that way changes, this instance will also be updated automatically.

#70: ugh… if you fix the merge conflicts first next time, then you can actually generate an interdiff just fine. Now you forced us to re-read through this entire >100 K patch :( Manually verified, hence back to RTBC.

damiankloip’s picture

  1. +++ b/core/includes/common.inc
    @@ -3662,7 +3661,7 @@ function drupal_render_cache_set(&$markup, array $elements) {
    + * @param string $callback
    

    Is this strictly true? I can understand we don't want to serialize objects here but you can call static methods etc.. in the form 'MyClass::method' or array('MyClass', 'method').

  2. +++ b/core/modules/ckeditor/src/Plugin/CKEditorPlugin/Internal.php
    @@ -251,22 +252,42 @@ public function getButtons() {
    +    if ($cached = \Drupal::cache()->get($cid)) {
    ...
    +      \Drupal::cache()->set($cid, $format_tags, Cache::PERMANENT, $format->getCacheTag());
    

    Should we cache this in the filter bin instead? Not sure why this needs to live in the default bin when filter has its own...

    This dependency should also be injected but looks like ckeditor plugins don't do this at all right now. So could create a follow up for that.

  3. +++ b/core/modules/editor/src/Plugin/Filter/EditorFileReference.php
    @@ -0,0 +1,55 @@
    +          $file = entity_load_by_uuid('file', $uuid);
    

    This should use \Drupal::entityManager()->loadEntityByUuid() instead. Previous comment also applies here about follow up to inject deps.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 71: filter_assets_cache_tags_prc-2217877-71.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
Issue tags: -Missing change record
FileSize
108.63 KB
6.95 KB

The sudden fail in #71 is due to #2183231: Make ContentEntityDatabaseStorage generate static database schemas for content entities getting committed between #70 and #71. Fixed in this reroll.


#72:

  1. #post_render_cache only accepts string callbacks — as you already indicate, a static method callable can be expressed in two different ways, and we only support one. Why? Because the callables are used as array keys, with the values being the different contexts in which they should be called.
  2. The filter bin is being removed as part of this patch. Agreed RE:dependency injection; fixed.
  3. Agreed; fixed.

Change notice written: https://drupal.org/node/2278483

damiankloip’s picture

Thanks, looks good. So if you use something other than a string callable it is going to tell you anyway - cool.

My only pick would be that a constructor should always come ahead of a create() method.

Wim Leers’s picture

Status: Needs review » Reviewed & tested by the community

#75: Okay, good :) Regarding that only nitpick: I ordered it in the same way as NodeSearch & UserSearch, so let's not delay this over that.

This can now go back to RTBC.

damiankloip’s picture

That would have taken a few seconds but ok.

damiankloip’s picture

FileSize
105.45 KB
3.52 KB

Sorry, making that small change anyway.

Dave Reid’s picture

Still confirming that #78 looks and works great!

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

Committed d5be5c0 and pushed to 8.x. Thanks!

I can't think of a better name for the class either and at least it is called exactly what it is.

  • Commit d5be5c0 on 8.x by alexpott:
    Issue #2217877 by Wim Leers, Dave Reid, damiankloip, xjm, chx | catch:...
Wim Leers’s picture

Assigned: alexpott » Wim Leers
Issue tags: -sprint

Hurray! :)

Published the change record draft: https://drupal.org/node/2278483

Wim Leers’s picture

I was able to close the following issues thanks to this one landing:

(Wow, that included two issues with node IDs below 1 million! Ancient issues!)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Gábor Hojtsy’s picture

Issue tags: -Media Initiative +D8Media

Fix media tag.