After #538164: Comment body as field there's very little reason to for filter module to maintain it's own dedicated cache - since all text field on entities will be handled by text.module which caches the results of check_markup() in the field cache instead. Block module will hopefully follow soon, then that's nearly all the cases of textareas with text formats attached which are regularly displayed.

Two options:

1. remove caching from filter.module entirely

2. Just use the default {cache} bin.

We've added a lot of caches in Drupal 6 and 7, it's time to start taking advantage of some of the higher level ones to clean out stuff like this.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

catch’s picture

Status: Active » Needs review
FileSize
4.35 KB

Here's a patch which simply removes {cache_filter} and the cache_set() and cache_get() in check_markup(). It also fixes a bug in HEAD where the _submit() handlers in filter.admin.inc weren't invoking hook_filter_update(). This will need to be fixed separately if this one doesn't get in.

Any module using a text field now gets check_markup() caching for free via field API. I can't think of many other cases where texts are run through check_markup() which are shown sufficiently often, or are sufficiently long, to warrant caching.

Status: Needs review » Needs work

The last submitted patch failed testing.

swentel’s picture

Status: Needs work » Needs review
FileSize
5.64 KB

Lot's of notices, $cache, $cache_id were still in function calls. Patch attached should make it nicer ...

Status: Needs review » Needs work

The last submitted patch failed testing.

swentel’s picture

Status: Needs work » Needs review
FileSize
9.42 KB

Another try, garbage collection wanted to clear filter_cache and a small removel in filter.test too.

hass’s picture

+

sun’s picture

I'm not entirely sure whether we want to do this. There are many modules in contrib that use filter formats on "fields" that still won't be fields in D7 - rather variables and stuff like that. Using those modules would mean to vastly decrease performance.

It's definitely debatable - and I see the following two options:

1) Flip the default for the $cache argument and alter the PHPDoc so it states that modules that use check_markup() for "custom" stuff should use $cache = TRUE to improve performance.

2) Remove the caching as proposed and defer caching to the respective modules. However, this option may mean that we'll see many more cache tables added by contributed modules.

In any case, this is an API clean-up, so we won't have to rush this into RTBC until tomorrow.

hass’s picture

I'm using the table in linkchecker, but I've used it only for the reason that check_markup() used it and I was made to clone check_markup() to add some customizations. Not sure if I can go the fields way, but I need to cache a custom filtered output for internal use.

Where does fields store the filtered data?

catch’s picture

Fields stores it in cache_field - see text_field_load().

I'd be happy going with sun's suggestion of reversing the 'cache' argument - and maybe just use the {cache} table for those custom fields, if we don't want to take the measure of removing it completely. We could do this as an interim step and then revisit the usage of it in D8 for a completely removal.

sun’s picture

Introducing a new tag for feature freeze: API clean-up.

Dries’s picture

I'd be OK removing the filter cache, because we shouldn't cache things twice for things that use the field API. Two things:

1. I'd be curious to learn what performance impact this patch has.

2. We'd need to update the PHPdoc to warn module developers that filters can be expensive to use, and to encourage module maintainers to do their own caching. Maybe that should go into the filter system introduction, i.e. the section explaining the filter system.

catch’s picture

We already don't cache things twice in field API - unless someone was to manually run check_markup() on the contents of a text field - see http://api.drupal.org/api/function/text_field_load/7 and the $cache argument to check_markup() - which was introduced with that patch. So this would be more cleanup than removing double-caching.

sun’s picture

My worry is that we presume that everything will become a field in D7, which may or may not be true - we'll see. Field API is quite a monster and it's possible that some contrib modules are straight ported to D7 without leveraging Field API (which is the only facility that caches filtered text on its own).

As long as we do not have a new Cache API (which unfortunately did not make it into D7), I disagree with the removal of the $cache parameter (and thereby also {cache_filter}).

We can flip the default value for $cache and skip caching by default.

We can still remove the built-in filter cache in D8 if it turns out that almost no module needs or wants to use it (and isn't able to cache on its own).

moshe weitzman’s picture

I guess I hesitate a bit on this too. The filter cache is one of the most transparent in all of drupal. And thats a good thing. I fear that this will discourage devs from calling check_markup(). Reckless, but it happens.

Tough call. If pressed, I vote to leave this cache in.

sun’s picture

Kick-start - needs filter system AND i18n team review.

catch’s picture

fwiw I agree with just flipping the default rather than removing the whole thing.

One question on sun's patch, what about using the block cache for custom blocks instead of the check_markup() cache?

sun’s picture

uhm, "the block cache" ? What's that?

sun’s picture

ok, I see. As long as custom blocks are not fieldable (#430886: Make all blocks fieldable entities), blocks may be cached (when configured), but we don't know upfront in hook_block_view() implementations whether the block will be cached.

That said, I wonder whether hook_block_view() implementations shouldn't get passed a $block object instead of just $delta. That way, we could attach the $cid and let implementations decide whether they need to cache anything on their own.

catch’s picture

I meant in block's own implementation of block_block_info(), set 'cache' to something (probably universal). http://api.drupal.org/api/function/block_block_info/7

On the other hand, this would require special handling of uncacheable filters same as we do in text_field_load(), so scratch that one.

sun’s picture

All that blocks this patch is another i18n team member review - I've set the $langcode argument for check_markup() in various places to something meaningful.

That said, we could also defer that to a separate issue, but I thought, while being there, and having to pass "something", I could straight go ahead and fix the missing language arguments.

Note that those are only used by replacement filters to determine, which language the replacement values may use.

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
10.42 KB

Removed those language codes == separate issue, if at all. In the meantime, I think they were wrong. $langcode should only be passed to check_markup() in case the original data really is language specific. Otherwise, filters will have to use the global $language anyway.

So this is RTBC from my POV.

Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
10.44 KB

Stupid me.

catch’s picture

Status: Needs review » Reviewed & tested by the community

Patch looks great, calls to check_markup() are going to reduce significantly in D7 contrib assuming field API uptake, as you can see by the precious few in core touched by this patch. So this is a good step to rationalizing some of our lower level caches in favour of higher level ones where appropriate.

catch’s picture

Title: Remove the filter cache » Make check_markup() not cache by default

er, more accurate title.

sun’s picture

Issue tags: +D7 API clean-up

Tagging absolutely critical clean-ups for D7. Do not touch this tag.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

I also like this as an interim step before D8.

However:

-  $data['content'] = check_markup($block->body, $block->format);
+  $data['content'] = check_markup($block->body, $block->format, '', TRUE);

Should we swap the $langcode and $cache parameters? I weep at how ugly that new call is.

sun’s picture

Status: Needs review » Reviewed & tested by the community
sun’s picture

This patch is required to move on with #562932: {filter_format}.cache is not saved

webchick’s picture

Ok, the summary of the IRC conversation that sun and I had is that no, they should not be switched, because each one of those calls should be passing in whatever language string that's most appropriate. Results need to be cached per language, because something like pirate.module might want the English translation of "Hello" to be "Yarrr" and the French translation of "Bonjour" to be "Le Yarrr" and similar (to pick a totally silly example).

The patch back in #15 attempted to pass in a sensible $langcode variable in each case. However, this really requires a longer, more over-reaching discussion on how best to do this, including involvement from the i18n team. Especially now that as of #282191: TF #1: Allow different interface language for the same path we have even more possible language contexts.

Therefore, dealing with $langcode has been moved to #601074: check_markup(): $format_id must be required, $langcode should default to LANGCODE_NOT_SPECIFIED; the idea being that fixing these calling functions is a "bug fix" vs. switching the default value of cache here is an "API change." I really don't like this kind of artificial separation at all, and would vastly prefer we fix it all here. However, sun pointed out that fixing the fact that $langcode isn't approrpriately used in core doesn't have anything to do with caching, which I guess is fair enough...

Anyway, I went ahead and committed this to HEAD, despite the above concerns, since it seems like a fairly reasonable middle ground between double-caching everything and removing these at-times redundant caching tables altogether and forcing all of contrib to use Field API (or its own separate caching table) to cache anything.

Please document the change and recommendation in the 6 -> 7 upgrade guide.

webchick’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs documentation
sun’s picture

Issue tags: -D7 API clean-up

.

samuelsov’s picture

Status: Needs work » Needs review

Updated section http://drupal.org/update/modules/6/7#check_markup_params (and for issue http://drupal.org/node/446518 at the same time). Documentation needs review.

#d7csmtl

sun’s picture

Status: Needs review » Fixed

Thanks! I've improved them slightly.

Status: Fixed » Closed (fixed)
Issue tags: -Needs documentation, -FilterSystemRevamp, -API clean-up

Automatically closed -- issue fixed for 2 weeks with no activity.