Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Blocks #606840: Enable internal page cache by default.
This fixes the 3 failures o/t parent issue in:
Drupal\search\Tests\SearchCommentTest
We must ensure that when search pages are cached in the internal page cache (or an external reverse proxy), they have a cache tag that is invalidated whenever the search index is updated. Otherwise, anonymous users will continue to see outdated search results.
Beta phase evaluation
Issue category | Bug because missing cache tags are breaking page caching. |
---|---|
Prioritized changes | The main goal of this issue is performance: it blocks #606840: Enable internal page cache by default. |
Disruption | Very little disruption. A method is added to the Search plugin interface, but it is given a default implementation in the base class. |
Comment | File | Size | Author |
---|---|---|---|
#26 | search_index_cache_tags-2460911-26.patch | 9.29 KB | Wim Leers |
Comments
Comment #1
Wim LeersSearch index are per-plugin. So we set and invalidate cache tags on a per-plugin-search-index basis.
Comment #2
Wim LeersComment #3
Wim LeersComment #4
Wim LeersComment #5
Fabianx CreditAttribution: Fabianx for Drupal Association commentedIf it was not for the 'Needs tests' part, I would RTBC this now :).
Comment #6
BerdirInteresting. Approach looks good to me. Looks like we can't use the entity cache tag, because that is not always available?
https://www.drupal.org/project/search_api will need a similar issue I think, and it only has views integration to display search results. So I guess the query backend will need to add a similar cache tag, probably for the search index and server.
Speaking of that, search.module has views integration too. We should probably make sure that the corresponding cache tags also exist when a view with those filters is used?
Comment #7
Wim LeersPer #2461087-3: Add 'no_cache' route option to mark a route's responses as uncacheable (was: Cron run response should not be cacheable) & #2461087-4: Add 'no_cache' route option to mark a route's responses as uncacheable (was: Cron run response should not be cacheable), this hunk is unnecessary once #2461087: Add 'no_cache' route option to mark a route's responses as uncacheable (was: Cron run response should not be cacheable) lands. So let's postpone this on that issue.
Comment #8
Wim LeersPostponing per #7.
Comment #9
Wim LeersTests.
Comment #10
Wim Leers#2461087: Add 'no_cache' route option to mark a route's responses as uncacheable (was: Cron run response should not be cacheable) landed, this is now unblocked.
Straight reroll, with the hunk mentioned in #7 removed.
Comment #11
Fabianx CreditAttribution: Fabianx for Drupal Association commentedThis is an API change.
ROTFL :-D
Don't we need to run cron first to make this happen?
Leaving at Code needs review.
We need a beta eval and approval of the API / Interface change and a change record for the API change.
Comment #12
Wim LeersNo, the cache tags ensure that the displayed data is updated immediately.
Comment #13
BerdirOpened #2463715: Properly integrate with Drupal Core's caching system for search API.
Comment #14
jhodgdonThis patch may work, sort of, for NodeSearch, but it is not really quite right.
This is not necessarily correct. Although NodeSearch uses the plugin ID as $type, that is not necessarily what all other plugins will do. In particular, I am aware of one contrib module that, when ported to 8, will be using a different $type for each *instance* of the plugin, not one shared by this plugin type. So this line needs to change to ask the plugin for what $type it is using.
The other thing is that NodeSearch and potentially other plugins are also removing data from the search index in other places. You've got an invalidate in search_index_clear, but ..
Actually, wouldn't it be simpler to just put the invalidate in all of the functions in search.module that are adding or removing from the index? This would be a *lot* easier. These functions are:
search_index()
search_index_clear()
These are the only two ways that plugins using the index are supposed to be adding/removing data from the search index, so if you invalidate there, you'll be gold. Then you don't have to worry about changing the API for the search plugins.
Comment #15
Fabianx CreditAttribution: Fabianx for Drupal Association commentedI agree with #14, thanks for chiming in.
Comment #16
jhodgdonAs one of the maintainers of the D8 search module, it's my job to chime in. ;)
Comment #17
jhodgdonAs one other note, not all Search plugins even use the index. Take a look at UserSearch as one example that doesn't.
So in SearchController, you need to ask the active search plugin instance which $type, if any, it's using in the search index, and invalidate that. I suggest adding a method to SearchInterface to get this information, implementing it with a '' or NULL return in the base class SearchPluginBase, and then overriding this in NodeSearch to return 'node_search'.
If you'd like me to take this over and make a patch, let me know...
Comment #18
jhodgdonHm. So, in the case of a Node search, the following things would change which nodes are displayed as search results on a given results page:
- Updates to any node -- this issue should catch these, at least after cron runs, because they'll result in node reindexing. Language, published/unpublished status, taxonomy terms, and author directly go into the query in some cases.
- Any update to a node that affects node access, or turning on/off a node access module.
- Updates to factors that can affect search rankings. In Core, these are (in addition to editing the node, which will eventually filter into an update to the search index): number of comments, sticky flag, promoted flag, and number of times the node has been viewed if the Statistics module is turned on.
And then the page output should also depend on the display of the specific nodes that are in the search results.
User search is much simpler. It just matches on user name (and email address if an admin is searching), and the query also filters out blocked users (unless an admin is searching). So basically if any user account is edited, all user search pages should probably be invalidated.
Is that being taken care of elsewhere?
Comment #19
Wim Leers#14: thanks for the review! Super helpful pointers :)
Did this. But while doing this, I noticed that
search_index
actually callssearch_index_clear()
… so really, we only need tag invalidation insearch_index_clear
!This also means no API change is needed anymore :)
See
interdiff-14.txt
.#17: Done.
See
interdiff-17.txt
.#18:
Indeed, the patch already handles that.
Installing/uninstalling modules clears the render cache.
Updates to nodes affecting node access: see
below.#2241249: First step in making search results pages cacheable: add the associated SearchPage's cache tag took care of that already, and that is exactly what the existing
SearchPageCacheTagsTest
test is verifying to be working.(Convince yourself by adding
debug($tags)
inCache::invalidateTags()
, modify the factors, save.)See
below.See
below.Regarding rendered node cache tags: for search modules + cache tags, we've only had:
search_page
config entities (which contain the ranking factors)You've pointed out:
This is neither about the ranking factors, nor about the index, but about the representation of what is listed. That's clearly a different scope than this issue — this issue is specifically about re-indexing. So, yes, let's take care of that elsewhere: the solution for that is independent of what we do here. No need to delay this issue (and the issues this issue blocks) for that.
I opened #2464409: Search results should bubble rendered entity cache tags and set list cache tags.
To take care of it:
node_list
anduser_list
cache tags, respectively.NodeSearch::prepareResults()
does; rendering should actually happen only when the entire page is being rendered. If that turns out to be tricky in Search module's architecture, then we just need to be able to set the cache tags we need to be associated on individual search results, so that when thesearch-result.html.twig
template is rendered, they are bubbled.Comment #22
Wim LeersTestbot failure, re-tested.
Comment #23
Fabianx CreditAttribution: Fabianx for Drupal Association commented$plugin->getType();
and should check for NULL :)
Is this the same $type as $plugin->getType()?
This confuses me ...
Comment #24
Wim Leers#23:
NodeSearch
callssearch_index()
with$type = $this->getPluginId()
, which is passed on tosearch_index_clear()
. It'd be better ifNodeSearch
would use$this->getType()
instead of$this->getPluginId()
in the 3 places where we're actually passing a type, but I don't want to do that here to keep changes minimal, and because then the comments in those places don't make sense anymore. SeeNodeSearch::(indexNode|indexClear|markForReindex)
.)Comment #25
jhodgdonThis all looks correct to me now. Great work!
A few minor things I think should be addressed:
This can just be if($type) I think?
Also if $type is actually NULL, then I think we would need to invalidate *all* cached search pages with any $type?
Again, why !== NULL, why not just if($plugin->getType()) ?
Perhaps this is better as:
Returns the search index type this plugin uses.
and for the @return docs:
The type used by this search plugin in the search index, or NULL if this plugin does not use the search index.
Probably better (or in addition) to have @see to search_index() rather than search_index_clear() ?
Saying something in a "No" assertion is "expected" is a bit odd. ;)
How about making the first line:
Asserts that a cache tag is absent from the current page.
And then change $expected_cache_tag to just $cache_tag
and in the docs for the param: The cache tag to check.
Comment #26
Wim LeersThanks for the review! :)
Also added a generic
search_index
cache tag, and expanded the test coverage.WebTestBase::assertCacheTag()
. All your other requested changes have been applied.Comment #27
jhodgdonLooks great to me now! Assuming test bot agrees, I think this is ready to go. Thanks!
Comment #28
jhodgdonNeeds beta evaluation added to summary though...
Comment #29
Wim LeersBeta evaluation added! :)
Comment #30
jhodgdonComment #31
Wim Leers#30: oh, right, thanks for that correction!
Comment #32
webchickLooks good, thanks!
Committed and pushed to 8.0.x.
Comment #34
jhodgdonI had a moment of doubt about this, but just verified it is working correctly. Phew!
What I wanted to verify was that if I added my own Node search page at admin/configure/search/pages, that the cache tags are correct. And they are: it still has search_index:node_search in there, plus a config item for the particular machine name of the search page config. User searches are also working well. So, all good!
But I'm wondering if we should put this in the tests too? The reason I am wondering about this is that the default config entity for NodeSearch has machine name 'node_search', which happens to match the plugin type ID for the NodeSearch plugin. So, my worry that made me take a look at this was that in place of using the plugin type ID string 'node_search' in the search_index cache tag, we could have been mistakenly using the individual config entity's machine name (which is the same)... Should we do something about this in the tests or not worry?
Comment #35
Wim LeersHurray for the automatically generated config entity cache tags! Thanks to that, it's actually difficult to cause the problem you are describing :)
But, yes, let's add test coverage for that. I didn't even know you can configure additional NodeSearch pages!
If you can file a follow-up, I'll roll a patch.
Comment #36
jhodgdonThat sounds like a good division of labor to me! I'll also (of course) volunteer to review the patch. :) Filed:
#2465111: Add tests for caching on user-added Search page