since cache_comment is never cleared, and since items are entered into cache_comment with no expiry, logged in users with no additional roles (those that benefit from comment caching) NEVER see new comments after the comment is originally cached.

since memcache doesn't support wildcards, and since there's one entry in the cache per page, there's no easy way to expire all the relevant comment_cache entries.

since users can choose the number of comments per page (at least on my site) i suspect that advcache is also breaking this feature of comments - who ever views the node first imposes their comments/page settings on all subsequent users.

the easy thing to do would be to clear_cache_all('cache_comment') on each new comment, but that makes the comment caching not so useful on a busy site.

an intermediary step would be to cache comments using cache_minimum_lifetime to calculate an expiry (this is still not great, since users will not see their comments after the post them)

i suspect that the correct solution is to cache the entire comment in one chunk, and then deal with the paging after the comment is grabbed from cache. that way there's a single entry per comment tree that we can invalidate easily.

alternately, we could cache only individual comments and construct the tree from whatever cids comment.module calls up.

i will disable comment caching for now.

CommentFileSizeAuthor
#4 comment_cache_5.patch11.77 KBfirebus
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

robertDouglass’s picture

Thanks for testing and reporting. This patch is admittedly poorly tested. I'll work on a better strategy.

firebus’s picture

things are definitely a notch slower without the comment caching, so i think i'd like to work on it if that's okay with you...

looking at my report, my suggestion to cache each comment separately is crazy.

the idea of caching the whole comment tree is also problematic. if there are a lot of comments, then you're loading a lot of data in to php. and you'll have to cache the comments with one kind of sorting, and then potentially resort the array to suit another users sort preferences.

perhaps a nice compromise would be to cache the comment tree once for each kind of sorting. that's still a lot of data in php, but it should be easy to select the range after loading it, and easy to issue 4 calls to clear_cache_all when invalidating a comment (there's 4 different ways to sort right? asc and desc combined with flat and threaded? maybe we only need two and can handle asc and desc with array functions)

we also might want a 'cached comment limit' - if there are more than X comments, or if the array is bigger than X, don't bother caching, just do things the hard way. as long as we're caching most threads, hopefully there will be enough free resources on the db server to handle the biggies.

robertDouglass’s picture

Yes, absolutely do work on this. It isn't something that I have scheduled for myself at the moment. I think that only real performance testing will tell you which approach scales the best. Start with caching the whole comment tree and seeing how many comments one needs before you get diminishing returns. It will most likely be a lot of comments. Make sure to test in a db cache environment as well as a memcache cache environment, though. It would be a shame to work hard coming up with a solution that is faster in memcache but slower on the db, or vice versa.

firebus’s picture

FileSize
11.77 KB

okay, here's my first cut.

i decided not to cache the entire comment tree - comment.module warns against it, and it would involve more of a rewrite than i'm willing to do, and probably more of a rewrite that would ever be accepted into core.

instead i cache an array of keys that have been cached for a specific node. for each node, we create a cache entry for nid-::hash.
each time we cache_set something to cache_comment, we also push the cache_key on the array.

then, when we need to invalidate the comment cache for a specific node, we can iterate over that array.

this concept might be useful for cache_node invalidation as well, vs the current method of clearing every possible key for every possible role combination.

i added the cache_comment clear routine to the places where cache_clear_all() is currently called:

  • comment_save (new and edit)
  • comment_confirm_delete_submit
  • comment_admin_overview_submit
  • comment_multiple_delete_confirm_submit

it's basically the same block of code in each location. this should really be pulled out as a function in advcache.module, and perhaps hooked in with comment_invoke_comment instead of being patched into the comment.module code.

i cleaned up one major bug.
previously, we had one cache entry per page, not respecting the comment settings of the user (comments per page, asc or desc, flat or threaded). i extended the cache_key format to cache separately for different order, mode and comments_per_page settings:

$cache_key = 'nid-' . $node->nid . '::cid-' . $cid . '::' . $mode . '-' . $order . '-' . $comments_per_page . '-' . $page;

i cleaned up a couple of other minor bugs:

  • don't cache the pager if $pager is entry (eg, if there's only one page of comments)
  • don't cache_set anything unless there's a $cache_key defined - previously we were cache_setting always, resulting in entries in cache_comment that were never used
  • cache_set single comments. previously we were calling cache_get for a single comment, but never cache_set so there were never any hits for single comments

i'm supplying this as a patch against the 5.2 comment.module, rather than as a patch against comment_cache.patch because i'm too tired to think about patching patches right now :) apologies for not having upgraded to 5.3 yet.

also note that in my comment.module, i've made some changes to handle memcache. the original lines are commented out, and my new calls to cache_set and cache_get might need to be modified to work without memcache.

firebus’s picture

Status: Active » Needs review

updating status. sorry.

firebus’s picture

Title: advcache doesn't' clear cache_comment when a new comment is added » cache_comment does not respect user comment_control settings.

i just realized, of course, that the reason advcache wasn't clearing cached comments on updates is because it depends on using a wildcard to clear cached comment keys, and i'm using memcache which does not support wildcards.

my patch should work both with and without memcache, but will cost some extra overhead in the non-memcache use case.

naturally, all of the cache deleting should be handed in advcache_comment instead of in the patch to the comment module. i'll roll a new patch for that when i can.

firebus’s picture

Assigned: Unassigned » firebus
Status: Needs review » Needs work

the pager doesn't respect the user's locale.

if a comment tree is cached in one locale, then a user in another locale will see the wrong language. we should probably be able to extend cache_key to handle this (at the expense of caching more objects).

i'm not sure if this is just locale, or if it's related to i18n. if it's locale only, we should certainly support that.

firebus’s picture

okay, i have a better method of caching pager that should fix the locale problem.

when you run a pager_query, 3 global variables are set.
these global variables are used when you theme a pager or do other pager related stuff.

previously we were caching the built pager, which has a side effect of caching the locale of the user who built it.

instead, what you want to do when you cache a pager is to cache the three global variables. then on a cache hit, you set the three globals. no changes to any other pager calls (theme_page, etc.) are necessary.

here's how to cache_set the pager:

          // cache the pager
          global $pager_page_array, $pager_total, $pager_total_items;
          $pager->pager_page_array = $pager_page_array;
          $pager->pager_total = $pager_total;
          $pager->pager_total_items = $pager_total_items;
          cache_set($pager_key, 'cache_comment', $pager);

here's how to cache_get the pager:

          // Get the pager
          global $pager_page_array, $pager_total, $pager_total_items;
          $pager = $pager_cache->data;
          $pager_page_array = $pager->pager_page_array;
          $pager_total = $pager->pager_total;
          $pager_total_items = $pager->pager_total_items;

i haven't written logic yet to build the pager separately from the comment tree, and i still need to move my comment cache invalidation routines out to advcache.module. once i do i'll roll a new patch.

robertDouglass’s picture

I have a question about this whole thread: are you absolutely sure that you're not adding code to comment module that couldn't be added to advcache.module in advcache_comment instead? It's very important that we keep these patches as non-invasive as possible.

It's customary to roll patches with the -u flag (diff -u) for readability.

ilmaestro’s picture

Firebus - are there any updates on your comment caching work? I'm interesting in how you went about handling cached comments, taking into account paging, as I want to do something similar for a module I've written that involves paged content.

robertDouglass’s picture

Bump. @firebus: still interested in this?

firebus’s picture

yes, i'll have patches this week for all my comment and forum related work. i know i've said that before :) i think it's really true this time...

firebus’s picture

there are three different issues rolled into one on this ticket, which will make things hard to test.

-memcache support
-user comment controls
-pager support

i will probably close this and create separate issues for controls and memcache, and submit a patch to http://drupal.org/node/217186 for the pager