Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
memcached supports data values up to 1M but not larger. Views, for example, regularly tries to cache values larger than 1M and this breaks. Size detection should be implemented so that we never try to write something to memcached is too large.
Comment | File | Size | Author |
---|---|---|---|
#128 | 435694-127-d8-memcache-large-items.patch | 11.04 KB | japerry |
#9 | memcache_watchdog_errormsg-435694-9.patch | 644 bytes | achton |
#10 | memcache_watchdog_errormsg-435694-10.patch | 632 bytes | glennpratt |
#13 | memcache_watchdog_errormsg-435694-13.patch | 722 bytes | glennpratt |
#14 | memcache_watchdog_errormsg-435694-14.patch | 814 bytes | pdrake |
Comments
Comment #1
catchI'm not sure it's possible to measure the size of a variable accurately - or at least I've not seen code that does that.
IMO this is a bug in Views (and CCK, theme registry, variables, Field API and all the other places that have massive individual cache items), but it'd be good to document it at a minimum.
Comment #2
catchhttp://groups.drupal.org/node/116764
http://groups.google.com/group/google-appengine-python/browse_thread/thr...
I'd like to see us catch when objects are too large (memcache should throw some kind of error), and log this. If we throw enough errors around, people should then run memache with -I, or file bug reports against the modules with too large items.
Comment #3
achtonSubscribing.
Comment #4
mikeytown2 CreditAttribution: mikeytown2 commentedAny way for memcache catch the error and drop it in watchdog? Have a special call just for this occasion, kinda like this one I made #1011000: Catch & report errors thrown by $memcache->connect.
Comment #5
achtonI'd love to see this happen, as I think >1MB cache sets can potentially be a nasty performance hit, memcache or not (it was for us).
Although I cannot replicate the use case scenario for #1011000: Catch & report errors thrown by $memcache->connect. and don't quite have that patch running as intended (no error message is returned in the Exception object), my initial feeling is that this will be difficult to do.
As far as I can tell,
dmemcache_set()
calls the set() class function, which maps to a C equivalent. I skimmed the c code inphp_memcached.c
and AFAICT the related function (php_memc_handle_error
) either returns 0 or -1. So no specific error message will ever bubble up to PHP and Drupal for us to use in Watchdog.Besides this, my tests show that the memcached log only logs a message about the item size being to large, if you start the daemon with
-vv
. It then writes something likeWriting an error: Too large.
.Comment #6
catchRight, memcache fails silently at least as far as PHP is concerned for this.
I have seen posts around the internet where people run strlen on the serialized value, and thrown an error if it's over a certain length. Could possibly live with this as a debug option but not as standard behaviour, not sure how that would play with the fact that we try to compress the items too though.
I would rather see people's effort going into fixing some of the root causes of this - there are pending patches at http://drupal.org/project/issues/search/drupal?issue_tags=memory - many of those have not had much review.
Comment #7
podaroksubscribe
memcache do not store large items even if memcached daemon started with -I 32M (store items with <32MB)
so it is a code behaviour
Comment #8
podarokthe same on 7.x
Comment #9
achtonIn accordance with #6, I've started issuing watchdog warnings on my installation to help me identify when cache sets fail. See attached patch for 6.x.
There is still the mentioned issue about cache item size when using compression, which I have not adressed.
This procedure may be the best we can get, since memcached does not let us know specifically when a cache_set fails due to item size, just that it fails.
Comment #10
glennpratt CreditAttribution: glennpratt commentedThanks for the patch achton. Works for me:
http://www.localhost:8080|1312474790|memcache|127.0.0.1|http://www.localhost:8080/devel/php|http://www.localhost:8080/devel/php|1||Cache set attempt failed. Key: cache_size_test. Cache bin: cache. Item size (uncompressed): 3385 KB.
Re-roll without trailing whitespace attached. Also, bumped status to error, seems like an error when a cache set fails, though I guess that could be debated.
Comment #11
glennpratt CreditAttribution: glennpratt commentedJust as an FYI, this patch has been very informative. We were sort of assuming cache_set was working most of the time, but our logs are pilling up with fails, some of which don't appear to be caused by the object size.
Comment #12
glennpratt CreditAttribution: glennpratt commentedNote to self, update patch to report error code when possible:
http://www.php.net/manual/en/memcached.getresultcode.php
Comment #13
glennpratt CreditAttribution: glennpratt commentedRe-rolled patch, changed watchdog call to look more like other calls in core, removed em tags, etc.
Primary change added the error message if we have Memcached.
Comment #14
pdrake CreditAttribution: pdrake commentedDepending on where we are in the bootstrap, "t()" may not yet be defined. I re-rolled the watchdog error message patch to check whether "t()" exists before using it.
Comment #15
catchLet's drop the t() usage altogether here and just log in English. I don't see this ever getting translated. Alternatively could use get_t().
Comment #16
pdrake CreditAttribution: pdrake commentedUnfortunately, this apparently can occur at times when module.inc hasn't been included yet, so it errors when watchdog tries to run module_implements. Two options - not log if module_implements function does not exist, or require_once module.inc so that we have the required functions. I'm leaning toward the latter - do you have a preference catch?
Comment #17
mikeytown2 CreditAttribution: mikeytown2 commentedYou can use a shutdown function for watchdog. This patch was committed to memcache #1011000-9: Catch & report errors thrown by $memcache->connect.
Comment #18
pdrake CreditAttribution: pdrake commentedmikeytown2, thanks for the suggestion. updated patch attached.
Comment #19
bibo CreditAttribution: bibo commentedSubscribing.
If we could log when the 1M limit is restricting caching, it just might solve (or help pinpoint) huge performance problems that I'm seeing on a complex site (or several).
It seems in our case the problems are caused by a few key cache component that never make it to memcache (and as you know, the database cache is not used either), it's slowing the site more than I like to admit. And has been for a long while.
Several serialized cache blobs are over 1M, probably even when gzipped. For example:
cache
***********
theme_registry:main_theme_name [BLOB - 8.1 MiB] 0 1316556483 1
theme_registry:admin_theme_name [BLOB - 7.8 MiB] 0 1316553845 1
cache_menu
***********
links:admin_menu:tree-data:abbcf0407c9c6cc16599da6... [BLOB - 1.5 MiB] 0 1316534351 1
cache_content
***************
content_type_info:fi [BLOB - 3.4 MiB] 0 1316538815 1
Comment #20
fgmThe patch adds these warnings within dmemcache_set and lets it return the error status, but the problem is exacerbated by the fact that set() discards that status by invoking dmemcache_set() without assigning its result anywhere. The commit mentioned in #1011000: Catch & report errors thrown by $memcache->connect. appears not to touch this part of the code.
Comment #21
drewish CreditAttribution: drewish commentedMarked #1706172: Object to big to be cached, memcache does not inform properly as a duplicate.
This is actually still a problem in the 7.x branch. I found my site was regenerating view's field list on every page load because the cache value was greater than 1MB and it was hitting the floor with no notice. It seems like back in the day there was a watchdog message but it was firing too many of them: #338522: watchdog() in dmemcache_set is dangerous!. We should do something like log one watchdog message at the end of the request with all the failed keys.
Comment #22
mgiffordIs this what the error would like?
Notice: MemcachePool::set() [memcachepool.set]: Server memcache1 (tcp 1213, udp 0) failed with: SERVER_ERROR object too large for cache (3) in dmemcache_set() (line 44 of dmemcache.inc).
Comment #23
helmo CreditAttribution: helmo commented@mgifford: Yes: that's what it looks like for me.
Adding
-I 5M
solved it for me.I already got this message via the watchdog so I'm not sure the patch is still needed.
Comment #24
Phizes CreditAttribution: Phizes commentedSorry to pollute the issue queue, but using PECL Memcached with igbinary to serialize data results in the stored data being up to 18% smaller than using PHP's serialize. I'm not sure how this plays with compression, but it may help someone if they are close to the 1mb mark. I can't find my benchmarks on this and it was a while ago that I checked it, so there may be a problem with my recall about it. You would need to benchmark for your specific use case.
(In as I recall there was also a small improvement in performance with using igbinary, though this was on PHP 5.3, I have not compared them on PHP 5.4)
Comment #25
markpavlitski CreditAttribution: markpavlitski commentedPECL::Memcache (3.0.6) produces a notice when dmemcache_set() fails.
The attached patch throws a warning when dmemcache_set() fails using PECL::Memcached.
Result codes were lifted from here:
http://stackoverflow.com/questions/9042883/php-memcached-extension-resul...
http://bazaar.launchpad.net/~tangent-trunk/libmemcached/1.2/view/head:/l...
Comment #26
johan.gant CreditAttribution: johan.gant commentedRegardless of how much a module/developer tries to throw into a single cache item I do think that if memcache isn't able to successfully store that value it should complain. I hit this issue on the 6.x version and found it was silently failing.
Would be lovely if this could be addressed.
Comment #27
markpavlitski CreditAttribution: markpavlitski commented@johan.gant - Hopefully the patch in comment #25 will address this (for D7 at least). Once a solution is approved we can look at backporting it to D6.
Comment #28
sd46 CreditAttribution: sd46 commentedI've had a look at the patch in #25, the main issue I see is that logging messages was originally included in the module for the Drupal 5 version, but was subsequently removed for spamming the database with hundreds of messages see:
watchdog() in dmemcache_set is dangerous!. Would it be sensible to be able to turn logging of these messages on or off, or some other method for restricting it from sending repeated messages?Comment #29
markpavlitski CreditAttribution: markpavlitski commented@FlakMonkey46 You're right, the patch in #25 introduces a regression and results in a large number of watchdog calls if the memcached instance is unavailable.
I'd consider this a bug since D7 core caching doesn't have this issue and memcache is supposed to be a drop-in replacement.
How about the attached patch which adds chunking support for large objects if the initial write fails?
Comment #30
arantxag CreditAttribution: arantxag commentedSubscribing.
Comment #31
Phizes CreditAttribution: Phizes commentedThere is no need to post 'Subscribing." type comments any more, there is a follow button in the top right hand area of the issue summary. For more info, see: Stop subscribing, start following
Comment #32
arantxag CreditAttribution: arantxag commentedI'm stupid. I was looking for this button some minutes before write "subscribing"...
Thank you :)
Comment #33
marcingy CreditAttribution: marcingy commentedQuick scan and initial thing that springs to mind is don't we also need to deal with cache clears for 'multi part' keys.
Comment #34
markpavlitski CreditAttribution: markpavlitski commented@marcingy Good catch. While #29 would have still functioned on a cache clear, there could be large cache entries left sitting around in memcache.
This patch cleans up these extra chunks of data when deleting a key or clearing the cache.
I've also added a test case to check the large values work with cache_get/cache_set.
Comment #35
SocialNicheGuru CreditAttribution: SocialNicheGuru commentedpatch no longer applies to the may 24, 2013 dev version
Comment #36
markpavlitski CreditAttribution: markpavlitski commented@SocialNicheGuru The patch in #34 still appears to cleanly apply to the latest 7.x-1.x-dev branch. Please note it's a -p1 patch. Can you retry with the latest dev release?
Comment #37
SocialNicheGuru CreditAttribution: SocialNicheGuru commentedThanks for the tip. can someone share what the difference between 'git apply' and 'patch -p1'
Comment #38
anavarre#37 https://drupal.org/patch/apply should answer your question.
Comment #39
wwhurley CreditAttribution: wwhurley commentedAttached is a re-rolled patch to apply to the most current dev version.
Comment #40
markpavlitski CreditAttribution: markpavlitski commentedThe patch in #39 doesn't apply (looks like a -p7 patch?).
@wwhurley have a look at the patch creation guide https://drupal.org/node/707484 for how to create a -p1 patch.
Comment #41
markpavlitski CreditAttribution: markpavlitski commentedComment #42
arantxag CreditAttribution: arantxag commentedThe patch #40 has errors. You don't change the variable name from $result to $rc. Here, you can find the patch reviewed.
Comment #43
Jeremy CreditAttribution: Jeremy commentedComment #44
Elijah LynnHere is a patch on #42 that improves the docblocks to better meet our Comment Standards (https://www.drupal.org/coding-standards/docs).
Edit: Sorry for adding #42 again, cannot delete it.
Comment #45
Jeremy CreditAttribution: Jeremy commentedThe main thing I'd like to see added here is logging, so we know when this is happening. Generally it's a bad sign that we're caching items >1M, and something that should be fixed in whatever module is doing it. So, ideally we'd log it and chunk it both (perhaps making both configurable, so you could do either/or/both).
Comment #46
Elijah LynnDoes anyone have a View they could paste that generates >1MB objects? I would like to test this patch more fully and help add logging. Maybe I can figure out on my own but if you have one please let me know. I will post back if I come up with one.
Comment #47
mdupontFor information, I've seen objects > 1Mb on complex sites with a lot of fields and database tables. The field_info:instances cache and schema cache were therefore ignored by Memcache, impacting performance.
Comment #48
Elijah LynnHiding an accident patch from #44.
Comment #49
bibo CreditAttribution: bibo commentedI'm not sure if people are generally aware of it, but memcached developers introduced a new configuration option some time age for increasing maximum item size limit (MAXITEMSIZE). This was introduced in memcached 1.4.2.
More info: http://www.alphadevx.com/a/387-Changing-the-maximum-item-size-allowed-by...
Having memcached-module inform about problems with data values over 1Mb would be a nice feature. Maybe it could also live as a manually triggered check in memcache_admin(?). Raising the limit by altering memcached configuration is luckily not too difficult anymore.
Comment #50
SocialNicheGuru CreditAttribution: SocialNicheGuru commentedWhen installing memcache I get the following:
"t/item_size_max.t .... 1/7 Item max size cannot be less than 1024 bytes.
t/item_size_max.t .... 2/7 Cannot set item size limit higher than 128 mb.
t/item_size_max.t .... 3/7 WARNING: Setting item max size above 1MB is not recommended!
Raising this limit increases the minimum memory requirements
and will decrease your memory efficiency.
WARNING: Setting item max size above 1MB is not recommended!
Raising this limit increases the minimum memory requirements
and will decrease your memory efficiency."
Comment #51
SocialNicheGuru CreditAttribution: SocialNicheGuru commentedThe solution in #49 did not work for me. the variable was not defined. I am on ubuntu 14.04 php 5.5.9
When I disable memcache, caching is done by drupal.
I take a look at my database and I see these as the major cache culprits:
cache 10 m
cache_advagg_info 1.6m
cache_bootstrap 2m
cache_field 14 m
cache_form 1.5 m
cache_rules 1.5m
cache_views 11.6 m
doing a memcache -help gives
'memcache -I (it's an uppercase 'i' not "L"): -I Override the size of each slab page. Adjusts max item size (default: 1mb, min: 1k, max: 128m)
I need to use the -i option for each in my config file
Comment #52
mikeytown2 CreditAttribution: mikeytown2 commented@SocialNicheGuru
The thing to look for is a row in the database that is over 1MB in size not the total table size. So for example the cache_advagg_info table has a lot of rows but each row should only be less than 1kb in size.
Comment #53
achton@Elijah #46 (and others): To generate >1MB size objects easily, install Panels and insert a view (or other) which generates a large bunch of HTML on a Panels page. Then enable page caching for that Panels page, making it cache the generated HTML directly in the default cache bin. Should make it easy to test.
Side note: This will also expose issues with your MySQL server, if it has the default max_allowed_packet size of 1MB (when caching to database).
Comment #54
torgosPizzaI have this issue in D7 on our homepage and catalog pages, which are Search API-Solr-Views and Panels/Panelizer based. (Using a view mode for our product entities that's been Panelized, and the homepage is a Page Manager page built in Panels). I would consistently get WSOD errors when attempting to load either of these pages after the panels cache had expired.
After using this patch, this does not seem to be the case - the pages load without any WSOD and relatively quickly (especially considering they would often not load at all). This is of course only observational data but the patch seems promising for those of us who don't want to sacrifice performance for an increase in memory allocation.
Thanks!
Comment #55
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedjeremy, I wonder why you want to have logging here. There is no way to really "fix" the issue.
As an alternative to splitting up the cached item, one could let the memcache extension handle this.
http://php.net/manual/en/memcache.setcompressthreshold.php
memcache_d_ does not seem to have a similar setting, however.
I've seen this issue not only for views but also for cache-schema and cache_bootstrap-hook_info, so a fix would need to be in core and I am not sure what it should look like and I don't think it should be dealt with in core since the size limit is a memcache issue.
Comment #56
Jeremy CreditAttribution: Jeremy commentedLogging will be configurable, so it will be possible to disable it. The reason for the logging is when cache items grow this big it's almost certainly a bug in and of itself (throwing everything and the kitchen sync into the cache all in one big lump is generally not an efficient choice): that said, the reality is there are many instances of cache items getting this big so we need to properly support it. While compression is also an option that can help, I fully intend to add support for auto-splitting the cache item in the next release.
Comment #57
joelpittetHere's a SQL command to find those big cache tables.
And some values:
I'm going to try the patch in #44 @Jeremy is that the one you are looking at adding with the configuration?
Comment #58
joelpittetI did get this nice little error after applying the patch in #44:
PHP Notice: unserialize(): Error at offset 1048047 of 1329813 bytes in memcache/dmemcache.inc on line 292.
Comment #59
adammaloneFollowing on from the comment in #57, here's a small drush script that has been used before to find those same large cache objects in the database: https://gist.github.com/typhonius/a3c52dab92bf07b7ce34
Comment #60
Fabianx CreditAttribution: Fabianx commentedI learned at DrupalCon this took a BIG Drupal site down during their launch.
With Drupal 7 sites having grown to >300 modules and such, we cannot expect this to work anymore out-of-the-box and as increasing memcache is not an option, this is critical.
Also because the cache_get() silently fails for critical cache entries (cause the cache_set failed), that is a big concern, too.
Comment #61
mikeytown2 CreditAttribution: mikeytown2 commentedI do agree with the Critical here; this issue can be deadly. We've stopped using memcache because of this issue actually.
Comment #62
Jeremy CreditAttribution: Jeremy commentedDoes not affect everyone, dropping to major but tagging as a 7.x-1.4 blocker (as it didn't actually get into 7.x-1.3 which was released tonight) which I fully plan to respect. Assigning to myself as this is one of my top priorities with this module.
Comment #63
Jeremy CreditAttribution: Jeremy commented(removing duplicate)
Comment #64
joelpittetWhile it didn't all out solve the problem I had, it at least made it much less explosive and was helpful in tracking down those large cache items which I moved to redis. And a module_implements bug in core was uncovered as well because of this patch in it's current state.
Thanks fore adding it as a 1.4 blocker @Jeremy.
Comment #65
mikeytown2 CreditAttribution: mikeytown2 commented@joelpittet
Link to issue?
Comment #66
joelpittet@mikeytown2 whoops sorry, #496170-34: module_implements() cache can be polluted by module_invoke_all() being called (in)directly prior to full bootstrap completion
So far so good on removing that patch from core on my system. I'm fairly sure the patch is not the cause of the problem but it does certainly contribute to exposing it.
Comment #67
mcdruidhttps://www.drupal.org/project/memory_profiler now includes some code which can be used to identify objects which are too large for cache without having to move cache bins into the database:
http://cgit.drupalcode.org/memory_profiler/tree/README_memcache_profiler...
It's mostly harmless to run on a prod site for a short while to identify the problematic objects, but almost certainly better to use it with syslog rather than dblog.
Comment #68
Jeremy CreditAttribution: Jeremy commentedI've done some initial testing of the latest patch, with the intention of committing this functionality soon. I've confirmed it works as designed, but still need to do some additional profiling before I'll commit it. I've made a few modifications, including:
Feedback and additional testing welcome.
Comment #69
joelpittetI was having some issues with the previous patch still, though trivial compared to without it. Sometimes the pieces/chunks wouldn't match sizes on these larger items. Though maybe someone else who is running this patch could keep an eye on their logs?
@Jeremy any chance you could toss up the interdiff as well, maybe the size issue was resolved in your cleanup?
Comment #70
Jeremy CreditAttribution: Jeremy commented@joelpittet interdiff attached, sorry for the oversight. My cleanup would not have fixed your problem, it didn't change the functionality in question.
Can you more fully describe the problem you've had? Do you get any error messages? Or what sorts of problems do you see?
I will tweak my test script to make smaller steps in object size to see if perhaps there's a small window where we are still generating pieces that are too large. Alternatively, we allow an estimated 512 bytes for metadata, perhaps we need to be more precise or add a larger margin.
Comment #71
joelpittet@Jeremy it's a bit tricky to remember as I switched quickly to store those larger bins in Redis to mitigate any further issues.
All I remember is that the sizes or number of chunks didn't match up and it was throwing a notice or error message in the logs. Didn't notice any errors in the page.
Thanks for the interdiff, it would be nice to have those semi-automatic #2233521: Auto-generate interdiffs :)
This patch looks like quite the improvement. Hopefully someone can drop it in and give it a test out.
Comment #72
torgosPizzaI'll be testing this on production today and over the weekend. Thanks for the updated patch!
Comment #73
catchThis looks encouraging to me with the logging. Was concerned about us silently working before, but we silently fail now so working + logging sounds great.
Didn't review the patch in lots of depth yet but also didn't spot anything obvious.
Comment #74
Jeremy CreditAttribution: Jeremy commented@joelpittet with additional testing I did find some edge cases where the earlier patch would fail. I tracked this down to the memcached server reporting "error: MemcachePool::set(): Server n.n.n.n (tcp 11211, udp 0) failed with: SERVER_ERROR out of memory storing object". I've updated the attached patch to handle this gracefully and now all my tests pass consistently.
For what it's worth, I'm using the attached quick test script to test various sized cache items -- I've been tuning the $minimum and $maximum values and the block size (bs=), then running it with "drush scr test-script.php". It was with a block size of 512 and a min/max of 4095 that I first duplicated this bug. If you're not familiar with dd, I recommend you don't experiment with this test script as it has potential to delete your hard drive if you misuse it.
Comment #75
Jeremy CreditAttribution: Jeremy commentedI've run some load tests comparing the before/after of applying this patch. Overall performance isn't changed much, however there is a huge spike in "delete_misses" and some penalty from this. The misses are expected from the new code in dmemcache_delete() required to best-effort cleanup multi-piece cache items, but I'd like to consider methods for avoiding this additional network traffic if possible.
This is latest -dev without this issue's patch applied (the average of 3 load tests):
And this is with the patch applied:
The load test processes 21.1 pages per second in either case, so this may be acceptable overhead.
Comment #76
Fabianx CreditAttribution: Fabianx commentedThis is great!
I shuddered on using a global here first, but given that a static or a drupal_static or a static class member all are kinda a global, too and its a very limited use case, this is fine.
This has the potential that all multi objects that are LARGE are fighting in expiration with small objects, which might be unimportant.
What I mean is:
They end up in the same slab as the small ones.
Perhaps adding some random / configurable padding would solve that?
Yeah, it is too bad we don't have set_multiple in the PECL memcache extension ...
Is there no timer_stop or timer_reset function that could be used to do the same here?
This is not protected against max-length, should run through dmemcache_key again or have the same logic.
As discussed having delete item misses for every non multi item is not a good trade.
We discussed (and Jeremy brought up these ideas):
a) making this at least configurable and maybe turning off by default
b) Storing all multi keys in a global variable, similar to wildcard cache clearing.
In any case we should ensure that supporting multi keys, while helpful to not bring a site down, is not affecting the normal operation of non-multi keys.
Reviewed, looks already great! Some pointers to discuss / change, but overall almost RTBC :).
Comment #77
catchInstead of watchdog could we trigger_error(). Assuming sites have their error reporting set up correctly developers would see that directly and production just gets watchdog anyway (but as a PHP notice rather).
Comment #78
Jeremy CreditAttribution: Jeremy commented@Fabianx, thanks for the thorough review! Very helpful.
#2: I've moved the global into a static.
#3: It will be interesting to see if this is ever a problem. For now, I'm opposed to adding additional network traffic and cache data just to try and load balance our slabs.
#5: This is the only way I've found to re-use the same timer name without getting an accumulative time.
#6: Thanks, fixed.
#7: I'll work on this in the next patch, but I felt the above changes were sufficient for re-review, and best to see alone.
@catch, I think this applies to ALL (or at least most) of the 'watchdog' calls in dmemcache.inc, as they are all errors. As such, this is best handled in a separate followup issue that addresses them all for consistency.
Comment #79
Fabianx CreditAttribution: Fabianx commentedUnfortunately this implementation is not correct.
I think it should be just a normal static.
But to fix it would need to use:
function name to drupal_static_reset.
Use the pattern exactly as used in e.g module_implements().
No need to return data if its stored in the static anyway.
Comment #80
Jeremy CreditAttribution: Jeremy commentedGood point, the error handler is hardly a high-performance function. Okay, simplified it by making it a normal static.
I've also implemented logic to avoid the extraneous DELETE's. I've run a loadtest to confirm they're gone, and profiled to confirm it properly fires when it's supposed to. During implementation I realized there's potential for a slow memory leak here, so I added some garbage collection. Typically this would live in hook_cron(), but we don't generally require people to enable the memcache.module so instead I fire this off as much as once every 24 hours when deleting another multi-piece cache item -- this will prevent the garbage collection from firing at all if there are no large items in the cache.
Comment #81
Jeremy CreditAttribution: Jeremy commentedIt's a little easier to review if I attach the patches.
Comment #82
joelpittet@Jeremy because it's somewhat realated to your last patch and cus it's fun: https://t.co/rC8gbo8Jis gc performance improvements and lots of animated gifs:)
Comment #83
Jeremy CreditAttribution: Jeremy commentedDuring an IRC discussion @catch pointed out that my previous patch would lead to an infinite loop if the variable array got larger than 1MB. So, instead we maintain the piece_cache in memcache (accepting that it's possible to lose it, in which cache memcache will eventually auto-expire the orphaned pieces).
Comment #84
Jeremy CreditAttribution: Jeremy commentedWhoops, accidentally uploaded an earlier interdiff. Attaching the proper one.
Comment #85
catchOnly found a couple of issues:
I'm not sure this needs to be restricted to once per day. It's happening only when multi-piece items are deleted, and then the additional cost is only retrieving the item from drupal_static() the foreach and the timestamp comparison (if we only logged when something is actually garbage collected) - unless something is actually garbage collected which should be infrequent. That seems very cheap and would save the variable_set(). Also I don't think it's a problem due to the 24 hours timer, but this is another case where it would be possible to trigger the variable_set() from a variable_set() - since variable_set() does a cache clear and the variable cache could be multi-part.
Minor but I'd rather see $e as $expires. When I see $e I always think of Exception.
Comment #86
Jeremy CreditAttribution: Jeremy commentedThanks @catch! I've fixed both, and also confirmed that this doesn't break when the __dmemcache_piece_cache itself grows to >1MiB.
Comment #87
mikeytown2 CreditAttribution: mikeytown2 commentedLike where this is headed not using variable_set.
Comment #88
killes@www.drop.org CreditAttribution: killes@www.drop.org commentedTested this a bit and saw watchdog entries like:
Spent 751.16 ms splitting 10.42 MB object into 11 pieces, cid = stage-cache_views_data-file_view%3Apanel_pane_5%3Aresults%3A9fb2367a069c76985dc346ff2c36a5ed
for pages where I'd get the mempool warning before.
Comment #89
Fabianx CreditAttribution: Fabianx commentedNot a blocker, but this timer_start / timer_read pair is not protected against that the timer is not reset when calling timer_start().
---
Besides this little nit, this is RTBC from my side and killes tested :).
Comment #90
Jeremy CreditAttribution: Jeremy commentedTwo changes:
@fabianx I realized that if I don't call timer_stop and just timer_start again there's no accumulation of time, so I cleaned up the timer logic accordingly. I would appreciate a review on the recursion handling, especially if you have better ideas on how to solve this.
@killes thanks for testing! please be sure to update to this latest patch which protects against the edge case of __dmemcache_piece_cache growing too large (which would require a massive number of very large cache items, and is therefor rather unlikely).
Comment #91
killes@www.drop.org CreditAttribution: killes@www.drop.org commentednew patch works just as well as the previous one for me.
Comment #93
Jeremy CreditAttribution: Jeremy commentedI ran another load test today and found that while recent changes solved the problems with excessive DELETE misses, on websites with no over-sized cache pieces it was adding a significant number of GET misses. Fortunately, this was easily solved by initializing the __dmemcache_piece_cache variable the first time we try and load it and find it empty. Load tests confirm that there is now no measurable overhead with this patch on websites that don't have over-sized cache pieces.
@killes, thanks for testing this on a real-world website!
@markpavlitski, thanks for the original patch to split large items into pieces!
The patch has landed:
http://cgit.drupalcode.org/memcache/commit/?id=2245e77
Next, it needs to be ported to the 6.x branch, then we'll look at merging it into the 8.x branch.
Comment #94
Fabianx CreditAttribution: Fabianx commentedChecked the recursion handling: That is fine.
Good catch on the cache get misses, testing helps :).
Congratulations everyone!
Comment #95
jonloh CreditAttribution: jonloh commentedAnytime soon if this will be backport to Drupal 6 soon? ;)
Comment #97
Jeremy CreditAttribution: Jeremy at Tag1 Consulting commentedBackported to Drupal 6:
Needs to be ported to D8, if not already done.
Comment #98
ressa CreditAttribution: ressa at Ardea commentedWas this patch ever ported to D8?
Comment #99
Andre-B@ressa it doesn't look like, at least not readme parts. but this shouldn't stop you from using larger cache bins (at least not by looking at the d8 code, still have to test this though..)
Comment #100
fuzzy76 CreditAttribution: fuzzy76 commentedThe D6 backport in #96 is broken. The contstant REQUEST_TIME is not defined in the D6 branch. 6.x-1.11 gives me these:
Unsure if this should affect the status of this issue or not, so I'm not changing it.
Comment #102
Jeremy CreditAttribution: Jeremy at Tag1 Consulting commented@fuzzy76 -- thanks for reporting this! 6.x fix committed.
Comment #103
solution33r CreditAttribution: solution33r as a volunteer commentedHas this been ported to D8?
Comment #104
rbayliss CreditAttribution: rbayliss at Last Call Media commentedHere's a first pass at a D8 implementation. A lot's changed since the D7 version, so the approach is slightly different here. This approach creates a single parent item that references N child entries.
Currently missing from this patch:
* Ability to configure the chunk size (it's currently a constant on the class).
* Logging of set operations for items that are too large.
* Detection of "item too large" errors - right now we're assuming that any item that failed to set is failing because it's too large.
Comment #105
rbayliss CreditAttribution: rbayliss at Last Call Media for Commonwealth of Massachusetts commentedWe've been using the patch from #104 in production for a while now on a large D8 site. It works very well, except that we've seen some strange behavior occurring around deployments. The problems all have the same pattern: Missing views field handlers and/or missing metatags, occurring during peak traffic deployments, which can be resolved by clearing caches. Our biggest cache entries are the views_data and token_info caches
I'm hypothesizing that we've got a race condition occurring where two SET operations are happening on a single large cache item at once, and causing an inconsistency between the child entries (eg: the first child entry and the second child entry belonging to different versions of the same parent entry). I'm adjusting the patch to add a unique identifier to the child CIDs so each multipart write is more atomic. This shouldn't affect GET performance or behavior at all. Worst case scenario is that we end up with extra data in the cache, but that should be flushed out eventually anyway.
Comment #106
rbayliss CreditAttribution: rbayliss at Last Call Media for Commonwealth of Massachusetts commentedCode style cleanups.
Comment #107
Josh Waihi CreditAttribution: Josh Waihi as a volunteer commentedDo we need to do any validation that the items come back in the same order as requested in $childCIDs?
The 7.x version also checks for a response of MEMCACHED_E2BIG (37) on a failed attempt where as this patch just assumes its too big if an item failed to be inserted. This approach potentially creates 1 additional request to memcache if the failure wasn't due to the item being too big. This could be avoided if the array count from splitItems() was tested to be > 1.
Comment #108
rbayliss CreditAttribution: rbayliss at Last Call Media for Commonwealth of Massachusetts commentedThanks for the review! That's a great question about the order of the items. It actually looks like both the Memcache and Memcached backends guarantee order - Memcached through the use of the Memcached::GET_PRESERVE_ORDER, Memcache through some additional logic in DrupalMemcache::getMulti(). From observing this on our site, I'm not seeing unserialize errors, so this seems to be working well.
I'm going to think a little on the way we should deal with failures - the method you've mentioned should work, but the cleanest way might be checking for MEMCACHED_E2BIG.
Comment #109
Josh Waihi CreditAttribution: Josh Waihi as a volunteer commentedEnsuing the order is preserved is the big one which sounds like its a non-issue. At this stage, its probably a bigger issue that this is not in the 8.x module than it is to modify the failure behaviour further. IMO, we could address that in a follow up issue. RTBC'ing.
Comment #110
markus_petrux CreditAttribution: markus_petrux commentedDoes not apply. Will post an updated patch for alpha7 asap.
Comment #111
markus_petrux CreditAttribution: markus_petrux commentedPatch updated
Comment #112
Josh Waihi CreditAttribution: Josh Waihi as a volunteer commented@markus_pertux, where were testLargeItem() and testLargeItemFailsGracefullyWhenChunkDeleted() removed?
Comment #113
markus_petrux CreditAttribution: markus_petrux commented@Josh: Since the location of the tests has been changed, I forgot to adapt that part of the patch. Sorry for that. I do not have now an environment to test this now.
Comment #114
markus_petrux CreditAttribution: markus_petrux commentedPatch updated. No tests included, do not really know how to. Sorry
Comment #115
solution33r CreditAttribution: solution33r as a volunteer commentedWe have addressed the problems presented in comment 114 - thanks @markus_petrux for your work!
So, here was the main issue - after much testing, we determined that the chunk size of split items was too large to fit inside slab 42 (which should accept items up to 1MB). This is due to some per-item overhead. As we wrote in the comments:
This patch worked for us locally, but we would appreciate the maintainers' input. This is a very important feature to add to the memcache module, as many sites are using a combination of views, multilingual, and search facets - all which have the potential to create rather large cache objects.
Comment #116
ndobromirov CreditAttribution: ndobromirov at FFW commentedI do not like the commented out code from the patch.
Either add the service in the class and log all the time DEBUG level logs or have a config that can be overwritten from
settings.php
that will enable / disable this logging. This is just not feasible to live in the code in that way and makes it VERY unpractical if you need to debug in production and will have to hack a contributed module and a redeploy to start debugging.Comment #117
ndobromirov CreditAttribution: ndobromirov at FFW commentedComment #118
rhristov CreditAttribution: rhristov at Bulcode commentedApplying a new patch created from #115 with added logging functionality suggested in #116.
Comment #119
vurt CreditAttribution: vurt as a volunteer and at Computer Manufaktur GmbH commentedI did some tests. The patch applied fine - but it does not seem to do anything. Or maybe I do not understand what it should do.
At the end of MemcacheBackend::set() splitItem should be called and a new should be MultipartItem created. But I did not find any conditional testing in the code if data is bigger than MAX_CHUNK_SIZE.
In my tests with big data (that should be splitted) the split was never called because the memcache->set() before was sucessful and so it left the function before reaching the split part.
The code seems to rely on the memcache server to return an error - which mine did not do...
When I build in a size test of the data length and ommit the normal call of memcache-set() when the data is bigger than MAX_CHUNK_SIZE it seems to work. At least it writes this in the watchdog:
debug memcache Split item views_data:de into 8 pieces
Comment #120
vurt CreditAttribution: vurt as a volunteer and at Computer Manufaktur GmbH commentedJust wanted to add that I did tests with a large Drupal site and there are not only big views cache entries:
I guess the memcache compression saves many sites from a bigger impact.
Comment #121
janusman CreditAttribution: janusman at Acquia commentedHere is a small testing script you can run from the shell: https://github.com/janusman/memcachetesting
It does test the splitting of large objects.
cd
to your Drupal install and then typebash /path/to/the/drush-test.sh
to run the script).Comment #122
vurt CreditAttribution: vurt as a volunteer and at Computer Manufaktur GmbH commentedSorry for the delay...
I used your nice test script and all tests run fine when the patch is applied.
I even added a content check to the tests to be shure. I extended the return line in storeAndFetchTest():
I dont know what was wrong with my last setup - the patch works as intended!
Comment #123
ndobromirov CreditAttribution: ndobromirov at FFW commentedBased on the above comments moving up to RTBC.
Comment #124
viappidu CreditAttribution: viappidu as a volunteer commentedShouldn't this one be commited? (Trying to clean up my composer....)
Comment #125
japerryMarking needs work. From what I see here, if you use memcache only as your database backend, a new release of memcache would break your site since you're missing the service.
so there are three options I potentially see here:
1.) Make a 3.x version of memcache so customers can manually update their settings.php to work with the memcache service correctly
2.) Inject the settings config option statically (IE bad practice containers not being done via dependency injection)
3.) Figure out how to include settings.php automatically through the module.
4.) Remove the debuging option and make it a separate issue
I'm looking at option 2 so we don't have to roll out a new major version... or take the debugging piece out and make a new issue for it.
Comment #126
MiroslavBanov CreditAttribution: MiroslavBanov as a volunteer and at FFW commentedI am confused about comment 125 above, but I guess it is about the introduction of a new required argument to the MemcacheBackend constructor?
Though I still don't get what this means:
Normally you use memcache as a cache backend, right?
And this isn't the first time an argument is added with no version bump - see for example #2996620: Fix clock skew tolerance in deleteAll; make it configurable [Follow-up: Add mechanism to support invalidateAll and deleteAll]
I also don't know exactly in what scenarios you would have issues upgrading. To my knowledge it's when you don't install the module but add the cache backend to the container through a code snippet in settings.php. There may be other scenarios I am not aware of. If so, I would like to know about them :)
Comment #127
japerryThis patch puts the debug components in a static call. While I see that the constructor was changed in that patch, as more and more places are using memcache, the likelyhood of sites breaking due to the requirement of settings.php changing, especially due to debug isn't worth it.
I also added a TODO to change this to be injected properly when a 3.x version is created.
Comment #128
japerryComment #130
japerrySince no one had an issue with the debug helper function, declaring this issue committed!