MySQL has a default max_allowed_packet size of 1MB. Memcache has a default slab size of 1MB.

This means without special configuration, any cache_set() of an item that is greater than 1MB in size may fail to be inserted into cache. This is happening in the wild, see #218187: Views cache too large and #1011614: Theme registry can grow too large for MySQL max_allowed_packet and memcache default slab size for examples.

#402896: Introduce DrupalCacheArray and use it for drupal_get_schema() just got in which is aiming to allow us to cache things a bit more intelligently, but I think a note in the API documentation would make sense.

Comments

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Make totally sense to document this part.

dries’s picture

Status: Reviewed & tested by the community » Needs work

The last sentence doesn't read properly: 'take care ensure'.

catch’s picture

Status: Needs work » Needs review
StatusFileSize
new754 bytes

Re-rolled with some tweaks to the comment.

catch’s picture

Issue tags: +Performance

Tagging.

dawehner’s picture

Mh, this linebreak confuses the human brain.

Just from reading it, it seems fine again, but don't trust me.

brianV’s picture

+++ b/includes/cache.inc
@@ -119,6 +119,9 @@ function cache_get_multiple(array &$cids, $bin = 'cache') {
+ *   Some storage engines only allow objects a maximum of 1MB in size to be
+ *   stored by default. When caching large arrays or similar, take care to
+ *   ensure $data does not exceed this size.
  * @param $bin
  *   The cache bin to store the data in. Valid core values are 'cache_block',

Perhaps 'up to a maximum of 1MB'?

Powered by Dreditor.

catch’s picture

Issue tags: +Needs backport to D7
catch’s picture

Re-rolled with 'up to a maximum of'.

catch’s picture

StatusFileSize
new760 bytes
berdir’s picture

Status: Needs review » Reviewed & tested by the community

This looks good now I think.

dries’s picture

+ *   Some storage engines only allow objects up to a maximum of 1MB in size to
+ *   be stored by default. When caching large arrays or similar, take care to
+ *   ensure $data does not exceed this size.

In a way, it would be better if the cache system handled this. Checking the size of an object or array could be really fast or really slow, depending on the PHP internals. I don't know. If slow, it would obviously be a bad idea. But if it is very fast, than it may be better to move that check into the cache system.

catch’s picture

The idea isn't to check the size ever time you set $data (especially since 1MB isn't actually a hard limit, it depends on configuration), it's to just be careful not to dump massive amounts of data into the cache and assume it'll work. If you checked the size then decided not to write to cache, that'd be worse than throwing errors since it fails silently in that case and the main issue with failing to write like this isn't the PDO exception is the cache miss every request (since usually the first thing to go is the theme registry cache which takes about one second to build).

There's a patch against Memcache to log when writes fail and include the size of the item: #435694: >1M data writes incompatible with memcached (memcached -I 32M -m32). That patch isn't ready to go in yet, but we could potentially do something with a try/catch in individual cache backends.

Not sure if this means we should revise the comment again or not so leaving status as is for now.

sun’s picture

Status: Reviewed & tested by the community » Needs review

Note: Didn't read the issue, only the patch.

TBH, as someone not too familiar with alternative backends, I mostly know of MySQL's max_allowed_packet size setting (which I'm not even sure of being related here [?]), and the comment doesn't really make clear to me when or how exactly I'm going to run into the issue and lastly, what I'm supposed to do if users suddenly start to report bugs against my implementation...

However, I do understand that's it's not really something we can do about within Drupal -- except for perhaps a (vague) hook_requirements() check...?

So overall, if you really think that this phpDoc change is worth to read for someone unfamiliar with the problem space and helps to solve related issues, then let's get this back to RTBC. Otherwise, let's think twice.

---

Now that I read the issue, I see that max_allowed_packet is indeed related. However, my #1 trouble-space with that in the past was that you can easily exceed the default limit with core's very own menu system's menu tree cache only. Whether you encounter the issue depends on how many modules you've installed and how many links there are in particular trees.

This documentation (phpDoc) is added in a place where it gets read and matters for Drupal developers - not really for webmasters and site builders.

I feel like I didn't manage to express my points clearly, but I hope you get the point. :-/

setvik’s picture

This issue just bit me in the butt. I have a lot of views, the sum of which was > 1MB. Every call to views_get_view called hook_views_data to rebuild all the views on the site b/c cache_set was failing but not returning or logging any kind of error message.

in DrupalDatabaseCache::set(), the current error handling is:

catch (Exception $e) {
  // The database may not be available, so we'll ignore cache_set requests.
}

I'm on mysql and get the following for $e->errorInfo if I add some debugging code to DrupalDatabaseCache::set():

array(
  0 => '08S01',
  1 => 1153,
  2 => "Got a packet bigger than 'max_allowed_packet' bytes"
)

1153 appears to be the MYSQL error code for exceeding the max_allowed_packet value.
Postgresql doesn't appear to have an equivalent max_allowed_packet limit (http://www.mail-archive.com/pgsql-general@postgresql.org/msg135090.html) and is apparently unaffected by large cache_set attempts.

Would it make sense to add error handling here that checks for the 1153 error (mysql max_allowed_packet error) and logs an error via watchdog explaining the cache_set failed b/c max_allowed_packet was exceeded?

Or have DrupalDatabaseCache::set() return TRUE/FALSE (or the Exception object) depending on whether the cache_set was successful or not?
(i.e. delegate the handling of this error to modules calling cache_set that suspect they may exceed the default cache limit)...

I don't have my head wrapped around the implications of handling the error in cache layer vs the db layer vs the module calling cache_set...

But I think some kind of error-handling (in addition to better documentation) would help users like myself that wouldn't know where to look in the documentation but may think to check the error log. Should I open this as a separate issue?

xjm’s picture

StatusFileSize
new780 bytes

Rerolled for core/.

valthebald’s picture

Status: Needs review » Needs work

If I was searching the source of the cache slab size issue, documentation to cache_get_multiple() wouldn't come on the first place. I would search in error log, mysql (or memcache) logs, status report, then google it, and only then check the core source code.
I think, meaningful exception handling in DrupalDatabaseCache::set(), suggested in #14, would be more helpful

yesct’s picture

Status: Needs work » Needs review
Issue tags: -Performance, -Needs backport to D7

#15: cache-limit-1261846-15.patch queued for re-testing.

Status: Needs review » Needs work
Issue tags: +Performance, +Needs backport to D7

The last submitted patch, cache-limit-1261846-15.patch, failed testing.

sandipmkhairnar’s picture

StatusFileSize
new812 bytes

Rerolled for core

marvil07’s picture

Status: Needs work » Needs review

let testbot test

valthebald’s picture

I'm still not convinced that documenting possible 1Mb limit in the code is the best, or sufficient, idea.
I would prefer either complimenting or replacing it with proper Exception handling as suggested in #14

marvil07’s picture

Status: Needs review » Reviewed & tested by the community

As mentioned, different cache implementations will have different responses, so we can:

In both cases, it probably involves to change the cache api interface, unless it is done internally, which is not an option for d8 unless we change all cache backends to use a recovery method(we do not have a central place, function, to change).

So, I agree, we have to fix that, but a hint in documentation can not really hurt, so marking as RTBC and I'll be opening a follow up to figure out a solution, probably an extra method on the cache interface to act on the set fail state, but I guess that will involve an api change to ask every palce where set is done to see if it happened correctly.

yesct’s picture

@marvil07 yeah. go ahead and make the follow-up now. We'll discuss there the actual approach to take.

webchick’s picture

Version: 8.x-dev » 7.x-dev
Component: cache system » documentation
Status: Reviewed & tested by the community » Patch (to be ported)

I was going to assign this to catch, but since it's catch's patch that won't really work. ;)

I can understand some of the concerns raised above about whether or not people to whom this applies will ever actually see it, but OTOH I certainly don't think the adding these docs by themselves hurts anything. So this looks good to go to me. Changing the component to documentation in case jhodgdon wants to retroactively chime in here and/or roll back the patch, though.

Committed and pushed to 8.x. Marking for 7.x backport. But let's make sure we get that follow-up created to deal with this problem in a more holistic manner.

yesct’s picture

Status: Patch (to be ported) » Reviewed & tested by the community

#1945712: Fix 1MB maximum size limit for cache_set() created. (just real quick. needs fixing the issue summary there.)

yesct’s picture

Status: Reviewed & tested by the community » Patch (to be ported)
jhodgdon’s picture

Formatting -- the new docs should have been wrapped to 80-char lines with the existing docs, but that's not a huge deal. The docs themselves look reasonable to me.

star-szr’s picture

Issue tags: +Novice

The backport would be a great novice task.

chrisjlee’s picture

Status: Patch (to be ported) » Needs review
StatusFileSize
new755 bytes

Docs backport attempt. First backport yay.

star-szr’s picture

Thanks @chrisjlee! I'm thinking the D7 backport should also patch DrupalCacheInterface::set() (also in cache.inc) which has roughly the same docs for the $data param, but would like confirmation on that before we roll another patch.

chrisjlee’s picture

StatusFileSize
new720 bytes
new1.31 KB

@Cottser: Yea that makes sense. I went ahead and patched that anyways just in case.

star-szr’s picture

@chrisjlee - Sure, makes sense, either patch can be committed. Thanks!

jhodgdon’s picture

Status: Needs review » Needs work

The wrapping in this patch needs to be fixed. For example:

  *   serialized before insertion. Strings will be stored as plain text and are
  *   not serialized.
+ *   Some storage engines only allow objects up to a maximum of 1MB in size to
+ *   be stored by default. When caching large arrays or similar, take care to

If this is a new paragraph, leave a blank line. If it's not, wrap it with the previous text.... Oh wait, it's part of a @param documentation -- so wrap with the previous text.

elijah lynn’s picture

amitgoyal’s picture

Status: Needs work » Needs review
StatusFileSize
new1.34 KB
new901 bytes

Please review updated patche with fixes as per #33.

jhodgdon’s picture

Status: Needs review » Needs work

This paragraph needs some wrapping attention too:

    *   Strings will be stored as plain text and not serialized.
+   *   Some storage engines only allow objects up to a maximum of 1MB in size to
+   *   be stored by default. When caching large arrays or similar, take care to
+   *   ensure $data does not exceed this size.
elijah lynn’s picture

Status: Needs work » Needs review
StatusFileSize
new2.46 KB
new2.22 KB

Here is a wrapping re-roll, I also added some (optional)s to two of the params.

jhodgdon’s picture

Status: Needs review » Needs work

Really?

+   * @param int $cid
    *   The cache ID of the data to store.

I do not think cid is an integer really, is it? Aren't cache IDs string? Anyway, let's please just keep this patch to the issue at hand.

pushpinderchauhan’s picture

Status: Needs work » Needs review
StatusFileSize
new425 bytes
new2.37 KB

IMO, there is no need to declare $cid datatype to make it consistent with other parameters in same function i.e $data and $expire.

Also $cid is string not int.

Please review updated patch.

jhodgdon’s picture

Status: Needs review » Reviewed & tested by the community

Thanks, this looks fine.

elijah lynn’s picture

Thanks, I didn't realize that, learned something new!

jhodgdon’s picture

Status: Reviewed & tested by the community » Fixed

Thanks all! Committed to 7.x.

  • jhodgdon committed 90de4ec on 7.x
    Issue #1261846 by catch, dawehner, brianV, Berdir, sun, xjm,...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.