If the strlen() of $full_key is greater than 250, the resultant key send to memcached can contain non-safe characters due to the absence of urlencode wrapping $prefix and $bin in dmemcache.inc on line 456.

See below:

  if (strlen($full_key) > 250) {
    $full_key = $prefix . $bin . '-' . sha1($key);
  }

In addition, this logic does not account for the combined prefix being greater than 210 bytes long, which would cause the length of $full_key to be greater than 250 bytes.

Lastly, there is no watchdog support to help administrators determine that their prefix or bin names are too long.

I have a patch that I am putting the finishing touches on right now and will attach to this issue a bit later this morning.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tzakrajs’s picture

Status: Needs work » Needs review
FileSize
2.52 KB

I have included the aforementioned patch file based on the latest development version in the git repo.

Let me outline what is included:

1. If the size of the entire urlencoded key is greater than 250 bytes go to 2, else go to 6.
2. If the size of the urlencoded prefix (ex: 'prefix-bin-') is greater than 210 bytes, go to 3, else go to 6
3. If the size of the urlencoded prefix is equal to or greater than 250 bytes, go to 4, else go to 5
4. Invoke watchdog as WATCHDOG_ERROR with "Key prefix and bin are too long to allow unique key" go to 6
5. Invoke watchdog as WATCHDOG_WARNING with "Key prefix and bin are too long to allow full SHA1 key" go to 6
6. Truncate the entire unencoded key character-by-character until the entire encoded key is not greater than 250 go to 7
7. Return the entire encoded key.

tzakrajs’s picture

BMDan’s picture

tzakraj's patch should have the following line deleted:

+    if ($encoded_key) { echo 'heh'; }

It is just cruft left over from testing. That minor issue notwithstanding, I've reviewed the patch's logic and offer my +r, for whatever it's worth.

SocialNicheGuru’s picture

the patch fails:

git apply _m*patch
fatal: corrupt patch at line 61

patch -p1 < _m*h
patching file dmemcache.inc
patch unexpectedly ends in middle of line
Hunk #1 FAILED at 439.
1 out of 1 hunk FAILED -- saving rejects to file dmemcache.inc.rej

BMDan’s picture

Here's an updated patch, fresh against a git clone. Note that this applies against 7.x-1.x, not master. If you'd like me to reroll against master, let me know.

BMDan’s picture

Assigned: tzakrajs » Unassigned

Unassign; tzakrajs is no longer available.

Patch needs review.

markpavlitski’s picture

Status: Needs review » Needs work

This patch cuts characters off the end of the key until it is 250 bytes or under. As the prefix length increases, the number of unique bytes available for the key decreases, until all keys are just the first 250 bytes of the prefix.

With a key prefix over 210 bytes, you will experience random WSODs where cache collisions are occurring.

I'd suggest trimming characters from the end of the key prefix instead of the key.

I'd also suggest using the register_shutdown_function() method for recording error messages, as watchdog may not yet be functional.

markpavlitski’s picture

Status: Needs work » Needs review
FileSize
3.36 KB

Attached patch includes the following changes:

  • Adds static caching of encoded keys. On a fresh D7 install I got as many as 23 calls for the same key during a cache rebuild.
  • Logs errors using the register_shutdown_function() technique.
  • Picks the smaller of the original key and sha1 encoded key, rather than assuming sha1 is smaller.
  • Trims the prefix rather than the key itself, to ensure the key remains unique.
markpavlitski’s picture

The attached patch takes a much simpler approach.

If the key is over 250 bytes, the fully prefixed key is hashed with sha256 (rather than just the cid).

This will guarantee that the key is always under 250 bytes, while still ensuring there aren't any key collisions.

There is also an update to README.txt to state that the total key length must be under 250 characters.

Jeremy’s picture

Status: Needs review » Needs work
FileSize
833 bytes

I like your simpler approach.

However, I don't like your change of hash algorithms. If anything, I'd consider changing to md5 instead of sha1 for performance reasons. Collisions with md5 would be possible, but highly unlikely with cid's imo. That said, in most cases we're probably hashing infrequently enough it doesn't matter which algorithm we use, but still, I see no advantage to sha256 in this use case.

To test, I wrote a quick script that hashes the same text strings 100,000 times with crc32, md5, sha1, sha256 and sha512 for comparison. It does this with increasingly large text strings -- results:

Hashing 32 characters...
  crc32: 0.29590201377869 seconds.
  md5: 0.36474585533142 seconds.
  sha1: 0.43390107154846 seconds.
  sha256: 0.7034809589386 seconds.
  sha512: 0.81006789207458 seconds.

Hashing 64 characters...
  crc32: 0.3405921459198 seconds.
  md5: 0.45874977111816 seconds.
  sha1: 0.55376219749451 seconds.
  sha512: 0.8125 seconds.
  sha256: 1.0382559299469 seconds.

Hashing 128 characters...
  crc32: 0.44771695137024 seconds.
  md5: 0.55433988571167 seconds.
  sha1: 0.65940308570862 seconds.
  sha512: 1.274346113205 seconds.
  sha256: 1.3972609043121 seconds.

Hashing 256 characters...
  crc32: 0.67021608352661 seconds.
  md5: 0.73162698745728 seconds.
  sha1: 0.9186680316925 seconds.
  sha512: 1.7371129989624 seconds.
  sha256: 2.1004288196564 seconds.

Hashing 512 characters...
  md5: 1.1053559780121 seconds.
  crc32: 1.1151471138 seconds.
  sha1: 1.4261789321899 seconds.
  sha512: 2.6576089859009 seconds.
  sha256: 3.536062002182 seconds.

Hashing 1024 characters...
  md5: 1.8577461242676 seconds.
  crc32: 2.0025329589844 seconds.
  sha1: 2.4622428417206 seconds.
  sha512: 4.4836809635162 seconds.
  sha256: 6.4158470630646 seconds.

Hashing 2048 characters...
  md5: 3.335862159729 seconds.
  crc32: 3.7688059806824 seconds.
  sha1: 4.4825189113617 seconds.
  sha512: 8.3543820381165 seconds.
  sha256: 12.189244031906 seconds.
markpavlitski’s picture

Status: Needs work » Needs review
FileSize
895 bytes

@Jeremy It looks like leaving as sha1 is the safest option.

Jeremy’s picture

Version: 7.x-1.x-dev » 6.x-1.x-dev
Status: Needs review » Patch (to be ported)

I cleaned up the patch by adding documentation, adding a test, and making the hashing algorithm configurable:
http://drupalcode.org/project/memcache.git/commit/1d4cd8b

Needs to be ported to 6.x-1.x.

markpavlitski’s picture

Status: Patch (to be ported) » Needs review
FileSize
3.33 KB

D6 patch.

BMDan’s picture

It strikes me that this line:

$full_key = urlencode(hash(variable_get('memcache_key_hash_algorithm', 'sha1'), $prefix . $bin . '-' . $key));

should be:

$full_key = urlencode(hash(variable_get('memcache_key_hash_algorithm', 'sha1'), $full_key));
Fabianx’s picture

Issue summary: View changes

I would propose the following both for D6 and D7:

diff --git a/dmemcache.inc b/trunk/dmemcache.inc
--- a/dmemcache.inc
+++ b/dmemcache.inc
@@ -403,7 +403,8 @@ function dmemcache_key($key = '', $bin = 'cache', $reset = FALSE) {
   // a longer key, hash it with sha1 which will shrink the key down to 40 bytes
   // while still keeping it unique.
   if (strlen($full_key) > 250) {
-    $full_key = $prefix . $bin . '-' . sha1($key);
+    $full_key = urlencode($prefix . $bin) . '-' . sha1($key);
+    $full_key .= '-' . substr(urlencode($key), 0, 250 - strlen($full_key) - 1);
   }
 
   return $full_key;

That should ensure the length is correct, but still give visibility to what is being hashed ...

  • Jeremy committed fc5df69 on 7.x-1.x authored by Fabianx
    Issue #1613622 by Fabianx, Jeremy: Include as much of key bin and name...
Jeremy’s picture

Thanks, committed to the 7.x branch after fixing an off-by-one error and improving the documentation. I also exposed memcache_key_max_length so anyone preferring to enforce shorter key lengths to optimize network traffic can do so.
http://cgit.drupalcode.org/memcache/commit/?id=fc5df69

This still needs a backport to 6.x-1.x-dev.

Jeremy’s picture

Status: Needs review » Patch (to be ported)
Jeremy’s picture

Status: Patch (to be ported) » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.