Problem/Motivation

We use a build strategy that keeps all builds, and symlinks the Drupal root to the latest build directory. This means that Xautoload's cache continues to load files from older builds, as those files still exist and therefore do not cause cache misses.

Proposed resolution

To be determined.

Remaining tasks

To be determined.

User interface changes

To be determined.

API changes

To be determined.

Data model changes

To be determined.

Intermediate workaround

See #15 and #18
The "smart database cache" seems to be unaffected by this. It is not shared between sites, and it is cleared every time it should.

As a cheap workaround, you could disable the other caches and only enable the "smart db cache".
Performance should actually be ok, if not better than some of the other caches.

Comments

Xano created an issue. See original summary.

donquixote’s picture

In Drupal\xautoload\CacheManager\CacheManager, I find this code:

  static function create(DrupalSystemInterface $system) {
    $prefix = $system->variableGet(XAUTOLOAD_VARNAME_CACHE_PREFIX, NULL);
    $manager = new self($prefix, $system);
    if (empty($prefix)) {
      $manager->renewCachePrefix();
    }
    return $manager;
  }

The prefix is what APCu and other caches uses to distinguish different Drupal instances.

Maybe it would be better to store the cache prefix in Drupal's cache, instead of the variable system. This way, it would be wiped automatically on cache clear.

And / or we could use the drupal hash salt in some way.. although I too often copy a settings.php to a different site, leaving the hash salt unchanged..

---------

This being said, maybe in your situation it helps to just restart web server? Would this wipe the APCu? I'm not sure..

donquixote’s picture

I find that the renewCachePrefix() is called from Main::flushCache(), but this is not called from anywhere.

So I am a bit surprised why noone has reported any problems with this so far.

donquixote’s picture

And I wonder if we should be spamming the APCu cache with more and more cache prefixes, or if we should wipe it instead.

donquixote’s picture

If we copy the database from one site to another, and the prefix is still in the database, then you would have two sites using the same prefix..

Xano’s picture

What about responding to hook_flush_caches()? It's not really what the hook is meant for, but it is a somewhat weird hook and it's hard to imagine other code than drupal_flush_all_caches() invoking it.

donquixote’s picture

Yes.
I think the Main::flushCache() was originally meant to be called from hook_flush_caches().
I don't know or remember why this is not happening.

This is how it could be done:

function hook_xautoload_flush_caches() {
  xautoload()->flushCache();
}

But I am a bit worried about garbage piling up, by leaving old instances in the APCu cache.
Should we also call apcu_clear_cache() ?
Doing this will also clear it for other Drupal instances.. But maybe it is necessary.

donquixote’s picture

I found that for APCu and APC it is possible to delete only keys with a specific prefix, instead of deleting all keys. This might be better than changing the prefix each time and letting cached values pile up.

But we need to make sure that multiple sites do not share the same prefix when the database was copied.

Which indicators can we use to distinguish between different sites?
Maybe..
- $drupal_hash_salt?
- Site path? (e.g. 'sites/default')
- DRUPAL_ROOT ?
- realpath(DRUPAL_ROOT) ?

See also #2755243: Multiple Sites, PHP Fatal Errors

donquixote’s picture

Category: Feature request » Bug report

Changing to bug report, since this can cause problems.

fuzzy76’s picture

hook_flush_caches() integration would be great. Just had an issue with a site not noticing changes to the code.

donquixote’s picture

I think for APCu we can flush selectively by prefix / wildcard.

For other cache types we can either flush system-wide, or we can change the cache cid each time, which can result in leftover data piling up in the cache.

@fuzzy76 ideas?
I will do this once we have a clear plan.

fuzzy76’s picture

Personally, I'd be fine with system-wide flushing.

donquixote’s picture

@fuzzy76: Ok. I think from the available option this is the best choice.

The other question is how do we prevent that a cache prefix is shared between copies of the site?
We can use the hash salt, or the site path, or both?
Currently it is a variable that is generated once and then kept. Not a good idea.

fuzzy76’s picture

Unfortunately, I don't have deep enough understanding of this module to have an opinion on that. :) OTOH, I don't think the issue I'm having right now would benefit anyway. It was more a duplicate of #2755243: Multiple Sites, PHP Fatal Errors

donquixote’s picture

I am working on it.

But I want to give a quick tip: I expect that the "smart database cache" does not suffer from the issue.
Because it uses Drupal's own cache, it will be cleared whenever the other caches are cleared.

donquixote’s picture

Regarding the cache clearing, I see the following challenges:

  • We do not always have access to all cache types. E.g. in a drush/cli process, we might not have access to the APCu cache. Then the next web request would still get the old data from APCu. So clearing the cache directly is not always an option.
  • Changing the cache prefix all the time would leave old data in memory, possibly spamming APCu memory over time. Not cool.
  • Multiple requests can happen simultaneously, and they might all want to write to the cache at different times during the request.
  • A long-running request with outdated PSR-4 mappings might still be running after the cache has been cleared in another request. And it might write outdated stuff to the cache.
  • Some caches, such as APCu, are shared between Drupal installations on the same machine. Other caches, e.g. the database cache, are unique to one Drupal installation.
  • Some caches act like key/value stores, where each write operation is for one individual class/file.
    Others (the "queued" ones) treat the entire map as one big array.

The approach I am currently following is two-fold:

  1. "unique site hash" to distinguish Drupal installations on the same machine.
  2. Invalidation instead of deletion.

The below is a "mind log" and might not reflect the solution I will finally choose.

The "unique site hash"

A "unique site hash", an md5() of a concatenation of:
- DRUPAL_ROOT
- conf_path()
- $GLOBALS['drupal_hash_salt']
- a uniqid() generated once and then stored in a variable, similar to the current 'xautoload_cache_prefix'.
- serialize($GLOBALS['databases'])

This may be a bit overkill. I am totally submissive to my own paranoia here.
E.g. the drupal_hash_salt should be unique in theory, but people do copy settings.php around and then it is no longer.

This unique site hash can be expected to
- be unique per site, even if some parts were copied.
- survive cache clear, to prevent spamming memory with leftover data, but
- change whenever significant parameters of the site (db connection, directory, hash salt) are modified.
- be "salty" enough to not leak its individual components.
- be quick enough to compute in every request without any relevant performance impact.

This unique site hash is then used as a cache prefix in those caches that are shared across Drupal installations.

Invalidation

The idea is to have a system-wide variable (via variable_get()), that is renewed with uniqid() on cache clear.
This one is NOT used as a cache prefix.
Instead..

For "key/value store" caches:
- When the cached class loader is initialized, we load one specific "signature" key and compare it to the current value of the variable mentioned above. If it is empty or it does not match, we wipe the cache and write the variable into the "signature" key.
- The rest stays how it is.

For "queued" caches where we store the entire map as one array:
- Whenever we write to the cache, we "sign" the cache with the value of this variable.
- Whenever we first load the cache in a request, we check if the "signature" still matches the current value of the variable. If not, we regard the value as outdated.
- Whenever we load the cache a later time in the same request, we check the "signature" again. If now it does not match, it means that another request has changed it, and we are lacking behind.

donquixote’s picture

The "invalidation" stuff still has the flaw that long-running requests can write to the key/value store.
We cannot afford to check the "signature" on every write operation.

Alternative:

At least for the key/value caches, we should change the prefix AND wipe existing data.
But what in e.g. "drush cc all" if drush cannot access APCu? (I don't even know if this is an issue, but I think we have to consider the case that at least some cache types will have this problem.)
This is why we still need invalidation, as described above. But also change the prefix. And wipe on signature mismatch.

Maybe we should also change prefix for the "queued" caches.
If we also wipe on signature mismatch, we prevent the leftover data spamming effect.

This all said, we still need to distinguish different sites / installations that were copied.
I'd say it should be allowed to copy a site without even clearing the cache.
So the cache prefix will consist of the "unique site hash" + the "signature" that we renew on every cache clear.

Wiping should always wipe ALL data for the unique site hash, not just for the current "signature".
If this targeted wiping is not supported by the cache type, we simply wipe system-wide.

fuzzy76’s picture

Changing to database cache fixed my problem. :)

Just be aware that Aegir changes the path to the site every time you migrate platforms, which you (ideally) do every time you install security fixes and module updates. So anything working on full file paths are likely to change regulary.

donquixote’s picture

Issue summary: View changes
donquixote’s picture

Just be aware that Aegir changes the path to the site every time you migrate platforms, which you (ideally) do every time you install security fixes and module updates. So anything working on full file paths are likely to change regulary.

This will also restart the web server and clear APCu, right?
(Not sure if we can depend on this for all other cache types)

donquixote’s picture

What if we would give up on all the other caches and only us the "smart db cache"?
(Btw the name is a bit misleading, because this does not necessarily use the database, but whichever cache is wired up to be used by Drupal.)

The original reason for the other caches was that I wanted a key/value store, similar to Symfony's ApcClassLoader or XcacheClassLoader.

But it turns out that the incremental approach is likely even better.
We can expect from it the same performance as Composer's compiled class map. Or even better, because it omits class files that are not used.
But it is still as flexible as the key/value caches. Which means that you can work on your code without having to manually recompile the class map each time.

The only cost is that it takes O(log n) requests until the cache is completely "hot". Which is nothing.

This said, I am still investigating and working on the approach that I explained before.

donquixote’s picture

The only cost is that it takes O(log n) requests until the cache is completely "hot". Which is nothing.

We could reduce this number by using a shutdown function, like the registry class loader in Drupal 7 core does.
So far I had the paranoia that this is generally more fragile than what xautoload is doing. E.g. what if another shutdown function needs more classes?
So I think the O(log n) is actually fine.

fuzzy76’s picture

As for #20, I believe it does for the default installation method which is Apache and mod_php. But if you run PHP with FastCGI/FPM, the FastCGI daemon won't be restarted, so APCu probably won't be affected.

donquixote’s picture

Here is an attempt to fix this stuff.

donquixote’s picture

donquixote’s picture

To test this, you would need to

git remote add github https://github.com/donquixote/drupal-xautoload.git
git fetch github
git checkout 7.x-5.x-2627228-24-flush-caches
donquixote’s picture

I apologize for the overengineered architecture.
Maybe I will improve this.

bburg’s picture

Just adding some SEO sauce to this issue

drupal capistrano cannot redeclare function node_help

This is a very difficult issue to pinpoint. I hope we get a resolution for it soon.

bburg’s picture

Running into this again, and I realized I never posted an update. The problem is that it runs into errors with core files, and Drupal is never able to successfully bootstrap enough to run the necessary processes to clear the xautoload cache. We ended up creating this alias:

alias truncatecache='drush sqlq "truncate cache; truncate cache_bootstrap; truncate cache_entity_file; truncate cache_entity_node ;truncate cache_entity_profile2 ;truncate cache_entity_registration_state ;truncate cache_entity_registration_type ;truncate cache_entity_taxonomy_term ;truncate cache_entity_taxonomy_vocabulary ;truncate cache_entity_user ;truncate cache_field ;truncate cache_filter ;truncate cache_libraries ;truncate cache_menu ;truncate cache_metatag ;truncate cache_page ;truncate cache_path ;truncate cache_rules ;truncate cache_token ;truncate cache_variable ;truncate cache_views ;truncate cache_views_data ;"'

Which worked at the time, but here I am deploying tonight, and it's not working this time.

bburg’s picture

I am able to force clearing the cache by temporarily adding return array(); in the beginning of DbCacheClassLoader::loadClassFiles(), clear the cache, then remove it.

Not ideal, but it will help get a hosed site back up and running.

donquixote’s picture

@bburg Thanks for putting this back on my radar!
Sorry I don't remember all details of this issue, need to read again.

If I understand correctly, the problem occurs only on projects that share the same cache on different environments or symlinked build directories, right?

One thing you could try:
drush vdel xautoload_cache_prefix
or
DELETE FROM variable WHERE name = 'xautoload_cache_prefix';

I should review this logic again.
Perhaps the variable table is the wrong place to store this string, and

To create a more permanent solution for this, it would be useful to have an "invariant" that can tell xautoload whether something has changed. I wonder which pieces of information need to be part of this invariant.

I was thinking of $drupal_hash_salt, but this may not be enough, especially in the cases discussed here.

donquixote’s picture

Would realpath(DRUPAL_ROOT) be a good invariant?

bburg’s picture

@donquixote, That would probably work for the varying buildpaths. we still use capistrano, so real paths include a datestamp in the directory. e.g.

releases/20191113145647

The directory below looks something like:

current -> /var/www/vhosts/site.www/releases/20191113145647
public -> /var/www/vhosts/site.www/current/public

Every build adds a new date-stamped directory in releases, and updates the symlinks to point to the new codebase.

You would likely have issues on Pantheon as well, that uses service containers in the paths as well, which can vary.

bburg’s picture

Also, running

delete from variable where name like "xautoload_cache_prefix";

Doesn't resolve my duplicate function issues.