Problem/Motivation
YAML parsing is still one of the slowest components of the current cold cache performance.
This especially effects cache clears (e.g. container rebuilds or module installation), the peak memory usage and cache clears.
e.g.
Looking at profile of standard install we call FileStorage::read 1772 times but we only have 1244 yml files in all of core, so this would indicate that some files are being read and parsed twice.
Proposed resolution
- Files can be cached perfectly based on their mtime using a FileCache class
- The best backend to cache files in is APCu, but PHPStorage works too (for e.g. shared hosts not having APCu).
Benefits:
- Installation is around 10%-20% faster with FileCache.
- Testbot consistently goes down from 15.4 minutes to 14.2 seconds, so we shave off a minute for core testing
- After module installation the container rebuild and YAML parsing time for all plugin data is saved (5-7% performance improvement, 38 k less calls)
- Saves 18% performance, 152k less calls for container rebuild
- Allows pre-compiling YAML and other data
Remaining tasks
- Review
- Commit
User interface changes
- None
API changes
- Add FileCache component / class that is a generic low-level class to cache files, has static caching by default
- Add FileCacheBackendInterface based on the APCu interface to allow some pluggability and especially not hard-code us to APCu, which has drawbacks, too.
- Add FileCacheFactory class to add some pluggability similar to PhpStorageFactory
| Comment | File | Size | Author |
|---|---|---|---|
| #210 | 2395143.210.patch | 31.23 KB | alexpott |
| #210 | a-interdiff-of-sorts.patch | 576 bytes | alexpott |
| #205 | Screen Shot 2015-04-22 at 13.05.03.png | 336.56 KB | alexpott |
| #199 | 2395143-198-followup-followup.patch | 1.89 KB | znerol |
| #190 | 2395143-190-followup.patch | 568 bytes | amateescu |
Comments
Comment #1
larowlanComment #2
larowlanComment #3
dawehnerI'm not sure whether this is the best idea, given that the continues container rebuild might reset the file storage object.
I remember that @alexpott had some ideas about actual cached YML files, maybe there is an isse out there for it?
Comment #4
berdirNot sure how much this will help.. Maybe when installing with drush as it is a single request, but not when installing through the web UI.
What I was thinking about is a cache that uses apcu with a file modification date so that we can keep returning the parsed result.
Also note that we have other .yml files that are loaded differently.
Installing my install profile, I'm seeing 14k calls to Yaml::decode, 7.5k from FileStorage and 6.9k from YamlDiscovery (83 router builds).
Also, 93% of FileStorage parsing is config schema :( For that, I had that idea about instead of rebuilding the config schema every time we install a new module, that we can update it with just the files from one module.
Yaml::decode() took 100s, which is 30% of the whole install.
Comment #5
berdirQuick (and ugly and undocumented) proof of concept, not yet tested in my install profile, but with the standard profile:
HEAD
Patch:
So a 20% improvement when neither xdebug nor uprofiler are enabled, without using more memory.
(uprofiler documentation page says that speedstep is messing with their timers, need to have a closer look at that, see http://superuser.com/questions/454101/is-there-a-way-to-disable-intel-sp...)
Those numbers again nicely show the overhead (edit: of uprofiler/xhprof) of many function calls. HEAD does 40% more function calls according to uprofiler report.
Not yet tested with my custom install profile, I actually expect the difference there to be bigger, as the number of file reads grow with both more modules (more rebuilds), which also result in more files. Yaml::decode() was about the same percentage, so maybe. There is a lot of other stuff going on there as well (like default content).
Note: because apc cache is per process, every drush install command still has to parse all the files once, but that's just 4% instead of 38% (when uprofiler is enabled).
Comment #6
berdirWe can probably also use ApcuFileCache for other things, like annotions, AnnotatedClassDiscovery::getDefinitions() currently takes 2.5s/7% (with the patch applied).
Comment #7
chx commentedWell, the DependencyInjection YamlFileLoader was already doing this so this makes perfect sense.
Comment #8
anavarreComment #9
Anonymous (not verified) commentednice! love this idea.
Comment #10
catchLike this a lot.
Comment #13
berdirUh. That smells like php file stats caching... :(
Comment #14
berdirNo, it was just me being stupid :)
Here's an improved version.
- Tests should be fixed (I forgot to update the static when a file got written written)
- added function exists checks for set/delete
- I changed it so that the static cache is always used.
- I'm also using this for the annotation parsing, extension discovery and the yaml file loader (replaced the static cache, which allows us to remote the reset()).
Comment #16
moshe weitzman commentedIn this example, it looks like we are saving a YAML:decode for each subsequent file read which is good but ApcuFileCache::get() has to internally deserialize so I wonder how much we are saving. We have seen apc_fetch() come up on flame graphs. I see benchmarks in #5. Would be good to redo those now that the patch is more mature.
Comment #17
berdirI didn't do benchmarks but drush si was another 0.3 seconds faster, but that obviously didn't hit apc, just the static cache.
Yes, apc_fetch has some overhead, but yaml parsing has a lot of that. Fairly convinced that this is faster. We could also try to build getMultiple() and setMultiple() versions, but this could be quite a bit bigger.
Comment #18
berdirThose fails are related to the caching in the annotation parser, weird stuff is happening I think when we cache objects because they are suddenly by reference and changes survive rebuilds. Tried to serialize them to get rid of the references problem but no luck yet.
@beejeebus also has ideas for a getMultiple() implementation that uses a single apc_fetch(), he said he will post a proof of concept here.
I think the performance gains here make this major...
Comment #19
larowlanWe'll need a hook requirements here too right?
Comment #20
berdirFor what? apcu is optional, we're not going to require it. Unless you want to make it a recommendation?
For drush, it won't even make a difference, because it just lives as long as the drush process lives. And the current implementation uses the static cache even if apcu is not available.
Comment #21
Anonymous (not verified) commentedhere's a patch that just fixes the !function_exists('apc_fetch') bit, leaving the rest of the patch as is to see what the bot thinks.
multi-get code is still in process, i fell down a rabbit hole looking ExtensionDiscovery and had to drink my way out.
Comment #22
Anonymous (not verified) commentedComment #24
yched commentedThis looks promising, but naming this "FileCache" is a bit confusing. This is doesn't cache the actual file content, but arbitrary PHP data "related to" (e.g. extracted from) a file.
Some consuming code uses this to cache the structures parsed from annotations in PHP class files,
Some consuming code uses this to cache the structures parsed from YAML files...
But the data being stored, and how it derives from the file, is entirely up to the consuming code. Thus, it seems there should be at least a notion of separate bins or namespaces, since there could be cases where different consumers might want to store different data about the same file ?
Comment #25
Anonymous (not verified) commentedhere's a first go at multiget.
it's ugly, and there are some bits to make tests pass i don't like.
re. #24 - yes, i guess that could be true.
Comment #26
larowlan@berdir yeah I didn't have php-apc on my 5.4 VM so it caused a fatal, but I think @beejeebus has fixed (although I have it installed now)
Comment #28
Anonymous (not verified) commentedberdir pointed out a silly typo:
Comment #30
berdirInterdiff is against my last patch, included the getMultiple() but changed a few things.
I removed the caching from the annotation parser, somehow that doesn't work yet, so trying to get this passing again without that. But fixed the yaml loader parsing, forgot the set() call. Down to 18s now with this, without the annotation parsing.
Apart from that, I also have the idea mentioned in the @todo. that we can cache built definition objects instead of just the parsed yaml in the YamlFileLoader, might be a bit ugly, though ;)
Comment #32
berdirFixed the test fails.
the PermissionHandlerTest is a bit annoying, that is using the vfs:// stream wrapper, but realpath() doesn't support stream wrappers (don't ask), so the cache id was always just the prefix and we had fun overlaps.
Comment #34
berdirHm, those tests worked for me, could be that I was running them in the UI.
Test run time was 15m30, HEAD seems to be at 18m right now, so that seems to save 2m30.. not bad.
Comment #36
babruix commentedConfirming, those test passed on my local, too.
Comment #38
berdirSo the problem is that filemtime is only in seconds. So if a file is written multiple times *and* read *and* that happens in a different context (cli vs web requests) so that set() doesn't work.
The test works from the UI, because there the test also runs in apache and can affect the same apc cache.
Adding a sleep(1) in there at the right time. Given that this is the only place ( I think there was a second that didn't fail this time. We also need to make sure to run this a few times, so that we don't have random fails), I think that's OK. We're calling sleep() in other tests as well. And it's not something that will happen in non-test scenarios I think.
Comment #39
catchIf mtime is tricky we could look at a hash of file_get_contents() maybe.
Comment #40
chx commentedCrazy people would extend the public stream wrapper so that the url_stat of it would return a counter of stream_write calls and use this file system for the test. But we aren't crazy, are we.
Comment #41
berdirWell, realname() doesn't support stream wrappers, so no, I don't think we can do something like that :)
Comment #42
catchBumping this to critical since it blocks any meaningful profiling of cold cache performance and it definitely won't be the last thing to look at.
Comment #43
dawehnerDo we really have to take care of the delete example? For things like a simpletest process you actually don't want to remove the entries between runs. Should we maybe just reset the static cache?
What about other usecases:
library.ymland the containerYamlFileLoader.php,Comment #44
berdirYes, we do need delete(), for config, mostly.
We don't delete entries at the end of test runs, we only do it when the file we are working with is actually deleted.
Not sure what you are trying to say here :)
Comment #45
dawehnerAh gotcha.
Ups. I was trying to express/ask whether we should cache the
.library.ymland.services.ymlfiles as well.Maybe especially for the container rebuild we could safe a bit, but it turned out, I was just too stupid to fine it. So what about
the library case, see
\Drupal\Core\Asset\LibraryDiscoveryParserComment #46
berdirYeah, those two examples are already token care of. Sure, can add it to LibraryDiscoveryParser as well. One place that does not yet use it unfortunately is the annotation parser. I tried to do it in an earlier patch and had *very* nasty issues with object references. There are tests that set flags to dynamically alter entity type objects, and then those alters survived rebuilds.
Comment #47
larowlanIf #1920902: Add a Drupal Yaml wrapper so we can default to PECL Yaml component if it is available was in would we still need this? Still waiting on a testbot.
Comment #48
berdir@larowlan: The yaml extension is a much faster, but there's no comparison with an apc_fetch() :) Also, this is not limited to Yaml, we could also use it for the annotation parsing for example, if we can make it work.
We can cache arbitrary data based on files. It might be possible to cache service definition objects for example, instead of just the parsed yaml structure.
Comment #49
catchYes some sites may have PECL but not APC, but some sites may have APC and not PECL. Also even with both, as Berdir points out APC will be faster for warm caches and PECL will be faster for cold. So both issues seem complementary to me.
Comment #50
yched commentedThen bumping on my comment in #24 :-)
Comment #51
berdirMaybe, I'm not sure if it will ever be a thing to parse the same file in two different ways? Do you have any specific use cases in mind?
Anyway, I'm not opposed to that, how do you imagine that to work?
$cache = new ApcuFileCache('routing');
$cache->get($filename)?
?
Comment #52
yched commentedWell, it feels weird to have an API that implicitly assumes that for any file there's only one "kind of info" that anyone might ever need to extract and cache.
Yep, I guess just adding a string parameter in the construct would do the job ?
Could be named $bin (as in the "regular" Cache API) - but maybe we precisely want to avoid that since it's an entirely different API ?
or $context (as in t()) ?
Comment #53
berdirOk, converted to that pattern.
Comment #55
znerol commentedMaybe just touch() the file with a modification date in the past instead of sleeping here.
Comment #56
berdirI can try that. I suspect the failing test is another one of those that fails sometimes, based on the exact time scenario. We have a lot of bored bots right now,I will post this patch a few times in a test issue and see if I can make it fail in other tests as well. Don't want to introduce a dozen random fails :)
Also, I will open a follow-up to look into caching parsed annotations, as that seems like a tougher problem than it looked like at first. I imagine that could produce another pretty neat speed up.
Comment #57
berdirOk, fixed that fail.
I thought about touch(), but that would require me to find out the name of the file.
Also posting 10 patches to #2082049: Ignore me: Berdir vs. Simpletest, to see if there are random fails. Did the same with the previous patch, there it was just the same one as here, plus some random fails in UserAdminTest, but that was a HEAD random fail.
Comment #59
Anonymous (not verified) commentedplease, no commit credit, thanks.
Comment #60
alexpottComment #61
catchDiscussed this on the maintainer call and agreed it should stay critical since it more or less blocks further work on #2289201: [Meta] Make drupal install and run within reasonable php memory limits so we can reset the memory requirements to lower levels.
Comment #62
berdirI think the patch is complete, the issue summary needs to be updated.
There are also at least two follow-ups to create: 1. Use this for the annotation parsed, proved to be harder than other cases, due to object references I think. 2: Maybe we can optimize services.yml parsing further by caching built definition objects instead of just the yaml, similar to other cases. Would need some refactoring in that class, though.
I have been using this patch since I posted the first version in my project, and as far as I can see, it resulted in zero problems and makes a *huge* difference for the install time, exponentially increasing with the number of modules.
With my install profile that installs ~110 modules, the install time difference is ~285s vs ~200s, that's a ~30% improvement. That improvement is not free, there is a 6% increase in memory usage (168M -> 179M) and 3.7% for the peak memory usage (195M -> 202M, again no idea what the difference is exactly in this case). IMHO, that is well worth it.
It would also be great if someone else could try to verify my numbers, and also with the web installer, possibly with blackfire.io?
Comment #63
fabianx commentedI spoke with catch:
This are my current concerns:
1. It feels wrong to hardcode this to ACPU without any interface, PHPStorage would work as well, I think it is better to abstract the cache storage out itself and have this use a kind of proxy pattern.
Especially this feels problematic, because it hardcodes us to one implementation.
2. I would propose to use something like $settings['yaml_filestorage_cache_class'] that can be used to configure this as settings are available very early before the container is booted / build.
3. yched's point seems valid to me, though I don't know yet how to integrate the prefix properly, will investigate.
I would like to take the remainder of this over - if I may. Berdir, could you please confirm that this would be okay for me to do or if you continue to work on this. Thanks.
Comment #64
berdirI'm OK with making this pluggable, although I'm not sure how much a phpstorage backed solution would help, as we still doing a lot of IO, i guess that would only perform in combination with an opcache.
As mentioned before, this is very low-level, as you said, the only way to make it pluggable would be through $settings. Maybe we could just change the hardcoded new $class with a SomeClass::factory(), that would make it easier to add pluggability in a follow-up issue? I'd prefer that this doesn't get delayed with implementing and testing different implementations.
I'm not worried about the WebTestBase change, we have various hardcoded assumptions there.
Comment #65
webchickSorry, just doing some house-cleaning. Per #61, this is a blocker to other critical issues getting resolved so adding the tag.
Comment #66
fabianx commentedThat is fair.
I will do that:
- Create a SomeClass::factory() then its pseudo-pluggable
The WebTestBase would also use the SomeClass::factory().
I plan on working on this tomorrow and Sunday to get this issue to MVP to get in.
(This is part of my accelerate grant.)
Thanks,
Fabian
Comment #67
fabianx commentedApologize for the delay, I got distracted by the also-very-important cache contexts discussions in the 'warm' cache performance realm.
Here is the new plan that I am in the process of finishing to implement:
- Create FileStorageCacheFactory with ::create()
- FileStorageCacheFactory::create() checks settings (optional), but if not set, auto-detects apc_fetch and uses APCCacheBackend, else falls back to PHP storage CacheBackend
- Have FileStorageCache take a generic 'CacheBackendInterface'
- Move all hardcoded apc_fetch / apc_store to $this->cache->get(), $this->cache->set(), $this->cache->delete()
This solves nicely all concerns and ensures non-APC users are profiting, too. It also makes it more generic.
Comment #68
berdirThis needs to be in Drupal\Component, the cache is Drupal\Core.
Comment #69
fabianx commentedUploading the new patch:
- Introduced generic FileCache class doing static caching, added protected cacheFetch, cacheStore, cacheDelete functions
- Introduced FileCacheFactory that is informed by the Kernel, which class to use. That solves the Core vs. Component dilemma nicely.
- Renamed all new ApcuFileCache(..) -> FileCacheFactory::get(...)
- Added PhpStorageFileCache and had ApcuFileCache extend FileCache instead.
Ensured that only if APCu is enabled, the ApcuFileCache is used. For normal mixed APC, user fragmentation is a big problem and will lead to worse performance.
APCu added the apcu_enabled() function for this case and we should be using it throughout core for auto detection instead of checking apc_fetch.
I think whats in DrupalKernel right now belongs in Settings, but besides that I am pretty happy with the implementation as we don't leave users not having PHP 5.5 / APCu / opcache out.
Comment #70
berdirWill review, as said above, I'm not convinced that adding the phpstorage backed implementation here is a good idea. It will need profiling. We're talking about *a lot* of files, are you sure that phpstorage is really worth for caching the .info files, and so on? I think cache writes to phpstorage are expensive, likely more expensive than parsing yml files.
And if you don't have apc, then loading from phpstorage is also not that fast.
I'm happy to have an implementation that shows that it works, but I'm still -1 to adding that here and not in a follow-up, where we can profile that it is actually worth doing.
Comment #71
alexpottThis issue will potentially save a tonne of schema file parsing #2432791: Skip Config::save schema validation of config data for trusted data
Comment #72
fabianx commentedIndeed, I currently ponder on the static cache as it makes memory way worse and that affects the installer (that I am trying to fix) big time.
Comment #73
berdirI know, but without that, it won't help much on e.g. drush installation, unless you enable apc on cli (which would then just be a glorified static cache as well, and it would use memory too :))
why not let the testbot run this, btw?
Comment #74
fabianx commentedI played around with all use cases the FileCache will ever have to be used for and used a configuration array similar to PhpStorage now.
I shortly played with multiple cache backends, but now decided for a ChainedFileCacheBackend instead.
This means you can do e.g.:
where the file_cache_ro would in this case be a pre-generated cache item.
I played also around with a FileCacheCollector class, but that is a follow-up as the locking and end-of-request writing is more tricky than one thinks (and probably not even needed).
But I found it important to allow easily in the future to do:
$this->fileCache = FileCacheFactory::get('extension_discovery', [ 'class' => '\Drupal\Component\Cache\FileCacheCollector' ]);
which makes the module info be just one cache fetch.
Neither PhpStorageFileCacheBackend nor ChainedFileCacheBackend need to be discussed here, but they have been included to show that indeed all use-cases are covered.
This can now also nicely be used to generate caches and pre-seed the system cache/ for cold caches.
Pre-seeded cache/ saves 7-16 s (of 75 s) during installation for me.
Comment #76
fabianx commentedSilly missing $configuration = [];
Comment #78
fabianx commentedComment #79
fabianx commentedComment #81
fabianx commentedComment #82
fabianx commented14.2 minutes for testbot, memory requirements have been good in my tests and with the last patch the InstallerKernel _could_ for non-testbot cases exchange FileCache with NullFilleCache, which could still cache in APC, but not in static memory.
Task list remaining here is:
- Figure out how to detect APC on testbot, but APCu on normal machines (user apc cache leads to very bad fragmentation in APC (not APCu) itself due to the double nature of the cache). For now can leave the general used version of function_exists('apc_fetch'). [Followup: @todo-insert-link]
- Review the patch / general architecture and then strip out all non-APCu for now, create follow-up for PhpStorage, which I think is still worth it from my benchmarks.
[Followup: @todo-insert-link]
- Create follow-up for static cache building [Follow-up: @todo-insert-link]
- Create follow-up for caching annotations in FileCache, too [Follow-up: @todo-insert-link]
- Create follow-up for possibly caching loaded classes in a FileCacheCollector - that gets simple to do now.
This issue now needs reviews and once we are happy with the big picture I will get the other parts stripped and moved to the follow-ups (to be created).
Comment #83
dawehnerJust some drive by review ...
Did we considered to write them later, using terminate for example? This would also make the fetch() code easier to grok
Nitpick: What about name them
$remaining_cids?Mh, this will be potential problematic in conjunction with #2304461: KernelTestBaseTNG™. In other words, aren't the filepaths coming in really limited in its possible values? (trying to be the devil advocate here)
Should we support a fluent interface? I can't think of many usecases where you want use multiple in a row.
Comment #84
fabianx commented1. Possible, but ChainedFileCacheBackend is only there as an example for now and writing at end of request has its own problems, but on a cache miss we would have written anyway (without the chained backend would have been a cache miss), so this should be fine for a simple implementation.
2. Will rename, sounds good.
3. Follow-up, I think we can resolve stream-wrappers to files, but currently files coming in are resolved paths.
4. This is the APCu interface, which works well enough for all our use cases - its also kinda what ircmaxwell proposed in his PSR-6 'cache interfaces need to be simple' rant.
Comment #85
larowlan@ should be inside the {}
nitpick: needs blank line before the final }
>80
wrong class name
optional goes on second line
https://www.drupal.org/node/1354#param
I think 'later cache' is better than 'cache later'
whitespace
needs namespace
an array
Are we doing this here? We could make $cid keyed by $cid with values of $realpath for simplicity
any reason why we're mixing statics with an instance? can't we do $this->cached?
needs a blank line before the }
missing public/protected scope modifier, given there is a static setConfiguration, this would be protected yeah?
optional should be on newline here too
these @ should be inside the {} too
we can't non-statically assign a property in a static method? shouldn't this fail?
do we have an issue for this?
Comment #86
dawehnerWell, otherwise we can't cache entries during multiple container rebuilds, like in a CLI installation / test run.
Comment #87
larowlanthanks, makes sense
Comment #88
larowlanWe should add a comment to that effect
Comment #89
wim leersWOW!
Comment #90
fabianx commented#85: Thanks for the review. All fixed.
- Still need to create issues, but d.org was down yesterday when I wanted to do that.
The interdiff should address all points.
Comment #91
larowlanIs this change unintended?
Comment #92
fabianx commented#91 Yes and no, this is for the follow-ups to be created, but yes that @todo chunk could be ignored for now.
Comment #93
dashaforbes commentedRemoved commented out code and added issue for todo.
#2447753: APCu detection erroneously succeeds if apc.enabled is "0"
Comment #94
larowlanThis needs to be split over two lines (sorry)
Comment #95
dashaforbes commentedComment #96
larowlanComment #97
kim.pepperCreated followups for #82
Comment #98
Anonymous (not verified) commented+1 to commit.
Comment #99
neclimdulFrom the summary, this seems like a very big solution for a very small problem. This may just be a problem with the summary though.
Does this have any impact on normal site operations?
Can we get an update of the summary with the performance impact of the patch? Reading the comments it wasn't clear to me though comments on an earlier patch did mention a memory time tradeoffs. Is that still relevant?
The file cache code is a neat idea though. Seems like introducing a system like this should have its own issue and own tests. Especially in Beta.
Sleeping in unit tests? *sigh* Is there a better way like maybe forcing an empty cache backend in the test?
Comment #100
fabianx commented#99:
1. Yes, will update the IS.
2. Yes, benchmarks are very favorable for cold cache performance, e.g. a 18% performance improvement for the installation with pre-seeded cache.
- Testbot is consistently near 14 min 20 secs. (from before 15 40 or so).
3. Yes, there is an impact when clearing caches, especially for everything container build related or discovery related.
4. There is a memory trade off for the installer, but there is a follow-up that I need to update (its titled wrong) to use a FileCacheNoStatic for the installer to help with memory requirements.
5. Added 'Needs tests' label, I assume unit tests are enough.
6. This is likely no longer needed, depending on if Unit tests go via the DrupalKernel or not. With the FileCache being static and the conifguration being injected by the Kernel - if its not initialized it will use just a static cache, so should probably no longer be triggered in unit tests.
Comment #101
berdirI'm not sure where you see anything about unit test, that test explicitly says that it is a UI test? I don't see another solution there, the test explicitly creates and changes the same file in the same second *and* does so in CLI and then uses it through apache, those two apcu caches are not and can not be in sync. That's a test-only scenario, we have a few similar existing cases elsewhere. There's nothing that can be reset, because the test runner has no way to access apcu in apache.
And yes, this is a big performance improvement for normal sites too, during router rebuilds and cache clears for example. So it will be faster for developers and your live site is faster available again when you do things like installing modules/running updates/clearing caches in general.
Comment #102
yesct commenteddo we need a change record also?
Comment #103
berdir@YesCT: I don't think so, nobody will have to change their code, it's a new feature they could use, but that is relevant to only very few people/modules, those that do their own custom parsing/processing based on data in files.
@Fabianx: Would be great to get that issue summary update to get this in, I can try to help and work on specific parts.
Comment #104
fabianx commented.
Comment #105
fabianx commentedUpdated the issue summary and here is the test coverage.
- 100% for anything not ignored.
Comment #106
fabianx commentedComment #108
fabianx commentedPlease disregard 105, this is a re-post with fixed phpunit.xml.dist:
Updated the issue summary and here is the test coverage.
- 100% for anything not ignored.
Comment #110
fabianx commentedThis needs to be $filename.
Will fix ...
Comment #111
fabianx commented- Fixed test failures by renaming from 42.php to 42.php.txt for the PhpStorage Mock Fixture.
- Fixed $filename
Comment #112
fabianx commentedComment #113
berdirDidn't really review yet, need to look at the complete patch again, just noticed this:
We've never added comments like this so far, and it violates coding standards...
Comment #114
fabianx commentedRemoved all @codeCoverageIgnore annotations for now. This is not the place to discuss this.
EDIT: Created #2462037: [policy, no patch] Discuss usage of @codeCoverageIgnore to discuss it.
Comment #116
fabianx commentedRe-rolled
Comment #117
fabianx commentedComment #118
yesct commentednoting what this is blocking in the issue summary.
Comment #119
berdirLooked through the patch a first time since you started refactoring it, looks pretty good.
What's confusing a bit is the naming (and I'm fully aware that a lot of that is the same since my patch, but a fresh look at something helps to rethink sometimes).. the namespace is Cache, but everything inside is about FileCache. On the other hand, we have the FileCacheBackendInterface, but the implementations of that really have nothing to do with File at all.
Suggestion: Rename Component\Cache to Component\FileCache. Then we avoid possible confusions by having two Cache components, and FileCache is a coherent thing that other projects might be in interested in as well, if we ever manage to split our components.
I think that also solves the naming problem with FileCacheBackendInterface, it's just the cache backend of the FileCache component then, that's fine.
Thoughts?
Will do a code review ASAP, noticed a few things...
Comment #120
dawehner+1 for thew new naming.
Comment #121
fabianx commented+1 to the new naming as well, that also had been my thoughts, but I did not wanted to re-factor the original more than needed.
Thanks!
Comment #122
neclimdulI really appreciate the effort you guys are putting into this. This is a really hard problem, some really smart code and I don't want the rest of this review to sound like I'm discounting any of that effort because I'm not.
The my first concern is my gut is telling me we're really not addressing the root problem here. Or not clearly explaining what the room problem is because I don't understand the specific problem or problematic code we're addressing.
Related to that, there is a thread through this issue that this issue is related to decreasing memory. We've marked it as blocking the decrease memory issue. But matching the benchmarks done this far locally drush si generally increasing the max memory usage several k. I dropped some hacky max_mem calls into the installer and index.php and ran through a couple times and the values where a lot more erratic but show a similar trend. Enough that if that's a metric for this issue we should clear it up.
Additionally, the summary talks about "standard install we call FileStorage::read 1772 times but we only have 1244 yml files in all of core". The DefaultPluginManager already caches all our parsed plugins so either those calls where purposeful cache rebuilds or there is a bug in the plugin manager caching code.
One of my tests I did was lopping over drush si. In that test, the wall time for each install decreased and average of 8 seconds with this patch. That's actually really impressive and was actually decreased by 20%. Poking around though just re-using the symfony parser instead of instantiating a new one every time in Yaml::decode() I shaved 3 seconds off the same loop or 6.5% off install times. Pecl Yaml without this patch decreases the same test 15 seconds or almost 38%.
Aside from that, the architecture makes me uncomfortable. We're adding a lot of complexity with an entirely new caching system.
Comment #123
berdirThanks for writing down your concern, that helps me to write a write a useful issue summary. And no offense taken :)
Here are some notes to start addressing your questions and concerns.
* The main problem that this is addressing is performance on cold caches and cache rebuilds.
* There is nothing wrong with the caching in DefaultPluginManager, but we invalidate those caches quite a lot. But 99% of the files that we are then parsing again haven't actually changed.
* The most obvious place is the installer. After every module, we invalidate all plugin manager caches and rebuild the container, because almost all modules add a few plugins, services, routes, links, .... And don't underestimate the impact of a faster installer. I've been working on a large D8 install profile for a year now with multiple other developers. We usually re-install multiple times a day, sometimes multiple times in a row, to get something working. That's a lot of wasted time, because you really can't do much while an installer is running. With this patch, the installation takes 160s currently. Without it, it's 230s.
* However, it also affects running sites. Quite a few administrative operations require router rebuilds for example, like saving views. It's pretty common on big D7 sites that saving a view results blocks the site for a minute.
* It also helps to mitigate existing performance issues. I've noticed in new relic that system_rebuild_module_data() (reparsing all .info.yml files) runs on user/people, taking 5% of the execution time there. When I did some profiling on a vanilla D8 site, it was actually 50%. That also has other factors, like much slower queries due to a few 10k users, but still.
* There is no way to flush caches because it's not needed. The FileCache looks at the file modification time, when that changes then we automatically invalidate our cache. I've included this patch in our installation profile since december and we're actively developing, changing routes, services and info files and we never had a single problem with it.
As you can see in the created follow-ups, there are multiple ideas for further improving the caching and using it in more places, like:
* Ship with pre-generated caches, so that not even the initial parsing is required for all the yaml files that core provides (that will never change unless you update core)
* Use it for annotation parsing
* Improve YamlFileLoader so that we can cache built service definition objects and not just the raw data.
Comment #124
fabianx commentedThanks berdir, even now static pre-generated caches can already be used to speed up installation even more (provided you have already installed once).
This works like this (could also be stored in /tmp or somewhere) using a persistent backing store, but APCu as the active backend.
Note, I also used this already successfully with read-only storage, e.g.:
If its outdated, it will just not be used anymore - thanks to the mtime :).
--
I also created two missing follow-ups:
Comment #125
neclimdulThanks,
That sums up almost all cache rebuilds. They are expensive by definition.
As a full time developer and someone actively reinstalling d8 multiple times a day I'm definitely not but I have clear priority to site performance. Also, developers can easily use PECL on the other patch if that's our audience and it has a bigger impact.
Can we quantify the effect on D8? I was under the impression cache_rebuild was designed to limit this.
That sounds like its own problem. 1) why is it running there? 2) Even with this isn't it going to be overly expensive? If its taking 50% of the time and best case we knock of 20% of that isn't that still unacceptable?
That's definitely re-assuring!
Comment #126
fabianx commentedHere are the patches with the rename.
FileCache being its own namespace under Component feels way better and actually I found two bugs, where I had made the confusion myself.
Comment #127
klausiJust for completeness here:
This is not true anymore since we are building new route information in the kernel terminate event, which is AFTER the response was sent to the user. So D8 sites do NOT block on router rebuilds anymore. This was implemented in #356399: Optimize the route rebuilding process to rebuild on write.
Comment #128
klausiBerdir pointed out in IRC that the Kernel terminate event does not actually work on Apache mod_php that way. The response is blocked until PHP is done with everything and exits.
However, if you use FastCGI then kernel termination works as advertised after the response has been sent to the user. The Symfony Response class invokes fastcgi_finish_request() for us, which closes the connection. Yay!
Lesson learned: don't use Apache with mod_php.
Sorry for the slightly unrelated distraction, carry on.
Comment #129
neclimdulOk, thats good to know. Looking at RouterRebuildSubscriber and RouteBuilder it looks like that is only true for the modifying request not the entire site though. Is that correct?
Comment #130
dawehnerExactly, that was the main idea of the optimize route rebuilding issue. At least for the case of views we even plan to rebuild way less often than before, see #941970: Only set router rebuild needed when something related to routing actually changes
Comment #131
berdirOk, I've been testing this a bit, the current patch is more or less as fast as my older versions.
Based on xhprof, #2447803: Use FileCache for caching discovered annotations should save another 12% (20s) or so.
Also, consistent with earlier numbers, without xhprof/xdebug, the installer is twice as fast, but this is still a big improvement 85s instead of 120).
This docblock is confusing, because there is nothing about files or modification dates in this class.
Maybe something as sinmple as "APCu backend for the file cache.", similar for the others?
If this is pluggable then we should have an interface for it.
I don't remember if $identifier was my name for it, but it is confusing. collection or prefix might be better?
Do we really need to allow so many variations? Why not enforce the array structure in the factory? And if you make it a class, why not set 'class'/'configuration' keys? Or actually, just pass in two arguments to the constructor?
Let's add the issue reference here.
Why not set this default up in FileCacheFactory when configuration is empty?
unrelated change.
Looks like this wasn't moved.
Wim is going to be happy ;)
Another comment, I assume this implementation is used for tests? Should we name it so then?
Comment #132
catch@klausi #127/#128 while the router rebuild will block the request that initiates it with apache/mod_php it will still not block any other requests to the site - whereas in Drupal 7 it does (end of request vs. flush and rebuild with a lock). So either way we should be better off with router rebuilds in terms of scaling/lock stampedes, but it will be worse for people triggering them with apache/mod_php (although for single user sites, the person triggering also gets the rebuild anyway next request, so that's less of an issue).
@neclimdul yes we're adding a new cache layer so that we can improve cold cache performance. We'd not be doing this if we'd chosen JSON as our file format, but given we're using YAML for all the things (and it'll be more in 9.x), I'm not sure this can be avoided. The PECL extension is good for people who understand the problem, but not for everyone else.
Comment #133
fabianx commented#132: That is an indirect nice idea :). We could add a JSON file storage, too, which might be faster than PHP Storage.
Comment #134
wim leers:D I indeed approve! MOAR LLAMAS!
#132/#133: hah, PHP to read JSON to make it faster for PHP to read YAML, all to read values PHP should use. Wow.
Comment #135
catchI'd had the indirect idea in #133 directly, although I worry a bit about JSON vs. PHP vs. YAML serialization. That might be a non-issue in practice.
The advantage of JSON over PHP would be that since it's not a PHP file, we'd not have any of the memory overhead that's inherent in loading PHP - especially since these files will (usually?) be opcode cache misses.
Comment #136
webchickTagging.
Comment #137
amateescu commentedRerolled and fixed everything from @Berdir's review in #131. The only thing to mention is that
PhpStorageFileCacheBackendwas not moved because it usesPhpStorageFactorywhich is inDrupal\Core.I also had the idea of doing a single
stat()call instead of onefile_exists()and twofilemtime(). Even iffile_exists()populates the stat cache with everything thatstat()would, we still potentially save a couple thousand stack calls this way.Comment #139
amateescu commentedOk, that doesn't work :(
Comment #141
fgmThe errors happens in
DrupalComponentTest::testNoCoreInComponent()becauseFileCacheFactoryusesDrupal\Core\Site\Settings, but Components may not use Core.Comment #142
amateescu commentedRight, that's why @Fabianx did the initialization in DrupalKernel in the first place. Just talked to @Berdir about it and we have no other choice than to revert that change.
Comment #144
amateescu commentedSilly me :/
Comment #145
berdirurl should be indented by two spaces, to make it clear that it belongs to the @todo.
Everything else in the interdiff's looks fine to me.
Comment #146
amateescu commentedAh, didn't know what was up with those extra two spaces.
Comment #147
fgmJust took the patch through xdebug for a few passes, and the logic seems sound. However, I think there might be a way to improve it on three counts:
DrupalKernel::createFromRequest()as is currently done means the FileCache is instantiated on all code paths, even cached pages, where it will not be used. Initializing it after the reverse proxy step, if possible, could reduce the work on page cache hitsrealpath(), notably inFileCache::getMultiple(), andFileCache::set()(from thefindAll()final loop), and that function can be costly in terms of IO (and waits) if therealpath_cacheis not set to a value appropriate for the amount of work D8 requires from it, which is often the case. For instance, during a "drush cr", if I set my realpath cache size to 1MB, I find that at the end of the flush, 290kB are used, while the PHP default is 16k, meaning thrashing in that overused cacheComment #148
berdirThanks for testing.
(you should use numbered bullet points :))
1. Sounds like a good idea. Having it in the factory would be even nicer because we would only have to initialize it when we actually need it, but we can't..
2. Yep, that's interesting. Is that the only place where we call realpath() or does that already happen somewhere else too for those files? (does the cache size change with and without this issue?) We should probably document this somewhere, maybe even check in the status page, but we could do that in a follow-up?
3. Yep, that's #2422815: Don't initialize the discovery object in plugin managers, unless needed and this issue doesn't really change that.
Comment #149
znerol commentedWe use a modified version of the DI
YamlFileLoader.phpof Symfony. It includes a static cache for parsed yml files. I think it is great if we can improve on that - it does not make that file less maintainable (should be updated anyway: #2375339: Update DependencyInjection YamlFileLoader for Symfony 2.6).Should be
@varWe should use 4 space indents here because that file uses Symfony coding standards.
I do not think that it is worthwhile to hack this into kernel only because of that. Please add a drupal specific factory instead, the rest can remain a component.
If the file path is known, use touch() with a date in the past instead of
sleep().Comment #150
amateescu commentedDiscussed #147.1 with @dawehner, @Berdir and @fgm and the conclusion is that there is no other better place for that code so we'll have to leave it where it is, in
DrupalKernel::createFromRequest().I also looked into #147.2 and, according to this blog post, every operation that involves a file on the disk will result in a PHP
realpath()call (which will be cached in the realpath cache), so this patch is not causing any more trouble in this regard than HEAD.The only thing we could micro-optimize is to not call
filemtime()twice inFileCache::getMultiple().Comment #151
amateescu commentedOh, and we already have a documentation page about this: https://www.drupal.org/node/2219781 which is referenced by https://www.drupal.org/node/2602
Maybe we should update it to say that Drupal 8 needs a value of at least 512K or 1M for the
realpath_cache_sizePHP setting.Comment #153
amateescu commented@znerol, thanks for the review :)
Fixed #149.1 and 2. The third point was discussed a lot yesterday evening and the conclusion was that we cannot have a core-specific factory because it's also used by other components, so we can just move the code to
DrupalKernel::boot()because it makes more sense to have it there than inDrupalKernel::createFromRequest().As for #149.4, that was already discussed in #55-#57 above and apparently @Berdir thinks that using
sleep()there is ok..Interdiff is to #146.
Comment #154
berdirboot() seems like an OK'ish tradeoff to me. It's still equally early, but at least it's out of createFromRequest() as it has nothing to do with the request, and it's still before the container loading/building where we might need it.
Comment #155
dawehner+1 for moving it to boot() away from
prepareFromRequest()Comment #156
znerol commentedOk, I'm less unhappy with the initialization code in
boot()instead ofcreateFromRequest():)Some nitpicks, mainly looked at the tests for the now:
Missing newline before closing brace.
This is still off by two characters.
This is extremely confusing. Once the class name is the array key, once the array value. If this is correct, can we at least have a comment here?
Not sure whether this is intended or not, store it into the outer backend and retrieve it from the inner?
This is strange for a test. Why do we need to update existing configuration - but only sometimes? Why can't we just set it to the configuration we actually want to test?
Comment #157
fgm+1 for placement in
boot()too instead ofprepareFromRequest(), as it does not depend on the request.Comment #158
amateescu commented#156.1, 2 and 3: FIxed.
4) that test method only tests
fetch()so mockingstore()was not needed at all.5) that's because
FileCacheFactoryTest::testGetConfiguration()depends ontestSetConfiguration()and we don't want to lose the configuration that was set there when we go throughsetUp()fortestGetConfiguration().Comment #159
fabianx commentedChiming in here:
- boot() is fine with me. Thanks @all.
#156:
3. I did not want to type so much!!! I am perfectly fine with the new pattern in #158 though. Whatever works best for us! :)
4. Yes this was intended in that way and tests the intended chain loading behavior.
5. I was mainly worried about loosing the default kernels setConfiguration() values, as interfering there I think would lead to bad results.
Alternatively we could also say that we want to run this Test class in an isolated process, then we can set all configuration - including the default, but I wanted to avoid that and don't know how reliable it is ...
Comment #160
amateescu commentedSmall self-review.
Comment #161
berdirOk, I think this is finally done.
I started this originally, but this has been reviewed, improved and tested by many different people, and nobody seems to feel bold enough to RTBC, so I'm going to do it. There isn't much of my original code left.
There were some reservations by @neclimdul a while back, but I hope I could address them a bit. As we noticed above, doctrine actually includes a very similar annotation parser cache that is using files and also relies on the filemtime() of the original and the cache files. We do it a bit differently, but the concept is the same. So I'm not the first person to come up with this idea.
Comment #162
fabianx commentedI hereby RTBC #126-#160, all changes look great to me!
As soon as someone RTBC's the whole patch, we can all enjoy a 1:30 - 1:45 minute faster testbot (on average) :).
X-POST: Yes, RTBC!
Comment #163
alexpottThis is mixed - not an array.
Not used.
Component
Comment #164
fabianx commented#163:
1. Good catch, even though FileCache only uses it as an array() with mtime, data properties, that is an implementation detail and not part of the interface.
2. Correct, good catch
3. Correct, good catch
4. There is a follow-up to discuss to make that the default when APCu is not available
(especially important if we distinguish between APC and APCU extensions - fragmentation for APC with using user cache is still a BIG problem on PHP 5.4 with just APC)
In my benchmarks PHP Storage was still way way way faster even without opcache than Yaml parsing (and with opcache about the same except for unserialize overhead, which is implicit in APCu rather than explicit).
There is also a follow-up to discuss to pre-generate things and load them in chain loaded fashion (e.g. read-only archive as 2nd level cache).
5. To be able to support a read-only archive in the future we have that, where reading imported data still makes a lot of sense.
Comment #165
amateescu commentedFixing #163.1-3. 4) and 5) need feedback from @Fabianx and/or @Berdir :/
4) is basically asking why are we introducing the PhpStorage file cache backend in this patch since we're not using it (yet).
5) is saying that we're putting a module's config install yaml files in apcu but we're only using that apcu cache when we install the module, which is a very rare operation in real-world cases, only the testbot would actually benefit from it.
Comment #166
berdirneeds review I think for the patch?
5) I think we're also using FileStorage when doing a config sync, where usually 95% of the files didn't change, so I think that's also useful there? We're not explicitly adding wrappers elsewhere, just happen to have them already, I think?
Comment #167
alexpott5) yes but if we static cache the reads - as another patch does then this is moot - plus putting files in a modules config/install or /optional folder seems unnecessary on 99.9% of sites this files are read once.
Comment #168
amateescu commentedAfter further discussion in IRC with @Fabianx, @alexpott and @catch we agreed to leave the PhpStorage backend and the config part to followups.
Comment #169
fabianx commentedThese files should be removed as well.
@amateescu: Can you create the follow-up issues for the removed code?
Comment #171
amateescu commentedUgh, the interdiff was good but I didn't upload the correct patch :/ Also removed 42.php.txt this time.
Re-purposed #2447797: Add a PHPStorage based file caching for non-APCU enabled sites to add back the storage and opened #2473179: Use FileCache for config storage for the config part.
Comment #172
fabianx commentedBack to RTBC, correct files have been removed now and moved to follow-up.
Comment #173
alexpottWhat about chroot? Is there an explicit reason why we're not using the salt?
Comment #174
berdirThis is in Drupal\Component, Settings and the concept of the salt is in Drupal\Core
Comment #175
alexpottActually @pwolanin mentioned that if we don't use a salt and an environment does have a shared APCu then a malicious user could poison a cache based on guessing filepaths.
Comment #176
fabianx commentedWe can set the prefix in the Factory via setConfiguration(). That should pose no problem.
Comment #177
berdirAs discussed, added a configurable prefix for FileCache instead of the hardcoded one.
Comment #179
berdirSo that broke unit tests :)
Enforcing that they use a null implementation, also had to make a small test to the FileCacheFactory test, because this unsets the value between the test runs, but that is IMHO a very weird way to test this anyway...
Comment #180
berdirSwitched to a separate method.
Comment #181
fabianx commentedThe separate method feels way better.
Thanks so much!
Back to RTBC!
Comment #182
effulgentsia commentedOverall, I like what's being done in this patch a lot. At first I couldn't tell from reading the issue summary why introduce FileCache as a separate thing than our normal cache system, and how adding another cache helps with our "cold cache" performance and memory usage, but I think the key innovation here is that FileCache never gets cleared (unlike normal caches which get cleared on every module enable) so long as the underlying file remains the same. Which is very cool.
I was tempted to commit in order to unblock issues that are blocked on this, but I found too many nits during review. Here they are, but I also have no objections if someone else wants to commit despite them, so leaving at RTBC.
What does "note the reversed order" mean?
This class is not used for anything real in this patch. Should we remove it (and its unit tests) from here and defer adding it until whatever follow-up makes use of it?
I don't understand what "backed by the same cache" means. Was this meant to say "same file" instead?
s/Fetch/Fetches/
s/Delete/Deletes/
s/File Cache/FileCache/
What's a FileCacheCollector?
Could be a follow-up, but I think this doc could benefit from some expansion explaining what "file data" means: i.e., that it can be the literal file contents, or some subset of that, possibly parsed, possibly processed in some way, but that the processing should not vary by conditions that can change during normal application operation (e.g., the data processing shouldn't include invoking event listeners or module hooks) since the cache doesn't get cleared by anything other than a change to the file itself.
What does "name of the cache" mean? Same question for the other methods in this interface that this was copy/pasted to.
s/Store/Stores/
s/Delete/Deletes/
Unused use statement.
Is it ok to use the hash of the global salt on its own? In other places, we usually hash the salt combined with some other string (i.e., in this case, maybe with 'FileCacheFactory'?), so that if the hash is leaked from one place, it can't be used to compromise something else.
Doesn't look to me like this patch uses FileCache for config files. So, which files is this referring to that can change during this test?
Same for here, but also, why 2 seconds here?
Comment #183
berdirThanks for the review.
1. I think I meant that $file and provider is switched there, there might be a better way to describe it or it might no be worth a comment at all.
9. Should be name of the file I think.
13. We are doing basically the same for the new default apc classloader in DrupalKernel::createFromRequest(), so I think it's ok. I did just notice that we have a Settings::getHashSalt() that checks if it's not empty, we could use that.
14. & 15. Yes, those are about config files, we can remove those hacks from the patch, nice catch.
Comment #184
berdirWorking on this
Comment #185
berdir1. Improved comment.
2. Removed.
3. Yeah, I think this comment is more confusing than not having one. It's also wrong, it has nothing to do with simpletest, those almost always run in separate processes anyway. I've just removed it, wasn't able to come up with something that is useful instead. It's static so that multiple calls to the same collection will access the same cache, otherwise it would not help during multiple cache clears.
7. I think "Collector" is a left-over.
8. Added some documentation.
13. As mentioned above, same as the classloader.
14. & 15. Removed.
Comment #186
amateescu commentedThe interdiff looks good to me, great review @effulgentsia :)
Comment #187
alexpottThis issue addresses a critical bug and is allowed per https://www.drupal.org/core/beta-changes. Committed 4192117 and pushed to 8.0.x. Thanks!
Comment #189
fabianx commentedThanks for the commit!
Unfortunately effulgentsia is right and we should indeed use:
instead of just hashing the hash salt. The APC class loader uses the 'drupal.' prefix for that.
There are some other things that I would have changed differently, but I will create follow-ups for those.
Comment #190
amateescu commentedThat sounds easy enough to do here with a small followup patch.
Comment #191
znerol commented+++ b/core/lib/Drupal/Core/DrupalKernel.php
@@ -431,7 +431,7 @@ public function boot() {
- FileCacheFactory::setPrefix(hash('sha256', Settings::get('hash_salt')));
+ FileCacheFactory::setPrefix(hash('sha256', 'FileCache' . Settings::get('hash_salt')));
I think @Fabianx meant
hash_hmac. Otherwise that change does not make sense.Comment #192
znerol commentedOh, I did misread the dot for a comma. But still
hash_hmacis the way to go when usinghash_salt.Comment #193
alexpott@znerol well...
from core/lib/Drupal/Core/DrupalKernel.php
imo we should proceed here and open a follow to change both of these to use
hash_hmac()Comment #194
znerol commentedFair enough.
Comment #195
alexpottCommitted bfcb41b and pushed to 8.0.x. Thanks!
Comment #197
znerol commentedFiled #2474043: hash_salt should be used in combination with hash_hmac() instead of hash()
Comment #198
znerol commentedThe test fails over in #2474043-1: hash_salt should be used in combination with hash_hmac() instead of hash() indicate that we currently setup the
FileCacheFactoryeven if thehash_saltsetting is empty. That will result in a predictableprefix.Comment #199
znerol commentedUse the same condition as the class loader bit in
DrupalKernel::createFromRequest().Comment #201
alexpottCan we open followups rather than re-open this issue?
Comment #202
znerol commentedFiled #2474077: FileCacheFactory::setPrefix() is called even when no hash_salt is present on the site
Comment #203
alexpottReverting this until we get testbot stability - see #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects
APC stats from a bot are attached to this in an image... we're at 100% fragmentation and inserting into the user cache at the rate of 5512.76 cache requests/second. Potentially we'll have to revert the classloader change too - but we didn't see these types of fails after committing that patch.
Perhaps we need to go forward with #2447753: APCu detection erroneously succeeds if apc.enabled is "0"
I'm sorry @Berdir :(
Comment #205
alexpottAPC screenshot
Comment #206
mpdonadioPer http://php.net/manual/en/function.apc-store.php, this can return FALSE when the $data can't be stored in the cache. This may be the source of the problem.
Comment #207
berdirI don't think it's the set() specifically, nothing should break if that fails, the problem is that due to the very recently introduced hash prefix for both the classloader *and* this, we have a huge amount of different caches, every test environment has it's own caches.
That fills up the cache, things constantly evicted from there and have to be written again, and most importantly, the shared memory becomes completely fragmented and very slow, to the point where the performance with apc enabled is much, much worse than when it's not.
Planned next steps:
* Make the prefix properly configurable again
* Set the prefix in the test environment to a stable value, so it is always the same, for all tests. Possibly both for the classloader and the file cache.
* Consider restarting apache after/before a completed test run, to drop out any cached information between test runs.
Comment #208
wim leersAlmost exactly a year ago, @rfay said this over at #1809248-27: Enable apc.enable_cli php.ini setting on testbots:
… so … did this change?
Comment #209
alexpottPostponing this on #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects. That patch will gives up the ability for the entire testsuite to use the same APC Filecache entries. This will massively reduce writes and make cache hit rates fantastic. Also it will reduce the thrashing on the APC cache.
Once that lands we'll need to reroll this patch and include its followups and change the code to use the new method to get the APC prefix. Once we've got the patch we'll then need to check the APC usage after a testbot run to confirm that we're not maxing the 300mb available out. And ensure that we're not too fragmented.
Comment #210
alexpottRerolled to use #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects. Also of interest is #2480541: Bump APC settings
Before #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects we were filling APC caches about 70 times per test suite run. With this patch alone that went up to 80 - with #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects and this patch it is 32 times. So hopefully committing this will not introduce random fails.
Comment #212
fabianx commentedInterdiff looks great, RTBC if tests still pass.
Comment #213
alexpottSo the patch in #210 reached a user cache full 37 times during the test run. Which is way less than was occurring before #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects landed.
Comment #214
webchickGiven that this patch caused testbot issues last time, alexpott is now offline for the day, and Mixologic is about to go home, I propose we wait until tomorrow morning Pacific time to commit this.
Comment #215
fabianx commentedAgree, lets have alexpott commit this.
Comment #216
berdirI would expect that this brings back the performance improvements that we've seen before we introduced the dynamic prefix, because that made it ~1m faster.
Not seeing that right now, testbot runtime for this was only a few seconds faster than it was for HEAD (both around 27min, did we switch to slower servers?)
Comment #217
wim leersYes :(
Comment #220
fabianx commentedThe performance improvement already was mostly gone after #2395143-177: YAML parsing is very slow, cache it with FileCache / #177 as testbot hugely did profit from the config storage caching. Maybe we can exchange the service with a caching one only for test runs as part of #2473179: Use FileCache for config storage?
Comment #221
alexpottScreenshots of apc.php...
Before
After
So APC is not blown by committing this anymore due to #2480541: Bump APC settings and #2474909: Allow Simpletest to use the same APC user cache prefix so that tests can share the classmap and other cache objects - see #2480541-5: Bump APC settings for what this looked liked before upping the APC memory :)
I ponder too about the performance improvement of this patch. Testbot times with the patch are 27 minutes and its 30 minutes without the patch - but can be within normal testbot variation (although I would say the 3 minutes is the extreme end of the variation).
I think I've only worked around the fringes of this patch and therefore I'm in a good place to commit it. Committed e361599 and pushed to 8.0.x. Thanks!