Suggested commit message: Issue #2363351 by chx, larowlan: CMI (or anything cached) can'\''t be used from a Service Provider.
Problem/Motivation
The cache_factory service needs %cache_default_bin_backends. This parameter is added by ListCacheBinsPass. This runs after service providers which, as a corollary can't use CMI. LanguageServiceProvider is an example which tries to use CMI and abuses the bootstrap config storage to do so because it can't use the normal service. There's a @todo to replace it with config.storage. This issue is about resolving that @todo.
Another example that if you would like to try to access a module's path in a service provider you get a You have requested a non-existent parameter "cache_default_bin_backends". :
final class MY_MODULE_ServiceProvider implements ServiceProviderInterface {
/**
* {@inheritdoc}
*/
public function register(ContainerBuilder $container): void {
/** @var \Drupal\Core\Extension\ModuleHandlerInterface $module_handler */
$module_handler = $container->get('module_handler');
try {
$path = $module_handler->getModule(MY_MODULE)->getPath();
} catch (\Exception $e) {
throw new LogicException('Unable to identify installation path of this module.');
}
}
Proposed resolution
Add cache_default_bin_backends: null by default and if NULL return NullBackend unconditionally from the CacheBackend.
Remaining tasks
User interface changes
API changes
| Comment | File | Size | Author |
|---|---|---|---|
| #87 | drupal-core-init-cache_default_bin_backends-2363351-87.patch | 3.46 KB | berdir |
| #61 | drupal-core-init-cache_default_bin_backends-2363351-test-only-poc.patch | 2.89 KB | mxr576 |
Issue fork drupal-2363351
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #1
dawehnerWe could just provide an empty
cache_default_bin_backendsinside our core.services.yml file?Comment #2
chx commentedEasy to fix.
Comment #3
fabianx commentedMakes sense to me.
Comment #4
alexpottWhat would happen if a cache service defines a default backend wouldn't this mean that potentially we switch backend during a request?
Comment #5
chx commentedYeah well but what else do you plan to ... do? Cos that thing is a compile pass running waaaaay later than the service providers and alters.
Comment #6
tim.plunkettComment #7
chx commentedNote that in particular the problem with CMI is circumvented by LanguageServiceProvider by using (abusing?) BootstrapConfigStorageFactory::get.
Also, alexpott is totally right. This is not a good approach. Quite probably the solution is to NullBackend it. Proposal in the issue summary.
Comment #8
berdirRelated, https://github.com/md-systems/file_entity/blob/8.x-2.x/src/FileEntitySer... is my workaround for checking if a module is enabled in a ServiceProvider.
Comment #10
chx commentedSo how do we proceed now?
Comment #11
chx commentedWe will add a comment and go with this patch.
Comment #12
jhedstromPatch no longer applies.
Comment #13
chx commentedComment #14
chx commentedComment #15
chx commentedComment.
Comment #16
jhedstromThis fix makes sense to me. I looked briefly at writing a test for this, but couldn't find really any existing service container level tests.
Comment #17
chx commentedComment #20
chx commentedComment #23
chx commentedComment #26
chx commentedTime for reviews! The remaining few fails, as far as I can see are just test failures and not actual functionality regressions, of course I will fix them but the changes will be test only changes. So , the architecture is what it is and time for reviews.
Comment #27
chx commentedComment #28
chx commentedComment #29
chx commentedOps.
Comment #30
chx commentedSuggesting crediting larowlan who pointed out where the kernel service gets defined and thus solved the last failures.
Comment #31
fabianx commentedArchitecture wise this makes sense to me.
Comment #32
catchThe only question I have - if there was a cache set or delete during the compiler pass, then it'll do that on the null backend and be a no-op. In most cases container rebuild = cache clear anyway, so it won't make any difference. Also doing deletes there would be strange.
We might just need some docs why this doesn't matter though.
Comment #33
alan evans commentedSo, I realise my opinion might not have great weight on this issue, but this patch makes sense to me ... I've been using it partly as an excuse to learn some of the moving parts involved. I have one question below, which I think largely stems from a lack of understanding on my part:
Manual testing done:
So here's the one question: I'm sure this is just me missing background on this but what's our guarantee that it's ok to remove all non-synthetic services in the compile method here? Are there no non-synthetic services that are valid at that point? Is it the case that not being synthetic, they'd get automatically reinstantiated later as needed?
I put in some debugging here too and discovered really quite a lot of definitions that are not flagged as synthetic... seems to make sense.
Comment #34
chx commented> So here's the one question: I'm sure this is just me missing background on this but what's our guarantee that it's ok to remove all non-synthetic services in the compile method here?
Yes it's OK, see a bit later why.
> Are there no non-synthetic services that are valid at that point?
Who knows and who cares? It's simpler to wipe them all.
> Is it the case that not being synthetic, they'd get automatically reinstantiated later as needed?
Indeed. Non-synthetic services were created from definition and they can be created again. And this is why we rather wipe them all rather than wonder which one stores a cache bin in a property.
Comment #35
fabianx commented#33, #34:
Already now only synthetic services are persisted from ContainerBuilder to Container, so this is fine.
The removal of persisted services is only needed for KernelTestBase*, where currently the ContainerBuilder is used for testing the system and not the Container for various reasons.
Comment #36
alan evans commentedMany thanks chx and fabianx - that's helpful.
Comment #37
chx commented#35 it's not about services persisted from one container to the next but about services created while compiling. For example, LanguageServiceProvider will create the config.storage service which in turn has @cache.config injected into it. If we didn't remove this service then every config get in this request would use the nullbackend which is very sub optimal.
Comment #38
fabianx commented#37: Yes, but because no services that are created during compiling are persisted from one container to the next, we can be very sure that removing those services from the compiled container that are created during compilation is safe as well :).
Comment #39
chx commentedLet's try this prettier version. I had this before but it failed or so I thought.
Comment #40
fabianx commentedLets rephrase to something like:
Remove all services constructed during compiler passes. Only keep those that have been explicitly set.
Comment #41
chx commentedWell, the code does that -- but why? If you want I can change the comment but I prefer the longer explanation on why.
Comment #42
fabianx commentedWhat I meant is to - lets rephrase to:
The current text is misleading as:
"Remove all non-synthetic services set by compiler passes."
set is wrong, because those are the ones we keep.
non-synthetic is wrong too, because we don't distinguish here between synthetic and non-synthetic.
Does that make more sense now?
Comment #51
larowlanFor #42, assuming this is still an issue
Comment #52
alexpottReading this issue makes me realise we are doing a tonne of unnecessary work. Now that config is stored in the db there's no need to duplicate everything in the cache_config table. We can lose all the complexity and have faster rebuilds and sites after rebuild.
Comment #53
mxr576I have just bumped into this issue, #39 does not apply on >= 8.9. How we should move forward? Should I try to reroll the patch or @alexpott's comment in #52 change anything? Based on my understanding that describes a different issue.
Comment #54
mxr576Okay, so many things have changed since #39 was born 5 years ago, today, it seems only adding this one-liner to core.services.yml fixes the issue. (Test coverage could be still needed to warrant this issue does not occur again.)
Comment #55
mxr576Comment #56
mxr576My fix in #54 seems to be working but I bumped into another interesting issue with Redis which could be slightly related to this one...
Comment #58
alexpottIt'd be great to add test coverage here.
Comment #59
mxr576This MR essentially contains my patch from #54 and also one important addition, a mitigation for a workaround like you can see here:
https://github.com/amazeeio/drupal-example-simple/blob/46d71e3885dc9fca7...
https://github.com/amazeeio/drupal-example-simple/blob/46d71e3885dc9fca7...
The null cache backend has to be registered because by default it is not available. I do not see a reason why it should not available by default, it does not mean it is in use. Although codes like you can see in the description of #3201355 may trigger calls that try to access to null cache backend as a fallback.
I agree although I cannot promise when I am going to have time for that. I could use some help on this.
(First I would like to make sure that the latest changes do not break anything. )
Comment #60
mxr576Comment #61
mxr576@alexpott Here is a minimal POC test that confirms the missing
cache_default_bin_backendsissue exists. This can be only reproduced if we remove the cache_factory override from the KernelTestBase because that change removes what is the key to reproduce this issue:core.services.yml
This test does not prove that
cache.backend.nullshould be always defined, although as I wrote in #59, I do not see why it should not be.Comment #62
mxr576ok, now I need some feedback
Comment #64
berdirThis is also blocking #2575105: Use cache collector for state now it seems, which makes state use a cache and as a result, ContainerRebuildTestServiceProvider, which does use state, fails.
Added some comments about the comments, didn't think too much about the tests but the state issue will add more functional test coverage then...
Comment #65
neclimdulIt probably doesn't make sense to have the null backend definition in sites/development.services.yml if its in core after this.
@alexpott I'm curious what complexity you where referring to. Currently we seem to only be adding things so not sure we really removing complexity. Happy to dig in though, removing complexity is my jam.
Comment #66
mxr576Comment #67
mxr576Comment #69
borisson_This issue is difficult to understand, as most things that have to do with the cache system. @neclimdul asked a great question in #65, that should probably be answered. Because as they said, reducing complexity is great but I don't see this issue already doing that.
I think we need to also add a test that the null backend is available without the development services being included?
Comment #71
smustgrave commentedThis issue is being reviewed by the kind folks in Slack, #needs-review-queue-initiative. We are working to keep the size of Needs Review queue [2700+ issues] to around 400 (1 month or less), following Review a patch or merge request as a guide.
As mentioned this is difficult. But what I can gather is that there is still discussion being done.
#65 needs an answer
and possible more tests from #69.
Comment #72
mxr576Rebased the MR, this is what I could do quickly. I am also pinging @alexpot on Slack to ask if he can answer #65.
Comment #73
alexpottThe complexity I'm referring to is the fact that in core we have both the config table and cache_config table. These tables are basically the same. The config cache really should just be fast cache backend - i.e APCu / memcache / redis. So maybe we're not losing complexity but we losing a layer. None of this is in scope here. I think I just realised this when looking at the code in the earlier patches.
Comment #74
mxr576@borisson_ Can you please explain what you meant by this? The proposed changed _removes_
cache.backend.nullfrom development.services.ymlComment #75
borisson_From #65:
I agree, It was removed in the latest MR from development.services so that is fixed now.
From #69:
We don't specifically test that services are available for any other services, so I actually don't think we need to do it here either.
I only have one more question about the comment next to the
cache.backend.nulldefinition in the services.yml, I'm not sure we need to specify the intended usecase and that comment probably is ok to be deleted. Not holding this issue up for that though.Comment #76
berdir> The complexity I'm referring to is the fact that in core we have both the config table and cache_config table. These tables are basically the same. The config cache really should just be fast cache backend - i.e APCu / memcache / redis.
We already use the ChainedFast backend by default for the config bin, it's only if APCu is _not_ available that it's kinda unnecessary, but we don't have a mechanism right now to automatically fall back to the null backend if apcu is not available, we just always fall back to the default bin then. At best we could document that _if_ you can't use APCu or redis/memcache then you could consider to manually the config bin to use the null backend.
We can't _only_ use APCu either, as that does not support multiple multiple servers and I think even on a single server, there's drush to consider?
Comment #77
mxr576I have removed the comment to eliminate all roadblocks :)
Comment #78
alexpottI think the change to \Drupal\Core\Cache\CacheFactory is unnecessary and should be removed. And if this results in a broken test run then we need to do the work to understand why.
I also don't understand why we need to move the null backend out of development.services to fix this issue. I think that this is a remnant of an earlier attempt at a fix.
Comment #79
mxr576Answered this in an MR comment. TL;DR it is needed, but no Drupal core tests catch the issue that I see on downstream.
TBH, I do not see an issue with having a null service/object registered, but you were right, it looks like it is not needed to make Drupal core tests pass and also everything worked fine on my guinea pig downstream project without it.
Comment #80
berdir> how the "field config import fails when the site is installed from configuration" thing can be covered with tests?
Is this really happening on a clean site with just core and and patches? My guess is that this only happens in combination with specific additional modules/and or patches.
Reminder that #2575105: Use cache collector for state is blocked on this issue and is currently failing. Combining those patches should give us a testbed to test the different fixes and which changes are required exactly.
Comment #81
andypostI looks like it needs another review
Comment #82
berdirThe new development compiler pass is now triggering this reliably on HEAD, will look into it in the next days.
Comment #83
berdirI'm proposing we remove the snippet from CacheFactory for now. It seems unrelated to the main change here, which has been blocking progress on issues like state caching for a very long time.
Lets open a follow-up issue to see if we can figure out when that's not needed. Could be a drush problem as well, try to reproduce with the drupal core UI installer?
Comment #84
berdirOk, I overlooked the part that this appears to be a regression introduced by this issue, but still, without a way to reproduce or understand why it's required, I'm not sure about keeping that change here.
Another thing to debug would be a backtrace when exactly that is needed/called. I did this in the same place as the removed part and got nothing during a regular install:
Not even with the condition removed, which kind of makes sense as the whole class is meant to be replaced during the installer. What do you get?
Comment #86
smustgrave commentedBrought this up in #needs-review-queue-initiative channel and lets try and remove the code in question from #83/#84 to a follow up.
Will admit cache is one of my worst components but want to help keep the issue moving where I can. Also can help test where needed.
Comment #87
berdirPatch against 11.x-dev from the MR. The mentioned code is already removed from the MR and it still applies, but can't change the target.
Best way to test this is combined with #2575105: Use cache collector for state. I'll upload a combined patch there next. This was already RTBC before and the reason it was pushed back has been removed, so I think it could go back to that.
I'm unsure about the follow and if that's really needed. I'd suggest @mxr576 or someone else who can reproduce that problem opens it *if* it can be reproduced, because without a way to reproduce, there's nothing we can do.
Comment #88
smustgrave commentedUnderstood on the follow up.
And seems the code in question was removed that I could tell so I'll remark this.
Comment #90
catchAfter nine years the functional change here is the same as the original patch from 2014. I think we should get this in to unblock the state caching issue, and if we do end up being able to reproduce the config import issues, we can fix in a follow-up or worst case roll back. 10.2 is still some time away.
Committed 28db985 and pushed to 11.x. Thanks!