TL;DR
If a module provides a subscriber for a kernel event, for example (\Symfony\Component\HttpKernel\KernelEvents::RESPONSE), and it gets uninstalled on the UI then the event dispatcher service still calls its event subscriber that could lead to errors.
Problem
Our module provides an MY_ENTITY_TYPE entity type and it has a KernelEvents::RESPONSE subscriber, my_module.event_subscriber.response. The event subscriber retrieves the entity storage of the MY_ENTITY_TYPE in its constructor. What happens when the module gets uninstalled on the UI? The \Drupal\Core\Extension\ModuleInstaller::uninstall() does a bunch of things, but most notably:
* uninstalls the MY_ENTITY_TYPE entity type
* calls the $this->updateKernel($module_filenames); - before that the my_module.event_subscriber.response service is still registered. If you run $container->get('event_dispatcher') at this point and check the registered listeners with xDebug you should see the my_module.event_subscriber.response is not registered anymore.
* after the ModuleInstaller::uninstall() finished the module is uninstalled.
* the reponse for the request is also ready so the HTTP Kernel triggers the KernelEvents::RESPONSE event.
* (surprise) the called instance of the event_dispatcher service in that context still contains the my_module.event_subscriber.response event subscriber as a registered listener so it gets called
* and the request silently fails with an exception in \Symfony\Component\HttpKernel\HttpKernel::handleException() because as I wrote the "The event subscriber retrieves the entity storage of the MY_ENTITY_TYPE in its constructor." but the entity type is not registered anymore.
Proposed solution
After a module gets uninstalled Drupal core should not call its event subscriber (including kernel events). At least, I do not see any use case why an uninstalled module's event subscriber should be called after it has been uninstalled within the same page request.
| Comment | File | Size | Author |
|---|---|---|---|
| #38 | uninstalled-module-kernel-response-listener-fix-3056604-37-8.8.x.patch | 14.34 KB | mxr576 |
Comments
Comment #2
mxr576Thanks for the event subscriber that I added in
drupal-event-dispatcher-bug-poc-test.patchthe\Drupal\Tests\system\Functional\Module\UninstallTest::testUninstallPage()fails because the new event subscriber that I added gets called after the "Module test" module is uninstalled and the thrown exception causes an HTTP 500.Comment #3
alexpottNice find. This totally makes sense. In Symfony land the codebase never changes from underneath you whereas in Drupal land we have the ability to install and uninstall module in a request. Which makes us vulnerable to bugs like this. I think this is another reason why we need to have our own event dispatcher rather then leverage Symfony's as our needs to be module installation status aware.
However one thing that's interesting is that the event dispatcher should be rebuilt on uninstall. So the question also becomes why
Comment #4
alexpottSo looking at the code it's because of
. The http_kernel service is constructed with the event dispatcher at the beginning of the request and there's no way to rebuild this because it's responsible for calling the module uninstall code itself.
Comment #5
mxr576Comment #6
mxr576Well, it is already late Friday here but I may have found a simple fix for the module. I am not asking a full review on this just yet, although anybody can provide feedback about the idea. I'll check this with a clear head on Monday :)
Comment #7
mxr576Well, it was too good to be true...
Comment #8
alexpottAh look at that... the fail shows what we need to do! We need ModuleInstall::uninstall() command to call
Drupal\Component\EventDispatcher\ContainerAwareEventDispatcher::removeListener()on the old event dispatcher prior to rebuilding the kernel.So maybe something like...
Get the old event dispatcher before uninstalling.
After uninstall get the list of listeners from the old service and new service and then the remove the ones that aren't there from the old service... or something like that.
Comment #9
mxr576Well, I already had the failing POC test case when I created this issue, I hoped that I also have a fix now :)
Yeah, probably this is the next best option that I could try because as I can see there is no API that I could use to collect all services defined by the uninstalled module before it gets uninstalled. Registered services does not store a reference to the module that defines time. Assuming that all service ids contain the provider modules' names would be a bold guess.
Comment #10
mxr576Maybe now...
This patch verifies all event listeners not just those that listens on the
KernelEvents::RESPONSE. I was not sure which event listeners should be validated and which not so I choose to verify all. This seemed the safest option.If we do not want to validate all then probably we should add the
KernelEvents::TERMINATEevent to the list as well. Not sure what else could be triggered after the given state of the request.Comment #11
mxr576Comment #12
mxr576Tbc
Comment #13
mxr576\o/
Let's handle when there is no event by type in the up to date event dispatcher anymore.
Originally I fixed the doc block of the
updateKernel()method. Not sure whether I can do that as part of this issue or not.Comment #14
alexpottI agree with private here because it will often end up being the wrong event dispatcher and because of this I think we need to denote this complexity in both the comments and property name. So how about $initialEventDispatcher?
$event_dispatcher :)
This makes me realise we have the problem the other way around. There's no way for a module to be subscriber to the KernelEvents::RESPONSE on the response they were installed on. Which is a bug because potentially they have code that is triggered on such a request that assumes it will be /shrug who knows?!?! Tricky. But logically I think doing the same (but adding listeners) for install makes sense.
$this->kernel->getContainer()->get('event_dispatcher') is mildly easier on the eye than \Drupal::service() because at least we're using an injected (albeit re-injected) object.
Using this as a test is fragile because our only test for this is the absence of the exception being thrown. What we would be better is if this set some state and the test checks that the state is set before uninstalling the module and then the test removes the state and uninstalls the module and checks it doesn't come back. You can leave the exception in to be super sure but having a correlating positive assertion means that even if the events change and we miss this nuance we'll know if we've affected this.
Comment #15
mxr576Thanks for the review!
1. make sense, fixed
2. ty, fixed
3. well, this also came into my mind earlier but first I wanted to solve the task at hand :)
4. fixed
5. Well, works for me, I rewrote the test to use indicator value. My usual problem with the State service that even it could be used as shared storage between
$this->containerand\Drupal::service()in browser tests, it has a static cache where values can stick and they can cause false-positive test failuresUpdated the test only patch.
Comment #16
mxr576Updated patch with the test and fix.
Additional comment regarding the 3, point. I almost submitted the patch when the
core/modules/system/tests/src/Functional/Module/InstallUninstallTest.phphad started fail. It was caused by this error:Took some time while I figured out what went wrong, but it easier if I show the diff between the submitted and the non-submitted patch.
What is the main difference? All those additional tasks that are running before new events get registered. The failure with my original changes was caused by this code because it run after events got registered:
But after I fixed this there was a new failure caused by the unregistered entitites and entity view modules.
So I think the best way is the add the event registration to the end of the install process. Also applied the same logic on the uninstall process, although that has not failed as it was implemented before. Open for ideas what should be the best place to call the event register/unregister methods if not the current ones.
Comment #17
alexpottYeah that is a super super interesting.
One thing I've tried to do is to make the install and uninstall methods kind of mirror opposites of each other. I think the location of the install event registration feels like it is in a good position. You're events are only triggered once your module's hook_install() has fired. So I think we need to do the reverse as well - ie remove the events prior to your modules hook_uninstall() being triggered.
Let's use the work remove rather than unregister. Using the event dispatcher interface as inspiration. Same with the docs and variable names in the method.
Comment #19
mxr576Yes, this totally makes sense. How about this?
Also replaced "register" with "add" based on the same logic.
Comment #20
mxr576Comment #21
mxr576Comment #23
mxr576Meh, of course, it failed. Either the test case needs to be adjusted or the event listener removal must run after kernel update. I rather prefer the second option which the updated patch implements because I guess until the kernel has not updated a removed event listener, which is a service, still can be called as a service.
One possible way to adjust the test is to set an indicator value in
module_test_uninstall()and use that in theModuleInstallUninstallResponseSubscriberto set the other indicator value.Comment #24
mxr576Additional info from our module's test suite. Tests with #18 passed but with #23 failed because the dynamic route subscriber tried to rebuild module related routes when some services provided by the module has been already uninstalled.
So probably the best place to do the event subscriber removal is somewhere in-between #18 and #23.
Comment #25
mxr576Well, this is getting worse and worse...
Managed to figure out why our test has failed. Even if this should ensure uninstalled entity types get _completely_ unregistered from the system but they do not. At least in this service, in the injected entity type manager's _static_ cache they still remain registered. The EntityDefinitionUpdateManager is unable to clear the static cache of a service which is running in a different "scope".
I tried out several ways to resolve this problem. Tried to inject the
EntityDefinitionUpdateManagerto theModuleInstallerand use that inuninstall()instead of\Drupal::entityDefinitionUpdateManager(), etc. I also tried to run\Drupal::entityTypeManager()->clearCachedDefinitions();before and after theupdateKernel()method call in theuninstall()but it only worked this place where I left in this patch.I am starting to feel like that it would be better to completely rewrite this service instead of trying to patch newly appearing issues. There are too many TODOs in this class already.
I guess this new issue with incomplete entity type deregistration would require a new POC test. Should it be handled in a new issue or not? It is also related to event dispatching in ModuleInstaler so it also makes sense to resolve this here.
What do you think about all of this @alexpott?
Comment #26
mxr576Comment #27
mxr576Tried to write a test case for the described issue in #24 #25, but I had no luck until this time. This made me think that maybe I was actually right in #23 and the event deregistration should happen in a different place in the uninstall() method. At least right now I do not see a better explanation for why our internal tests passed with #19 but failed with #23 without the change in #25.
Comment #28
mxr576Additional info to #23. I can move
$this->removeEventListenersAfterModuleUninstall();below in theuninstall()method until it is above the\Drupal::getContainer()->get('plugin.cache_clearer')->clearCachedDefinitions();call our functional test passes. If I move it after the\Drupal::getContainer()->get('plugin.cache_clearer')->clearCachedDefinitions();call then our test starts failing without #25 because theplugin.cache_cleanerclears some kind of caches, probably theentity_type.managerservice's cache which is tagged withplugin_manager_cache_clear.Comment #29
mxr576Next combination of change that seems to be working and somehow it still makes more sense than the fix in #25 although I am still trying to understand why this is working. Basically by moving the
plugin.cache_cleanercall below theupdateKernel()method call theEntityRouteProviderSubscriberstill calls the uninstalled entity type's dynamic route provider, but the cache service defined by uninstalled module is still registered therefore the test does not fail like with #23: https://www.drupal.org/files/issues/2019-05-27/screeshot-2019-05-27-16-5....Comment #30
catchIs it worth trying to consolidate getting the diff of event listeners into another helper method? No rule of three here, but the code is pretty much identical except for the addListener/removeListener calls.
This once again brings up #2237831: Allow module services to specify hooks, have updated that issue.
Comment #31
mxr576Well, I was thinking about this too and I am still not sure whether it makes sense to extract this diff creation to a method where the method that it gets called (addListener/removeListener) would be also a parameter or an anonymous function. Maybe with this refactoring, we would just replace a small copy-paste like-ish code to something that causes bigger complexity and it is not easier to maintain.
Comment #32
catchOn this diff, we should double check that the order hasn't changed before by looking at the order of the file. Also we should add a comment explaining that the plugin manager cache clear is explicitly after the kernel update, possibly with a @todo to a follow-up to investigate further why it's necessary.
Comment #33
mxr576Also sharing a conversation between me and @catch from Slack here that relates to #32
So I did some digging:
* The latest commit that changed the
\Drupal::getContainer()->get('plugin.cache_clearer')->clearCachedDefinitions();line was this commit: https://github.com/drupal/core/commit/c6fb130bd0b0d272cbf16e95579181f1dd... which is related to this still open issue: https://www.drupal.org/node/2380293* This is the previous commit https://github.com/drupal/core/commit/4646c059ea07ca013a25648e2e6048ffb1... but it is also related to the same issue
Also added the TODO item to the patch.
Comment #34
mxr576Comment #35
mxr576Bumped version, although this should be backported to 8.7.x.
Comment #36
catchI think we can probably just have the first line of this comment (also missing a period). i.e. drop 'why that causing trouble with contrib modules'.
On #33 I did some extra investigation with
git log -S " \Drupal::getContainer()->get('plugin.cache_clearer')->clearCachedDefinitions();"This was added in #2155635: Allow plugin managers to opt in to cache clear during module install and hasn't moved around since, so I think we're probably OK to move it now.
Comment #37
mxr576Thanks for the review @catch, I fixed what you suggested.
Comment #38
mxr576Comment #39
larowlanCan you provide some more detail about the module you're having this issue with? Is it on drupal.org - I'd be interested in the service that's causing the issue. I've had similar problems in custom code and have been able to get around it with some defensive coding. My preference would be to do that instead of introducing the complexity this requires. We can't be sure there aren't side-effects to this, as there may be some modules that rely on things occurring the way they do now (whether or not that is correct), i.e just because it creates issues for some modules may not be the case for others.
we need to provide a BC layer here - make the argument optional and if its not provided, use the \Drupal singleton to fetch it and trigger_error with a deprecation warning
this typehinting change is a BC break - yes its correct, but unfortunately we can't do it without risk of breaking sub-classes, so let's avoid doing it
we shouldn't be relying on this anti-pattern, the _serviceId is not part of the API
Comment #41
macherifI am still having the same issue. May be declared events should use plugin_manager instead of been stored on DB.
Comment #42
mxr576Added [PP-1] Fix leaky and brittle container serialization solution as it describes and tries to resolve the issue with
_serviceId.Comment #49
borisson_The issue that was linked in #42 is now resolved, we can pick this up again.
The remarks in #39 needs to be picked up and we need to reroll the patch. I'm actually not sure if #2531564: Fix leaky and brittle container serialization solution actually fixed this as well.
Comment #50
andypostThe issue still makes sense as when modules uninstalled via UI the remaining request processing still working with container which have services/listeners initialized. And overtime complexity of contrib is growing so it will hit early or later
OTOH agree with #39 about complexity and overhead it brings
A compromise could be to add some defensive code to core, at least to give a hand to contrib to deal with it
Comment #52
mstrelan commentedI think this belongs in the extension system component, since ModuleInstaller is in the Drupal\Core\Extension namespace.