Closed (fixed)
Project:
Automatic Updates
Version:
3.0.x-dev
Component:
Code
Priority:
Major
Category:
Task
Assigned:
Unassigned
Reporter:
Created:
2 Feb 2023 at 16:20 UTC
Updated:
12 May 2023 at 09:59 UTC
Jump to comment: Most recent
We don't have any test coverage which proves that exceptions thrown by subscribers to any Package Manager event will result in reasonable user experience (i.e., proper redirection and error reporting during batch processing).
Write a single test, based around a data provider listing every Package Manager stage life cycle event, that tries to go all the way through the attended core update process and confirms that, no matter which event throws an exception, the user is redirected to the right place and the exception message is reported.
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
tedbowComment #3
wim leersPer @tedbow: Package Manager does not provide a UI, so this cannot be a
package_manager.moduleissue.Comment #4
wim leersI now wonder if this overlaps with #3277034: Unhandled Composer Stager exceptions leave the update process in an indeterminate state — AFAICT this would provide the end-to-end test coverage that that issue would exercise. But perhaps that's overkill?
If 100% of
composerinteractions are happening viaphp-tuf/composer-stager, then that ought to be sufficient.We know for a fact that
ComposerInspectorwill call the installedcomposerbinary, but … we already haveComposerValidatorto ensure that foundational things work fine.So … I'm now thinking that this is perhaps overkill? OTOH, 100% of our current
Buildtests test only the happy path. That just seems wrong. The sole exception is\Drupal\Tests\automatic_updates\Build\CoreUpdateTest::assertReadOnlyFileSystemError(), but that tests an error triggered by one of our validators.Conclusion: I think this would still be valuable.
Comment #5
effulgentsia commentedDo we currently have tests for, or does #3277034: Unhandled Composer Stager exceptions leave the update process in an indeterminate state include exceptions for, a timeout (e.g., due to a short max-execution-time) that occurs during either copying files from active to stage, or from stage to active? That latter one would be especially concerning if it ends up only copying half of the updated files from stage to active.
Comment #6
wim leersI know of no such tests.
\Drupal\Tests\automatic_updates\Functional\DeleteExistingUpdateTest::testDeleteExistingUpdate()comes closest, but does definitely not simulate what happens in case of an execution timeout. (Or for that matter: a sudden power loss, which would behave equivalently.)Comment #7
omkar.podey commentedComment #8
omkar.podey commentedComment #9
effulgentsia commentedI remembered about #3291770: Inform the site admin if Composer Stager's committer failed, possibly leaving the site in a half-updated state, so I asked a similar question as #5 in #3291770-19: Inform the site admin if Composer Stager's committer failed, possibly leaving the site in a half-updated state as well.
Comment #11
omkar.podey commentedPostDestroyEventand thenPreDestroyEventbut the problem i'm stuck on is clearing the errors caused by the previous fired events , is there any way to clear them in a build testComment #12
omkar.podey commentedOkay so from last what I remember, we settled on writing functional tests to observe the hard composer failures through UI, and dropped the idea of having a build test for
patch failing to apply from composer-patchesas it doesn't seem possible as described in #3338667: Add build test to test cweagans/composer-patches end-to-endComment #13
omkar.podey commentedComment #16
omkar.podey commentedPreDestroyEvent,PostDestroyEvent,PostApplyas they are not subscribed by any validators with composer operations.PreCreateEvent, I tried to set the priorities to-999andPHP_INT_MAX - 1but either I run into aStageExceptionor theStageManipulator.StatusCheckEventalso needs to be testedComment #17
omkar.podey commentedComment #18
omkar.podey commentedComment #19
wim leersComment #20
omkar.podey commentedComment #21
omkar.podey commentedComment #22
phenaproximaComment #23
phenaproximaI like the general idea and approach here. I feel like it could be a little clearer in some places...
Comment #24
omkar.podey commentedComment #25
phenaproximaThis is shaping up, but there are still some outstanding things.
The other question is something I must discuss with @tedbow: adding a functional test proves...what, exactly? We already have a bunch of functional tests that prove we catch validation errors during batch jobs. A Composer hard failure is the same basic problem. If that's all we're trying to prove -- that a validation error is treated the same way as a Composer problem -- then this test will do the trick. But I want to confirm that's what we're aiming for here.
Comment #26
tedbow@phenaproxima was asking me about this current approach which I think is good but want to look at to make sure I understand
Comment #27
tedbowI liked the general approach. I left a couple of comments but didn't give it a full review
Comment #28
omkar.podey commentedComment #29
omkar.podey commentedComment #30
phenaproximaSelf-assigning for review.
Comment #31
phenaproximaOK. As I dove into this to try and understand it, my biggest question quickly emerged: why are we only testing some events? ANY event subscriber could run into a hard Composer failure, so why don't we test all events over the full update life cycle?
If it's just that we didn't ask for that when this issue was filed, then I'm asking for it now :) But if there's a specific why we don't test all events, then that will need an explanation and comment.
Sorry to kick this back, but I think this test coverage is useful and we should have it be as robust as we can make it.
Comment #32
phenaproxima@omkar.podey explained why we're not testing all events, way back in #16:
I do not agree with this reasoning. Although it's true that we don't really use PreDestroy, PostDestroy, and PostApply ourselves, external code may very well listen to those events, and interact with Composer. Therefore, we need to prove that we're correctly handling hard Composer failures for those events as well.
As for PreCreateEvent...we need to test that one as well. If it's hard to test, I think we should probably pair on it and do what's needed to ensure it's tested. One way suggested by @tedbow in Slack:
Comment #33
omkar.podey commentedComment #34
omkar.podey commentedI was able to cover all the events the
PostDestroyEventis the only one that doesn't lead to nicely formatted page.Comment #35
phenaproximaThanks for the expanded coverage! I think this still needs some cleaning up for clarity's sake, but this is definitely the right level of coverage I was hoping for.
Comment #36
omkar.podey commentedThe two Events that stand out are
PreCreateEventas it fails onComposerNotReadyExceptionwhere every other event fails onStageEventException.PostDestroyEventas it doesn't lead to a properly formatted page, just on another exception page with messageThis operation was already canceled.Comment #37
omkar.podey commentedComment #38
omkar.podey commentedI know it's still not super clear but it does follow the flow of the events from up to down PreCreate to PostDestroy. I was able to remove another else if but that's about it for now.
Comment #39
phenaproximaComment #40
phenaproximaOK, the current failure is due to a pre-existing bug that this test has turned up! 🎉
Here's the problem: when an exception is thrown by PostDestroyEvent, the batch processor tries to return us to the UpdateReady form. The first thing that form does is try to claim the stage. However, because the stage has already been destroyed, the stage can't be claimed -- and thus, we get a WSOD.
I'm not exactly sure how to fix this, but it's an existing bug and therefore I'm thinking we should simply not test PostDestroyEvent in this issue, and file a follow-up to add that coverage and fix the bug.
Comment #41
phenaproximaComment #42
omkar.podey commentedEverything else looks good also Comment#36
The failure during the
PreCreateEventis aComposerNotReadyExceptionwhile it'sStageEventExceptionfor everything else , it's because of an unhandled call in\Drupal\automatic_updates\UpdateStage::beginwhere it usescomposerInspectorComment #43
phenaproximaOver to @tedbow for final review.
Comment #44
tedbowSee MR comments
Comment #45
phenaproximaAddressed your feedback.
Comment #46
tedbowSee MR comments
I think there was misunderstanding because the issue summary just quoted another issue but did not give the full context. But to be fair I think before this issue was worked on at all the issue summary should have been updated and in general we should not work on issues unless the issue summary actually say what should be done
Leaving assigned to myself because I want to see how the current test coverage overlaps with what we already have.
Comment #47
tedbowBecause there has been so much work on this issue it doesn't cover the original problem I think we should rescope this issue to what it actually covers and create a new issue for the original intent of the issue
\Drupal\Tests\automatic_updates\Functional\UpdateErrorTest::testStageDestroyedOnError()Although that issue has
It is not actually simulating an error in require but PostRequire, So again covering Events not actual error cause we require via composer.
So I think we could probably remove that test and fold it's assumptions into the current test.
\Drupal\Tests\automatic_updates\Functional\UpdateErrorTest::testUpdateErrorsParticularly this section
\Drupal\automatic_updates_test\EventSubscriber\TestSubscriberComposerHardFailure::setEventactually that much different than using\Drupal\package_manager_test_validation\EventSubscriber\TestSubscriber::setExceptionbut allow you to trigger an error at particular event. `TestSubscriberComposerHardFailure` just means this error with because by `Composer validate` but since we catch `Throwable` inStageBase::dispatchI don't think this will cause much of UX difference.Comment #48
tedbowRetitling
also #3346644: If an error occurs during an attended process, delete the stage on behalf of the user is related because our test should check if the "Update" button is present once and "delete existing update" is not when you are redirected to the start after an error. If that is already the case we can close #3346644
Otherwise we can solve it in that issue
Comment #49
tedbowCreated #3354099: Add functional test that proves there is reasonable UX whenever Composer Stager operations have a hard failure for original intent of this issue and assigned to @omkar.podey
The description here still needs to be updated. Removing the old one because we know that is not what the issue does
Comment #50
wim leersI don't understand. The scope is crystal clear to me. The original title
Add functional test that proves there is reasonable UX whenever Composer has a hard failureis more precise than the one you specified in #48.The whole point of this issue was to prove that there was no way that a Composer error could occur without the user getting informed about it through the UI. At any point in the stage lifecycle/during any interaction with Composer. Because that was not yet tested.
This issue discovered that that was not yet the case. Wonderful, that was the exact intent!
The scope ballooned because we tried to fix some of what was not yet working in this issue. That was reasonable at first, but perhaps that became too complex. Fine, then we should limit the scope to not fixing anything and just committing the parts that are already working, and spinning out the fixes into a follow-up.
What am I missing here? 🤔
Comment #51
tedbowre #50
Will explain.
The issue discovered any exception thrown by a StatusCheckEvent subscriber or a PostDestoryEvent subscriber would cause a UX problem. I just happened to be that test forced a Composer related error error. The fact that test forced a Composer exception in particular made it very unclear that any exception would cause this same problem.
To me "whenever" is not clear. It was obviously interpreted by @omkar.podey and @phenaproxima as in any StageEvent subscriber.
But from the original quote from the issue on the issue summary
The test on this issue would not have caught the original problem we were looking to test "patch failing to apply from
composer-patches". This does not happen in a StageEvent subscriber. This happens when we pass control over to composer stager to do the update in\Drupal\package_manager\StageBase::requirespecifically$this->stager->stage($command, $active_dir, $stage_dir, NULL, $timeout);I don't think you can then say that "whenever" meant the StageEvent subscribers and
$this->stager->stage()because then you are not covering the direct calls to\Drupal\package_manager\ComposerInspector::getInstalledPackagesList()in\Drupal\automatic_updates\Form\UpdateReadyand\Drupal\automatic_updates\UpdateStage::beginplaces that also could have a hard Composer failure.I think "whenever" is broad and even if all of these places were intended it is too much for one test.
The fact nobody has mentioned the 2 direct calls to
\Drupal\package_manager\ComposerInspector::getInstalledPackagesList()that I mentioned in this comment(I had just now thought of them) also makes the point that original issue summary did not have enough context and was not clear.Basically "whenever" would have to include every call the public functions on
ComposerInspectorat any point in the production code. If that was the intent then it was not clear.Comment #52
phenaproximaComment #53
phenaproximaComment #54
phenaproximaComment #55
wim leersI want to forever avoid the chaos we currently find ourselves in. So I re-checked this entire issue to make sense of how we got to #46
PostDestroyEventdo not properly inform the user, and that we should fix that in a follow-up → EXACTLY WHAT I WROTE IN #50.The very point of this issue was to a have single test that triggers hard failures at every point in the stage life cycle, to verify generically that there's a reasonable UX in those cases. Everything in #47 points to specific tests for specific edge cases.
@tedbow in #47.4 is right that deleting
composer.jsonto trigger Composer hard failures is not exactly doing the original point. But that's because due to the omission ofcweagans/composer-patchesfrom the issue (forced on us due to DrupalCI limitations), this was the only way @omkar.podey and I could think of to trigger Composer hard failures at every point in the stage life cycle.@tedbow in #51:
composerbinary, we should AT EVERY POINT IN TIME be catching problems around our interactions withcomposer. That's where the title came from.ComposerInspector::getInstalledPackagesList()yet. 👍 Indeed, I cannot find any mention either. But I'd swear this was previously discussed during a pairing session. Mea culpa for not having posted that as a comment too! 😬Conclusion: the problem is that we evolve an issue in some direction. We decide that together, but we don't document our decisions in either code or on the issue. It's all verbal. That cannot happen anymore. It's fine in initial phases for a project, but not sustainable long-term, especially if it's either going to be maintained by others or if it's going into Drupal core.
Comment #56
wim leersPreDestroyEventcase, which is also not yet passing tests. I suspect we need a separate issue for that too?Comment #57
phenaproximaUnclear. I still gotta debug that. It was working before, dammit...
Comment #58
phenaproximaThanks to a brilliant insight from Wim during a Zoom, I figured out why PostApplyEvent and PreDestroyEvent are failing: it's because
StageBase::postApply()callsdrupal_flush_all_caches(), which callsdrupal_static_reset().Our batch processor is storing the error messages in
&$context, which gets reset bydrupal_static_reset()-- basically, because it's a reference, the reset makes it point to memory that's no longer used by Drupal! So when we store error messages in it, we're really just throwing them into the abyss.The solution, IMHO, is to store the error messages in the session, which is not affected by
drupal_static_reset(). The batch processor already stores certain things in the session, so there's precedent for that. This makes PostApplyEvent and PreDestroyEvent behave properly, and renders #3354326: [PLACEHOLDER] Exception messages raised during BatchProcessor::postApply() are not propagated to the error page unnecessary.Comment #59
tedbow@phenaproxima thanks! Looks good!
Comment #61
phenaproximaWhat a nightmare this was to get done. Finally! Now, to the follow-ups...
Comment #62
wim leersSee you in #3354003: If a PostDestroyEvent subscriber throws an exception, users are redirected to a WSOD & #3346644: If an error occurs during an attended process, delete the stage on behalf of the user!
Comment #63
wim leers