Problem/Motivation

This was discovered by Wim in #3338666: Add functional test that proves there is reasonable UX whenever a stage event subscriber has an exception. Copying here for posterity:

th[e] stack trace [of an exception raised during batch processing] should also have been logged, to make diagnosis after the fact possible. We should assert that this indeed happened. Otherwise post-mortems in the real-world (where updates will be installed automatically, and hence no user will be present to witness the error) will become impossible.

This was discovered during form updates but we want exceptions from the event subscribers like this to be logged however they occur.

Proposed resolution

  1. In \Drupal\package_manager\StageBase::dispatch() we should log any error with the backtrace.
  2. Add test coverage in StageBaseTest for the logging
  3. After 1) we should be able to remove the logging that would now be duplicated in StageBase

    For example

    We logging at the end of the catch after

    $stage_id = parent::begin(['drupal' => $target_version], $timeout);
          $this->stage();
          $this->apply();
    

    We would no longer need to log this if the exception was a StageEventException instance because this would already be logged.

    try {
          $this->postApply();
        }
        catch (\Throwable $e) {
          $this->logger->error($e->getMessage());
        }
    

    Would also not need to be logged if it is instance of StageEventException instance

      try {
          $this->destroy();
        }
        catch (StageEventException $e) {
          $this->logger->error($e->getMessage());
        }
    

    Looking at the comment before this I think we could remove the try/catch around this all together

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

phenaproxima created an issue. See original summary.

wim leers’s picture

Without this, figuring out what actually happened on real sites would be impossible. Bumping to Major.

wim leers’s picture

Issue tags: +beta target, +Usability
yash.rode’s picture

Assigned: Unassigned » yash.rode

wim leers’s picture

No commits yet in >36 hours. Everything going okay here, @yash.rode?

yash.rode’s picture

I was wrong yesterday that PreDestroyEvent exception is not logged but the case is both PreDestroyEvent and PostDestroyEvent Exceptions are not getting logged, I tried debugging it, as we don't know where exactly the error messages are getting logged I could not find it why it is happening.

tedbow’s picture

Status: Active » Needs work
Issue tags: +Needs issue summary update, +Needs tests

In the quote from @Wim Leers in the summary we have

Otherwise post-mortems in the real-world (where updates will be installed automatically, and hence no user will be present to witness the error)

but in the suggested solution it says we should change UpdateErrorTest::testExceptionFromEventSubscriber(), which doesn't exist anymore since #3354325: Consolidate UpdateErrorTest but was a functional that test of what happens when the user does the update manually so won't cover "no user will be present "

Also the current MR changes src/BatchProcessor.php which also won't affect cron updates.

This is another example of why we should not make issues that simply quote another issue as the "Problem/Motivation" because is unclear unless you really dig in the other issue what the scope of the problem to be addressed is.

phenaproxima’s picture

@tedbow, to me this was perfectly clear. If an exception occurs from an event subscriber during batch processing, we want to be sure the exception and its backtrace are logged. Although the issue summary refers to a non-existent method, the gist is still correct, and what we're trying to implement. What's the problem here?

tedbow’s picture

re #10

If an exception occurs from an event subscriber during batch processing

but the quote from the issue summary says

Otherwise post-mortems in the real-world (where updates will be installed automatically, and hence no user will be present to witness the error) will become impossible.

(emphasis mine)

If "no user will be present to witness the error" then we are talking about cron updates, cron updates do not use "batch processing"

wim leers’s picture

Assigned: yash.rode » Unassigned
Issue tags: +Needs followup

@tedbow's point in #11 is fair, since apparently cron vs UI (non-cron) updates have different execution flow.

The real point here though is that exceptions should ALWAYS be logged, regardless of cron vs UI updates! And #11 may be correct in quoting what I said literally but is sort of ignoring the spirit: we cannot expect end users to meticulously take notes of everything they see happen. No matter whether that's during cron updates (where they CANNOT see it) or UI updates (where they COULD see it).

However … it's shocking news to me that cron updates and non-cron updates apparently do not use the exact same logic! 😳 It never even occurred to me that they'd be implemented differently. The stage life cycle is the same, only the way things get executed is different.

We need to document why these are implemented so differently. Why is it for UI-based installing of updates not okay to assume the entire stage life cycle finishes within a single request, but for cron-based installing of updates it is okay? 🤔

Unassigning @yash.rode, because @tedbow and @phenaproxima need to clarify this. We also need a follow-up issue that only either of them can work on to document the architectural differences between cron vs non-cron updates.

tedbow’s picture

The real point here though is that exceptions should ALWAYS be logged, regardless of cron vs UI updates! And #11 may be correct in quoting what I said literally but is sort of ignoring the spirit:

I agree. This should have been state in the summary more clearly. I don't think we should be making issues and handing them off to others follow the "spirit" of the issue.

Why is it for UI-based installing of updates not okay to assume the entire stage life cycle finishes within a single request, but for cron-based installing of updates it is okay?

  1. Cron sets a long time limit to be able to do more operations. We could do this in the UI but then user has no feedback for the entire cycle.
  2. The batch system gives the user feedback that something is happening so they don't have to wait and see nothing for whole cycle.
  3. After the update is staged and before it is applied in UI we are able to show the user the form with the "Continue". Here we can show them more relevant information about what was staged. Currently I think all we are showing them is the message about possible database updates from StagedDBUpdateValidator but this is very important information and the user may decide not to run an update that has staged database updates(to run it at another time). Other contrib or custom code also follow this same pattern of listening to StatusCheckEvent and if there is staged update show some warning if it is not cron and an error if it is cron.

    In cron updates we don't allow updates to run if there are staged database updates. There wouldn't be anybody to show this information to anyways.

tedbow’s picture

tedbow’s picture

Issue tags: +core-post-mvp
omkar.podey’s picture

Assigned: Unassigned » omkar.podey
omkar.podey’s picture

Assigned: omkar.podey » tedbow

So is the conclusion that cron sets a longer time hence can be assumed that it happens in a single request so this should be only done for cron and not for the UI updates ?, can you update the issue summary so it's more clear what the next steps are. and could this issue affected by #3357969: For web server dependent unattended updates run the entire life cycle in a separate process that will not be affected by hosting time limits

tedbow’s picture

Assigned: tedbow » omkar.podey
Issue summary: View changes
Issue tags: -Needs issue summary update

@omkar.podey I have update the summary. I think this is actionable now

The issue will not be affect by #3357969: For web server dependent unattended updates run the entire life cycle in a separate process that will not be affected by hosting time limits

omkar.podey’s picture

Issue summary: View changes
omkar.podey’s picture

Assigned: omkar.podey » Unassigned
Status: Needs work » Needs review
omkar.podey’s picture

Assigned: Unassigned » omkar.podey
Status: Needs review » Needs work
omkar.podey’s picture

Assigned: omkar.podey » Unassigned
Status: Needs work » Needs review
wim leers’s picture

Assigned: Unassigned » omkar.podey
Status: Needs review » Needs work

A new quarter, time to increase the expectations! 🤓 I know you can do it! 😊

  1. #9 added Needs tests. Why is this marked Needs review if the tag is still present? Are the necessary tests present now?
  2. Looking good, just have a few questions!
omkar.podey’s picture

Issue tags: -Needs tests
omkar.podey’s picture

Assigned: omkar.podey » Unassigned
Status: Needs work » Needs review

@Wim.leers I was trying to think of a way to test the back trace as i'm currently logging it as context. do you think that's necessary ?

phenaproxima’s picture

Assigned: Unassigned » phenaproxima

Self-assigning for review.

phenaproxima’s picture

Assigned: phenaproxima » Unassigned
Status: Needs review » Needs work

This seems like a reasonable start, but there are some aspects that feel...porous to me. Maybe I'm missing something, though. In general I think we need better comments.

omkar.podey’s picture

Assigned: Unassigned » omkar.podey
omkar.podey’s picture

Assigned: omkar.podey » Unassigned
Status: Needs work » Needs review
tedbow’s picture

Assigned: Unassigned » tedbow

reviewing

tedbow’s picture

Assigned: tedbow » Unassigned
Status: Needs review » Needs work

Looking good. Just a few more things

tedbow’s picture

Assigned: Unassigned » omkar.podey
omkar.podey’s picture

Assigned: omkar.podey » Unassigned
Status: Needs work » Needs review
tedbow’s picture

Status: Needs review » Needs work

Needs to be merged/rebased with 3.0.x

phenaproxima’s picture

Assigned: Unassigned » tedbow
Status: Needs work » Needs review
tedbow’s picture

Assigned: tedbow » phenaproxima
Status: Needs review » Needs work
phenaproxima’s picture

Assigned: phenaproxima » tedbow
Status: Needs work » Needs review
tedbow’s picture

Assigned: tedbow » Unassigned
Status: Needs review » Reviewed & tested by the community

Looks good!

phenaproxima’s picture

Status: Reviewed & tested by the community » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.