Problem/Motivation
I discovered this while working on #3166602: Ensure media_filter → media_embed mapping does not cause fatal errors and subsequently #3166930: Migrated filters may have invalid/incomplete settings applied.
If a fatal PHP error occurs in the middle of a migration, it leaves the migration in a weird state:
- the message of the fatal error is not logged into the Migration system's messages tables, but only in the PHP error log and into watchdog
- the migration is left in the
\Drupal\migrate\Plugin\MigrationInterface::STATUS_IMPORTINGstate - therefore the migration is effectively "stuck", and you need to manually unstuck it
Steps to reproduce
See the STR in #3166930: Migrated filters may have invalid/incomplete settings applied.
Proposed resolution
Set a custom exception handler while in MigrateExecutable, because these are currently caught by _drupal_exception_handler(), so surely the migration system can do the same, but provide extra context?
- In PHP 7, fatal errors can be caught! So let's catch them, and treat them like exceptions. See https://www.php.net/manual/en/language.errors.php7.php → see #2 👀
- We currently only have
/** * Migration error. */ const MESSAGE_ERROR = 1; /** * Migration warning. */ const MESSAGE_WARNING = 2; /** * Migration notice. */ const MESSAGE_NOTICE = 3; /** * Migration info. */ const MESSAGE_INFORMATIONAL = 4;Also add
MESSAGE_FATAL_ERRORso that it's possible to communicate this is even worse thanMESSAGE_ERROR. → see #3 👀
Remaining tasks
The decision was made not to do this, see the complete comments, #16, #17, #22 and #33.
Discussed at a migrate meeting #3168931: [meeting] Migrate Meeting 2020-09-10, #3245469: [meeting] Migrate Meeting 2021-10-28 1400Z and #3247150: [meeting] Migrate Meeting 2021-11-04 2100Z
User interface changes
Fatal errors also show up.
API changes
Addition: \Drupal\migrate\Plugin\MigrationInterface:MESSAGE_FATAL_ERROR.
Data model changes
None.
Release notes snippet
Fatal errors while running a migration no longer cause the migration system to crash.
| Comment | File | Size | Author |
|---|---|---|---|
| #36 | drupal-3167267-36.patch | 5.01 KB | edurenye |
| #33 | before-log.png | 56.05 KB | quietone |
| #33 | After.png | 12.49 KB | quietone |
| #33 | Before.png | 39.6 KB | quietone |
| #25 | 3167267-26.patch | 4.97 KB | wim leers |
Comments
Comment #2
wim leersComment #3
wim leersComment #4
wim leersForgot the interdiff for #3.
Comment #5
wim leersComment #8
wim leers#2's failures:
Drupal\Tests\migrate\Unit\MigrateExecutableTest+Drupal\Tests\migrate\Unit\MigrateExecutableMemoryExceededTestfail because they're simply tightly coupled to an existing function signature… this seems like a genuine failure: I should move the new
catchstatement to after thecatch(MigrateSkipRowException $e)😃… I suspect this has the same root cause as the previous point.
Comment #10
wim leersYay, that works! Only the unit tests need to be updated now, plus explicit test coverage.
Comment #11
wim leersFixing the few remaining test failures are excellent for a novice contributor 😊
Comment #12
ridhimaabrol24 commentedFixing test cases!
Comment #13
benjifisherThe Novice task has been done, so I am removing that tag.
It is really annoying that
\Errordoes not correspond to'error', but I do not think we can do anything about that (unless we go through a deprecation process). If we want to do that, it can be in a follow-up issue.It is not too late to replace
'fatal'with'critical', to matchconst MESSAGE_CRITICAL. Let's do that.I am adding the tag for an issue summary update. Update the Proposed resolution section to match the current patch. (At least the name of the constant needs to be updated.) That seems like a Novice-level task, so maybe I will not remove that tag after all.
If we have explicit steps to reproduce, then we should be able to create a test. Probably it will have to be a functional test, not a real unit test.
Comment #14
Vidushi Mehta commentedAdded a patch which replace the 'fatal' as mentioned by #13.
Comment #15
wim leers#12 and #14 look great!
Add a custom migration that we run explicitly, that intentionally triggers a fatal PHP error.
Comment #16
mikelutzSo, I'm going to go on the record as being opposed to this change. A quick search through core for /catch\s?\(.*\\Error\s.*\)/ tells me that nowhere else in Drupal core do we attempt to catch php errors, and for pretty good reasons. Having to trap errors at this level suggests that there are significant flaws in the migration system that should be addressed. There shouldn't be a way to write a custom migration using only core code that is capable of intentionally triggering a fatal error, unless you have a test module with a custom plugin that is intentionally doing something bad.
If there is a way to write a custom migration that triggers a fatal error that isn't handled by the migration system before it happens, we should fix that, not try to hide it.
If we have to write a broken custom plugin to test this, then we aren't really testing core code.
There are plenty of processes in core that would be left in a funky state if a fatal error occurs during them, and core passes off to uncontrolled contrib plugins all the time, so why should this particular system be the one instance where we try to catch a Fatal error?
I understand the reasons for wanting to do this, and at the end of the day, I'm happy to be overruled, but to me, this is not good coding. We shouldn't be catching errors, we should be programming defensively to make sure they don't occur.
Comment #17
mikelutzAdding in a relevant thread from the recent migration meeting:
4️⃣ MigrateExecutable should catch not only exceptions, but also fatal errors
Participants:
benjifisher, mikelutz, xurizaemon
Comment #18
wim leers… because this only became possible in Drupal 9, which hasn't existed for very long, since it requires PHP >=7.
Any migration source/destination/process plugin can trigger a fatal PHP error that can crash an entire migration. I'm arguing it should not.
Can you point me to the places in the codebase and to issue queue discussions where this is an explicit rationale?
Core code is generally more robust than contrib/custom code, so IMHO this is not a convincing justification.
Agreed :) But we can't control all code that all people write. And I think this very measure would fall under the umbrella of "programming defensively"? 😅🤓
Not everyone is knowledgeable enough to debug PHP errors. Empowering people with less experience to get pretty far along the way in a migration without having to know xdebug to figure out the cause for a fatal PHP error would make running migrations more accessible.
Comment #19
mikelutzDrupal 9 has been branched off for over a year at this point. Plenty of time to trap one error somewhere if that was what we were going to do.
This isn't compelling to me. ANY plugin(/hook/controller/event handler/theme function) that executes code can crash an entire page load or any other process that drupal manages. This is not unique to migrate plugins, yet you propose a unique solution.
You want me to find precedent for the idea that we should fix errors rather than hide them?
This isn't a justification, it's simply a fact. You would be testing a reaction to code using the core apis that is intentionally broken, and not just broken in that it is using the core apis in an unintended way (which we SHOULD be programming defensively for) but for code that decides to write
$x = $y / ($z-$z);which has never been something we have previously attempted to save the user from.I do agree that we can't control all the code that all people write. Drupal's policy has always been to use contributed modules at your own risk. Our job is to provide a rock-solid core experience, and a robust api, and tools that contrib maintainers and contrib contributors can use to monitor their code and have issues reported. If core is going to be responsible for all catchable errors that contrib code might possibly cause, then I think that's a bigger discussion, not something we pioneer in the migration system. If this is the policy we are going to go with then we need to start wrapping any hook or event or plugin calls or anywhere else that can call custom or contrib code with try/catch(\Throwable). If we decide to go that route as a community, so be it, but I just don't think it's right to try tarbitrarilyly do it here.
Comment #20
wim leersThis is false. Core committers have intentionally kept
8.8.x,8.9.xand9.0.xas similar as possible, to not make development unnecessarily complicated.Migrations are long-running, HTML responses are not.
I strongly disagree, because the migration system allows through the use of YAML files to combine lots of migration plugins (PHP code) together, and those may be combined in novel ways, leading to fatal errors due to hitherto unseen edge cases.
Eh … this was true in the D7 and earlier days, but not anymore. We've specifically made it harder for people to shoot themselves in the foot, with stricter validation, stricter types, helpful exceptions, et cetera — rather than "oh PHP fatal, you just use a debugger and figure it out".
I'm a little bit shocked by your very strong reaction, to be honest.
How does this patch adversely affect you?
Your argument boils down to "developers should write perfect code". I agree! But we all make mistakes, and we all miss edge cases.
For those developers that do write perfect code, this patch will have zero effect. For those that work with imperfect code, this helps their migrations to not crash midway. I'm genuinely wondering what the harm is here?
Comment #21
larowlanCode review below - we need to be really sure that we want to do this before going ahead, particularly because there's a breaking API change for anyone that has sub-classed MigrateExecutable and handles exceptions differently. Yes, there's the 1:1 rule, but this would definitely be 9.1 only because of a breaking change - and we'd have to be reallllly sure we wanted to do it. Personally, I've been in the scenario where this has caught me out whilst I'm working on developing a new source/process plugin - but if someone is in that scenario - they are comfortable interpreting PHP errors, and resetting the migration status.
So, tl;dr I don't feel strongly either way (sorry).
sidenote: we could just use \Throwable here as it's a common ancestor of both of these?
this is an api change, and if we changed the typehint to \Throwable, it doesn't help - see https://3v4l.org/RlS0n and https://3v4l.org/rIOsE
Comment #22
heddnWhy does try/catch have to be added to the executable? Why couldn't it be added to the UI or batch form that is calling the executable? That's how we handled it in migrate_tools, where we've added a
--continue-on-failureflag to do sorta what is proposed here.Comment #24
wim leersThanks, @ridhimaabrol24 (#12) and @Vidushi Mehta (#14)!
#2969551: Migrate messages from caught exceptions need file and line details was committed and requires this patch to be rebased.
Comment #25
wim leers#24:
Obviously this didn't belong in here 😬
#21:
Attached is a rerolled #24 with #21 already addressed.
Comment #26
wim leers#22: very, very, very good question! 🤔
The reason I put it in
MigrateExecutableis because that's what is already catching\Exceptionand is logging that.I think that in your reasoning, that should also be moved out of
MigrateExecutable? Because why put one kind of exception handling in one place and another elsewhere?Furthermore, there is one enormous limitation in the way that
MigrateToolsCommandswraps theMigrateExecutablewith its--continue-on-failurelogic: it continues subsequent migration plugins, but it does not continue subsequent rows within the current migration plugin.So the crucial difference is that this core patch allows source rows after the one that triggered a fatal error to still be imported, rather than skipping forward to the next migration plugin.
Comment #27
heddnThis might seem like a small thing, but adding an additional level of logging to the migrate message could be a BC issue. We've had issues before in migrate tools; I assume in migrate run as well. Could we remove the logging level to a follow-up so we don't bikeshed on it?
The rest of my concerns are pretty well addressed. I'm still a little concerned about continuing to run the migration once it hits errors. What if someone has logic built-in to their execution to handle them. But let's see what other have to say.
Comment #28
wim leersHow could that cause a BC issue? It's not like we're inventing new log levels?
Happy to do that but I'd love to understand this first. I'd also love to understand what needs to happen to be able to introduce the sensible log level in a follow-up.
🥳
What exactly is the concern? Can you share an example of such built-in logic? That'd probably help me understand your concern better and perhaps others too 😊
Comment #29
heddnre #28.1: Drush uses different logging level and mapping. See https://gitlab.com/drupalspoons/migrate_tools/-/blob/8.x-5.x/src/Drush9L.... We also have similar logic added to migrate_upgrade and other migrate contrib modules.
re #28.2: If anything can be done, then it probably will be done in Drupal. I'm probably being overly cautious, but https://xkcd.com/1172/ seems appropriate.
Comment #30
quietone commentedAFAIKT this issue is not specific to Drupal 7 sources, removing tag.
Comment #32
wim leers#29:
RE #28.1: Is that a reason why we can't have it in the migration system itself? I'm not sure what point you're making here? 🤔
RE #28.2: I'm still not sure what that "built in to their execution" could be? Do you mean in a process plugin? Do you mean in an overridden
MigrateExecutable?@quietone I think @heddn asked for your thoughts in #27, where he wrote:
Comment #33
quietone commentedThe proposed resolution in the issue summary needs an update. It should emphasize that this is proposing a change in behaviour in MigrateExecutable and explain the pros and cons of it. Especially, what are the uses cases where continuing despite fatal errors is an improvement. It should also explain what happens to the state of the migration when a fatal error occurs.
What stands out for me are these comments, all of which I agree with.
heddn: "I'm still a little concerned about continuing to run the migration once it hits errors."
mikelutz: "If core is going to be responsible for all catchable errors that contrib code might possibly cause, then I think that's a bigger discussion"
larowlan: "Personally, I've been in the scenario where this has caught me out whilst I'm working on developing a new source/process plugin - but if someone is in that scenario - they are comfortable interpreting PHP errors, and resetting the migration status."
And I am not yet convinced that using the migrate message tables to log fatal errors is the right thing to do.
I then tested this when using the Migrate UI. I made a divide by zero error in the process plugin, FieldInstanceSettings, when the field was a term reference. Without the patch the Ajax error (screenshot below) error is displayed and the migration is halted. Looking in the log, the latest error in the divide by zero error so it is easy for the user to find. With the patch the migration continues on after the fatal errors and the success message is displayed. I looked at the log (/admin/reports) and, of course, the divide by zero error was not there. The user has no feedback that there was an error. So, at best, the patch needs work for that.
Now, we know that the error message is in migrate_message_d7_field_instance but how will the user of the UI find it? Migrate tables are not accessible via the UI and even if they use something like PhpMyadmin, they would have to search all the message tables for errors. And then even if they do find it only contains, 'divide by zero', which clearly lacks the information needed to track down the problem.
All of which means that I do not agree with this proposal.
Still, as chx often said, 'migrate is a special snowflake'. Perhaps the error could be caught so that state could be changed to indicate a fatal error has occurred instead of IMPORTING. However, I do not want a behaviour change, I would still expect the migration to be 'stuck' in this state until some action is taken to un-stick it. That would prevent an accidental actions on the migration, such as ''mim --update' or rollback. Even with that suggestion I am still concerned about setting a precedent of capturing errors and what others may think that means for other parts of core.
Without the patch
Watchdog message without the patch
With the patch
Logged message in message table with the patch. There is no watchdog message.
1 eef7ce1a543f5239383e7a7e256d5d4b78706b000306068229... 4 Division by zeroComment #34
anybodyAs of #33
Comment #35
quietone commented@Anybody, thanks. I left this at Needs Review because there is not an agreed upon path forward so working on the patch isn't needed now.
Let's see if I can make the clearer in the IS.
Comment #36
edurenye commentedThe patch does no longer apply in 9.3.x, so I re-rolled it.
Comment #37
quietone commentedThis was brought up at #3247150: [meeting] Migrate Meeting 2021-11-04 2100Z and the maintainer present, benjifisher, mikelutz, and quietone, agree to close this as won't fix after the meeting, assuming no further comments. Since there hasn't been I am closing this.
See the Remaining Tasks section in the Issue Summary for links to the relevant comments.
Comment #38
quietone commentedAdd link to an earlier migrate meeting where this was discusses.
Comment #39
wim leersPer #37.