The file_copy process plugin being added in #2695297: Refactor EntityFile and use process plugins instead throws MigrateException when a copy fails, aborting the entire row being processed. This makes sense for the core usages in d6_file and d7_file, where it makes no sense to create the file entity if the file itself does not exist in Drupal. However, in contrib we'd like to use this plugin in other contexts - in particular, to copy a file and create a file entity within a file/image field mapping in a node migration (#2635622: Process plugin for importing/creating files by URL). In this case, we don't want to throw out the node for the sake of a missing file, we just want to leave the field empty.
To support this, let's add a skip_process_on_failure option - when set to TRUE, we'll throw MigrateSkipProcessException instead of MigrateException.
Comment | File | Size | Author |
---|---|---|---|
#33 | 2766369-33.patch | 5.08 KB | tstoeckler |
#33 | 2766369-32-33-interdiff.txt | 7.88 KB | tstoeckler |
#32 | 2766369-30.patch | 4.23 KB | Ludo.R |
#25 | 2766369-25.patch | 5.34 KB | quietone |
#25 | interdiff-20-25.txt | 4.59 KB | quietone |
Comments
Comment #2
chx CreditAttribution: chx at Smartsheet commented> However, in contrib we'd like to use this plugin in other contexts - in particular, to copy a file and create a file entity within a file/image field mapping in a node migration
Even for contrib that's a bad, bad idea. Write a file migration and a node migration. Very easy. It's an architectural mess otherwise. There is one destination in a migration. While obviously you can save a file in a process plugin ,there's nothing to stop you, that's a mess. That's not how Drupal 8 migration works.
Comment #5
mikeryanComment #8
nachosalvador CreditAttribution: nachosalvador at Ymbra commentedI've experienced a similar problem when I was working in a migration to Drupal 8.4.5 from a custom mysql database. When a source file does not exist the migration process fails and this stops the migration group from processing.
First I tried throwing MigrateSkipProcessException when the origin file does not exist but transform process continues and destination EntityFile.php throws MigrateException('Destination property uri not provided').
Finally I used MigrateSkipRowException and works fine for me. Row is skipped, destination field leaves empty and group migration completes all configured migrations
Comment #9
maxocub CreditAttribution: maxocub as a volunteer commentedWe should throw a
MigrateSkipRowException
everytime a file is not found. As the issue summary proposes, we should have askip_process_on_failure
option. If this option is set to true, we throw aMigrateSkipRowException
exception, otherwise we should still throw aMigrateException
exception.Comment #11
robertom CreditAttribution: robertom at bmeme commentedHi, sorry for my bad english.
I have added the skip_process_on_failure option on FileCopy and Download process Plugin, but for now the tests are missing
Comment #12
Peacog CreditAttribution: Peacog as a volunteer commentedThanks @robertom. The patch works well for me.
Comment #13
vidorado CreditAttribution: vidorado at Biko2 commentedRe-rolled patch for Drupal 8.6.9
Comment #14
quietone CreditAttribution: quietone as a volunteer commentedComment #16
idflood CreditAttribution: idflood at Stimul.ch commentedPatch was not applying to 8.6.13 so I rerolled it against 8.8.x.
edit: the patch in #13 still probably apply cleanly on 8.6.13, my issue was somewhere else.
Comment #17
nachosalvador CreditAttribution: nachosalvador at Ymbra commentedComment #18
mikelutzThis is supposed to be a MigrateSkipProcessException, not a MigrateSkipRowException MigrateSkipRowException has the same effect as MigrateException
Also moving to NW because this still needs tests of the new functionality.
Comment #20
mallezieRerolled for 8.9.
Changed MigrateSkipRowException in MigrateSkipProcessException.
Needs to go back to NW for tests, but just letting testbot test first.
Comment #22
gigimaorTo solve the issue I just change the exception to skip row in the core migrate module migrate/process/fileCopy.php as additional to patch ver 15.
It solve my issue.
Comment #24
marvil07 CreditAttribution: marvil07 at Isovera commentedThanks for the work here!
Generalizing it would be probably a good idea, changing the title to reflect it.
Having an option to skip either via row or process, and not a raw exception would be great to be able to inspect the migrate messages table after the run, instead of needing to re-run manually the given migration.
Keeping NW for that, and related tests.
A note on how to do this without a patch is to manually verify.
This is not always possible, but should be OK for the simple cases.
E.g. Given an filepath already available on
source_full_path
destination property, we could do the followingComment #25
quietone CreditAttribution: quietone as a volunteer commented9. Add a MigrateSkipRowException when the source file is not found to FileCopy.
And then some cleanup and adding tests.
Comment #26
mikelutzThe doc doesn't match the actual exceptions thrown
Same here.
Not sure why we arbitratily change this from a MigrateException to a MigrateSkipRow exception, it seems like we should use the same logic we do below.
There's also one more MigrateException thrown in this plugin that we didn't change. was that intentional?
Comment #27
tstoecklerI had recently opened #3162959: Migrate should allow continuing the migration on a failed download for this. I only covered the GuzzleException's over there, as that was my use-case. I.e. that patch doesn't contain the exception change, when, for example, the
file_exists()
on the source file fails inFileCopy
. However, over there, I introduced a configurable way to either skip the process or skip the whole row which may be interesting for this.Comment #29
AnybodyAs it may help others: also see https://www.drupal.org/project/migrate_skip_on_404 (from #3113394: Add process plugin to skip files that don't exist)
Comment #32
Ludo.RRe-rolled the patch for 9.3.x
Comment #33
tstoecklerHit this once again, but like it was already reported in #8 skipping the process instead of the whole row is problematic for migrations with a file entity destination. So I wanted to bring in the patch from the issue mentioned in #27 to make this behavior configurable, so that either the process or the row is skipped depending on the configuration.
In doing so, however, I realized that in my opinion part of the current patch is incorrect, namely the changing of the MigrateException when a FileException is caught. The FileException is only thrown when there was an error in the actual copying process, not when the source file does not exist. As far as I can tell this was never explicitly discussed here, the discussion was always about the missing source files. Not sure if it makes sense to allow skipping such a FileException, as well - I personally think it doesn't - but either way I don't think it's in scope for this issue.
So removed that from this patch.
Sorry if this is a bit "railroad"-y, I don't mean to step on anyone's toes here by changing course a bit, but I went through all the patches above and there is quite a bit of "ping-pong" going on with different things being changed differently by the various patches without a lot of discussion of the actual details, so this is supposed to be me throwing my preference into the mix, not me invalidating any other suggestion.
Attaching an interdiff to #32 for completeness sake, but this is mostly a different patch.
Comment #35
mikelutzThis still needs tests, so back to NW.
In my opinion, the configurable exception is a bit much here. The skip process exception almost sounds reasonable, but maybe you want to do something else in the process on failure, like provide a default. What really should happen here is the plugin should simply return null, and then let the migration decide what to do, whether that's a skip_on_empty, method: row, or a default_value, or whatever. Of course, for BC reasons we can't do this in these plugins, and there's not an easy way to change that behavior in a backwards compatible way.
So.. no core migrations call download, everything calls file_copy. file copy uses download if it detects a non-local uri, but in practice for me, and in core, I never use download directly, as file_copy covers both the download and local copy use cases.
What I would like to see here is a new process plugin, call it file_transfer, or file_move, retrieve_file, get_file, something like that. The new plugin should pull all the logic in from file_copy and download, but just return NULL if it can't successfully copy the file. Core migrations that use file_copy should use the new plugin, and follow it with a skip_on_empty method: row, which isn't exactly the same behavior as the migrate exception thrown now, but it's generally better, the whole point of this is that it's buggy to fail the whole migration on a failure to download a file from one row. And file_copy and download should both be deprecated in favor of the new plugin. This leaves the responsibility of dealing with what to do on a failed copy to the migration itself, with a default of setting the result to NULL like a skip process would, but still allowing the migration to deal with that null in further plugins in the process chain. Yes, if you do nothing, and leave the NULL, then the file:entity destination will complain, but that's not really the process plugin's business. It has no idea what your destination is, or what you are doing with the file, or anything like that. Only the actual migration is qualified to know if it needs to add special handling for the NULL.
Comment #38
donquixote CreditAttribution: donquixote at European Commission and European Union Institutions, Agencies and Bodies commented@mikelutz (#18)
For the record.
The difference is that with MigrateSkipRowException, the migration still counts as "successful" for the sake of dependent migrations.
With MigrateException, the entire migration counts as failed, which blocks dependent translations.