Problem/Motivation
When the d6_file migration is configured to fetch the files from a remote URI (i.e., directly from the public directory of the legacy site), the files are not copied, although file_managed is populated as if they were. The only hint that something is wrong is filesize warnings (attempts to stat the files where they should have been copied): Warning: filesize(): stat failed for public://u3/NOLA 20080427 2.jpg in Drupal\file\Entity\File->preSave() (line 198 of /Users/mryan/Sites/d8/core/modules/file/src/Entity/File.php)
The problem is in EntityFile::import(), which uses drupal_realpath() to decide whether the source and destination files are identical and skips copying the file if so - as documented, drupal_realpath "does not work for remote URIs". This was introduced in #2405337: EntityFile destination doesn't handle temporary files when the source and destination are the same.
Proposed resolution
Possible solutions:
- Only apply the drupal_realpath() test for temporary files (the motivation for this change).
- Identify remote source files and skip the test in those cases.
- ...
Remaining tasks
- Submit a failing test case.
- Choose and implement a solution.
User interface changes
None.
API changes
None.
Original report by @mikeryan
File migration worked at one point, but no longer - although it completes without error and the file_managed table is populated with valid-looking uri values like public://u3/NOLA 20080427 2.jpg, the files are not actually copied (the u3 directory is not even created). Need to investigate whether there was a change in the core file migration we need to react to...
Comment | File | Size | Author |
---|---|---|---|
#13 | interdiff.txt | 3.81 KB | benjy |
#13 | 2414365-13.patch | 7.11 KB | benjy |
#11 | interdiff.txt | 2.5 KB | benjy |
#11 | 2414365-11.patch | 7.18 KB | benjy |
#8 | interdiff.txt | 1.49 KB | benjy |
Comments
Comment #1
mikeryanComment #2
mikeryanComment #3
benjy CreditAttribution: benjy commentedYes I think we should try detect remote files.
Also for testing, I think it's time to introduce some unit tests for the file destination. It's one of our more complicate destinations and it's hard to get different sets of test data to test using the file migration. I think need to test public, private files using relative, absolute and remote paths plus tmp files with relative and absolute paths.
I'll take a stab at this in the next day or so unless you beat me to it.
Comment #4
mikeryanStarted digging in and thought the same thing - actually, I'd like to see unit tests in general for the destinations in the migrate module and not rely so heavily on migrate_drupal to do all the testing, but this in particular needs much more coverage. Spun my wheels a bit, not having done that much unit testing (not recently anyway), so it'd be great if you get it started and I can help add test cases.
Thanks.
Comment #5
benjy CreditAttribution: benjy commentedOK, I started this off. Few things to note:
Added a couple of stub methods that should be filled in. Also, since we're fixing up the EntityFile destination, I wonder if we should remove the wrong assumption that temporary files are absolute at the same time?
Comment #6
benjy CreditAttribution: benjy commentedOK, reproduced the bug in this issue, the key part is having a folder that doesn't exist that needs to be created. Failing patch attached.
Comment #7
benjy CreditAttribution: benjy commentedOK, reproduced the bug in this issue, the key part is having a folder that doesn't exist that needs to be created. Failing patch attached.
Comment #8
benjy CreditAttribution: benjy commentedMoving the prepare directory code to the top fixes the issue here.
Comment #9
benjy CreditAttribution: benjy commentedOpened #2417975: EntityFile destination incorrectly assumes temporary files are absolute paths to handle the wrong assumptions with temporary files after this issue goes in with the new tests.
Comment #11
benjy CreditAttribution: benjy commentedOK some final clean-up.
I discussed this issue with @chx, he wanted to use PHPUnit. We can't do that right now because of the use of drupal_realpath, file_prepare_directory and drupal_dirname in the EntityFile destination.
The more I look, I don't even know why Drupal needs custom implementations of realpath anyway, it should just work if the stream wrapper is registered with PHP?Because PHP's implementation can't support remote stream wrappers with these methods.I opened #2417987: EntityFile needs better support for remote destinations which will take us closer to using PHPUnit by removing drupal_realpath.
Comment #12
chx CreditAttribution: chx commentedRealpath for remote/virtual makes not a lot of sense and PHP instead of a passthrough decided to not support at all and we decided (very , very long ago #517814: File API Stream Wrapper Conversion ) to add drupal_realpath and a realpath to the stream wrapper. It was meant to be a crutch for D6->D7 conversion but it lingered oh well. Anyways, this is good as it is.
Comment #13
benjy CreditAttribution: benjy commentedI changed the test so that $source_base_path is coming from the data provider similar to how it does from the FileUri process plugin in a real migration. This is going to mike life easier for #2417975: EntityFile destination incorrectly assumes temporary files are absolute paths which needs to test relative and absolute paths.
Comment #14
chx CreditAttribution: chx commentedEven better!
Comment #16
alexpottMigration is not subject to beta evaluation. Committed 06aff1a and pushed to 8.0.x. Thanks!
Minor fixes on commit.