Problem/Motivation
I am trying to migrate drupal 7.56 to 8.4.2 using migrate_drupal and migrate_drupal_ui web interface.
The instructions for file path seems to be misleading.
It is asking to enter "Public file directory" and "Private file directory". The variables seems to be interchanged i.e., the Public_file_path value is being used for importing private files. with this, Private files are getting imported.
Public file path is becoming "private_file_path/sites/default/files..." and public file import is being failed.
If both the values are entered, i.e., public_file_directory for public and private_file_directory for private, the values are becoming "file://public_file_path///private_file_path".
Proposed resolution
Change the ReviewForm so that both the source_base_path and the source_private_path are added to the batch configuration.
Then in MigrateUpgradeImportBatch, for all file migrations, set the source_base_path constant of the source plugin. Use the scheme property of the source plugin, $definition['source']['scheme'], to select the public or private base path entered by the user on the CredentialForm.
Add a new functional test, FilePathTest, to confirm the proposed resolution truly works.
Remaining tasks
- Change record
- Release notes snippet
User interface changes
N/A
API changes
Data model changes
Release notes snippet
When Drupal 7-to-Drupal migrations are run through the user interface, the 'Private file directory' is now used correctly as the base source path for the private files.
| Comment | File | Size | Author |
|---|---|---|---|
| #105 | 2925899-105.patch | 14.16 KB | quietone |
| #105 | interdiff-102-105.txt | 1.22 KB | quietone |
| #102 | 2925899-102.patch | 14.1 KB | quietone |
| #102 | diff-94-102.txt | 1.56 KB | quietone |
| #94 | 2925899-94.patch | 14.15 KB | quietone |
Comments
Comment #2
quietone commentedAdding tag
Comment #3
mpp commentedIn Drupal\migrate_drupal_ui\Form\CredentialForm you can see that there's a
source_base_pathand asource_private_file_path:In Drupal\migrate_drupal_ui\Batch\MigrateUpgradeImportBatch the source_base_path constant is set for the file migrations:
Note that in Drupal\file\Plugin\migrate\source\d7\File it gets the
file_private_pathvariable from the database.If both source_base_path & file_private_path are available they will indeed be concatenated...
Comment #4
wim leersMore than a year later and this code is still in Drupal core:
Funny enough, #2804611: Migrate sources and destinations need a way to get their requirements was closed, but the comment is still there. So, reopened it.
But this explanation in #3 is inaccurate:
They won't be concatenated. They'll be overwritten. This means you effectively can specify either
$config['source_base_path']or$config['source_private_file_path']. If you specify both (which\Drupal\migrate_drupal_ui\Form\CredentialFormtotally allows you to do!),$config['source_base_path']always wins, meaning that whatever private files directory you entered, it'll just be ignored/forgotten.The code is basically doing:
Comment #5
wim leersComment #6
quietone commentedAnd the ReviewForm only adds the source_base_path to the batch. It should add the source_private_file_path too. That way both paths are available to MigrateUpgradeImportBatch::run() and the configuration for the D7 source plugins will be set correctly.
Comment #7
quietone commentedLet's try that again.
Comment #8
wim leersAh yes, that too. I noticed it but forgot to report it.
That means that private files migration could never have worked for sites that stored the private files outside the Drupal root!
Comment #9
mikelutzSetting to NW for tests. Tests will help us make sure the fix is valid.
Comment #11
quietone commentedComment #12
quietone commentedStarting over here. A new patch with a fail patch. Unfortunately, the fail patch doesn't actually fail and show the error. Turns out that this particular arrangement of source directories and path names work. If the private source directory is moved it will fail. I work on a new patch later.
Comment #14
quietone commentedThose patches contain changes to Upgrade7Test that I was considering and should not have been in the patch. So, removing all those.
Comment #17
quietone commentedGot it wrong again, the path returned in getSourcePrivateFilePath() should be /files.
Comment #19
quietone commentedDidn't know there were any file migrations that didn't use the 'scheme' property in the source. And there is, d6_user_picture_file.yml. This adds a check that the theme property exists.
Comment #20
quietone commentedThis still needs a proper test.
Comment #21
quietone commentedI have started over and rewrote the test and so haven't included an interdiff. This was working locally with the smaller test fixture I created (that is in the earlier patch) but this now uses drupal7.php.
Edit: fix grammar
Comment #23
quietone commentedSome cleanup.
Comment #24
quietone commentedAccidentally, left in the smaller test fixture I used for local testing.
Comment #25
quietone commentedFrom the IS:
I wasn't able to replicate this particular problem during testing and I am not sure how it would happen. @raveendrab, do you recall the paths that you were using?
Comment #26
benjifisherIMO the underlying problem here is that these lines of code (after applying the patch) are doing two two things:
/.If we straighten out the code so that we do (1) and then do (2), then the code will be easier to follow and it will be less prone to mistakes in the future. Something like this:
You might even use a ternary operator instead of an
ifblock. (But not everyone likes ternary as much as I do.)While you are at it, can you also improve the first code comment? The phrase "avoid this" is not very explicit, and the code that follows it has changed over time. (At least, it is changing in this issue). IIUC, the idea is that the destination plugin needs the configuration
source_base_pathand we should have a better way to supply it.Finally, what goes into the
if (...)is very important, and is one of the things that this patch changes. Can you say a few words about why$definition['source']['scheme']is the right thing to check? Either a comment here or an edit to the Proposed Resolution will help both me and the next reviewer.Short version:
@todocomment.$definition['source']['scheme']is the right thing to check.While you do that, I will start to look at the tests.
P.S. The expression
rtrim($base_path, '/') . '/'is simple enough that you can consider dropping the related code comment.Comment #27
benjifisherOops, I forgot to update the issue status.
Comment #28
benjifisherSorry for the piecemeal review. I should really get some sleep. Maybe I am confused.
I searched the codebase for
getSourcePrivateFilePath. There are three matches, all introduced in this patch:Since
FilePathTestdoes not extendMigrateUpgradeExecuteTestBase, it does not need that empty function. Since the abstract base class uses the function, I think it should declareso that non-abstract child classes are required to implement it.
Comment #29
quietone commentedNo worries about the piecemeal review.
#26. All points addressed, code block improved, comment improved and IS updated.
#28. Yes, the tests are messy, and I am glad I have finally made a push on #2974445: [Meta] Refactor Migrate Drupal UI Functional tests and made several child issues, all ready for review. I have moved the abstract method as suggested and renamed it to getSourcePrivateBasePath to better match getSourceBasePath. It will eventually be moved to MigrateUpgradeTestBase, if the refactoring goes as planned but if it is moved there now it will introduce unnecessary noise here.
Oh, and added comments to the test.
Comment #30
quietone commentedSmall inmprovement to the @todo comment.
Comment #31
benjifisher@quietone:
Thanks for pointing out #2974445: [Meta] Refactor Migrate Drupal UI Functional tests. I look forward to working on that with you! Knowing about that plan, I will not comment here on things in MigrateUpgradeExecuteTestBase.
I apologize for the half-baked suggestion in #28. I am going to ask for a different approach in this review.
getSourcePrivateBasePath()abstract in the base class, let's provide a default implementation, returning NULL. You can do that with an empty function, but I would use an explicitreturn NULL;. For the one-line description, I would mention where this function is used, since that explains why NULL is a reasonable value: something like "Provides source_private_file_path for the review form." or "Provides the source base path for private files for the review form." Or is it the credential form?Why the extra spaces before "Find"?
These test classes extend BrowserTestBase, so I think they should use
$this->container. Here,$this->fs = $this->container->get('file_system'). There are a few other cases (but see next comment).This one should be removed completely, since
$this->fsis already defined insetUp().I always get nervous when I see
.*in a regular expression. In a test, I think we should be more explicit about our expectations. How about (untested)instead?
Since this class does not extend MigrateUpgradeExecuteTestBase, that method will never be used.
FilePathTest::testFilePath(), but most of them are copied fromMigrateUpgradeExecuteTestBase::testMigrateUpgradeExecute(). For example,If the plan is to remove that base class, does it make sense to add comments to FilePathTest or to keep the code as similar as possible for when we restructure?
Both of these can be simplified with
?:.I am still working on understanding how the updated tests are split between Upgrade7Test and FilePathTest. I may request further changes as I figure that out.
Comment #32
quietone commentedThx, it will be nice to work together on the refactoring. I am keen to have that done and save myself and reviewers/committers time in the future.
1. Fixed
2. The spacing follows the example at API documentation and comment standards#todo
3. I was about to do that but then I read #2066993: Use magic methods to sync container property to \Drupal::getContainer in functional tests which doesn't seem to have a definitive preference. And considering that \Drupal::service() is used in 13 places in the functional tests I am reluctant to change that. If this needs to be done let's do it in a separate issue and convert them all at once.
4. Fixed
5. Fixed
6. Right, fixed.
7. Yes, the throws aren't needed and are removed.
8. I am in the camp of keeping this as similar as possible for when we restructure.
9. Fixed
And removed unused getFilenames from Upgrade7Test
Comment #33
benjifisherSorry for the nit-picking, but if we fix these now then we can save a core committer the trouble of pointing them out.
I will try to keep the same numbering as Commennts #31 and #32.
Now that we have a default implementation of
getSourcePrivateBasePath()in the base class, we do not need any of these empty functions.I see: the extra spaces are to make it look better in PHPStorm. When I have nothing better to do, I will see about doing the same in Vim.
Do not wrap "with" to the next line.
I would say that #2066993: Use magic methods to sync container property to \Drupal::getContainer in functional tests needs an issue summary update. I just read the summary and the two comments by @alexpott at the end. And then a couple of others. I also see three instances, in the
migrate_drupal_uifunctional tests, of$this->container->get(). I agree: deciding what to do (the hard part) and implementing it (the easy part) are out of scope for this issue.Thanks.
I forgot to include "temporary". Thanks for remembering it. I suggested
assertTrue()for the result ofpreg_match(). Any reason not to? If the match fails, then$matchesis set to an empty array and PHP issues a notice on the next line.Thanks.
Since we have removed the explicit
throwstatements, do we still need this annotation?OK, I will not worry about things copied from
MigrateUpgradeExecuteTestBase::testMigrateUpgradeExecute()for now. Don’t worry, be hapy!Thanks
I checked the API docs for @return and did not see any exception for NULL, so I think it should be
@return string|null.This comment is not nit-picking: I think it is my most important note on this issue. We are changing an existing test, which is always a warning sign.
I have been doing migrations for years. Some involved private files. I read the docs, and maybe I decided that the code was too complicated to figure out what was going on. By trial and error, with a little help from
drush mmsg, I figured out that I had to put the private files in a particular place and I could skip that field on the credential form. In short, I learned to behave like the existing test.After this patch, the test has to change and so do I. If I want to keep everything else the same, all of a sudden I have to enter the same thing in both text fields on the credential form. It is great for someone else who has never done this before that the form makes sense, but it makes life hard for someone who has gotten it to work.
It seems to me that we have two choices. One choice is to publish a change record and add an item to the release notes telling people that they have to pay attention to the form field that hey have learned to ignore. The other choice is to use
source_base_pathas a fallback whensource_private_file_pathis not set, preserving the current behavior. The first choice is disruptive, and the second choice is technical debt.I do not like these descriptions. Maybe "The file system helper class" or "The file system service"? Also, shouldn’t we use the interface instead of the class for the type declaration? We should definitely not use the same description for both
$basePathand$variable. Can we add some information about keys and/or values?Naming things is famously hard. Can we come up with something instead of
$basePathand$variablethat suggests that one is for the files used on the source site and the other is where the files are located when we perform the migration … and that otherwise these arrays are analogous? Perhaps$sourceDirectoryand$localDirectory, or we could usePathorBasePathinstead ofDirectory.I think we can simplify a little. If you do not like the
array_flip()part, you can leave it out, but still eliminate the$tempstore_privateand$all_migrationsvariables.I believe that keys in the inner array of a data provider can be omitted. If you use string keys that match the variable names, does that override the order? Most data providers I see just use indexed arrays.
On the other hand, keys for the outer array are shown to the user when the test fails, unlike code comments. So how about something like this?
Maybe it would be clearer to use
$filepath = implode('/', [...]);, with a multi-line array. Then we would not need the intermediate variables$pathand$base_path.Comment #34
pavnish commentedWorking on it
Comment #35
benjifisher@pavnish:
Thanks for working in this issue. How is it going?
I think that @quietone is ready to update the patch based on my review in #33, but will wait because you assigned this issue to yourself. If you are not actively working on this issue, then please post a patch with the work you have done so far and un-assign the issue. If you have questions about the issue, then please ask them here or on the #migration channel in Drupal Slack.
Comment #36
pavnish commented@benjifisher Thanks for comment i have just started . if some one is also worked on then no issue i would be very thankful @quietone you have worked on.
So i am un-assign the issue i was busy in some other task sorry for delay in this.
Thanks
Pavnish
Comment #37
quietone commentedAddressing review from 33.
1. Yes, of course.
2. Line wrap fixed. I don't see how the spaces are related to PhpStorm.
3. And I don't think we will need a follow up because it will covered when #2066993: Use magic methods to sync container property to \Drupal::getContainer in functional tests is resolved.
4. Done
5. I just don't think it is needed even though it doesn't hurt. It is back, sort of, preg_match return 1 on success not TRUE.
6. Fixed.
7. Fixed.
8. OK.
9. OK.
10. Fixed.
11. Since this is a bug I think we should not preserve the existing broken behaviour. Let's add a change record.
12. TODO. I deliberately chose 'variable' because these are the values that are put in the source database variable table for the file schemas. I've added a comment but am still thinking about names.
13. Agreed. Another thing I meant to cleanup but didn't get back to.
14. The keys are there for clarity but I will try without them.
15. Changed.
Comment #38
benjifisher@quietone:
Thanks. This addresses almost all the points from #33.
Follow-up to #33.5:
Remove the commented-out line of code. While you are at it, maybe eliminate the
$filesvariable by putting$this->getManagedFiles()insidefor (). I am not sure that I like the blank line in the middle of the loop, but it is up to you.You listed #33.12 as TODO.
The main point is #33.11. In #33, I described the fallback as technical debt, but it takes only a few lines. I reverted the change to Upgrade7Test and use this in MigrateUpgradeImportBatch.php:
I think our BC policy requires more than just a change record and release notes. Since this is a UI module, I think it is enough to show a message (warning?) to the user when the form is submitted without a value for
source_private_file_path. Is there any way to tell when the credential form is submitted whether there are any private files? I guess so: we just have to query the source database, looking for private files. The other question is how to make sure we remove the fallback for Drupal 10. Finally, do we need to add test coverage for the fallback? If so, that might be the right place to add@trigger_error()to answer the previous question.Comment #39
quietone commented1. All fixed. Looks fine without the blank line in the middle.
2. Yes, I wanted to review the naming. I am finally with you on using $localDirectory but instead of $sourceDirectory it is now $sourceFileScheme. I really want to keep the clear connection to the scheme because it is frequently used in the code when manipulating the file names. And I expanded the doc block for makeFiles().
3. In general BC gives me a headache and since I woke up with a real headache today, this will have to wait.
Comment #40
heddnI suggest for 33.11 to add a CR and do the right thing.
Comment #41
benjifisherBased on #40 and the similar comments in #3146005: [meeting] Migrate Meeting 2020-06-11, we do not need the fallback I suggested in #33.11, #38.3.
I reviewed the patch in #39, and it looks great. It addresses the remaining points from #38.
I am setting the status to NW for a CR and a release-notes snippet. I am listing those two things in the IS under "Remaining tasks".
Thanks especially for this:
It is often helpful to have a little more explanation in a docblock.
Can you add a follow-up issue or two?
First, I think it would be helpful to have a shorter version of the explanation above on the credential form. The current form suggests (at least to me) that the files will be directly under
bar/, when they are actually underbar/sites/default/files/.Second, in MigrateUpgradeImportBatch.php we now have
When I run file migrations and there are missing files, the message often lists a path including
//, and I think one of those characters comes from this line. Do we need to add a trailing/here?Comment #42
quietone commented@benjifisher, I thought you would like that.
Two follow ups made, #3151360: Improve description for file paths on the CredentialFrom and #3151363: Double // in file paths. A change record and a release note snippet added.
Back to NR!
Comment #43
benjifisherThanks for the follow-ups and the CR. I commented on #3151363: Double // in file paths and edited the CR. I already approved the patch in #39, so this issue is RTBC.
If #3143719: Add a getCredentials helper method to migrate_drupal_ui functional tests is committed first, then this issue will need an update, and vice versa.
Comment #44
alexpottI've been discussing the CR with @benjifisher. I’m not sure about the CR and the BC concerns here. I think the private file source field being ignored is an important usability bug to fix and data consistency bug - as this can result in missing files after migration. I don’t think we should publish the CR and I think we should backport this to 8.9.x.
But I think what @benjifisher is pointing out is valuable. If you are mid-migration then this change is going to break your migration. In these casaes we need to provide an upgrade path so that the in progress migration gets updated to have source_private_file_path that makes the public path.
Comment #45
benjifisherTruism: an undocumented feature is a bug.
If you have an undocumented feature, then you can either change the behavior to match the documentation or else you can change the documentation to match the behavior. Either one counts as fixing the bug. In other words, document the feature and it will no longer be a bug.
Maybe we should make the change I suggested in #38.3: if the path for private files is not specified, then use the path for public files as a fallback.
sites/default/files/. Having gone to all that trouble, I typically recreate the docroot structure of the source site, so I would end up entering the same thing for "Private files directory" as for "Pubic files directory".Aside: I know that it is a bad idea to have the private files under the docroot, but at least one major Drupal host (Pantheon) does that, mounting a different partition at that path in order to prevent nginx from serving the the private files.
Comment #46
quietone commentedA collection of thoughts regarding #44 and #45.
The suggested change in #38.3 concerns me because is keeps the broken behavior for new migrations.
Please suggest text for the note to be added to the description of the "Private files directory" field. #45.
As I understand this we want to maintain the broken/legacy behavior only in the case where a migration has been run before this patch and incrementals are being run and the private file path field on the credential form is empty. If that is correct then;
Comment #47
benjifisherI do not think so. If
source_private_file_pathis set, then it is used. We only fall back tosource_base_pathwhensource_private_file_pathis not set.With the change I suggested in #38.3, the original Upgrade7Test passed. I think, at the time, I also tried the new tests and they also passed.
I do not think we need to have different behavior depending on whether migrations have been run before. Just use the legacy behavior as a fallback if
source_private_file_pathis not set.The current help text is
We could change the second one to
I wish I could think of something shorter. I am not happy with all the repeated text. I considered adding the new text before the exixting help text, but I think it is important to see that an external URL is not allowed here.
Comment #48
quietone commentedI agree that functionally it probably doesn't matter, at least I can't think of a case where it does. But it does matter if we are going to trigger an error, which I would think we only want to do when the use case is someone wanting to maintain the broken behavior.
Comment #49
quietone commented'Twas in need of a reroll.
Comment #51
quietone commentedYes, now that there is a getCredentials() method in MigrateUpgradeTestBase the new getSourcePrivateBasePath() needs to be in that file as well.
Comment #52
quietone commentedbenjifisher and I have been having a Slack discussion about whether this is a bug or a feature. We just had a video call and it clarified my thinking on this and confirms that my lack of experience in using these tools I help write can be a hindrance to building those same tools! The result is that I agree with a) keeping the current behavior where, if the private file path is not entered it is set to the public file path and b) update the help/description text for those fields. The fix for 'a' is already in the patch. This patch adds the new text to the form.
And this should address alexpott's concerns in #44
Comment #53
quietone commentedIt helps to upload the patch.
Comment #55
quietone commentedUnrelated test failure, Drupal\Tests\media\FunctionalJavascript\CKEditorIntegrationTest #3037436: [random test failure] Make QuickEditIntegrationTest more robust and fail proof
Retesting
Comment #56
benjifisherI thought what we agreed to was to add the snippet from #38.3 (or an equivalent using the ternary operator) here:
That is, fall back to using
source_base_pathifsource_private_file_pathis empty. Explicitly:Or we could do it in
ReviewForm::submitForm()As I said in #33.11, changing an existing test is a warning sign. Once we do (1), we can revert this change:
Comment #57
quietone commentedOh, heavens! I completely messed that up didn't I. (It has been like that for the past week).
56.1 Kept the change in MigrateUpgadeImportPatch. We already have a change to check the scheme property and a comment, just seems right to keep it all together.
56.2. Removed.
@benjifisher, sorry for the extra work I've caused.
Comment #59
quietone commentedEek, the silly mistakes continue. This should fix 56.2
Comment #61
quietone commentedLet's try that again.
Comment #62
quietone commentedFinally!
I'm uploading a diff against the patch in #51 which is the latest patch before benjifished and I had chatted. It is the one I should have fixed just once and messed up many times. I hope it helps.
Comment #63
alexpottI'd love to know how a different directory works here? Both private and public are asking for the same thing - it's just the private doesn't support the url downloading - cause private files.
Why do we have the separate settings?
Comment #64
benjifisher@quietone: We are still jinxed.
From the 51-56 interdiff:
This is the base class, so we cannot use
{@inheritdoc}here.@alexpott:
I think that is essentially the same question you asked in #3151360-12: Improve description for file paths on the CredentialFrom, which led me to add #3159217: Be more flexible in where files are staged for migration.
Comment #65
quietone commentedOh, not again! This needed a reroll anyway because #3151360: Improve description for file paths on the CredentialFrom was committed. This patch has both the reroll and the fix for #64. I did them both at the same time since the reroll was a one line change.
Comment #66
benjifisher@quietone:
Be careful of false matches when rerolling! This is not the strangest effect I have seen:
What is your process for rerolling patches? If you use
git mergeorgit rebase, do you use the-X patienceoption?Comment #67
quietone commented@benjifisher, I commend your patience.
Comment #68
benjifisher@quietone:
Thanks!
I like the patch in #67: RTBC. I hope the testbot agrees.
Comment #69
alexpottSorry getting stuck on this point but testing different values of the private and public paths is odd when they are asking for the same thing. I'd think we should consider hiding the private file thing from the UI until we fix and test it properly. And state in the other file that to migrate private files this has to be a path and not a URL.
Then in a follow-up add support for a variety of solutions. The private file path can be relative to web root ie. something like '../private' or absolute eg. '/home/kieran/subdomains/u2/u2-private-files' - see https://www.drupal.org/docs/8/modules/skilling/installation/set-up-a-pri... - these values were valid on D7 too.
At the moment the UI is suggesting things that are not supported and this fix makes it looks like it works but I'm not sure it does. I think we only support a private file path that's relative to your web root. So asking for your web root twice is?!!??! And if you provided http://mysite.com for the public path and then the real path to document root for the private path your public file migrations are going to go much slower than your private ones!
Comment #70
quietone commented@alexpott, what needs to be done to 'test it properly'? Is there something wrong/missing in the new FilePathTest?
There is no reason that having '/home ...' and '../private' as the Drupal 7 'file_private_path' would fail. If the 'file_private_path' was '/home/kieran/subdomains/u2/u2-private-files' and the user entered '/d7files' for the 'Document root for private files' then the files would be migrated from '/d7files/home/kieran/subdomains/u2/u2-private-files' to the destination private file directory. And it was '../private' and they entere '/d7files' then the files would be migrated from '/d7files/../private'.
Comment #71
alexpott/d7files/home/kieran/subdomains/u2/u2-private-fileswould be an error because the absolute path is/home/kieran/subdomains/u2/u2-private-filesComment #72
quietone commentedIt would not be an error. /d7files/home/kieran/subdomains/u2/u2-private-files would be the path where I copied the private files from from the source site to somewhere accessible to the destination site for the migration.
Comment #73
quietone commentedOn Slack I said I would show some examples, these are not intended to be a complete set (been a busy day). These use the private files paths used on the examples on Set up a private file path and show what is entered on the credential form and the directory where the files will be migrated from.
I agree with benjifisher' s comment on Slack that this patch correctly fixes the problem stated in the IS. Other improvements can be made incrementally in follow ups. Perhaps a follow up to add a handbook page explaining the use of those fields, with examples, of course.
While preparing this I discovered that \Drupal\Tests\file\Kernel\Plugin\migrate\source\d7\FileTest does not test the filepath value, which is should. I'm not sure it is worth making a new issue for that.
Edit: Replaced table to show path for a single file and fix errors.
Edit: add 'concat' detail to header
Comment #74
quietone commentedI did to add an issue to fix d7\FileTest, #3159849: Add missing tests of filepath to FileTest and added as a related issue.
Now, the table above has been replaced to fix an error and changed to show the full path that d7_private_file.yml will use to access an example file, cube.jpeg.
The first three examples, with the d7 source files are at '/home/kieran/subdomains/u2/u2-private-files', show the source path used based on different values entered on the form. The 1st and 3rd examples will work just fine, assuming the destination site has read access to '/home/kieran/subdomains/u2/u2-private-file'. It is the 2nd example that won't. This happens because the value entered on the form is replaced with NULL at one stage. And replacing all the '/' is problem, to say the least. This can easily be handled by avoiding the replacement with the entered value contains only '/' characters. That is suitable for a followup.
The 4th and 5th example show results when the file_private_path is relative. Again, those will work.
I hope that is enough to show that the UI will migrate files in a variety of cases, well specifically the ones alexpott referred to on Slack. What we are left with is a not so clear UI and a lack of documentation, particularly for the non -developer using /upgrade. I think the discussion on both these can continue in #3159217: Be more flexible in where files are staged for migration.
Comment #75
quietone commentedAdded follow up to not remove all slash characters #3160015: Don't remove all slashes from filepath in file.php.
The related issue #3159849: Add missing tests of filepath to FileTest didn't get added so trying again.
Comment #76
quietone commentedIn Slack, benjifisher said that if I think I have answered alexpott's concerns in #69 and #71 then I should set this back to RTBC. That seems like the right thing to do here. I do think I have answered his questions so setting back to RTBC.
Comment #78
benjifisherUnrelated failure: Drupal\BuildTests\Framework\Tests\BuildTestTest
Comment #79
alexpottGoing to try to explain the problem I'm seeing again. Using absolute file paths for private files is quite common - it's in the example documentation.
In #70 it says that if the user enters
/d7files<code> for the 'Document root for private files' and the D7 site has a private filepath of <code>/home/kieran/subdomains/u2/u2-private-filesthen the files would be migrated from/d7files/home/kieran/subdomains/u2/u2-private-filesHowever in #73 it says if the user enters
/home/kieran/subdomains/u2/u2-private-filesand the private file path is/home/kieran/subdomains/u2/u2-private-filesthen the files will be migrated from/home/kieran/subdomains/u2/u2-private-files//cube.jpegbut the d7file plugin has:So I think we're going to be looking in
/home/kieran/subdomains/u2/u2-private-files/home/kieran/subdomains/u2/u2-private-fileswhich is completely wrong.I think this is part of the problem here. We have two field on the migration form asking for the same thing - the base path to the site. In HEAD we only use one of them because - it is the same thing. Whereas actually we making everything way too difficult. For private files we should be asking for a path to the private files and not adding in any other directory. The user knows the path to the files - often they've had to copy them over to the server. Concatenating the paths makes this way more complex than it needs to be.
Furthermore the validation of these paths does:
So we're not even using the source path + the public or private path that we're going to use in the migration to validate the paths - again making it harder for the user.
Comment #80
quietone commented@alexpott, Have you looked at the file source plugins? They replace the scheme with the file path variable then strip out the value entered by the user and that the duplication you are referring to.
The two field are asking for two different values, putting aside any confusion the help text may be giving to the user. One is asking for a path, or url, to the documents (minus the path variable) to the public files and the other one is asking for the same but for the private files. HEAD only uses one of those because of a bug, it was not by design.
Can't say that I looked at validatePaths() before but it seems reasonable to test that migrate has access to the base directory for the files.
Comment #81
quietone commentedalexpott and I had a chat about this the other night. We agree that there are problems and improvements that should be made to the form, the file validation, and the file path processing. I think that work is for followups and this stays within scope fixing the bug that the private file path entered by the user is ignored. The existing followup issue that I couldn't find during the call is #3159217: Be more flexible in where files are staged for migration, now added as a related issue. The stumbling block here, if I recall correctly, is to be sure that this change is not making things worse for the user.
During our conversation I could not see how using a field value correctly could be worse than ignoring it. But then maybe something was missed. So, I took my time reviewing. I double checked the before and after paths and everything works as expected, see table below. The UI has more information for the user, almost always a good improvement. So, what could be making this worse for a user?
Anyone already entering a value in the private file field has probably figured out the value is ignored. Or perhaps someone is entering the same value for both. If they are, they now have a friendly message telling them that they no longer have to enter the value twice. That is another improvement.
I've yet to find a reason this patch makes the existing situation worse. Instead, it is a small step in the right direction.
Comment #82
wim leersHere's a port of #67 to
9.0.x. It differs inMigrateUpgradeExecuteTestBase::testMigrateUpgradeExecute()andMigrateUpgradeTestBase.Comment #83
mikelutzAfter reviewing, I'm kicking this back to rtbc in light of #81 sounding like @quietone and @alexpott have an understanding, and if not, at least it will get back in front of @alexpott who can say otherwise. It looks like the consensus is that this patch is an improvement and further improvements are coming in follow-ups, and it looks like those follow-ups are filed, so hopefully this can be committed.
Comment #84
mikelutzLooks like it doesn't apply to 9.1.
Comment #85
vsujeetkumar commentedRe-roll patch given for 9.1.x.
Comment #87
mikelutzComment #89
mikelutzComment #90
alexpottThe reroll in #85 doesn't look quite right - seems to be missing things.
Comment #91
ravi.shankar commentedWorking on this to add reroll.
Comment #92
ravi.shankar commentedHere I have tried to add reroll of patch #82.
Comment #94
quietone commentedA bit strange to port a patch to 9.1.x when that patch was a port of a 9.1.x patch to 9.0.x. This patch is based on #92. I've also added the missing diff for the patch in #92.
Comment #96
quietone commentedComment #97
damienmckennaRelated: #3183219: File migration plugin's "source_base_path" problems when using ddev
Comment #98
wim leersComment #99
wim leersComment #100
benjifisherFor some reason, the testbot has not been re-testing the patches in #94 even though this issue is RTBC. I triggered a test against 9.1, and the new failed with this message:
Comment #101
wim leersSounds like a PHPUnit infrastructure problem 😬
Comment #102
quietone commentedThis needs a reroll probably because of #3151363: Double // in file paths.
Comment #103
benjifisherThe reroll looks good. In addition to the update for #3151363: Double // in file paths, I see that you have also updated the patch in line with #3168375: Convert calls to drupalPostForm(NULL, ...) to submitForm. :+1:
RTBC based on#99.
Comment #104
larowlanThis looks good to me, only a couple of suggestions that could be taken/left.
We're in freeze for a security release window, so will revisit this again later in the week.
nit: We can type-hint these as
stringnownit: temporary files are in a sub-dir here too right?
suggestion: we could use assertRegExp hereEdit: no we can't because we're using $matches
Comment #105
quietone commented@larowlan, thanks for the review.
This patch fixes #104.1 and 2.
I thought I had already tagged this for bugsmash... Done now.
Comment #106
quietone commentedI asked in #bugsmash for a review and larowlan replied that I could self RTBC (I wasn't sure). So, back to RTBC
Comment #107
larowlanThis looks good to me, going to ping alexpott for one more look
Comment #108
alexpottCommitted 9aec686 and pushed to 9.2.x. Thanks!
We need to new patch for 9.1.x if we're going to backport it. But given there are string changes perhaps this should be a 9.2.x only fix.
Comment #109
benjifisherThanks for fixing this issue!
Now that it works as designed, we can think about making the system easier to use in #3159217: Be more flexible in where files are staged for migration.
Comment #111
damienmckennaFYI this appears to break file migrations handled by Migrate Upgrade. I'd open an issue for that module only its issue queue was moved off d.o.
Comment #112
benjifisher@DamienMcKenna, if you post some more details here then I will open an issue for Migrate Upgrade.
Thanks for the early testing. We should have time to fix this before Drupal 9.2.0 is released.
Comment #113
alexpott@DamienMcKenna / @benjifisher - there is an option to revert this and commit again after fixing the BC break.
Comment #114
benjifisher@alexpott:
I would rather not revert the commit for this issue.
I opened https://gitlab.com/drupalspoons/migrate_upgrade/-/issues/36. @DamienMcKenna, can you add some details there?
Comment #115
damienmckennaI'm rerunning my site migration with the patch to confirm whether it works or if the problem was PEBCAK..
Comment #116
damienmckennaYes, it seems the problem I ran into was PEBCAK, so there's nothing to see here, sorry for the noise.
Comment #117
benjifisherI would rather have early testing with some noise than no testing before the next release!
Comment #119
quietone commentedPublish CR