While trying to move the user.module's user picture into the managed file framework I came across an oversight in the Files API. When you use FILE_EXISTS_REPLACE to overwrite an existing, managed, file using file_copy(), file_move() or file_save_data() the target file's database record isn't affected.
The case of file_copy() is relatively straightforward to remedy, if replace was specified just look for an existing file with the resulting path:
$file = clone $source;
$file->fid = NULL;
$file->filename = basename($filepath);
$file->filepath = $filepath;
becomes:
$file = clone $source;
if ($replace == FILE_EXISTS_REPLACE && ($existing = file_load(array('filepath' => $filepath)))) {
$file->fid = $existing->fid;
}
else {
$file->fid = NULL;
}
$file->filepath = $filepath;
And logically this makes sense, you start with two files--the source and target--and end with two files by changing the contents of the later. hook_update get fired and modules know to update themselves accordingly.
file_save_data() is also very simple we just try to load the resulting file path and if a file is found use its file id.
file_move() is a bit trickier. You're starting with two files--the source and target--but should only have one at the end. The filesystem bit is straightforward--just overwrite the file. The question really boils down to which file is deleted and which updated? Modules need to be informed that one file has been updated and one deleted.
I think that the most common case for using file_move() with overwriting is to replace an existing file with a new upload. In this case the best course of action is to load the destination record and update that, then after hook_file_move() is invoked call a soft delete (file_delete($source, FALSE)
) so that the original is removed. That way any modules using the destination file won't be disturbed.
Comment | File | Size | Author |
---|---|---|---|
#42 | file_334303.patch | 53.26 KB | drewish |
#41 | file_334303.patch | 53.34 KB | drewish |
#39 | file.patch | 52.64 KB | bcn |
#38 | file_334303.patch | 53.21 KB | drewish |
#36 | file_334303.patch | 42.11 KB | drewish |
Comments
Comment #1
drewish CreditAttribution: drewish commentedComment #2
drewish CreditAttribution: drewish commentedre-rolled from CVS rather than git.
Comment #4
drewish CreditAttribution: drewish commentedi think this got caught by the testbot melt down.
Comment #5
drewish CreditAttribution: drewish commentedactually that last one had some of #330633: Temporary file cleanup needs some love (UnitTest included!) mixed in.
Comment #6
c960657 CreditAttribution: c960657 commenteddrewish encouraged me to review this. I am not an expert in the file API, but here are some comments:
The patch modifies the behaviour of file_move() and file_copy() wrt
{files}.filename
of the destination file. Previouslyfilename
was always set tobasename(filepath)
, i.e. it included the _nn suffix in case of overlapping filenames. This is no longer the case - and that is probably a good thing (afaik the reason to save bothfilename
andfilepath
is to allow multiple files to have the same "public" name). However, file_save() isn't updated to the new behaviour - I suggest changing that too.When file_move() is used to rename a file (i.e. $destination is a filename rather than a directory), filename still reflects the old name. Or should destination always be a directory? It says so in the Doxygen comment, but in the testNormal() unit test $desired_filepath is a filepath including filename.
I think $file->filename should be checked in the unit tests as well.
I suggest renaming $delete_original to $delete_source (to reflect that it refers to $source).
+ $this->assertEqual($a->fid, $b->fid, t("File's id is the same."), 'File Unchanged');
Is that proper English? The same as what? I suggest rephrasing to "The files' ids are identical" or "File ids are identical".
Comment #7
drewish CreditAttribution: drewish commentedc960657, i think you're being modest about your experience with the files api.
actually the filename has (at least since D5) been the name the file was uploaded with--rather than just the basename--for exactly the reasons you've suggested. it was really more of a shortcoming in the first versions of file_move() and file_copy() that it wasn't preserved during the operation. not sure what you mean about file_save() not being updated... it should save what ever filename is provided.
Yeah that was a docs bug, it's actually always let you specify the filename, it's just not been documented. I've fixed that for file_copy(), file_move(), file_unmanaged_copy(), and file_unmanaged_moved().
I'm not quite sure where you mean for that.
Good call, changed that and fixed the comment above the file_delete() to explain that it's not a forced delete.
Yeah, I can see your point... I thinking since it's comparing the same file it's one file's id... I've changed the variable names from $a and $b to $before and $after to make that clearer and removed the possessive bit so it's now:
Comment #8
c960657 CreditAttribution: c960657 commented:-)
Sorry, I meant file_save_data(). The following adds two rows with the filename column set to foo.txt and foo_0.txt, respectively. They should both be foo.txt.
file_save_data('lorem', 'foo.txt'); file_save_data('ipsum', 'foo.txt');
I suggest expanding the tests so that the contents of the filename column is also verified, e.g. making sure that the above example generates the expected result.
I'll take a closer look at this later.
Comment #9
c960657 CreditAttribution: c960657 commentedThe patch changes the behaviour of {files}.filename, but it still doesn't always match what I would expect. Here are some examples. Do you agree that these should be changed?
(In the following, "Expected/actual filenames" refer to the expected and actual values of the {files}.filename column for each of the mentioned files. All examples should be run with an empty files directory.)
When copying or moving to a complete file path, I think {files}.filename should reflect basename($destination):
Expected filenames: foo.txt bar.txt bar.txt
Actual filenames without patch: bar_0.txt bar.txt foo_0.txt
Actual filenames with patch: foo.txt bar.txt foo_0.txt
Expected filenames: bar.txt bar.txt
Actual filenames without patch: bar_0.txt bar.txt
Actual filenames with patch: foo.txt bar.txt
When saving a file with a specified name, I think this should be saved in {files}.filename, even though the file is renamed to something else in the files directory.
Expected filenames: foo.txt foo.txt
Actual filenames with and without patch: foo.txt foo_0.txt
It looks like these changes weren't included in the patch.
Title Case doesn't seem to be used much elsewhere in Drupal.
Use single quotes instead of double quotes.
What about renaming this to assertIdenticalFile()? This seems more in line with assertDifferentFile() and assertSameFile().
+ $this->assertEqual($a->fid, $b->fid, t("Files have the same ids %a-fid == %b-fid.", array('%a-fid' => $a->fid, '%b-fid' => $b->fid)), 'Same File');
This message seems very elaborate (includes the actual fids) compared to most others in the bug. A left-over from debugging?This kind of comments has been deprecated (#338403: Use {@inheritdoc} on all class methods (including tests)).
Description should start with a verb (see Test writers guidelines).
You forgot to call parent::setUp().
In the munging tests, how about asserting the actual munged name, e.g. like this:
Comment #10
drewish CreditAttribution: drewish commentedc960657, thanks, first, for such a through review, and second for so patiently explaining the issue. you're absolutely right that i'd over looked aspects of the filenames.
in the case of file_save_data() i think it'd make sense to use the basename of their original destination filename. i think we're in agreement on this point.
with file_move() and file_copy() i'd like to give it some thought before i commit to anything. i'll be on the plane all day tomorrow so i'll have some time to try to flush out a fix for this.
Comment #11
drewish CreditAttribution: drewish commentedhere's a re-roll that gets most of c960657's issues with the tests and other nitpicking points. doesn't hit the higherlevel filename issues.
Comment #12
drewish CreditAttribution: drewish commentedokay did some work on this to try to address the filenames in a sane, consistent manner. this looks at the replacement mode and if the destination was a a file (rather than a directory) and if it's a renaming around an existing file then it uses that for the filename.
i ran out of time before i could get the copy and move tests fully updated so this might need a bit of work but i'd say it's ready for a review in other respects.
Comment #13
quicksketchAdding to my review queue. Thanks drewish.
Comment #14
c960657 CreditAttribution: c960657 commentedThis comment appears twice in file_copy().
"it's" should be "its" (in several comments).
file_copy() contains the following (some lines omitted):
Instead of assigning a variable and the overwriting it immediately after, I'd prefer if the initial assignment was moved into an else block. I think this makes it easier to realize that filename is set to one of three different values based, and under which circumstances.
"exampine" -> "examine"
I didn't look too closely at the tests yet.
Comment #15
drewish CreditAttribution: drewish commentedaddressed all of c960657's issues
Comment #17
drewish CreditAttribution: drewish commentedi think the test was run on pgsql database and was failing for other reasons.
Comment #18
Dries CreditAttribution: Dries commentedWould be good to get another review from, c960657. The patch introduces some additional complexity, but it seems useful. It matches the behavior of 'cp' and 'mv' on Linux systems, which is what I think we should aim for.
Comment #19
c960657 CreditAttribution: c960657 commentedAFAICT none of the three examples in #9 yield the "expected" result - is that intended? It looks like $destination can be a path relative to either DRUPAL_ROOT or file_directory_path. The patch only handles the former (I think - I haven't looked to closely). I am not sure whether the latter is expected to work. The documentation isn't clear on how relative paths are resolved.
In the file_move() tests, you may want to verify that $source->filepath no longer exists.
These two can be replaced by assertDifferentFile().
+ $this->assertFalse($result, t("File move failed."));
Should use single quotes (I'm not sure whether this is actually required by the "official" coding conventions?). Several occurences.
+ // Verify that was was returned is what's in the database.
One was was one too many :-) Several occurences.
+ // Ensure that the existing file wasn't ovewritten.
"ovewritten"
Comment #20
drewish CreditAttribution: drewish commentedc960657, it would be helpful if you could rephrase what you think the expected vs actual are from #9, you list several files but it's not clear which is which. at this point i feel that the names are correctly handled. what don't you like?
in regards to the directories, IMHO we shouldn't be modifying any files outside drupal's files directory but we definitely don't test them. and due to the permissions issues i don't think it's realistic to try.
FileMoveTest should be checking that the files are removed. testExistingRename(), testExistingReplace() and testNormal() all call:
In testExistingError() and testExistingReplaceSelf() the file isn't removed.
Fixed the typos pointed out.
Comment #21
c960657 CreditAttribution: c960657 commentedI agree. My point is that both
file_save_data('lorem', 'asdf.txt')
andfile_save_data('lorem', file_directory_path() . '/asdf.txt')
are equivalent, i.e. they both create a file in the files directory.This example creates four files in the files:
Note how filename is modified with _0 only when $destination is specified without the file_directory_path() prefix. This seems inconsistent.
Hmm, don't know how I missed that.
Comment #22
drewish CreditAttribution: drewish commentedah, good point. i'll look into that. here's a clean re-roll in the mean time.
Comment #23
drewish CreditAttribution: drewish commenteddropped the unrelated unit tests that had been merged in from #276280: Tests needed: file.inc. also got all the tests passing now that file_load_multiple() is the correct way to load by filepath.
Comment #24
drewish CreditAttribution: drewish commentedc960657, got the renaming names fixed.
Comment #25
drewish CreditAttribution: drewish commentedre-rolling now that #353207: Remove FILE_STATUS_TEMPORARY was committed.
Comment #26
drewish CreditAttribution: drewish commentedComment #27
drewish CreditAttribution: drewish commentedre-rolling around some other commits.
Comment #28
c960657 CreditAttribution: c960657 commentedA few more nits:
Over in #348448: Always report E_STRICT errors, Damien Tournoud suggested writing this as:
I am not sure I like this pattern (I think it obscures the fact that you are working with objects), but if this is the recommended coding convention (is it?), we should follow it.
"was was"
For consistency with FileMoveTest and FileCopyTest, I suggest renaming FileSaveDataTest::testOverwriteError() to testExistingError().
Comment is a duplicate of that for testExistingReplace().
Comment #29
drewish CreditAttribution: drewish commentedc960657, I think that #348448 is more concerned with cases where
new stdClass
wasn't being called. We're not getting any warnings and there's other places in core where this style is followed.Corrected all the other points though.
Comment #30
drewish CreditAttribution: drewish commentedit occurs to me that since file_save_upload() also has a replace parameter we should have this code there as well.
Comment #31
drewish CreditAttribution: drewish commentedFixed some bugs when calling file_save_upload() with $replace = FILE_EXISTS_ERROR.
Comment #33
mfer CreditAttribution: mfer commentedThe patch applies cleanly. There was an issue with one of the testing servers. Also, all the file tests run cleanly with the installed patch.
On a quick review I noticed
There are multiple places we're are used. Conjunctions should be removed. This should be 'we are'.
I'll try and give it a better review tomorrow.
Comment #34
drewish CreditAttribution: drewish commentedremoved the contractions.
Comment #35
webchickWhile this sounds like an accurate description of what the code below does, I'm left wondering "Why is this so important that you spent 5 lines talking about it?" An extra sentence with a summary of the implications would be good to add here, I think.
Heh. ;)
I like that assertFileUnchanged() uses parameter names that are indicative of what they're doing. Could we do the same here? $file1, $file2 would be an improvement.
Same for assertSameFile()
let's
moving
That's it for an initial pass. Now that I know how big this patch is I'll set aside time accordingly. :\ Sorry, tried to do this and something else at the same time and it didn't really work well. :(
Comment #36
drewish CreditAttribution: drewish commentedaddressed webchick's comments.
Comment #37
webchickA more thorough review this time. These are some wonderful tests to have, and a great improvement overall. I can tell that lots of work went into this.
A few minor issues, from the bottom, just to make things saucy! Once these are resolved, I'll do another quick pass but then I think this is good to go.
Could we expand that comment a bit more, since the new code is less readable than the old version? Something like "// Validate the uploaded picture, using the type of replacement specified by the form."
unchanged
How come sometimes we do...
...and other times we do...
Should these not be consistent throughout the tests?
Although I realize that $expected_count in assertFileHookCalled() defaults to 1, I wonder if we should specify 1 here anyway. This kind of looks like a mistake. Just musing ... feel free to ignore this comment if you'd like. ;)
I'd love to see an inline comment or two break up the code in testWithFilename(), testExistingRename(), testExistingReplace() and testExistingError(). Roughly one line of comments for every 4-5 lines of code (where they don't currently exist). For example, instead of:
Maybe (only comment changes):
I realize that one can read the assertions to get a good idea of what's going on, but I find the code much more scannable when this information is left-aligned with the code in the form of an inline comment. I think it would also help a new developer to understand at a glance what the test is doing, even if they don't understand SimpleTest.
overwritten.. this looks like it might've been copy/pasted a couple times.
I do not grok this assertion. :) What does that mean? The code part just looks like "the file names were the same" but obviously there is some nuance here.
Why clone? Could we have a little one-liner here to explain?
And while we're at it, a more descriptive comment here? Applies to a few sections of the code. I'd love it if each of the "Look at the results" actually explained what the code below was testing. I know it's not quite as copy/paste friendly, but it's much easier to read through.
*very* nit-picky, but could we please have a blank line above each of these comments, for consistency with elsewhere in this file? Applies to a few sections of the code.
A couple in-liners to break up testExistingError() and testExistingReplaceSelf() would also be good.
Also, there are a couple of inline comments that are missing periods. Tsk, tsk. ;)
function assertSameFile() logs the file IDs/paths in the assertion (%file1-fid == %file2-fid). This function probably should as well. Then again, assertFileUnchanged() does NOT do this for all of the properties it's checking, so maybe assertSameFile() ought not to either.
Is there a reason you logged it there and not the others?
Other than that, I couldn't find anything. :)
Comment #38
drewish CreditAttribution: drewish commentedResponding to your comments in reverse order to just desaucy things a bit so they don't get out of hand...
Added id/filepath from assertDifferentFile() to assertSameFile() and assertFileUnchanged().
Got sick of trying to get all the hooks checked correctly so I went ahead and added assertFileHooksCalled() which takes an array of hooks and ensures that only those file hooks are called.
Well which ones were they?
Not really, we need to ensure they've got different contents to ensure that the file wasn't overwritten.
So on the whole the comments should be a lot better.
Comment #39
bcn CreditAttribution: bcn commentedWas the only missing period I found.
There were two places where the comments were longer than 80 characters, so I slit them to a new line.
Those are the only changes for the attached patch, and it should still apply.
Comment #41
drewish CreditAttribution: drewish commentedrerolled after making the changes listed in noahb's comments.
Comment #42
drewish CreditAttribution: drewish commentedfixed two comment issues webchick spotted.
Comment #43
webchickExcellent clean-up work!
Committed to HEAD. :)