Problem
The error message for file extension validation displays a bogus file path for browsers who do not return the real local file path for security reasons. This can cause confusion by leaving the user thinking that perhaps Drupal could not find the file because it messed up the file path.
Proposed resolution
Displaying the file path to the file is not necessary. Why not just display the file name, removing the possibility of showing a bogus file path? By extracting the file name from the file path this can be achieved.
By testing for a slash in the file path one can distinguish between a file path on file systems supporting forward slashes as path delimiters and a file path on file systems supporting backslashes as path delimiters. One assumption is that file systems supporting forward slashes as path delimiters can support backslashes in the file name (as is the case for many UNIX file systems). Another assumption is that file systems supporting backslashes as path delimiters do not support backslashes in the file name (e.g., Windows).
Known problems with patch
The file name will be cut short for fake paths containing backslashes set by the browser, when the file name contains backslashes (which is a possibility for many UNIX file systems). An example could be this file name: file\name.txt. If the browser returns a bogus file path with backslashes, the test will give the extraction job to the extractor that cuts everything before the last backslash. The displayed file name would be: name.txt. This can also cause confusion for users.
User interface changes
The error message will display the file name instead of the file path. No other changes are made to the error message. See the attached screenshot for how it will look.
Original report by roy smith
I am editing a story using Google Chrome 8.0.552.237 on a Macintosh. Under Image, I click the Choose File button and select a file of a dis-allowed type. The error message I get reads:
The selected file C:\fakepath\no-phrf.graffle cannot be uploaded. Only files with the following extensions are allowed: png, gif, jpg, jpeg.
The path is totally bogus, since this is not a windows machine. The file name and extension are indeed the file that I selected, but the stuff before that is all wrong. In this case, the full path to the file is /Users/roy/Desktop/no-phrf.graffle.
Comments
Comment #1
sweetchuckThe path is only depend on the used browser. Each browser return different path, but never return the real local path. Only the file name and the extension is real.
This is a security restriction.
Do you want allow to each website to take information from your file system?
Comment #2
damien tournoud commentedThis is a text displayed by the file.js client side validator. It feels like we should only use the base name of the file here.
Comment #3
Tor Arne Thune commentedCame across this today. It seems totally unnecessary to display the path of the file in this error message, and for many the path will be completely bogus.
This would be a sufficient and clean error message:
Also, the placeholder
%filenameinfile.jshas the right name, but displays the file path, not the file name.Comment #4
Tor Arne Thune commentedAlso, this can be backported, as it will not break strings, if the value of the placeholder %filename is changed to what it signifies (a file name, not a file path).
Comment #5
Tor Arne Thune commentedComment #6
Tor Arne Thune commentedAttaching a patch that removes the file path so that only the file name with its extension is displayed instead. See attached screenshot for the result.
Comment #7
Tor Arne Thune commentedJust adding some tags.
Comment #8
roy smith commentedI don't fully grok the solution (my JS experience is pretty shallow), but it looks like it is regex matching anything that's not a slash or backslash up to the end of the string. That's not really correct. On systems which use forward slashes for path delimiters, a backslash is a valid character in a file name (not sure the opposite is true on Windows). This will fail on those pathological cases.
Comment #9
Tor Arne Thune commentedYou are absolutely right. I can't find a solution to test for the file name when they can contain backslashes on UNIX file systems. The test in the patch basically takes what is after the last backslash or slash.
Comment #10
Tor Arne Thune commentedIt's possible to test if the value of the file field is a path with slashes (e.g., a UNIX path) or not (e.g., a Windows path), and then extract everything after the final slash from the "UNIX" path or everything after the final backslash from the "Windows" path.
The problem with this is the bogus
C:\fakepathsome times appearing for people on UNIX systems. This creates the possibility of them having a file name with backslashes in it, combined with the bogus Windows path. The test would pick this up in the same group as the Windows paths, therefore extracting everything after the final backslash. A path ofC:\fakepath\example_file_with_\_in_it.txtwould then be displayed as_in_it.txtin the error message.Giving this a rest, as I can't see a solution.
Comment #11
roy smith commentedI dislike half fixes, but in this case I'm tempted to say it's the right thing to do. The way it is now is totally broken. Your suggested fix is 99% correct, only fails on an extremely rare case, and when it does fail, is no worse than what we have now. So maybe just note the issue in a code comment (so future maintainers will be aware) and go for it.
Comment #12
Tor Arne Thune commentedIssue summary
Problem
The error message for file extension validation displays a bogus file path for browsers who do not return the real local file path for security reasons. This can cause confusion by leaving the user thinking that perhaps Drupal could not find the file because it messed up the file path.
Proposed resolution
Displaying the file path to the file is not necessary. Why not just display the file name, removing the possibility of showing a bogus file path? By extracting the file name from the file path this can be achieved.
By testing for a slash in the file path one can distinguish between a file path on file systems supporting forward slashes as path delimiters and a file path on file systems supporting backslashes as path delimiters. One assumption is that file systems supporting forward slashes as path delimiters can support backslashes in the file name (as is the case for many UNIX file systems). Another assumption is that file systems supporting backslashes as path delimiters do not support backslashes in the file name (e.g., Windows).
Known problems with patch
The file name will be cut short for fake paths containing backslashes set by the browser, when the file name contains backslashes (which is a possibility for many UNIX file systems). An example could be this file name:
file\name.txt. If the browser returns a bogus file path with backslashes, the test will give the extraction job to the extractor that cuts everything before the last backslash. The displayed file name would be:name.txt. This can also cause confusion for users.User interface changes
The error message will display the file name instead of the file path. No other changes are made to the error message. See the attached screenshot for how it will look.
Comment #13
sheldon rampton commentedI think the best solution here is to do a very specific search-and-replace for the string "C:\fakepath\" rather than a general search-and-replace that looks for forward slashes or back slashes. Web browsers already strip out the local path. Internet Explorer and Chrome then prepend "C:\fakepath\" to eliminate the possibility that Javascript might submit something to servers that they might mistake for a real file path. In our case, however, we're not submitting anything to a server. We're simply generating a message for display to human users, so the C:\fakepath\ serves no useful purpose and merely creates confusion. Here are a couple of discussions about the C:\fakepath\ issue:
http://acidmartin.wordpress.com/2009/06/09/the-mystery-of-cfakepath-unve...
http://stackoverflow.com/questions/4851595/how-to-resolve-the-c-fakepath
A more general attempt to strip out everything prior to a forward slash or back slash would create problems. If the user is on Windows, stripping out forward slashes could break legitimate filenames. If the user is on Unix/Linux/OSX, stripping out back slashes could break legitimate filenames. However, the problem can be fixed to everyone's satisfaction by simply stripping out the exact string "C:\fakepath\".
I'm attaching a revised patch with my proposed solution.
Comment #14
sheldon rampton commentedI've posted a bug report for this issue in Drupal 7, along with a D7 backport of my patch:
http://drupal.org/node/1347918
Comment #15
mototribe commentedI tested the patch on D7.12 and it worked fine. Thanks for getting this fixed!
Comment #16
Tor Arne Thune commentedThe patch in #13 looks ready to go and has been tested!
Comment #17
xjmThanks Sheldon, and thanks everyone for the testing.
Sorry to knock this back over coding standards, but #13 needs a bit of cleanup. There's some trailing whitespace and the comment lines are quite a bit too long. They should wrap at or before 80 characters. Reference: http://drupal.org/coding-standards
Adding a novice tag for someone to reroll this simple cleanup.
Comment #18
sheldon rampton commentedThanks for the coding standards lesson. It's actually great to have this level of oversight and attention to detail. I've rerolled the patch.
Comment #19
sheldon rampton commentedremoving Novice tag since I've rerolled the patch myself
Comment #20
xjmThat looks great, thanks. Back to RTBC.
Comment #21
webchickEhhh. I'm not sure about this. This puts in a very specific browser hack-around for what could very well be a generalized problem (hopefully, Chroms isn't unique in somehow munging or otherwise protecting the local file path from being exposed for security). Their "cute" string could always change in the future, as well.
I think I prefer something more along the lines of #12, but I'm willing to hear arguments as to why I'm on crack. :)
Comment #22
sheldon rampton commented@webchick, please read some of the background on this:
http://acidmartin.wordpress.com/2009/06/09/the-mystery-of-cfakepath-unve...
http://stackoverflow.com/questions/4851595/how-to-resolve-the-c-fakepath
Multiple web browsers, including Chrome, Safari and IE, use the "C:\fakepath\" string as a security measure. In fact, the specific "C:\fakepath\" string is expressly mandated in the HTML5 specification from w3.org. The W3 spec states, "On getting," the value IDL attribute "must return the string 'C:\fakepath\' followed by the filename of the first file in the list of selected files, if any, or the empty string if the list is empty." As far as I'm aware, there are no web browsers that use a string other than "C:\fakepath\" to implement this security feature. If this should ever happen (which is unlikely and would violate the HTML5 spec), it would not create a security issue. Prepending something other than "C:\fakepath\" might result in a string getting displayed to human users that someone might find confusing, but it would not breach security. In any case, it would be easy to fix then with a new patch.
A more "general" fix actually has the potential to create a new bug, as I noted in comment #13 above. We should just fix what's broken and stay consistent with the W3 specs for HTML5.
Comment #23
sheldon rampton commentedI posted a reply to webchick's concerns back in March and there has been no further reply, so I think we should assume that either my answer addressed her concerns fully, or else this isn't a strongly held concern for her. Can we please move this ticket forward to get it committed?
Comment #24
nod_Could you replace double quotes with single quotes in your patch please? I know i'm kind of nitpicking but it makes the code clearer and more coherent with the rest of the file.
You can reroll and set to rtbc, looks good to me too, thanks :)
Comment #25
sheldon rampton commentedOK, here's the rerolled patch with single quotes.
Comment #27
sheldon rampton commentedOops, I guess there were some other changes to file.js since the last time I submitted this. Here's a patch that should actually work.
Comment #28
sheldon rampton commentedComment #29
tim.plunkettPlease don't RTBC your own patches.
In the future, when making changes to a patch it helps to write an interdiff.
Comment #30
sheldon rampton commented@tim.plunkett: nod_ asked me in comment #24 to set this to RTBC, so I was simply doing what I was asked to do. Unfortunately, nod_ asked me to do that back in May, and I didn't notice his request until a few days ago. In the meantime, some other changes have been made to file.js, which means that the line numbering of the code addressed in my original patch has changed, so git now fails to apply the patch that nod_ was referring to (even though it passed testing when I originally submitted it). This also means that I can't write an interdiff. Step #2 of the instructions for writing an interdiff says to "Download the most recent version of the patch you wish to update and apply it to your local git repository." I can't do that, because git won't let me apply the previous version of the patch to the current version of file.js.
The only differences between the current patch and the patch I submitted in comment #18 (which was accepted at the time as RTBC by xjm before webchick kicked it back) is that the current patch changes a pair of double quotes to single quotes in keeping with the request from nod_, and the current patch now references the correct line numbers in the current version of file.js so git will accept it.
Since I can't create an interdiff, can someone please just evaluate the current patch on its own merits?
Comment #31
tim.plunkettRe #23, #25, and #28: "One should not set one's own patch to this status."
That said, I reviewed this and it is good to go.
Comment #32
sheldon rampton commented@tim.plunkett: Regarding not setting my own patches to RTBC, I understand your point and will try to be careful about this in the future.
Comment #33
catchSorry to knock this back again, but we don't need
that's handled fine by git blame so it can just be dropped from the patch.
Comment #34
sheldon rampton commentedOK, here's a new patch and an interdiff.
Comment #35
Tor Arne Thune commentedVerified and back to catch/webchick/Dries :)
Comment #36
catchThanks for the quick re-roll. Committed/pushed to 8.x, moving to 7.x for backport.
Comment #37
sheldon rampton commentedOK, here's the patch for D7.
Comment #38
pjcdawkins commentedThe D7 patch in #37 applies cleanly, seems to be clear, and works in manual testing, and it's a straightforward backport of the D8 patch in #34.
Comment #39
webchickAh, cool. Thanks for the further background in #22. In the future, go ahead and kick back to RTBC if you feel the comments/questions are addressed; I'm likely to lose track of "needs review" issues. :\
Committed and pushed to 7.x. Thanks!
Comment #40.0
(not verified) commentedAdding issue summary to original post.