Since we can monitor managed files in Drupal core we should prevent uploading of existing content by comparing hash of the uploaded file with existing files. This would not be a hard task but it would be applicable only to file upload widgets and not for managed inline images/files that CKEDITOR now supports so another value for the file would be required, something like "duplicate" or something like that so new files' hashes are compared only with the original file and not its duplicate(s).
The uplaoded file, if it would match the existing one, would get deleted right away and the id of the original file with the matching hash would be returned as value for the file upload widget, maybe with a message so the user knows why the file might be named differently and/or with option to upload the duplicate anyway, and usage wold be added for the original file.
Potentially, this could reveal if some files already exist in the file system so there is a small risk regarding informaiton/privacy leaking but it can be taken care of with proper permissions.
For example a privileged user would see a notification after the file get's uploaded that it already exists and if he want's to create a duplicate(only in DB) with the file name of the uploaded file or if using file name of the original, already uploaded, file is ok. Unprivileged user would not see any changes and based on settings the new file would either create a new duplicate entry or it would silently use the original file's name.
In any case only one file with the same hash would be physically present on the server with the possibility to have multiple entries in database pointing to the original one acting as "diplicate" entries.
Comments
Comment #1
dawehnerTo be clear, knowing that a file already exists is also some form of information disclosure. We should be aware of that problem, in case anyone decides to work on it.
Comment #2
Anonymous (not verified) commentedYes, that's what the last sentance mentions.
And that is also the reason I mentioned allowing the duplicates(in DB, not physically). They still could point to the original file but use a different filename so the user that just uploaded it will not be surprised about the different file name.
Also this could include a permissions to see if the file already exists(the mentioned message right after upload + option to create duplicate or just be fine with using the original file name) and so on.
On the other hand we already append numbers to already existing fiels with the same name so we are already givivng some information about existing files out.
I don't see much impact on small websites but if somebody is running a website with a ton of files this could save some space on their servers.
Comment #3
Anonymous (not verified) commentedComment #19
smustgrave commentedThank you for creating this issue to improve Drupal.
We are working to decide if this task is still relevant to a currently supported version of Drupal. There hasn't been any discussion here for over 8 years which suggests that this has either been implemented or is no longer relevant. Your thoughts on this will allow a decision to be made.
Since we need more information to move forward with this issue, the status is now Postponed (maintainer needs more info). If we don't receive additional information to help with the issue, it may be closed after three months.
Thanks!
Comment #20
cilefen commented