Hello there,

Hope you are doing well. First of all thank you for this incredible module. I have been using this for my project. But Deduplication is not working at all. I have been trying to resolve the issues but unable to do this because there is no any documentation or information regarding how it works internally. I have tested your module in following ways.

First i created two nodes with image field attached to them. I uploaded same image two both of them and was expecting Deduplication would work but did not work. For second node it should reuse the existing file rather than creating the new one by renaming them.

Second Test

I tried to use drupal file management function to load files through api.

i used the funtion

system_retrieve_file($url, $destination = NULL, $managed = FALSE, $replace = FILE_EXISTS_RENAME)

for reference see the drupal.org page https://api.drupal.org/api/drupal/modules!system!system.module/function/...

In that if i use FILE_EXISTS_RENAME option and upload same file again and again. This creates the files each time with new name thus Deduplication does not work. It should check the whirlpool_hash so if the file name and whirlpool hash is same it should reuse existing one and if whirlpool hash differs then it should rename the file and create them.

Third Test if i use the option FILE_EXISTS_REPLACE and try to upload same file again and again. Then it generates PDO exception error

PDOException: SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'storage-field-images://sharma/DN.38849.jpg' for key 'uri': INSERT INTO {storage_core_bridge} (storage_id, uri) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1); Array ( [:db_insert_placeholder_0] => 93 [:db_insert_placeholder_1] => storage-field-images://sharma/DN.38849.jpg ) in DrupalStorageStreamWrapper->stream_close() (line 566 of C:\Users\gauravsharma\Sites\devdesktop\commerce_kickstart-7.x-2.19-core\sites\all\modules\storage_api\core_bridge\storage_core_bridge.module

again it should check the whirlpool hash if the file name and whirlpool hash is same then it should resuse existing files and if not it should replace the existing one and updates the relevant tables rather than inserting the new record that cause PDO exception error.

thanks

Gaurav Sharma
https://www.indiantrendz.com

Comments

graham.roberts’s picture

Hi,

I too have been looking at this module and the deduplication function would be very helpful for us.

Like the OP, uploaded images are getting renamed and duplicated for me. I have started to work through the code and suspect it might be because core has renamed the file before Storage gets a chance to look at it.

When uploading an image via an image field the file is getting renamed by the file module in file_create_filename:

function file_create_filename($basename, $directory) {
// Strip control characters (ASCII value < 32). Though these are allowed in
// some filesystems, not many applications handle them well.
$basename = preg_replace('/[\x00-\x1F]/u', '_', $basename);
if (substr(PHP_OS, 0, 3) == 'WIN') {
// These characters are not allowed in Windows filenames
$basename = str_replace(array(':', '*', '?', '"', '<', '>', '|'), '_', $basename);
}

// A URI or path may already have a trailing slash or look like "public://".
if (substr($directory, -1) == '/') {
$separator = '';
}
else {
$separator = '/';
}

$destination = $directory . $separator . $basename;

if (file_exists($destination)) {
// Destination file already exists, generate an alternative.
$pos = strrpos($basename, '.');
if ($pos !== FALSE) {
$name = substr($basename, 0, $pos);
$ext = substr($basename, $pos);
}
else {
$name = $basename;
$ext = '';
}

$counter = 0;
do {
$destination = $directory . $separator . $name . '_' . $counter++ . $ext;
} while (file_exists($destination));
}

return $destination;
}

Working back through the call stack I don't see any hooks invoked or overridden method calls that would have given StorageAPI a chance to deduplicate before the filename got changed:

/dev/includes/file.inc.file_create_filename:1213
/dev/includes/file.inc.file_destination:995
/dev/includes/file.inc.file_save_upload:1554
/dev/modules/file/file.module.file_managed_file_save_upload:652
/dev/modules/file/file.module.file_managed_file_value:501
/dev/modules/file/file.field.inc.file_field_widget_value:601
/dev/includes/form.inc._form_builder_handle_input_element:2059
/dev/includes/form.inc.form_builder:1844
/dev/includes/form.inc.form_builder:1906
/dev/includes/form.inc.form_builder:1906
/dev/includes/form.inc.form_builder:1906
/dev/includes/form.inc.drupal_process_form:885
/dev/modules/file/file.module.file_ajax_upload:267
/dev/includes/menu.inc.call_user_func_array:517
/dev/includes/menu.inc.menu_execute_active_handler:517
/dev/index.php.{main}:21

Is anybody able to offer input regarding how/when the deduplication is supposed to be triggered and I will be happy to continue investigating from there?

Thanks

Graham

rahu231086’s picture

Issue summary: View changes