( Possible related to https://www.drupal.org/node/2810995 )
As of now, the program removes any special UTF-8 characters from the filename title ( such as German Umlauts, french accents, etc.). In that case, the generated filename becomes quite unreadable.
I traced it down to RenderBase->sanitizeFilename, where at the moment the module throws out anything but standard ascii letters and numbers.
Instead, the string should be properly tansliterated.
Drupal offers a special API for this
( there is another issue https://www.drupal.org/node/2810995 where a conversion array is given, but I think if there is an API for it, we should use that one? )
So the function would become:
protected function sanitizeFilename($filename) {
// See Drupal's doc on transliteration:
// https://api.drupal.org/api/drupal/core!lib!Drupal!Core!Language!language.api.php/group/transliteration/8.2.x
$langcode = \Drupal::languageManager()->getCurrentLanguage()->getId(); // Use the current default interface language.
$trans = \Drupal::transliteration(); // Instantiate the transliteration class.
$transformed = $trans->transliterate($filename, $langcode); // Use this to transliterate some text.
return preg_replace("/[^A-Za-z0-9 ]/", '', $transformed);
}
I attached a patch -- HOWEVER THIS PATCH IS OFF the 2.x-beta1 branch, NOT OF DEV.
You may have to adjust for the dev branch.
Comment | File | Size | Author |
---|---|---|---|
#12 | 2845323-12.patch | 13.01 KB | benjy |
#11 | 2845323-11.patch | 12.34 KB | benjy |
#10 | interdiff.txt | 4.56 KB | benjy |
#10 | 2845323-10.patch | 14.45 KB | benjy |
#8 | interdiff.txt | 962 bytes | benjy |
Comments
Comment #2
hoporr CreditAttribution: hoporr commentedComment #3
benjy CreditAttribution: benjy at PreviousNext commentedThanks for posting a patch, feedback below.
No need for links to the docs within the code. Feel free to add it to the interface documentation if you think it's required.
This is against the coding standards, inline comments should be on-top of the line and line lengths less than 80 characters.
Also, I think we should probably inject the transliteration service and not use \Drupal
Is the regex still need after this?
Also, would be good to see some new tests that show how this works. Setting to needs review for the test bot then it's needs work.
Comment #4
benjy CreditAttribution: benjy at PreviousNext commentedComment #6
benjy CreditAttribution: benjy at PreviousNext commentedOK I've implemented the transliteration as suggested but i've also pulled it out into a separate service and removed the getLabel abstract protected method from renderers. This makes testing filename generation easier and more self-contained.
Comment #8
benjy CreditAttribution: benjy at PreviousNext commentedTest fix.
Comment #9
Sam152 CreditAttribution: Sam152 at PreviousNext commentedThis looks good.
should be "filenames"
Does this really need the whole entity? Maybe this should just be a string. Also funky for the method to be "generateFilename" but it returns multiple filenames.
If all we need is the langcode, can we just inject the ID? Surely container param for this?
spaces are allowed in file names?
Comment #10
benjy CreditAttribution: benjy at PreviousNext commentedComment #11
benjy CreditAttribution: benjy at PreviousNext commentedFinal patch here, few things leaked into the last one.
Comment #12
benjy CreditAttribution: benjy at PreviousNext commentedComment #14
benjy CreditAttribution: benjy at PreviousNext commentedCommitted and pushed.
Comment #16
mstrelan CreditAttribution: mstrelan commentedSo no one answered whether the regex is still needed... I would like it removed. Should I raise a new issue?