( Possible related to https://www.drupal.org/node/2810995 )

As of now, the program removes any special UTF-8 characters from the filename title ( such as German Umlauts, french accents, etc.). In that case, the generated filename becomes quite unreadable.

I traced it down to RenderBase->sanitizeFilename, where at the moment the module throws out anything but standard ascii letters and numbers.

Instead, the string should be properly tansliterated.

Drupal offers a special API for this
( there is another issue https://www.drupal.org/node/2810995 where a conversion array is given, but I think if there is an API for it, we should use that one? )

So the function would become:

 protected function sanitizeFilename($filename) {
    // See Drupal's doc on transliteration:
    // https://api.drupal.org/api/drupal/core!lib!Drupal!Core!Language!language.api.php/group/transliteration/8.2.x
    
    $langcode = \Drupal::languageManager()->getCurrentLanguage()->getId();        // Use the current default interface language.       
    $trans = \Drupal::transliteration();                                          // Instantiate the transliteration class.                                          
    $transformed = $trans->transliterate($filename, $langcode);                   // Use this to transliterate some text.
        
    return preg_replace("/[^A-Za-z0-9 ]/", '', $transformed);
  }

I attached a patch -- HOWEVER THIS PATCH IS OFF the 2.x-beta1 branch, NOT OF DEV.
You may have to adjust for the dev branch.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

hoporr created an issue. See original summary.

hoporr’s picture

benjy’s picture

Status: Active » Needs review

Thanks for posting a patch, feedback below.

  1. +++ b/www/modules/contrib/entity_print/src/Renderer/RendererBase.php
    @@ -113,7 +113,14 @@ public function generateHtml(array $entities, array $render, $use_default_css, $
    +    // See Drupal's doc on transliteration:
    +    // https://api.drupal.org/api/drupal/core!lib!Drupal!Core!Language!language.api.php/group/transliteration/8.2.x
    

    No need for links to the docs within the code. Feel free to add it to the interface documentation if you think it's required.

  2. +++ b/www/modules/contrib/entity_print/src/Renderer/RendererBase.php
    @@ -113,7 +113,14 @@ public function generateHtml(array $entities, array $render, $use_default_css, $
    +    $langcode = \Drupal::languageManager()->getCurrentLanguage()->getId();        // Use the current default interface language.       ¶
    +    $trans = \Drupal::transliteration();                                          // Instantiate the transliteration class.                                          ¶
    +    $transformed = $trans->transliterate($filename, $langcode);                   // Use this to transliterate some text.
    

    This is against the coding standards, inline comments should be on-top of the line and line lengths less than 80 characters.

    Also, I think we should probably inject the transliteration service and not use \Drupal

  3. +++ b/www/modules/contrib/entity_print/src/Renderer/RendererBase.php
    @@ -113,7 +113,14 @@ public function generateHtml(array $entities, array $render, $use_default_css, $
    +    return preg_replace("/[^A-Za-z0-9 ]/", '', $transformed);
    

    Is the regex still need after this?

Also, would be good to see some new tests that show how this works. Setting to needs review for the test bot then it's needs work.

benjy’s picture

Title: Transliteration of pdf title » Transliteration of printed document filename

Status: Needs review » Needs work

The last submitted patch, 2: 2845323_1-entity_print-change-for-transliteration.patch, failed testing.

benjy’s picture

Status: Needs work » Needs review
FileSize
11.76 KB

OK I've implemented the transliteration as suggested but i've also pulled it out into a separate service and removed the getLabel abstract protected method from renderers. This makes testing filename generation easier and more self-contained.

Status: Needs review » Needs work

The last submitted patch, 6: 2845323-6.patch, failed testing.

benjy’s picture

Status: Needs work » Needs review
FileSize
12.7 KB
962 bytes

Test fix.

Sam152’s picture

This looks good.

  1. +++ b/src/FilenameGenerator.php
    @@ -0,0 +1,68 @@
    + * A service for generating filename's for printed documents.
    

    should be "filenames"

  2. +++ b/src/FilenameGenerator.php
    @@ -0,0 +1,68 @@
    +  public function generateFilename(array $entities, callable $entity_label_callback = NULL) {
    +    $filenames = [];
    +    foreach ($entities as $entity) {
    +      if ($label = trim($this->sanitizeFilename($entity_label_callback ? $entity_label_callback($entity) : $entity->label()))) {
    +        $filenames[] = $label;
    +      }
    +    }
    +    return $filenames ? implode('-', $filenames) : static::DEFAULT_FILENAME;
    +  }
    

    Does this really need the whole entity? Maybe this should just be a string. Also funky for the method to be "generateFilename" but it returns multiple filenames.

  3. +++ b/src/FilenameGenerator.php
    @@ -0,0 +1,68 @@
    +    $langcode = $this->languageManager->getCurrentLanguage()->getId();
    

    If all we need is the langcode, can we just inject the ID? Surely container param for this?

  4. +++ b/tests/src/Kernel/ContentRendererTest.php
    @@ -55,8 +55,9 @@ class ContentRendererTest extends KernelTestBase {
    +      ['Dußeldorf will be transliterated', 'Dusseldorf will be transliterated'],
    

    spaces are allowed in file names?

benjy’s picture

FileSize
14.45 KB
4.56 KB
  1. Fixed.
  2. It only returns 1 filename, it's just that it can generate a filename for multiple entities. Going to leave the entity param for now. I think it could help for more contextual filenames.
  3. Removed, using language from the entity.
  4. Yep!
benjy’s picture

FileSize
12.34 KB

Final patch here, few things leaked into the last one.

benjy’s picture

FileSize
13.01 KB

  • benjy committed fae6840 on 8.x-2.x
    Issue #2845323 by benjy, hoporr, Sam152: Transliteration of printed...
benjy’s picture

Status: Needs review » Fixed

Committed and pushed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

mstrelan’s picture

So no one answered whether the regex is still needed... I would like it removed. Should I raise a new issue?