Problem/Motivation
non-ASCII characters are rendered as ????? or other weird characters
Persian language and replace by ???????????????? or Dropping between letters in word.
Example text:
Do not apply if air and surface temperatures are below 5ºC or above 35ºC.
Rendered as:

Steps to reproduce
add a non-ASCII character to content and generate PDF
example: Do not apply if air and surface temperatures are below 5ºC or above 35ºC.
Proposed resolution
Add UTF-8 encoding
Remaining tasks
provide a patch
User interface changes
none
API changes
none
Data model changes
none
| Comment | File | Size | Author |
|---|---|---|---|
| #28 | 3028545-28.patch | 385 bytes | m.stenta |
| #26 | 20250218 PDF encoding error.png | 40.7 KB | jannakha |
| #20 | entity_print-fix-character-encoding-3028545-20.patch | 658 bytes | johanvdr |
| #19 | entity_print-fix-character-encoding-3028545-18.patch | 665 bytes | johanvdr |
| #16 | fix-character-encoding_3028545-16.patch | 614 bytes | peri22 |
Issue fork entity_print-3028545
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
matt bSame issue here. Any updates or support for this?
Comment #3
matt bI think the answer is here: https://github.com/dompdf/dompdf/wiki/About-Fonts-and-Character-Encoding
Comment #4
smustgrave commentedCould more steps be added by chance? But thanks to @Matt B in #3 seems like this might be an issue with dompdf maybe an issue should be logged there instead?
Comment #5
avpadernoReading About Fonts and Character Encoding, I gather it could be an issue with this module, which should reference the correct font in the used CSS stylesheet.
Comment #6
avpadernoThe attached screenshot does not show the described bug. It just shows the screenshot of part of a node, where the PDF link is present.
Comment #7
smustgrave commentedThanks for taking a look @apaderno
Comment #8
avpadernoAlso, what does dropping between letters exactly mean?
Comment #9
smustgrave commentedNot familiar with other languages but I took that as the letter appears slightly lower then expected/not centered with rest of the word. But just how I took it.
Comment #10
avpadernoThat would be dropping letters.
Given the screenshot does not show exactly what the bug is, and the description is not clear, this needs more information from the OP.
Comment #11
matt bI cannot comment for the OP, but I'm still struggling to get output in Farsi / Persian.
I've set
In the css, and whilst it now produces characters instead of lots of ??? (one or two specific issues in my header, which I'll look at separately), but when I copy and translate the text back to english using google translate it's clearly not giving me the original text, and when I compare to the original text in Drupal it is different - something is happening in the PDF production process.
This text (from both the node content and .../debug) looks correct (probably reverts to LTR here) :
But is displayed in the PDF as
Comment #12
matt bI think the following is relevant, and this is probably a support request rather than a bug fix due this being a feature not supported by DomPDF?
https://github.com/dompdf/dompdf/issues/2619
https://github.com/dompdf/dompdf/pull/2107
Also, google about, I got the hint that this may not be an issue with Wkhtmltopdf, so I might give that a try (something for another day!)
Comment #13
matt bI've switched to using the Entity PDF module, which uses mpdf as the engine. It is rendering Arabic characters fine.
Comment #14
hdahoud commentedYou can use phpwkhtmltopdf library https://github.com/mikehaertl/phpwkhtmltopdf
Comment #15
jurgenhaasI'm having the same issue with simple German umlauts, they get printed as 2-byte character combination, as if dompdf is able to deal with UTF-8. Switching to wkhtmltopdf engine solves the issue, though.
Comment #16
peri22 commentedHello, I had the same problem with some characters. I have set the font-family and created a patch to fix the character encoding.
Comment #17
avpadernoComment #18
johanvdr commentedI confirm that the patch in #16 fixed the issue though with the encoding but that would throw a deprecated error in php 8.2.x
Deprecated function: mb_convert_encoding(): Handling HTML entities via mbstring is deprecated;
Handling HTML entities via mbstring is deprecated in PHP 8.2.
To fix that deprecated error:
$this->html = mb_encode_numericentity($this->html, [0x80, 0x10ffff, 0, 0xffffff], 'UTF-8');Comment #19
johanvdr commentedHere is an updated patch which resolves that deprecated error.
Comment #20
johanvdr commentedI have updated the patch from #19. There was a small issue with the conversion map array.
Comment #21
johanvdr commentedActually there is an even more simple fix that worked for me without any patch. Set the metatag header in entity-print.html.twig.
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>Comment #22
kufliievskyi commentedI confirm that the template fix from the #21 comment works for me also. Looks like the most simple solution.
Comment #25
introfini commented#21 provides a straightforward fix. In a fresh installation with only that fix applied, everything is working correctly. Additionally, the Dompdf documentation also uses that meta tag: https://github.com/dompdf/dompdf/wiki/About-Fonts-and-Character-Encoding
I've created a merge request with the simple fix from #21. Please review...
Comment #26
jannakha commentedComment #27
jannakha commentedBoth MR70 and #20 patches work and fix the issue.
MR70 will require developers to update any custom twig templates.
Although latest browsers treat
<meta charset="utf-8">the same as<meta http-equiv="content-type" content="text/html; charset=utf-8">DomPDF needs specific attributes in meta tag.Comment #28
m.stentaI can confirm that the suggestion in #21 (which is now implemented in MR70) fixes the issue for me.
In my case, the string "Moo’s Dairy Farm" was appearing in the PDF as "Mooâ??s Dairy Farm".
With the change from #21, it now appears as "Moo’s Dairy Farm".
The easiest way for me to fix this in my production deployments (while we wait for this issue) is to apply a patch via
cweagans/composer-patches. Attached is a patch for this purpose (which is safer to use than a MR patch, which may be changed by anyone with push access).Thanks for finding this solution @johanvdr!
Comment #29
abelpzlPatch #28 fixed my issue. But in my case, I preferred to override the entity-print.html.twig template in my custom theme and set the meta tag header there, so I wouldn't have to apply the patch.
Comment #32
jsacksick commentedComment #33
jsacksick commented