This module fails to process files with non-ASCII filenames. It is because LC_ALL is set to 'C' in includes/bootstrap.inc and then escapeshellarg() strips out non-ASCII characters. A quick-and-dirty workaround is to enclose the line in _imagemagick_convert() with two setlocale() calls:

  setlocale(LC_ALL, 'en_US.UTF-8');
  $command = escapeshellarg($source) . ' ' . implode(' ', $args) . ' ' . escapeshellarg($dest);
  setlocale(LC_ALL, 'C');

Or you can just modify the bootstrap.inc directly; I've done that before and found no problems. I think the core should provide its own implementation of escapeshellarg() and other locale-dependent functions as it forces LC_ALL to 'C'.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

sun’s picture

Status: Active » Closed (duplicate)
zoo33’s picture

Version: 7.x-1.x-dev » 7.x-1.0-alpha2
Component: Code » Miscellaneous
Status: Active » Closed (duplicate)

If you just need a quick solution, a setlocale() call in a hook_init() in a custom module does the trick. No need to hack core or this module. Edit: see my comment below.

If you want to do a change like the one above, I think you could restore whatever the locale was set to like this:

  $old_locale = setlocale(LC_ALL, 0);
  setlocale(LC_ALL, 'en_US.UTF-8');
  $command = escapeshellarg($source) . ' ' . implode(' ', $args) . ' ' . escapeshellarg($dest);
  setlocale(LC_ALL, $old_locale);
sun’s picture

Version: 7.x-1.0-alpha2 » 7.x-1.x-dev
Component: Miscellaneous » Code
Status: Closed (duplicate) » Active

The core issue #1561214: Bootstrap sets C locale, but does not set UTF-8 character encoding appears to be stuck.

I do not want to duplicate that issue here, but I think I would accept a temporary stop-gap fix until core has been fixed itself.

zoo33’s picture

Oops, note that doing setlocale(LC_ALL, x) in a hook_init() without setting it back to its default (like I suggested above) breaks search. Take a look here: http://drupal.org/node/1145964#comment-5888364

Status: Closed (duplicate) » Active
Frando’s picture

Issue summary: View changes
Status: Active » Needs work
FileSize
3.43 KB

I ran into this as well, and created a quick patch to have umlauts in filenames on a German language site work properly. Attached patch for reference or if someone else needs it. Not sure how to generalize it, en_US.UTF8 didn't work properly for filenames with umlauts (ä,ö,ü).

steven.wichers’s picture

+1 this issue. Prés.plateau.jpg gets turned into Prs.plateau.jpg. Having a custom escapeshellarg function is a good workaround, though #6's implementation is too specific.

guillaumev’s picture

Status: Needs work » Needs review
FileSize
3.43 KB

What about this ? (using LC_CTYPE with en_US.UTF-8). It works for me with French special characters, haven't tested it with umlauts...

steven.wichers’s picture

That would probably work in many scenarios, but based on the core bug mentioned it doesn't look like en_US.UTF-8 is defined on all systems. Your patch is probably a good workaround. I think the ultimate fix here (ignoring core fixing the issue) would be a user option to select the locale desired out of the ones available on the system.

OnkelTem’s picture

Or just reimplement escapeshellarg() (which really sucks: for cross-platform language (PHP) making functions system locale dependent was really a bad idea).
We have pretty much rewritten stuff. Just add one more.

badrange’s picture

Status: Needs review » Needs work

This issue bit me too. On my local Mac everything worked fine (nginx, php-fpm) with files that have Finnish special characters in them, but when we deployed to platform.sh things broke.

The patch in #8 didn't work probably because the only locales available on platform.sh (according to the command locale -a) are C, C.UTF-8 and POSIX.

I changed it to use C.UTF-8 and voilá - image conversion works for us.

This patch must be a bit tricky to get right because it is hard to have a local environment match all the different hosting environments out there. Is it at all possible that some can be even more minimalistic than platform.sh?

juliencarnot’s picture

Also affected by this on the D8 version. For a website with open contributions, I can't rely on having the editors figuring out that they have to rename their image, as there is no error message indicating this. Seems like the imagick module is dealing with accentuated characters better, but I need ImageMagick advanced options and effects.

dman’s picture

I had an identical issue reported over in #2860085: Not generating images when PDF's file name contains UTF-8 charecters - a utility module that tries to stub over here to imagemagick.module for its dirty work, and has copy-pasted a few lines from it when needed.

I've tried the patch in #6, and against my expectations, it WORKED - against an Arabic-named image!

The work-around in the patch *feels* messy, but I can't suggest a better method. And it really does work.
Per #11 - shifting to C.UTF-8 instead of a named language sounds like a good move.

I tried it, and yes, that works on Arabic for me today also.

FWIW, my environment was a 1-hour old vanilla Drupal-vm box Ubuntu 16.04.2.

frederickjh’s picture

I re-rolled the patch in #8, but used C.UTF-8 instead of en_US.UTF-8.

frederickjh’s picture

The re-rolled patch I submitted in #14 worked for me on the development and test servers but not on production. For that I went back to the patch in #8 and that worked. Still not sure what the difference is. If someone wants to say what information I could provide from these servers that would help troubleshoot this please let me know.

frederickjh’s picture

After switching back to the patch in #8 the development server and the production server work correctly but not the test server. The production and development servers have LC_CTYPE set to C. The test server does not have it set or if it is it is empty.

badrange’s picture

What is the content of this variable on the different environments?
$old_locale = setlocale(LC_CTYPE, 0);

And what becomes the result of setlocale(LC_CTYPE, $old_locale);? According to PHP docs If locale is NULL or the empty string "", the locale names will be set from the values of environment variables with the same names as the above categories, or from "LANG".

Not sure what the result of that will be in your environments.

frederickjh’s picture

@badrange

On the development site:

$old_locale = C
LC_CTYPE=C
LANG=en_US.UTF-8

On the test site:

$old_locale was empty
LC_CTYPE is not set or listed when running "env"
LANG=en_US.UTF-8

I ran on the test site the following to check the locale settings:

$locale -a
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_COLLATE to default locale: No such file or directory
C
C.UTF-8
POSIX

On the production site:

$old_locale = C
LC_CTYPE=C
LANG=en_US.UTF-8

After checking all that, I then decided to see what it would take to get Imagemagick working on the test server.

After installing the English Language package with apt install language-pack-en-base, running dpkg-reconfigure locales, setting the default LC_CTYPE on the server for users to en_US.UTF-8, and restarting Apache. Imagemagick worked correctly with filenames with non-ASCII characters in them.