Russian characters are cut out of the script when selected JSMin compressor. JSqueeze and others are fine

Comments

AmiGator created an issue. See original summary.

mikeytown2’s picture

Assigned: AmiGator » Unassigned

Here's the issue in JSMin: https://github.com/sqmk/pecl-jsmin/issues/14

Does this work for detecting if Russian characters are in the affected js file? http://stackoverflow.com/questions/3212266/detecting-russian-characters-...

mikeytown2’s picture

Status: Active » Needs review
StatusFileSize
new1013 bytes

  • mikeytown2 committed f54cb73 on 7.x-2.x
    Issue #2566311 by mikeytown2: Do not use jsmin compression if the js...
mikeytown2’s picture

Status: Needs review » Fixed

Committed #3

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

osopolar’s picture

Status: Closed (fixed) » Needs work

This workaround actually seems not to be enough. Jsmin seems to have problems with multi-byte UTF-8 characters in general, as the issues title says.

To verify this, run following code on the command line:
php -r 'print jsmin("a,b,c,á,ä,ö,ü,ß,$,€");' to test all compressors run the following under /devel/php:

module_load_include("inc", "advagg_js_compress", "advagg_js_compress.advagg"); 
list(, , , $functions) = advagg_js_compress_configuration();

foreach ($functions as $compressor => $function) {
  if (function_exists($function)) {
    $contents='$var ="a,b,c,á,ä,ö,ü,ß,$,€";';
    $contents = str_replace(pack("CCC", 0xef, 0xbb, 0xbf), "", $contents);
    $function($contents);
    dpm($contents, $function);
  }
}

So jsmin is currently pretty useless for none ASCII characters.

das-peter’s picture

Let's hope this pull request for JS-Min goes in fast: https://github.com/sqmk/pecl-jsmin/pull/40

das-peter’s picture

A new version of jsmin was just released - as of 2.0.0 it has official UTF-8 support
Package Site (Changelog)

So I guess we should adjust the code to include a version check.

osopolar’s picture

Status: Needs work » Needs review
StatusFileSize
new2.24 KB

The pecl extension can be upgraded now to 2.0.0 for example by sudo pecl upgrade jsmin which may require a webserver restart.

I tested as in #7 with the string "a,b,c,á,ä,ö,ü,ß,$,€" which now works fine. I added a patch which uses the fall backs as in #3 for jsmin < 2.0.0 as suggested by Peter in #9.

Does it matter if it does a version check each time before jsmin would be invoked?

Status: Needs review » Needs work

The last submitted patch, 10: advagg-jsmin-multibyte-character-handling-2566311-10.patch, failed testing.

osopolar’s picture

Version: 7.x-2.15 » 7.x-2.x-dev
Status: Needs work » Needs review
StatusFileSize
new2.31 KB

Modified a the watchdog message.

osopolar’s picture

mikeytown2’s picture

Looks like the old version of jsmin is labeled 1.0 so this will cover both of these packages :)
http://www.ypass.net/software/php_jsmin/

mikeytown2’s picture

Status: Needs review » Reviewed & tested by the community

#12 looks good, will commit soon.

osopolar’s picture

BTW: On the research for this issue I stumbled across a really great and recommendable article about character encoding: http://kunststube.net/encoding/. The last third of the text is about "Encodings and PHP".

mikeytown2’s picture

Yeah I've my own fun with UTF-8. If you want to generate every possible UTF-8 char here's some code I wrote to do so: http://stackoverflow.com/questions/3466035/how-to-skip-invalid-character...

mikeytown2’s picture

#12 has been committed.

mikeytown2’s picture

Status: Reviewed & tested by the community » Fixed

  • mikeytown2 committed f54cb73 on 8.x-2.x
    Issue #2566311 by mikeytown2: Do not use jsmin compression if the js...
skin’s picture

Hello,
I've just installed jsmin 2.0.1 and restarted my vps, after that I tried to run following code on the command line:
php -r 'print jsmin("a,b,c,á,ä,ö,ü,ß,$,€");', but the test failed.

After flushing and regenerating all Drupal cache I see in the error log this message:

The jsmin function does not handle cyrillic characters. Using JSqueeze.

Should I downgrade to Jsmin 2.0?

P.S. I'm using AdvAgg 7.x-2.16

das-peter’s picture

@Skin I just played around with your cli command.
On an Ubuntu with jsmin 2.0.1 I got following results:

  • php -r 'print jsmin("a,b,c,á,ä,ö,ü,ß,$,?");': a,b,c,�,�,�,�,�,$,?
  • php -r 'print utf8_encode(jsmin("a,b,c,á,ä,ö,ü,ß,$,?"));': a,b,c,á,ä,ö,ü,ß,$,?

So it seams like jsmin() works as it should - even though the command line might show something odd.
Further - as far as I know there's no such message as The jsmin function does not handle cyrillic characters. Using JSqueeze. in the latest advagg codebase: http://cgit.drupalcode.org/advagg/tree/advagg_js_compress/advagg_js_comp...
I suggest to update advagg to the latest dev version.

skin’s picture

@das-peter
Hello,
I've just tried on Centos 7 with jsmin 2.0.1 this command:
php -r 'print utf8_encode(jsmin("a,b,c,á,ä,ö,ü,ß,$,?,€"));': a,b,c,á,ä,ö,ü,ß,$,?,¬ Except for the EUR (€) symbol, everithing is correct.

I'm going to test the dev version, is there a way to test if AdvAgg use jsmin?

osopolar’s picture

@Skin: If you have the devel module enabled, you may try the code from the second code-block from #7. This would call all available functions for js-minification, one of them would be advagg_js_compress_jsmin() which invokes jsmin(). The output of devel won't depend on your command line location settings.

@das-peter: utf8_encode just "Encodes an ISO-8859-1 string to UTF-8". It's strange that jsmin is returning an ISO-8859-1 encoded string. I am not exactly sure what jsmin should return for "a,b,c,á,ä,ö,ü,ß,$,€" as it is no valid javascript code, you may try the following code:

php -r "print jsmin('var x = \"a,b,c,á,ä,ö,ü,ß,$,€\";');"

BTW: I still have an issue with jsmin, see #2627468: JsMin: Browser console gives errors like SyntaxError: illegal character.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

nickdickinsonwilde’s picture

Version: 7.x-2.x-dev » 8.x-2.x-dev
Assigned: Unassigned » nickdickinsonwilde
Status: Closed (fixed) » Patch (to be ported)
nickdickinsonwilde’s picture

StatusFileSize
new1.99 KB

8.x-2.x version of the patch.

  • NickWilde committed d69fb9b on 8.x-2.x
    Issue #2566311 by osopolar, mikeytown2, NickWilde: JSMin Compression...
nickdickinsonwilde’s picture

Status: Patch (to be ported) » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.