I am trying to add a node whose name is Franța. The use of ț is not mandatory in the URL but even transiliteration would not fix that, because (I think) the site is in Romanian. What happens here is I get the ț in the URL but I cant get the final A. So instead of www.site.com/franța, I get www.site.com/franț.
Why is that a problem with the A and not with Ț?


simanjan’s picture

I've spent some time to investigate this issue. Haven't found solution, but I understand where it comes from.

File pathauto.inc, line 202:

$words_removed = $cache['ignore_words_callback']($cache['ignore_words_regex'], '', $output);

Here is replacing words which should be ignored for path. Depending on condition on line 163 it can be replaced by mb_eregi_replace or preg_replace. Both functions make replacements non-correctly with utf text.

This issue could be solved if "ignore_words_regex" regexp will be rewritten in way to support utf.

Additional workaround without fixing regexp: remove "a" from the list of the ignored words.

dureaghin’s picture

Issue summary:View changes

In my case it was "*"

Dave Reid’s picture

Status:Needs work» Active

No patch to review here.