I am trying to add a node whose name is Franța. The use of ț is not mandatory in the URL but even transiliteration would not fix that, because (I think) the site is in Romanian. What happens here is I get the ț in the URL but I cant get the final A. So instead of www.site.com/franța, I get www.site.com/franț.
Why is that a problem with the A and not with Ț?

Comments

simanjan’s picture

I've spent some time to investigate this issue. Haven't found solution, but I understand where it comes from.

File pathauto.inc, line 202:

$words_removed = $cache['ignore_words_callback']($cache['ignore_words_regex'], '', $output);

Here is replacing words which should be ignored for path. Depending on condition on line 163 it can be replaced by mb_eregi_replace or preg_replace. Both functions make replacements non-correctly with utf text.

This issue could be solved if "ignore_words_regex" regexp will be rewritten in way to support utf.

Additional workaround without fixing regexp: remove "a" from the list of the ignored words.

dureaghin’s picture

Issue summary: View changes

In my case it was "*"

Dave Reid’s picture

Status: Needs work » Active

No patch to review here.

ben.kyriakou’s picture

Status: Active » Closed (fixed)

Testing in Pathauto 7.x-1.3 this now appears to be fixed and works as specified above. If you're still experiencing this issue I'd recommend updating to the latest version of Pathauto.

Marking as Closed (fixed). If you're still encountering this issue in Pathauto 7.x-1.3 or above, please set this to Active with additional information to reproduce.