If you check the checkbox titled Reduce strings to letters and numbers which has the help text Filters the new alias to only letters and numbers found in the ASCII-96 set. then no matter what you have for the Punctuation settings, they always get replaced by the separator as in the pathauto_cleanstring function there is:

  // Reduce strings to letters and numbers
  if ($cache['reduce_ascii']) {
    $output = preg_replace('/[^a-zA-Z0-9\/]+/', $cache['separator'], $output);
  }

In my special requirement, I wanted to replace & with the word and, so by using hook_pathauto_alias_alter() I could alter the generated alias, but I needed to leave the & in the URL so it would be picked up by my hook. I would prefer to leave Reduce strings to letters and numbers checked so other non-ASCII characters are replaced.

Comments

intrafusion’s picture

Thanks to StackOverflow I think the description on the options pages should be amended and the code above should be changed to:

  // Reduce strings to ASCII-96 characters
  if ($cache['reduce_ascii']) {
    $output = preg_replace('/[^\x20-\x7E]/', '', $output);
  }
xpersonas’s picture

I finally dug through and figured out why my punctuation settings are not working, and this was the culprit. I think the help text, at the very least, should probably be a little more clear?

I will say, finding this hook helped me a lot...

/**
 * Alter the list of punctuation characters for Pathauto control.
 *
 * @param $punctuation
 *   An array of punctuation to be controlled by Pathauto during replacement
 *   keyed by punctuation name. Each punctuation record should be an array
 *   with the following key/value pairs:
 *   - value: The raw value of the punctuation mark.
 *   - name: The human-readable name of the punctuation mark. This must be
 *     translated using t() already.
 */
function helpmate_pathauto_punctuation_chars_alter(array &$punctuation) {
  // Add the trademark symbol.
  $punctuation['trademark'] = array('value' => '™', 'name' => t('Trademark symbol'));
}

Just adding that in case someone else finds this thread on the same trail.

temkin’s picture

Status: Active » Postponed (maintainer needs more info)

It doesn't look like any additional work is needed here, unless reporter thinks otherwise. Changing to "postponed (need more info)" to get more details before it can be addressed.

intrafusion’s picture

Status: Postponed (maintainer needs more info) » Active

No this is a valid bug, the issue stems from the fact if you want to use Reduce strings to letters and numbers then the regular expression is incorrect and traps valid ASCII-96 characters and therefore replaces them with the separator.