It'd be nice to have support for replacing   with the separator, perhaps in the punctuation replacement section. Currently, using   in a content title renders an automatic URL alias with %C2%A0 in the URL. Also, adding " ", " " (non-breaking space character), or "%C2%A0" to the "STRINGS TO REMOVE" field all fail to remove the character, and that option wouldn't replace it with the separator anyway.

Alternatively, just always treating   exactly the same as the standard space character would be a logical and probably simpler solution.

Comments

calvinjuarez created an issue. See original summary.

calvinjuarez’s picture

Issue summary: View changes
calvinjuarez’s picture

Issue summary: View changes
calvinjuarez’s picture

Issue summary: View changes
calvinjuarez’s picture

Issue summary: View changes
markdc’s picture

I'm surprised this hasn't been addressed yet. Some content authors use non-breaking spaces in title fields to control how text is displayed. I agree that that they should be handled as normal spaces.

While we're at it, all other "visual" spaces should be included:

  •  
  •  
  •  
  •  
  • 	 CHARACTER TABULATION
  •   THREE-PER-EM SPACE
  •   FOUR-PER-EM SPACE
  •   SIX-PER-EM SPACE
  •   FIGURE SPACE
  •   PUNCTUATION SPACE
  •   HAIR SPACE
  •   NARROW NO-BREAK SPACE
  •   MEDIUM MATHEMATICAL SPACE
  •   IDEOGRAPHIC SPACE



ZERO WIDTH SPACE (​) is not visually represented and should simply be removed without any substitution choices.

@calvinjuarez
Using the "Strings to remove" settings field does work if you paste the rendered HTML rather than the encoding. True, it doesn't produce the desired effect of hyphenated strings, but it's better than seeing %C2%A0 in the URL.

Opening a separate issue for D8.

cdmo’s picture

Here's a workaround with hook_pathauto_punctuation_chars_alter(), for example with non-breaking spaces:

/**
 * Alter the list of punctuation characters for Pathauto control.
 *
 * @param $punctuation
 *   An array of punctuation to be controlled by Pathauto during replacement
 *   keyed by punctuation name. Each punctuation record should be an array
 *   with the following key/value pairs:
 *   - value: The raw value of the punctuation mark.
 *   - name: The human-readable name of the punctuation mark. This must be
 *     translated using t() already.
 */
function my_module_pathauto_punctuation_chars_alter(array &$punctuation) {
  // nbsp
  $punctuation['nbsp'] = array('value' => ' ', 'name' => t('Non-breaking space'));
}

Then you'd have to select "Separator" under "Punctuation" at /admin/config/search/path/settings for "Non-breaking space."