I have an issue where our urls aliases are based on the node title and our titles often contain dashes. Dashes are different from hyphens and come in two varieties, En Dashes and Em Dashes.

Because there is currently no way to ignore them in the Pathauto settings, I was getting urls like so:

Node Title:

Downloads – Install additional software

Pathauto URL alias:

http://www.root-of-site.com/software-requirements/downloads-–-install-additional-software

Notice that the En dash is surrounded by two hyphens here. Not a desirable result.

I have created a patch that adds both En and Em Dashes to the list of punctuation to be removed from urls in the settings. The resulting URL is this:

http://www.root-of-site.com/software-requirements/downloads-install-additional-software

Please consider rolling this into dev.

CommentFileSizeAuthor
pathauto-add-dashes-to-punctuation-list.patch914 bytesjaydee1818
PASSED: [[SimpleTest]]: [MySQL] 337 pass(es). View
Members fund testing for the Drupal project. Drupal Association Learn more

Comments

Toby Wild’s picture

For those waiting for this to be included in the module, you could also include this using their hook into a custom module:

function MODULE_pathauto_punctuation_chars_alter(array &$punctuation) {
  $punctuation['ndash'] = array(
    'value' => '–',
    'name' => t('En Dash'),
  );
  
  $punctuation['mdash'] = array(
    'value' => '—',
    'name' => t('Em Dash'),
  );

}

Also, in case anyone has the same issue I had, make sure your text editor is using UTF-8 encoding.
Notepad++ defaults to ANSI and it doesn't save the characters correctly.

dpovshed’s picture

Status: Needs review » Reviewed & tested by the community

@jaydee1818, your patch working fine for me, so I am changing status of the issue.

However, for my task I will use hint from @Toby Wild to define even more characters. Those are loved by endusers in one project. So my hook looks like

  $punctuation['ndash'] = array(
    'value' => '–',
    'name' => t('En Dash'),
  );
  $punctuation['mdash'] = array(
    'value' => '—',
    'name' => t('Em Dash'),
  );
  $punctuation['single_quota_open'] = array(
    'value' => '‘',
    'name' => t('Quotation Open'),
  );
  $punctuation['single_quota_close'] = array(
    'value' => '’',
    'name' => t('Quotation Close'),
  );
  $punctuation['double_quota_open'] = array(
    'value' => '“',
    'name' => t('Double Quotation Open'),
  );
  $punctuation['double_quota_close'] = array(
    'value' => '”',
    'name' => t('Double Quotation Close'),
  );

Thanks to both of you!

Toby Wild’s picture

Fantastic, can't wait to see this released.

Content authors love their special characters in page titles even though I keep telling them not to.

Dave Reid’s picture

Status: Reviewed & tested by the community » Active
KeithC’s picture

Hi,

This is causing issues (in particular with the Rate module) on a clients site.

Is this change likely to be included in a stable release any time soon?

Thanks

rdellis87’s picture

Thanks, jaydee1818. The patch appears to be working great for me.