Problem/Motivation

Characters with Māori macrons are stripped from ids, so heading `Tā mātou` becomes `t-mtou`. It would be best to allow the macrons through, but since convertStringToId() uses regex \w to strip all non-Latin characters, it would require ToC API to offer a per-site preference for convertStringToId().

Steps to reproduce

Create content with heading <h2>Tā mātou</h2>. Apply ToC filter. Observe generated anchor id is t-mtou.

Proposed resolution

Refactor convertStringToId() to enable a per-site preference to allow specific non-latin characters to remain. This would likely entail revising the use of regex \w. A less desirable work-around is to map characters with macrons to Latin characters, but this can change the meaning of a word so is not ideal. A patch for the work-around is offered below.

Remaining tasks

Implement site preference described above.

User interface changes

Add settings form for per-site preference.

API changes

Unsure.

Data model changes

Would need to save site preference in config.

Comments

jonathan_hunt created an issue. See original summary.

jonathan_hunt’s picture

Issue summary: View changes
jonathan_hunt’s picture

Title: Map macron characters to latin instead of excluding » Per-site preference for non-Latin characters in anchor ids
jonathan_hunt’s picture

StatusFileSize
new2.38 KB

Patch to map macrons to latin as a work-around until site preferences for non-Latin characters can be implemented.

rosk0’s picture

Status: Active » Reviewed & tested by the community

I believe currently suggested in the patch implementation is the best way forward - Mozilla suggests using only ASCII characters in the ID attribute value and I'm completely support this recommendation:

Note: Technically, the value for an id attribute may contain any character, except whitespace characters. However, to avoid inadvertent errors, only ASCII letters, digits, '_', and '-' should be used and the value for an id attribute should start with a letter.

vladimiraus’s picture

Thank you! Released and committed! 🥂

vladimiraus’s picture

Status: Reviewed & tested by the community » Fixed

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.