Problem/Motivation

We are using this module to migrate some Taiwanese content. Seemingly, paragraphs are getting stripped win stripCmsLegacyMarkup helper function - specifically the code that removes empty p tags

Steps to reproduce

This shows the regex failing:

https://regex101.com/r/b3TV62/1

As you can see, it matches on a seemingly legit (non-empty) paragraph.

Proposed resolution

Seems like if we match with unicode the desired behavior is exhibited

https://regex101.com/r/qVLxpV/1

Remaining tasks

Adjust empty p tag (and strong check) to use unicode

User interface changes

None

API changes

None

Data model changes

None

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

AndyThornton created an issue. See original summary.

AndyThornton’s picture

Status: Active » Needs review
AndyThornton’s picture

Title: Unicode issue in strip cms legacy markup Primary tabs View(active tab) » Unicode issue in strip cms legacy markup Primary tabs

  • swirt committed 8ed1bcd on 8.x-2.x
    Issue #3310741 by AndyThornton: Unicode issue in strip cms legacy markup...
swirt’s picture

Status: Needs review » Fixed

Thank you for this contribution. It is part of 8.x-2.5

swirt’s picture

Status: Fixed » Closed (fixed)