Support replacement for more advanced strings (e.g. "token-style" string patterns) [#2821645]

I'm currently searching for a way to get tokens to be supported in node titles. Unfortunately, any non-fieldable entity is out of luck it seems. I'm thus exploring Wordfilter to do this job. String replacement works great, but token-as-a-string isn't well-supported by preg_replace() it seems, likely because the use of [] (e.g. [my-custom:token])

Would it be an option to improve parsing the 'words' to filter to account for special chars such as brackets? E.g.

>>> $word = '[my-custom:token]';
=> "[my-custom:token]"
>>> echo $word = preg_replace('/\[[^]]+\]/', 'My custom token', $word);
My custom token

Comments

Comment #1

24 October 2016 at 10:07

anavarre created an issue. See original summary.

Comment #2

anavarre

French

🇪🇺

CreditAttribution: anavarre at Acquia commented 24 October 2016 at 11:38

Issue summary:

View changes

Comment #3

mxh

German

Offenburg

CreditAttribution: mxh commented 24 October 2016 at 21:54

Parent issue:

» #2820128: Improve default filtering process

The current regex is currently implemented as follows (see WordfilterLibrary::filterWords()):

<?php
$filter_pattern = '/\b(' . implode('|', $filter_words) . ')\b/i';
$substitute = $wordfilter_config->getSubstitute();

$filtered_text = preg_replace($filter_pattern, $substitute, $text);
?>

The used word boundary \b does its job pretty well regarding "commonly spoken" words since it's using \w internally, but doesn't fit regarding "enhanced words" like tokens. Tokens contain non-alphanumeric characters like [ and : which will be handled as boundaries.

To be able to use strings like [my:custom-token] as well as *another-example*, the currently used regex must be changed. Maybe we could replace \b by a custom word boundary implementation?

Comment #4

mxh

German

Offenburg

CreditAttribution: mxh commented 25 October 2016 at 07:35

For clarification: The Wordfilter module won't do anything with Drupal Tokens. You can only set one substitution text per configuration by design, which itself is a static, user-defined string. You won't be able to embed dynamically generated tokens like [node:nid].

Comment #5

mxh

German

Offenburg

CreditAttribution: mxh commented 25 October 2016 at 07:35

Title:

Support replacement for more advanced strings (e.g. tokens)

» Support replacement for more advanced strings (e.g. "token-style" string patterns)

A little renaming to "token-style" string patterns to reduce possible confusion on this issue.

Comment #6

anavarre

French

🇪🇺

CreditAttribution: anavarre at Acquia commented 25 October 2016 at 09:57

Maybe we could replace \b by a custom word boundary implementation?

So, let me ask: could that be a user-facing configuration form field per Wordfilter configuration? Users would thus have access to an advanced form setting to really account for more complex strings.

It could also be only accessible to advanced users, for instance with a new permission.

Comment #7

mxh

German

Offenburg

CreditAttribution: mxh commented 25 October 2016 at 12:46

So, let me ask: could that be a user-facing configuration form field per Wordfilter configuration? Users would thus have access to an advanced form setting to really account for more complex strings.

Issue #2820131: Allow the usage of different algorithms and services for the filtering process contains the plan for letting the user choose between different filtering algorithms.

It could also be only accessible to advanced users, for instance with a new permission.

At the beginning, I'd like to keep the complexity of the module at a lower level. Adding further permissions would be useful e.g. when Wordfilter itself becomes a "big" suite for filtering methods and options.

I have changed the filtering process in the current dev version in that way that word boundaries are now only used for 'naturally and commonly spoken' words. Using a token-style word like [my:custom] should be now able to be replaced as well. Please test the current dev version for yourself and give me feedback whether this is more useful to you. But as previously said, Wordfilter won't process real tokens by design*.

* When #2820131: Allow the usage of different algorithms and services for the filtering process is fixed, it should be possible to add a new filtering algorithm which will use token_replace() though. The user may then select this algorithm at the Wordfilter configuration.

Comment #8

anavarre

French

🇪🇺

CreditAttribution: anavarre at Acquia commented 25 October 2016 at 13:01

I have changed the filtering process in the current dev version in that way that word boundaries are now only used for 'naturally and commonly spoken' words. Using a token-style word like [my:custom] should be now able to be replaced as well. Please test the current dev version for yourself and give me feedback whether this is more useful to you.

Hey, thanks for jumping on this so quickly! This worked great in my testing. Although you cannot seem to be able to enter more than one 'word' to filter when there's a token-style 'string' involved (e.g. [my-custom:token], My Text => only the token-style will be replaced, not the 'My Text' string.) it does exactly what I had hoped for.

But as previously said, Wordfilter won't process real tokens by design

Totally understand and not my expectation anyway. I really only wanted Wordfilter to parse a string (albeit, not standard) and render it with an arbitrary text replacement. As far as I'm concerned your latest commits make it work as anticipated and I'm thrilled. Thank you.

Comment #9

mxh

German

Offenburg

CreditAttribution: mxh commented 25 October 2016 at 13:19

Although you cannot seem to be able to enter more than one 'word' to filter when there's a token-style 'string' involved (e.g. [my-custom:token], My Text => only the token-style will be replaced, not the 'My Text' string.)

That's a little bit strange, because it works on my installation, e.g. my filter words are defined as [my-test:string], Other Test, it renders the node title containing the text [my-test:string] and some other test ! as expected into **** and some **** !.

I'm glad I could help. This feature addition seems to make sense and will be included in the upcoming alpha4-Release.