UtextPlainWidget Options

This project is not covered by Drupal’s security advisory policy.

UText module

All editors, copywriters, users have different skills in html and unicode. Somebody type text in notepads, anothers type in word processors or in some advanced publishing platforms, all can made copy-paste from foreign sources and so on.

As result in real life: many pieces of simple utf-8 text (in site's database for example) can be very different in formatting and technical quality.

This module provides services, widgets and formatters with plain text filtering feature, based on infoxy/utext library.

Installation

Use composer to install module with all dependencies:

composer require drupal/utext

utext.plain service

This service can be used to normalize strings in some ways, like this: strip tags with space insertion, replace some specific chars and patterns, normalize and trim spaces, transform to specified unicode normalization form.

There is list of options, that clarify what utext.plain service can do:

$options = [
    'filter_utf8' => t('Bypass only correct utf8 chars'),
    'newline_tags' => t('Insert "\r\n" before every "<" (Useful with strip tags)'),
    'strip_tags' => t('Strip tags'),
    'decode_entities' => t('Decode html entities to chars'),
    'lang_quotes' => t('Replace double quotes with language-specific ones'),
    'simplify_dashes' => t('Simplify hyphens and dashes to hyphen-minus, endash or emdash'),
    'shy_pattern'  => t('Use soft-hyphen pattern \-'),
    'dash_patterns'   => t('Use dash patterns -- and ---'),
    'replace_triple_dots' => t('Replace triple dots with ellipsis'),
    'replace_quotes'  => t('Replace single and double quotes with curly quotes'),
    'replace_specials'=> t('Replace special chars with safe ones'),
    'simplify_spaces' => t('Simplify spaces (Replace nbsp and spations)'),
    'collapse_spaces' => t('Replace sequence of whitespaces with single space'),
    'trim' => t('Trim leading and trailing whitespaces'),
    'trim_dots'   => t('Trim leading and trailing dots'),
    'normalize' => t('Normalize')
];

More extended description can be found on infoxy/utext/PlainFilter page.

Dash patterns and entities

Patterns feature can be implement in two main ways:

  • as formatter option: patterns and entities in source text
  • as widget option: text saved as normalized (in some ways) string without patterns, but turn to pattern and back on the fly.

If patterns used, then `\-`, `--`, `---` are reserved for their use in text.

Widgets

Filtering in widgets allow to polish input strings before saving. So, saved string always stay clean and normalized is some aspects.

UtextPlainWidget also provide:

  • Optional lettercase (uppercase, lowercase, titlecase)
  • Optional Regex pattern to check final result (on server side, mb_strings)

Widget support special escapement that allow to restore patterns before pass data to widget and re-filter strings with [clously] similar results every time then data come back.

This lead to on-the-fly pattern and html-entities usage: patterns and entities will not be saved in resulting data (if options is configured to that).

Formatters

Supported field types:
`string`, `string_long` - full string formatter;
`text`, `text_long`, `text_with_summary` - trimmed or summary formatter;

Entities decode and patterns usage in formatters are really different then from thous in widgets: patterns must be in string data itself.
Note that widgets regenerate patterns on the fly based on chars in string data.

Project information

Releases