Overview

The canvas module overrides the property definitions for the string_long field type in \Drupal\canvas\Plugin\Field\FieldTypeOverride\StringLongItemOverride.

It currently adds a RegexConstraint to the value property with the pattern /(.|\r?\n)*/ to match any string, including newlines.

When saving entities with large amounts of text (e.g., 10,000+ characters) in a string_long field, this specific regex pattern causes PHP's PCRE engine to hit the JIT stack limit (PREG_JIT_STACKLIMIT_ERROR, code 6). This happens because the alternating group (.|\r?\n)
combined with the * quantifier forces the engine to create an excessive number of capture groups and backtrack, exhausting memory.

When the JIT limit is hit, the regex execution fails, which causes the field validation to fail. This results in the user seeing a generic "This value is not valid" validation error when trying to save the entity, with no clear indication in the logs as to why.

Steps to reproduce

Install and enable the canvas module.
Create a node type with a string_long field (e.g., a plain long text field).
Attempt to save a node with a very large string in this field (e.g., 15,000+ characters).
The entity save fails with a validation error on the field: "This value is not valid."
(Developer side) Manually running preg_match('/(.|\r?\n)*/', $long_string) returns 0 and preg_last_error() returns 6 (PREG_JIT_STACKLIMIT_ERROR).

Proposed resolution

Change the regex pattern to be more efficient. Since the goal of the regex seems to be matching any string including newlines, it can be replaced with the /s modifier (PCRE_DOTALL), which makes the dot . match all characters including newlines.
Change:

$properties['value']->addConstraint('Regex', [
      'pattern' => '/(.|\r?\n)*/',
    ]);

To:

$properties['value']->addConstraint('Regex', [
      'pattern' => '/.*/s',
    ]);

This is exponentially faster and successfully validates long strings without exhausting the JIT stack.

Comments

barakgalili created an issue. See original summary.

wim leers’s picture

Component: … to be triaged » Shape matching
Priority: Normal » Critical
Issue tags: +Needs tests

Wow, nice find! Could you add a test that reproduces this failure? 🙏

The intent of the regex is to convey that it's a string that is allowed to contain newlines. I don't think .* conveys that.

We may need to convey this in an altogether different way. 🤔

barakgalili’s picture

StatusFileSize
new3.79 KB

Thanks for taking a look! I completely agree that the intent of the regex should be clear to anyone reading the code that newlines are explicitly expected and allowed.

The problem with /(.|\r?\n)*/ is that putting an alternation group | (with an optional group inside it) inside a greedy * quantifier causes catastrophic backtracking in PHP's PCRE engine. When evaluating strings of 15,000+ characters, it is forced to create thousands of permutations, hitting the fatal "JIT stack limit reached" (PREG_JIT_STACKLIMIT_ERROR, code 6) immediately.

To preserve the explicit intent while eliminating the stack limit crash, I have updated the patch to use /[\s\S]*/. This explicitly matches any whitespace character (which inherently includes \n and \r) or non-whitespace character. It conveys the exact same meaning but processes linearly, avoiding JIT stack accumulation.

Per your request, I have also written a kernel test (StringLongItemOverrideTest.php) that reproduces the failure by attempting to validate a 20,000-character string containing newlines.

Without the patch: The test fails exactly as described in the issue because the PCRE engine crashes under the hood.
With the patch: The test immediately succeeds.
I've attached the updated patch file including both the regex change and the new test. Let me know what you think!