Since PHP 7 the preg engine uses a just in time compiler. This one can cause strange issues:
https://bugs.php.net/bug.php?id=70110

When a YAML file contains a value that consists of a lot of newlines, we're affected.

We've been affected by the PHP bug until Drupal 8.4, where the symfony components have been updated to a newer version in #2712647: Update Symfony components to ~3.2. As we want to ensure we'll not encounter the same problem again this issue is only about adding the test that used to fail in Drupal 8.3.x. The problem itself has been fixed in Symfony.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mkalkbrenner created an issue. See original summary.

mkalkbrenner’s picture

mkalkbrenner’s picture

mkalkbrenner’s picture

Status: Needs review » Active
mkalkbrenner’s picture

mkalkbrenner’s picture

Status: Active » Needs review
mkalkbrenner’s picture

Hooray, the expected fail on PHP 7 when the amount of newlines gets too much.

neclimdul’s picture

can't say I'm excited about turning of jit. is there any related upstream symfony discussion?

mkalkbrenner’s picture

I considered creating a pull request for symfony as well. But I found that one which seems to be a better solution: https://github.com/symfony/symfony/pull/19750

Nevertheless the search_api_solr_multilingual requires a solution. We have language and domain specific word lists which break the jit limit. These lists are coming from libre office. Not using configuration management for that will be a major drawback.

neclimdul’s picture

I don't know how you're configuring things but it seems like it might work better for you to store them as a yaml sequence(array) instead. you should get native datastructures saving you a step of exploding the list on \n and should avoid this problem.

mkalkbrenner’s picture

I don't know how you're configuring things but it seems like it might work better for you to store them as a yaml sequence(array) instead. you should get native datastructures saving you a step of exploding the list on \n and should avoid this problem.

That should be discussed in the Search API Solr Multilingual issue queue. But we had that discussion already and selected the string with newlines for good reasons, for example copy and paste stuff from libre office or Solr configs.
BTW we don't have trouble with the initial import as we use the literal style, indicated by the pipe "|".

cspitzlay’s picture

BTW we don't have trouble with the initial import as we use the literal style, indicated by the pipe "|".

... which does not seem to be supported by the Symfony YAML dumper. Which is sad as it would not only circumvent the jit stack limit issue (#2792843: Drupal crashes after exporting and re-importing Solr field configs) but also be more readable in our use case (noun lists in language-specific Solr field configuration).

neclimdul’s picture

UI shouldn't dictate storage

cspitzlay’s picture

What UI are you talking about? I was talking about the file on disk when viewed in a text editor or when diffed by a VCS system after a roundtrip through Dupal (config import and then config export).

The files for newly added languages are typically compiled manually from external sources that are plain text.
Currently there isn't even a GUI to edit them in the D8 version of search_api_solr_multilingual.

But I agree with mkalkbrenner that a discussion, if any, should better take place in Search API Solr Multilingual.

The point of this issue here is that valid Yaml cannot be parsed.

neclimdul’s picture

I'm not trying to discount your problem only trying to help offer workarounds since this is a pretty far edge case of the YAML parser. I'm also interested in getting it fixed which is why I tagged it a contrib blocker to get more eyes on it. I'm sorry.

cspitzlay’s picture

Don't be. Thanks for your effort.
I just wanted to give some background why the original files
contain large lists of newline-separated data
(which the Yaml parser can read just fine).

andypost’s picture

andypost’s picture

andypost’s picture

alexpott’s picture

The problem here is that pcre.jit has been implemented to improve regular expression performance and Symfony's yaml parser makes heavy use of regular expressions.

tstoeckler’s picture

Maybe we can disable JIT only conditionally? I'm not sure if there's a reliable way to check for this error, but looking at https://bugs.php.net/bug.php?id=70110 it seems that it only affects long strings. Maybe we can do something like strlen($string) > 1000 so we can still get the performance benefit for most files?

hchonov’s picture

@tstoeckler this would not work as with JIT you might first encounter a string which length is under 1000 characters but later after the pattern is compiled you might encounter a bigger string. And also it is not only the length of the string that matters but also how complicated it is, which means it might pass with a string which length is above 1000 characters but fail with a string which length is under 1000 characters.

mkalkbrenner’s picture

Maybe we can disable JIT only conditionally? I'm not sure if there's a reliable way to check for this error, but looking at https://bugs.php.net/bug.php?id=70110 it seems that it only affects long strings. Maybe we can do something like strlen($string) > 1000 so we can still get the performance benefit for most files?

Maybe that's a good approach. But as demonstrated in the tests we should not use the length of a string but the amount of newlines:

+++ b/core/tests/Drupal/Tests/Component/Serialization/YamlSymfonyTest.php
@@ -62,4 +62,20 @@ public function testError() {
+  public function testNewlines() {
+    $string = str_repeat('\n', 2700);
+    $yaml = 'string: "' . $string . '"';
+    $config = YamlSymfony::decode($yaml);
+    $this->assertEquals(str_repeat("\n", 2700), $config['string']);
+
+    $string = str_repeat('\n', 2800);
+    $yaml = 'string: "' . $string . '"';
+    $config = YamlSymfony::decode($yaml);
+    $this->assertEquals(str_repeat("\n", 2800), $config['string']);
+  }
mkalkbrenner’s picture

OK, there're also other issues with "complicated" long strings. So maybe a simple strlen() is a good trade-off.

mkalkbrenner’s picture

@cspitzlay just reminded me that we can't have such a condition at all.
If the YAML parser passes some short strings using the jit compiler, the pattern is already "cached". Once it is cached turning of jit for long strings has no effect anymore.

tstoeckler’s picture

Issue tags: +Needs benchmarks

So I guess this also means that theoretically someone could use preg_match() with the same pattern that Symfony's Yaml so that it will be compiled already and then even your patch would not work. I guess that's pretty far-fetched but still worth considering IMO.

I thought whether instead of the current patch - or perhaps in addition to it - we could add a hook_requirements() warning about this. We could check if people are using the Symfony implementation and if so check if they have pecl.jit on. And if so, recommend either turning it off or installing PECL YAML.

Regarding the perfomance impact of turning off JIT: Since we now actually support using PECL YAML for decoding, perhaps it's acceptable do slightly slow down Symfony's YAML decoding as people who are performance conscious will use the PECL implementation anyway? Tentatively tagging "Needs benchmarks" as I think it would be good to know what kind of performance tradeoffs we are talking about here.

cilefen’s picture

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

jhedstrom’s picture

Status: Needs review » Needs work
Issue tags: +Needs issue summary update

Patch no longer applying, and the IS could use updating to describe what's left to be done here.

jofitz’s picture

Status: Needs work » Needs review
FileSize
2.59 KB

Re-rolled.

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

Anonymous’s picture

Cann't reproduce this bug anymore. Perhaps after https://github.com/symfony/symfony/pull/19750.

hchonov’s picture

@vaplas, the referenced GitHub pull request has been closed without merging it.

Anonymous’s picture

@hchonov, thanks for correction, indeed. But test passes without fix now.

hchonov’s picture

Issue summary: View changes
Status: Needs review » Reviewed & tested by the community
Issue tags: -Contributed project blocker, -Needs benchmarks, -Needs issue summary update

The test that used to fail on 8.3.x indeed doesn't fail anymore on 8.4.x and later. It is not because of a PHP bug fix either as we see it passes on all the current versions - 7, 7.1 and 7.2.
The symfony components have been updated in 8.4.x to a newer version in #2712647: Update Symfony components to ~3.2, so most probably this is the reason for the test passing now.

Let's still get in the the test from #38, as the PHP bug will not be fixed soon. RTBCing it for that.

@vaplas++

hchonov’s picture

Priority: Major » Normal

Oh and this is not major anymore.

mkalkbrenner’s picture

I did a quick test and can confirm that exported Solr Field Type configs can be re-imported now.

alexpott’s picture

How about filing a PR to add this to Symfony's Yaml test suite? I'm not sure we should be added tests in our code for bugs in our dependencies.

neclimdul’s picture

That is a good point. Technically we're testing our wrapper but since its not testing any code we've actually implemented it does feel like something for their test suite.

alexpott’s picture

Status: Reviewed & tested by the community » Closed (outdated)
hchonov’s picture

@alexpott++, thank you for the findings.