Problem/Motivation

\Drupal\Core\Plugin\Discovery\YamlDiscovery fails to handle empty files.

The result of the YAML-parsing is NULL for empty files but YamlDiscovery performs a foreach of the parsing result. (See line 50.)

Proposed resolution

Add a check for NULL and skip the file in this case.

Remaining tasks

User interface changes

API changes

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tstoeckler’s picture

Status: Active » Needs review
Issue tags: +Needs tests
FileSize
824 bytes
dawehner’s picture

+++ b/core/lib/Drupal/Core/Plugin/Discovery/YamlDiscovery.php
@@ -47,6 +47,12 @@ class YamlDiscovery implements DiscoveryInterface {
+      // If the file was empty, the parsed result will be NULL. An empty file
+      // is equivalent to providing no definitions at all.
+      if ($list === NULL) {
+        continue;
+      }
+

Should we mention that this can happen also in case you have just comments in your file? On addition we could fix this also in the component itself, so check in findAll() that the result is actually an array.

damiankloip’s picture

Yes, I was going to say the very same. Is this something we generally want. However, this will just make the whole key for x module disappear. Is that what we really want? Rather than just an empty array for x module that has an empty file?

tstoeckler’s picture

FileSize
758 bytes

I'm not 100% sure I understood #3 correctly, but I think this patch is what #2 and #3 are referring to.

I like the suggestion a bit better than #1 myself, so thanks for that! I tested that this also fixes the mentioned use-case.

Still needs tests, though.

damiankloip’s picture

FileSize
4.08 KB
2.35 KB

damiankloip queued 5: 2331407-5.patch for re-testing.

dawehner’s picture

Issue tags: -Needs tests
  1. +++ b/core/lib/Drupal/Component/Discovery/YamlDiscovery.php
    @@ -48,7 +48,9 @@ public function __construct($name, array $directories) {
    -      $all[$provider] = Yaml::decode(file_get_contents($file));
    +      // If a file is empty or its contents are commented out, return an empty
    +      // array instead of NULL for type consistency.
    +      $all[$provider] = Yaml::decode(file_get_contents($file)) ?: array();
    

    Mh, so according to Yaml::decode this could throw an exception, should we maybe catch here and convert it to an array? (just curious)

  2. +++ b/core/tests/Drupal/Tests/Component/Discovery/YamlDiscoveryTest.php
    @@ -9,6 +9,9 @@
    +use org\bovigo\vfs\vfsStream;
    +use org\bovigo\vfs\vfsStreamWrapper;
    +use org\bovigo\vfs\vfsStreamDirectory;
    

    <3

tstoeckler’s picture

Re #7: Actually the exceptions that are thrown are if either the file is not readable or if the YAML is invalid, (i.e. if you use tabs, or if you mix maps and sequences in a single array, etc.) which I think we should not silence, but which should surface, just as they do now.

damiankloip’s picture

Agreed, that sounds like the behaviour we want.

jhedstrom queued 5: 2331407-5.patch for re-testing.

Xano’s picture

Status: Needs review » Reviewed & tested by the community
+++ b/core/lib/Drupal/Component/Discovery/YamlDiscovery.php
@@ -48,7 +48,9 @@ public function __construct($name, array $directories) {
+      $all[$provider] = Yaml::decode(file_get_contents($file)) ?: array();

Dude, a ternary operator without short array syntax? ;-)

+++ b/core/lib/Drupal/Component/Discovery/YamlDiscovery.php
@@ -48,7 +48,9 @@ public function __construct($name, array $directories) {
diff --git a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_1/test_1.test.yml b/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_1/test_1.test.yml

diff --git a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_1/test_1.test.yml b/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_1/test_1.test.yml
deleted file mode 100644

deleted file mode 100644
index b327536..0000000

index b327536..0000000
--- a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_1/test_1.test.yml

--- a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_1/test_1.test.yml
+++ /dev/null

+++ /dev/null
+++ /dev/null
@@ -1 +0,0 @@

@@ -1 +0,0 @@
-name: test
diff --git a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_2.test.yml b/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_2.test.yml

diff --git a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_2.test.yml b/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_2.test.yml
deleted file mode 100644

deleted file mode 100644
index b327536..0000000

index b327536..0000000
--- a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_2.test.yml

--- a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_2.test.yml
+++ /dev/null

+++ /dev/null
+++ /dev/null
@@ -1 +0,0 @@

@@ -1 +0,0 @@
-name: test
diff --git a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_3.test.yml b/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_3.test.yml

diff --git a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_3.test.yml b/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_3.test.yml
deleted file mode 100644

deleted file mode 100644
index b327536..0000000

index b327536..0000000
--- a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_3.test.yml

--- a/core/tests/Drupal/Tests/Component/Discovery/Fixtures/test_2/test_3.test.yml
+++ /dev/null

+++ /dev/null
+++ /dev/null
@@ -1 +0,0 @@

@@ -1 +0,0 @@
-name: test

Nice one!

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/lib/Drupal/Component/Discovery/YamlDiscovery.php
@@ -48,7 +48,9 @@ public function __construct($name, array $directories) {
+      $all[$provider] = Yaml::decode(file_get_contents($file)) ?: array();

I think this falls into the trap that http://fabien.potencier.org/article/48/the-php-ternary-operator-fast-or-not details. Potentially the array could be very large.

Xano’s picture

Status: Needs work » Needs review
FileSize
1.05 KB
4.38 KB

Thanks for the link! I removed the ternary operator and explained why we thought it was a bad idea to use it.

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

Good point!

tstoeckler’s picture

Re #12: Because we are not saving the file contents into a variable, the ternary should not be slower, as there is no copying involved.

Can still stay this way, of course, regardless.

Xano’s picture

Status: Reviewed & tested by the community » Needs work

If there are no actual reasons to go with one approach or the other, we shouldn't introduce code comments that say there are. I tested by rebuilding caches using the following code and the differences are usually no more than several tens or hundreds of seconds, with the occasional >1s difference, but nothing that significantly indicates one approach is faster than the other.

  public function findAll() {
    $all = array();
    foreach ($this->findFiles() as $provider => $file) {
      $start_ternary = microtime(TRUE);
      for ($i = 0; $i < 999; $i++) {
        $all[$provider] = Yaml::decode(file_get_contents($file)) ?: NULL;
      }
      $stop_ternary = microtime(TRUE);
      $start_old = microtime(TRUE);
      for ($i = 0; $i < 999; $i++) {
        // If a file is empty or its contents are commented out, return an empty
        // array instead of NULL for type consistency. Do not use a ternary
        // operator, as those can become very slow with large amounts of data. See
        // http://fabien.potencier.org/article/48/the-php-ternary-operator-fast-or-not.
        $contents = Yaml::decode(file_get_contents($file));
        if (!is_null($contents)) {
          $all[$provider] = $contents;
        }
        else {
          $all[$provider] = [];
        }
      }
      $stop_old = microtime(TRUE);
      var_dump('STATS');
      var_dump('Ternary: ' . ($stop_ternary - $start_ternary));
      var_dump('Old: ' . ($stop_old - $start_old));
    }

    return $all;
  }
Xano’s picture

Status: Needs work » Needs review
FileSize
685 bytes
4.07 KB

Here's #5, but with short array syntax.

Xano queued 17: drupal_2331407_17.patch for re-testing.

damiankloip’s picture

Status: Needs review » Reviewed & tested by the community

So in light of tstoeckler's comment in #15 and Xano's comparisions, I think this is good.

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

It appears this problem has been solved in php5.4 - see http://3v4l.org/QuDvY/perf#tabs.

This issue is a normal bug fix, and doesn't include any disruptive changes, so it is allowed per https://www.drupal.org/core/beta-changes. Committed e8e84d7 and pushed to 8.0.x. Thanks!

  • alexpott committed e8e84d7 on 8.0.x
    Issue #2331407 by Xano, tstoeckler, damiankloip: YamlDiscovery does not...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.