Objective

  1. #2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage discovered that all key/value store implementations in core are hard-coding PHP serialize() as serialization format right now.

  2. PHP serialize() is definitely not always the most appropriate serialization format. unserialize() has security issues.

  3. A key/value store MUST NOT care for the serialization format being used to begin with — its sole responsibility is to store a (string) value and retrieve it. The serialization format only needs to be consistent for each instance of a key/value store.

  4. As a concrete use-case, the (file-based) configuration system currently expects data to be encoded and decoded in YAML.

Proposed solution

  1. Establish Drupal\Component\Serialization as a core component that provides default implementations for serialization formats used in Drupal.

  2. Inject a serialization format into each key/value store instance.

Notes

  1. This issue only adds the PhpSerialize serialization format, which is currently used by key/value store implementations.

  2. The set of formats is completed by

    #2208609: Move Json utility class into Drupal\Component\Serialization
    #2208633: Add a Yaml class to Drupal\Component\Serialization

Files: 
CommentFileSizeAuthor
#33 kv.format.33.patch19.7 KBsun
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 67,983 pass(es). View
#23 kv.format.23.patch19.74 KBsun
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 67,994 pass(es). View
#23 interdiff.txt12.29 KBsun

Comments

sun’s picture

Status: Active » Needs review
FileSize
12.08 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 65,287 pass(es). View
sun’s picture

FileSize
12.09 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 65,357 pass(es). View
3.09 KB
+  public function __construct($collection, SerializationInterface $format, Connection $connection, $table = 'key_value') {
↑
+  function __construct(Connection $connection, SerializationInterface $format) {

Hm. This difference in argument order is unfortunate and bad DX. The format should always come first (or after $collection).

Let's make that consistent.

sun’s picture

FileSize
12.74 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 66,426 pass(es). View
1.25 KB

Make MemoryStorage override StorageBase constructor instead of using a confusing $format default of NULL.

I think this is ready to fly.

sun’s picture

Feedback, anyone? :-)

sun’s picture

3: kv.format.3.patch queued for re-testing.

dawehner’s picture

One thing which I really try to understand is the following. Why do we define all this methods as static? Can we assume that all typical serialization bits don't really have dependencies?
In additional to that I wonder why it is not possible to reuse the Serializer interface from symfony, just one example:

interface EncoderInterface
{
    /**
     * Encodes data into the given format
     *
     * @param mixed  $data   Data to encode
     * @param string $format Format name
     * @param array  $context options that normalizers/encoders have access to.
     *
     * @return scalar
     */
    public function encode($data, $format, array $context = array());

    /**
     * Checks whether the serializer can encode to given format
     *
     * @param string $format format name
     *
     * @return Boolean
     */
    public function supportsEncoding($format);
}
sun’s picture

The underlying technical premise is that these serialization classes are just some utility-alike classes, which should be usable in a standalone fashion by simply calling the encode() and decode() methods. The latest patch for Json clarifies that we're using that class that way throughout core already.

All of the planned serialization classes (Json, PhpSerialize, Yaml, Xml) are just trivial wrappers around either native PHP core functions or utility classes (Yaml). None of them needs additional configuration beyond the Drupal-specific flags that we're hard-coding. As these are all of the dominant serialization formats today, I do not foresee that we'd be adding any further classes than those four.

Symfony's Serialization component is a very complex piece of code, which has a different intention: Instead of simply encoding and decoding data into and from a specific serialization format, it allows serializers to interact with every single element in the process, in order to validate and normalize each value, to resolve references and linked resources, and to allow other code to participate in the process. Such a very controlled and granular serialization is the use-case of the REST module in core, in which the serialized/encoded data is to be treated like untrusted external user input (similar to user input in forms). That's very different to these simple + standardized serialization format classes.

Note that I was able to perform two interesting serialization format switches in the parent issue already: (1) As an experiment only, I switched the State service to store data in Json instead of serialized PHP. (2) The primary objective: I was able to switch the active configuration from Yaml to Json (and even PhpSerialize) — but of course, the latter is only possible with the additional k/v FileStorage implementation of the parent issue.

sun’s picture

FileSize
12.68 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 66,859 pass(es). View

Merged 8.x.

pwolanin’s picture

Looks straight forward, though I think the variable naming is poor - can you rename this> SerializationInterface $format. I'd use something liek $serializer

sun’s picture

FileSize
11.61 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 67,031 pass(es). View
7.88 KB
  1. Renamed $format variable to $serializer.
  2. Adjusted phpDoc accordingly: s/serialization format/serialization class/g.
  3. Removed serialization class injection from StorageBase on @msonnabaum's request.
pwolanin’s picture

Status: Needs review » Reviewed & tested by the community

Much more readable with those changes.

xjm’s picture

Assigned: sun » alexpott

Assigning to @alexpott for review.

tim.plunkett’s picture

Title: Inject a serialization format into key/value storages » Inject a serialization format into database key/value storage

Clarifying scope

alexpott’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/lib/Drupal/Core/KeyValueStore/DatabaseStorageExpirable.php
@@ -53,7 +47,7 @@ class DatabaseStorageExpirable extends DatabaseStorage implements KeyValueStoreE
-    parent::__construct($collection, $connection, $table);
+    parent::__construct($collection, new PhpSerialize(), $connection, $table);

This looks wrong - if we're going to the trouble of injecting the serializer into DatabaseStorage then I think we should go to the trouble of injecting it into KeyValueDatabaseExpirableFactory too

sun’s picture

Assigned: alexpott » sun
Status: Needs work » Needs review
FileSize
21.49 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] Unable to apply patch kv.format.15.patch. Unable to apply patch. See the log in the details link for more information. View
5.29 KB

Inject serializer into DatabaseStorageExpirable.

Status: Needs review » Needs work

The last submitted patch, 15: kv.format.15.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
15.48 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] Drupal installation failed. View

Sorry, last patch was bogus. (Only interdiff was correct.)

Status: Needs review » Needs work

The last submitted patch, 17: kv.format.17.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
17.79 KB
FAILED: [[SimpleTest]]: [PHP 5.4 MySQL] 66,986 pass(es), 2 fail(s), and 19 exception(s). View
2.46 KB

Inject serializer into user TempStoreFactory, too.

Status: Needs review » Needs work

The last submitted patch, 19: kv.format.19.patch, failed testing.

sun’s picture

Status: Needs work » Needs review
FileSize
19.76 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 67,564 pass(es). View
1.97 KB
  1. Fixed GarbageCollectionTest unit test.
  2. Fixed TempStoreDatabaseTest unit test.
sun’s picture

Status: Needs review » Reviewed & tested by the community

I hope it is OK if I move this back to RTBC — the adjustment to also inject a serializer into KeyValueExpirable instead of hard-coding PhpSerialize is really minor and was the only remark in #14.

I actually copied the code from the non-expirable implementation (and only forgot to update tests accordingly, hence the hiccups ;-)).

sun’s picture

FileSize
19.74 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 67,994 pass(es). View
12.29 KB

After further discussion on #2208609: Move Json utility class into Drupal\Component\Serialization + #2208633: Add a Yaml class to Drupal\Component\Serialization + IRC...:

Moved Drupal\Core\Serialization into Drupal\Component\Serialization.

sun’s picture

Issue summary: View changes

Updated issue summary accordingly.

tstoeckler’s picture

+++ b/core/lib/Drupal/Component/Serialization/PhpSerialize.php
@@ -0,0 +1,36 @@
+  /**
+   * {@inheritdoc}
+   */
+  public static function getFileExtension() {
+    return 'serialized';
+  }

Not marking back to "needs work" as this is very minor and should probably discussed in a follow-up, but this hints at the fact that getFileExtesion() shouldn't really be on the interface.

I.e. in a follow-up we could split that into a FileSerializationInterface extends SerializationInterface which is then used by Config's FileStorage.

sun’s picture

Can we move forward here?

catch’s picture

Are we sure that all the serializers that can be injected preserve type OK?

I'm thinking of calling code that stores and int, expects and int back, gets back a string etc. It would not be fun if this worked with one serializer and not with another.

sun’s picture

@catch: Both yes and no... out of the default serializer implementations, all of them do support primitive data types, but only PhpSerialize supports e.g. serialization of class instances/objects. Whether it is legit and makes sense to use a particular serializer depends on the use-case.

pwolanin’s picture

So, is KV supposed to support arbitrary data like classed object or only scalars and arrays like Config does?

sun’s picture

@pwolanin: I've been trying to avoid that question, because it's an implementation detail of user-space code. It's a user-space decision what exactly gets stored in a particular key/value service instance. The backends do not care, because it's a hashtable that's compound of strings/bytes only; keys and values.

The question is not what the common denominator of all serializers + backends is — but instead, how many custom, one-off key/value use-cases needlessly have to re-invent the wheel from scratch, because this stuff is hard-coded right now.

The KeyValueStore component is an abstraction that provides access to a concept, offering various implementations. You should be able to leverage the concept; you don't necessarily have to futz with "the" default key_value service.


Forgot one aspect in #29 regarding data types:

Similar to the Memory backend in core, I learned that some alternative backends do not need a serializer, as they're capable of storing native PHP data types via their corresponding PHP extensions. (can't remember which one, but might have been redis)

alexpott’s picture

23: kv.format.23.patch queued for re-testing.

sun’s picture

FileSize
19.7 KB
PASSED: [[SimpleTest]]: [PHP 5.4 MySQL] 67,983 pass(es). View

Merged 8.x + Adjusted for committed sister issues.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

Looks like pretty straight-forward refactoring. The SerializationInterface is nice.

Committed and pushed to 8.x. Thanks!

  • Commit 13c930f on 8.x by webchick:
    Issue #2216527 by sun: Inject a serialization format into database key/...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.