Objective
-
#2190723: Add a KeyValueStore\FileStorage to replace e.g. ConfigStorage discovered that all key/value store implementations in core are hard-coding PHP
serialize()
as serialization format right now. -
PHP
serialize()
is definitely not always the most appropriate serialization format.unserialize()
has security issues. -
A key/value store MUST NOT care for the serialization format being used to begin with — its sole responsibility is to store a (string) value and retrieve it. The serialization format only needs to be consistent for each instance of a key/value store.
-
As a concrete use-case, the (file-based) configuration system currently expects data to be encoded and decoded in YAML.
Proposed solution
-
Establish
Drupal\Component\Serialization
as a core component that provides default implementations for serialization formats used in Drupal. -
Inject a serialization format into each key/value store instance.
Notes
-
This issue only adds the
PhpSerialize
serialization format, which is currently used by key/value store implementations. -
The set of formats is completed by
#2208609: Move Json utility class into Drupal\Component\Serialization
#2208633: Add a Yaml class to Drupal\Component\Serialization
Comment | File | Size | Author |
---|---|---|---|
#33 | kv.format.33.patch | 19.7 KB | sun |
#23 | kv.format.23.patch | 19.74 KB | sun |
#23 | interdiff.txt | 12.29 KB | sun |
Comments
Comment #1
sunComment #2
sunHm. This difference in argument order is unfortunate and bad DX. The format should always come first (or after
$collection
).Let's make that consistent.
Comment #3
sunMake MemoryStorage override StorageBase constructor instead of using a confusing $format default of NULL.
I think this is ready to fly.
Comment #4
sunFeedback, anyone? :-)
Comment #5
sun3: kv.format.3.patch queued for re-testing.
Comment #6
dawehnerOne thing which I really try to understand is the following. Why do we define all this methods as static? Can we assume that all typical serialization bits don't really have dependencies?
In additional to that I wonder why it is not possible to reuse the Serializer interface from symfony, just one example:
Comment #7
sunThe underlying technical premise is that these serialization classes are just some utility-alike classes, which should be usable in a standalone fashion by simply calling the
encode()
anddecode()
methods. The latest patch for Json clarifies that we're using that class that way throughout core already.All of the planned serialization classes (Json, PhpSerialize, Yaml, Xml) are just trivial wrappers around either native PHP core functions or utility classes (Yaml). None of them needs additional configuration beyond the Drupal-specific flags that we're hard-coding. As these are all of the dominant serialization formats today, I do not foresee that we'd be adding any further classes than those four.
Symfony's Serialization component is a very complex piece of code, which has a different intention: Instead of simply encoding and decoding data into and from a specific serialization format, it allows serializers to interact with every single element in the process, in order to validate and normalize each value, to resolve references and linked resources, and to allow other code to participate in the process. Such a very controlled and granular serialization is the use-case of the REST module in core, in which the serialized/encoded data is to be treated like untrusted external user input (similar to user input in forms). That's very different to these simple + standardized serialization format classes.
Note that I was able to perform two interesting serialization format switches in the parent issue already: (1) As an experiment only, I switched the State service to store data in Json instead of serialized PHP. (2) The primary objective: I was able to switch the active configuration from Yaml to Json (and even PhpSerialize) — but of course, the latter is only possible with the additional k/v FileStorage implementation of the parent issue.
Comment #8
sunMerged 8.x.
Comment #9
pwolanin CreditAttribution: pwolanin commentedLooks straight forward, though I think the variable naming is poor - can you rename this>
SerializationInterface $format
. I'd use something liek$serializer
Comment #10
sunComment #11
pwolanin CreditAttribution: pwolanin commentedMuch more readable with those changes.
Comment #12
xjmAssigning to @alexpott for review.
Comment #13
tim.plunkettClarifying scope
Comment #14
alexpottThis looks wrong - if we're going to the trouble of injecting the serializer into
DatabaseStorage
then I think we should go to the trouble of injecting it intoKeyValueDatabaseExpirableFactory
tooComment #15
sunInject serializer into DatabaseStorageExpirable.
Comment #17
sunSorry, last patch was bogus. (Only interdiff was correct.)
Comment #19
sunInject serializer into user TempStoreFactory, too.
Comment #21
sunComment #22
sunI hope it is OK if I move this back to RTBC — the adjustment to also inject a serializer into
KeyValueExpirable
instead of hard-codingPhpSerialize
is really minor and was the only remark in #14.I actually copied the code from the non-expirable implementation (and only forgot to update tests accordingly, hence the hiccups ;-)).
Comment #23
sunAfter further discussion on #2208609: Move Json utility class into Drupal\Component\Serialization + #2208633: Add a Yaml class to Drupal\Component\Serialization + IRC...:
Moved
Drupal\Core\Serialization
intoDrupal\Component\Serialization
.Comment #24
sunUpdated issue summary accordingly.
Comment #25
chx CreditAttribution: chx commentedComment #26
tstoecklerNot marking back to "needs work" as this is very minor and should probably discussed in a follow-up, but this hints at the fact that getFileExtesion() shouldn't really be on the interface.
I.e. in a follow-up we could split that into a
FileSerializationInterface extends SerializationInterface
which is then used by Config's FileStorage.Comment #27
sunCan we move forward here?
Comment #28
catchAre we sure that all the serializers that can be injected preserve type OK?
I'm thinking of calling code that stores and int, expects and int back, gets back a string etc. It would not be fun if this worked with one serializer and not with another.
Comment #29
sun@catch: Both yes and no... out of the default serializer implementations, all of them do support primitive data types, but only
PhpSerialize
supports e.g. serialization of class instances/objects. Whether it is legit and makes sense to use a particular serializer depends on the use-case.Comment #30
pwolanin CreditAttribution: pwolanin commentedSo, is KV supposed to support arbitrary data like classed object or only scalars and arrays like Config does?
Comment #31
sun@pwolanin: I've been trying to avoid that question, because it's an implementation detail of user-space code. It's a user-space decision what exactly gets stored in a particular key/value service instance. The backends do not care, because it's a hashtable that's compound of strings/bytes only; keys and values.
The question is not what the common denominator of all serializers + backends is — but instead, how many custom, one-off key/value use-cases needlessly have to re-invent the wheel from scratch, because this stuff is hard-coded right now.
The
KeyValueStore
component is an abstraction that provides access to a concept, offering various implementations. You should be able to leverage the concept; you don't necessarily have to futz with "the" default key_value service.Forgot one aspect in #29 regarding data types:
Similar to the Memory backend in core, I learned that some alternative backends do not need a serializer, as they're capable of storing native PHP data types via their corresponding PHP extensions. (can't remember which one, but might have been redis)
Comment #32
alexpott23: kv.format.23.patch queued for re-testing.
Comment #33
sunMerged 8.x + Adjusted for committed sister issues.
Comment #34
webchickLooks like pretty straight-forward refactoring. The SerializationInterface is nice.
Committed and pushed to 8.x. Thanks!