Problem/Motivation
In Drupal Commerce, we created a field that storage a plugin's ID and configuration. The latter is saved as a serialized blob (https://github.com/drupalcommerce/commerce/blob/8.x-2.x/src/Plugin/Field...).
When the field is attached to an entity type and has a dedicated table the target_plugin_configuration
property is unserialized just fine. However, if the field is attached as a base field item and part of the shared table, it does not.
You can see that \Drupal\Core\Entity\Sql\SqlContentEntityStorage::loadFromDedicatedTables
takes into consideration column schema being marked as serialized. And then \Drupal\Core\Entity\Sql\SqlContentEntityStorage::loadFromSharedTables
does not, at all.
Proposed resolution
Improve \Drupal\Core\Entity\Sql\SqlContentEntityStorage::loadFromSharedTables
to handle serialized column data.
Remaining tasks
User interface changes
None.
API changes
Serialized field properties of base fields are now deserialized automatically before being passed to field types and they no longer need to support them as a string.
That means all field types must support that their values are passed as a a deserialized data structure. If a field type was only used as a base field then it might not be supporting that properly.
Data model changes
None.
Release notes snippet
Serialized properties of base fields are now automatically unserialized to be consistent with configurable fields. Existing workarounds for this bug might need to be adjusted if they relied the old behavior of passing a string to those fields.
Comment | File | Size | Author |
---|---|---|---|
#62 | values_in_shared_table-2788637-62-interdiff.txt | 1.29 KB | Berdir |
#62 | values_in_shared_table-2788637-62.patch | 7.83 KB | Berdir |
#57 | values_in_shared_table-2788637-57-interdiff.txt | 4.17 KB | Berdir |
#57 | values_in_shared_table-2788637-57.patch | 7.82 KB | Berdir |
#53 | values_in_shared_table-2788637-53-interdiff.txt | 2.01 KB | Berdir |
Comments
Comment #2
mglamanFor instance, hacking the code for a quick test, the following fixes our tests
Comment #3
mglamanHere is the patch. Might be simplified if we could access
$fieldStorageDefinitions
inside of\Drupal\Core\Entity\Sql\DefaultTableMapping
Adds same logic as found in
\Drupal\Core\Entity\Sql\SqlContentEntityStorage::loadFromDedicatedTables
for checked schema marked as serialized.Comment #4
mglamanComment #6
mglamanMade patch against outdated local checkout. Re-roll.
Comment #7
borisson_I think this is related with #2788023: Serialized single-value base field columns are not unserialized when loading entities. Even though it's for different types of storage.
Do you think it's possible to fix both issues at the same time?
Comment #8
mglamanThanks borisson_, it's the same issue. Linking the two, we need a maintainer to decide proper approach. This one follows what dedicated table does.
Comment #9
mglamanMarking for subsystem maintainer review so we know which approach to take, this or linked issue in #8
Comment #10
mglamanFWIW went to write a core test, and the MapItem FieldType works fine because it handles unserialization itself, and all that. Experiencing this with a field which has multiple columns, one or more of which is serialized. Which core does not have a field like this to test against.
Comment #11
mglamanDiscussed with berdir in IRC. Core does has a test case: http://cgit.drupalcode.org/drupal/tree/core/modules/link/src/Plugin/Fiel...
I need to upload a patch with the unserialize workaround removed to prove failure. Then with removal and my fix.
Comment #12
mglamanOk. Here's a patch which should fail by removing unserialize check in LinkItem. Then a passing patch with unserialize fix.
Comment #19
mglamanFix link item test that assumed data had to be serialized.
Comment #20
Berdirthe comment above this should be updated too, not visible here but it actually explicitly talks about serialized.
Comment #21
mglamanUpdated the comment for the test, per #20
Comment #22
BerdirWe plan to add a test field type that we can use (we need something there that actually has an *object* as the value to be serialized) in #2609252: Boolean field with #access FALSE cause EntityStorageException. That will add another @todo to this issue and once it is in, we can update that and should have sufficient/explicit/dedicated/documented test coverage to get this in.
The implementation makes sense to me and is consistent with loadFromDedicatedTables(), the only question I guess is if we could share the code between the two, but the rule of three states that you should only try to share code when you have the third implementation of something :)
Comment #23
mglamanI don't see much of a gain by sharing the code between load shared or load dedicated. I'm pretty sure I had tried working on that. Or the method was so minimal (like an is serialized/serializable check.)
Comment #24
BerdirYeah, agreed.
The other issue landed, so lets remove the @todo in \Drupal\field_test\Plugin\Field\FieldType\TestObjectItem::setValue() and I think we're good to go.
Comment #25
mglamanWorking on this
Comment #26
mglamanUpdate patch which removes the
@todo
and tests the original patch!Comment #27
BerdirOne could argue that removing support for the string input is an API change for existing code.
I think it never was an API and hope people didn't do stuff like this, this was kind of a test for that code that we no longer have.
If core committers disagree, we could keep it since we have a test field type now that provides sufficient test coverage.
Marking as RTBC. A test-fail patch wouldn't hurt and maybe we should do a CR to notify people that this is now working and they can rely on it/remove their hacks.
We've marked major issues as duplicate, so I think this should be major as well.
Comment #28
alexpottI think this is an important bug fix since the system is very much not behaving as one would expect. But the fix introduces a nasty BC issue where code that was previously unserialising will now not need to. So at the very least the we need a CR to announce this and we probably need to only do this in 8.3.x and the release managers need to have a think about BC too.
The API section of the issue summary needs updating because I think we have two API changes here - one on setting values and one on retrieving values.
Comment #29
BerdirI'm fine with a change record but I think the change is not as problematic as you think:
This issue fixes the missing unserialize for base fields/shared tables, it already works fine for configurable fields/dedicated tables. That means field types already need to handle both cases. And even if it is a special field type only used for a certain base field in a custom entity type, it already needs to be dynamic to support setting the value through the API/REST (normalize/denormalize) and so on.
I'm reasonably sure that this will not break any existing field types, we can just tell them that they can remove this hack now.
Also, serializing of both base and configurable fields already works fine and it only affects calling setValue() (directly or indirectly). Accessing it with getValue() will not change here unless a field type wouldn't have the unserialize logic but then he would be broken right now.
The biggest BC problem that I see is actually custom code that would have set a serialized string on the LinkItem, like one of our tests did. And as commented, we could just keep that logic to avoid that BC break. But I really hope nobody did that except us to test the code.
Comment #30
alexpottActually I don't get why we're changing this here - isn't it kinda out-of-scope. It's not necessary for the changes in SqlContentEntityStorage which are the real meat of this change right? If we don't do this then there are no API changes that field types don't already need to deal with right? So then this change could be eligible for 8.2.x.
So lets keep the code to avoid the BC break and open a new issue to address that specific issue.
Comment #31
BerdirYes, it is not required for that change to work, but it is also dead code now that we no longer need (in core at least). Before we had the test field type, it was test coverage that it is no longer required.
We don't need it anymore for test coverage, it's just dead code now. If we don't remove it here, we need another follow-up issue and update the @todo. IMHO, it would be fine to remove it but as I wrote in #27, if committers prefer to keep it, fine with me.
Comment #33
pwolanin CreditAttribution: pwolanin as a volunteer and at SciShield commentedAre all these serialized columns simple data like associative arrays? Would we be better off using JSON as a serialization format?
Comment #34
pwolanin CreditAttribution: pwolanin as a volunteer and at SciShield commentedWhat remains here to get back to RTBC? This has been waiting a while now.
Comment #35
mglamanI don't see benefit here versus normal serialization. Especially if it's more than just simple data (never assume)
No idea. Someone to say "Nah, this isn't BC, or internal enough to be three sheets to the wind about it." But I don't dive into that level of bikeshed in core. I don't see it as BC, never did. Saw it as a bug, and a bugfix this deep shouldn't be BC.
Comment #38
Wim LeersSee #2563843-52: MapItem is broken for configurable fields.
Comment #39
effulgentsia CreditAttribution: effulgentsia at Acquia commented#26 no longer applies. This is a straight reroll, containing only changes to context lines, not to any code being added or removed.
Comment #40
bradjones1I'm not sure if this is in scope on this issue or belongs elsewhere, but I would add that this logic is also totally unreachable if
$this->dataTable
is NULL on the storage. In my case, that's a result of the entity type being neither revisionable nor translatable, and as such the property is nullified and then never set in::initTableLayout()
.This particular code is pretty old, dating to the earliest ports in D8, so I wonder if this is an oversight or purposeful and I'm just not familiar enough with the subsystem to understand the motivation.
https://github.com/drupal/drupal/blame/8.6.x/core/lib/Drupal/Core/Entity...
Comment #41
bradjones1It actually occurred to me after looking at this more closely that my edge case requires a grafting of this new logic in a third place in the class when values are processed directly off the table; see the attached patch and interdiff.
Comment #42
bradjones1Added an extra conditional onto the single-property unserialization rule, as apparently some columns in the DB do not map to a storage definition. Interdiff is still against #39.
Comment #44
borisson_I'm not sure if the changes in #41/#42 should go in a new issue and discussed there, or if we can implement them here. I added a start for a change record, by using a lot of words from @Berdir in #29.
Comment #45
BerdirThe interdiff in the last two patches are very confusing, any change you could provide on between your latest patch and the one before yours?
Lets keep this, but put a @deprecated in there, saying that passing in a serialized string value property is deprecated in Drupal 8.6.0 and will no longer be supported 9.0.
That still gives us test coverage for not doing it in core as without adding the skipped deprecation message, tests will fail, but we don't break contrib/custom.
It's fine to keep it removed in the test field type.
Comment #46
BerdirComment #47
bradjones1@Berdir - The interdiff is against the patch in #39, so I think I've provided what you asked for? My addition is just to include the same logic as the original patch but also for base fields stored directly in the main entity data table. Let me know what else you need.
Comment #48
BerdirI'm not sure what went wrong with yours exactly, but both your interdiffs are pretty large and are shifting around a ton of code.
I applied #39 and your latest patch two different branches and diffed them, and the result was the attached interdiff, which does actually make a lot more sense. It's 1.8kb instead of your 4.6kb files.
About your question in #40, it certainly is the same problem, the thing is that we don't have any field type like that in core (that has a single serialized property), so we would need to add that, an entity type that uses it and a test that saves/loads such an entity.
Comment #50
bradjones1Re-rolled
Comment #51
BerdirRebased, keeping the options-as-string support but adding a @trigger_error() to mark it as deprecated.
Comment #52
Wim LeersThis was RTBC a long time ago. What do you think still needs to happen before this can be RTBC again, @Berdir?
Comment #53
BerdirDealing with the Needs tags I suppose, they were added in #28, and I think release manager review is no longer necessary now that we are properly deprecating passing it in as a string. I've also updated the API changes section of the issue summary
There is already a change record, but I guess the wording needs to be updated a bit: https://www.drupal.org/node/2961643. I'll give that a go.
Updating the patch to reference that.
Comment #54
BerdirCR is now updated.
Comment #55
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedThis looks good to me overall. Here's a review:
Can we skip the assignment of
$storage_definition
and just do$columns = $storage_definitions[$field_name]->getColumns();
.This would also match the code from the else block below.
This
if
is not needed :)Same here, let's skip the first assignment.
'properly' fits into the first line now.
How about
assertEquals()
? :)Comment #56
andypostAs already discussed in #28-29 it introduces BC break for all contrib and custom code, moreover it makes it mostly impossible to use custom serializers which were allowed at field level.
If we going to make it so - it should allow configuration of serializer or allow prevent this unserialize but storage level.
As d8 fields supposed to maintain their schema why mapping made at level upper?
it hardcodes php-serialization the level up (to storage) so custom or contrib code no longer can use igbinary or msgpack and no ability to make it swappable like #839444: Make serializer customizable for Cache\DatabaseBackend
Comment #57
Berdir#55:
1. Yes, fixed.
2. Ah yes, it used to be necessary, but isDefaultRevision has apparently been moved into a separate loop now.
3-5. Fixed.
#56: As I said in #29, configurable/dedicated table fields already do call unserialize automatically. Only custom field types that were only used as a base field were able to rely on it being a string and even then it was a bad idea because you also forced any direct call to setValue() to pass in a serialized string. Once serialize/deserialize behavior is consistent, we can think about using something else than serialize() or making it more flexible, but setValue() is *not* the place to deal with that. Yes, there is a possible BC break and I in fact know that payment will need to be updated, and I just prepared it for that: #3035265: Make the unserialize() call in Payment::__construct() conditional. I do believe that this change makes sense and the benefits clearly outweigh the problems.
Comment #58
mglamanAs I have said before, this isn't a BC break. It's fixing a bug that everyone else skated around and refused to fix.
Comment #59
bojanz CreditAttribution: bojanz at Centarro commentedBack to RTBC.
Fix looks good, amateescu's feedback was addressed, proper deprecation added + a test.
Comment #60
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedI agree with #57 / #58, the fact that base fields do not behave like configurable fields is a bug and it shouldn't be regarded as a "flexible extension point" at all, IMO.
As @Berdir said, this patch brings sanity in this area and allows us to start thinking about how exactly do we want to make the serialization process customizable.
RTBC +1.
Comment #62
Berdir1800 test fails due to deprecation, nice, BC layer is working ;)
Comment #63
BerdirTested that #57 still applies and passes on 8.6, but considering that this will likely cause some disruptions in contrib (for example with payment.module as mentioned above, not aware of any other cases that will break) I think that it is better to not backport this, even though it would fix some issues that people have with the security hardening we did (I think the exploit is now public enough to mention the connection publicly). Those affected by that will have to use the patch in 8.6.
Comment #64
alexpottGiven where we are at in the release cycle shall we commit this to 8.8.x so custom / contrib have a whole cycle to learn about and fix for this bug fix?
Comment #65
bojanz CreditAttribution: bojanz at Centarro commentedContrib has learned about this problem in the 8.2.x cycle.
The fix has been stuck and postponed ever since.
A number of Commerce installs have been affected by the recent security release (#3034976: Cannot retrieve saved objects properly from Order data field), and this issue is the actual fix.
Without it, we'll need to tell people to spend the next 6 months patching core.
All of that feels way too much for what is essentially a one-line fix.
Comment #66
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedAgreed with @Berdir and @bojanz, I think it's much more useful to have this fix in 8.7.0-alpha1 because contrib module authors and production sites are more likely to test their codebase against an alpha release rather than the next dev branch.
Comment #67
alexpottHappy to go with what the entity API maintainers decide.
Can someone please update the release notes section of the issue summary? Thanks
Committed and pushed 54ff737a18 to 8.8.x and e866ce6c13 to 8.7.x. Thanks!
Fixed on commit.
Comment #68
Berdir