Problem/Motivation
EntityReferenceItem::getValue() will return a target_id property of -1 when the entity has not been saved, however the -1 (NEW_ENTITY_MARKER) is an internal implementation detail. And attempting to set the value with the value returned from getValue will throw an \InvalidArgumentException.
This manifests after serializing and unserializing - so entities with unsaved references cannot be used with things like TempStore or caches., any attempt to get the field values before preSave is called results in an exception.
Steps to reproduce:
// The term entity is unsaved here.
$term = entity_create('taxonomy_term', array(
'name' => $this->randomMachineName(),
'vid' => $this->term->bundle(),
'langcode' => LanguageInterface::LANGCODE_NOT_SPECIFIED,
));
$entity = entity_create('entity_test');
// Now assign the unsaved term to the field.
$entity->field_test_taxonomy_term->entity = $term;
$entity->name->value = $this->randomMachineName();
// Now save the term.
$term->save();
// Get the value.
$value = $entity->get('field_test_taxonomy_term');
// Now attempt to set the entity value.
$entity->field_test_taxonomy_term = $value // Throws an \InvalidArgumentException;
The issue is that EntityReferenceItem::getValue should remove the -1 for target_id
Proposed resolution
Make EntityReferenceItem::getValue handle this more gracefully by removing the static and using a validation constraint to ensure a value is set.
Remaining tasks
Review
User interface changes
None
API changes
EntityReferenceItem::$NEW_MARKER (-1) is removed in favour of a validation constraint
Beta phase evaluation
Issue category | Bug because broken |
---|---|
Issue priority | Major because you should be able to serialize an entity and then unserialize it. |
Disruption | Removes the protected static so only impacts DER which would be the only thing that sub-classes EntityReferenceItem - both maintainers are active in this issue and are proposing this change to unblock issues there. |
Comment | File | Size | Author |
---|---|---|---|
#125 | er-set-value-2404331-3.125.patch | 15.69 KB | larowlan |
#125 | interdiff.txt | 2.26 KB | larowlan |
Comments
Comment #1
larowlanComment #4
larowlanSo that test is wrong, back to the drawing board
Comment #5
larowlanComment #6
larowlanHere's a failing test
Comment #8
larowlanComment #10
larowlanmaybe this?
Comment #12
larowlanComment #13
larowlanComment #14
jibranThe fix look good to me. I'd suggest to add a comment to the condition.
The fix is in core folder and tests are in ER. Which component should we choose? :D
Comment #15
larowlanActually the issue is that getValue returns the -1. Removing that fixes it.
Comment #17
larowlanComment #18
larowlanComment #19
yched CreditAttribution: yched commentedThis needs @fago's approval
Comment #20
amateescu CreditAttribution: amateescu commentedThis is the kind of problems I feared would show up when we were discussing the change to ER's autocreate architecture in Ghent :(
One of the problems with the patch in #15 is what happens if a validation is triggered between "get the field value" and "then set it"? I expect the validation to fail because target_id will be NULL.
I think we need to revisit that decision and try to see if we can bring back the "old way" (e.g. 'target_id' => NULL, 'entity' => $entity_object) for autocreate somehow.
Maybe my proposal to skip validation at some point and bring it back later was not so good, but the real case here is that the 'target_id' property might not pass the NotNull constraint on its own, but it should pass it if combined with the 'entity' property. So we need a way to validate a property with the context of its related properties. Does that make sense?
Comment #21
larowlanI don't think we need to 'bring back the old way' - this patch demonstrates the fix.
Note this bug also prevents things like caching of entities with unsaved entity reference fields (as they are serialized which invokes getValue()) and potentially things like the temp-store used for previewing content which I assume also serializes the entity. The internal target_id of -1 is still retained and when setValue is called with just an unsaved entity, the ERItem takes care to ensure the internal target_id is set back to -1 again, so there is no need for the -1 to be exposed outside the ERItem. i.e. internally the -1 remains intact, it's just not leaked outside the ER item
Comment #22
RavindraSingh CreditAttribution: RavindraSingh commentedAdded the check for if target id values['target_id'] is defined.
Comment #23
larowlanPlease combine these into one if, with an && - thanks :)
Thanks for the fix
Comment #24
RavindraSingh CreditAttribution: RavindraSingh commentedComment #25
RavindraSingh CreditAttribution: RavindraSingh commentedComment #26
RavindraSingh CreditAttribution: RavindraSingh commentedAdded inline condition.
Comment #30
larowlanpossibly manually edited patch?
re-roll with #22 and #23
Comment #31
fagohm, that patch meanas $entity->reference->first()->target_id is not the same as $entity->reference->first()->getValue()['target_id'] then - that's unexpected behaviour imo.
Maybe better, we should allow writing the value back in combination with the unsaved entity?
Comment #32
larowlanSo in that case would $entity->reference->first()->target_id return -1?
Again, that seems like the internal behaviour being leaked.
The root cause of my issue here is TypedDataManager::getPropertyInstance calling setValue with stale field values when unserializing entities from cache.
We do allow writing the values back with the unsaved entity. But as per the test/demo code in issue summary - the issue only happens when the entity is no longer unsaved.
I guess we could make the setValue more permissive, but attempts to do so in #6, #9 and #10 were unsuccessful.
Comment #33
jibranReroll after #2404021: entity_reference formatters should be in Core
Comment #34
amateescu CreditAttribution: amateescu commented@fago, did you see #20? IMO, the solution here is really simple, we just need a custom NotNull constraint for the entity reference item which takes the context into account (i.e. the 'entity' property) when it determines if 'target_id' is NULL or not.
Comment #35
jibranAdded another test. Also created a similar issue for DER #2407123: Can't access autocreated entity after serialization
Comment #37
larowlanSo this makes it so that $entity->reference->first()->target_id is the same as $entity->reference->first()->getValue()['target_id'] - ensuring we don't leak the -1 implementation detail.
Internally, EntityReferenceItem needs to use $this->getTargetId() as $this->target_id no longer returns -1 when that is the value.
Bumping this to major given the issues we're seeing in DER with node preview of unsaved entities.
Comment #38
jibranI think we should add comment for this.
Comment #39
larowlansure
Comment #40
larowlanRe-roll
Comment #41
RavindraSingh CreditAttribution: RavindraSingh commentedAwesome, perfactly implemented. moving to RTBC
Comment #42
amateescu CreditAttribution: amateescu commentedI would still like to hear a definite NO from @fago and/or @yched for the proposal in the last paragraph of #20 / #34, because only @larowlan replied so maybe no one else saw it :)
Comment #43
larowlanI agree, would like fago and yched's input and sign off first, left a tell for fago on irc
Comment #44
fagoThanks. The patch looks solid to me, however I must say I'm not so fond of making the -1 an implementation detail because it requires us to massage incoming/outgoing values all the time. Usually, that's a sign of bad architeture, i.e. if it's not suppoed to be in target_id, don't put it there but e.g. add another internal object property instead.
But as of now, we have to put it there because of validation. I'd agree with amateescu (#20/#34) that fixing validation instead seems to be better option to me, but I'm not sure how realistic or simple that would be. Given that, I'd be fine with moving on with this patch unless someone else can come up with a patch for #20? ;)
But yeah, I'd love to hear yched's and berdir's thoughts on this also. Thus, assigning to the next one.
Comment #48
larowlanreroll
Comment #49
larowlanre-roll
Comment #50
larowlanbump?
Comment #53
larowlanreroll
Comment #54
larowlanAs per #44
Comment #55
larowlanre-roll
Willing to trade for reviews on this.
Comment #56
larowlanre-roll
willing to trade for reviews on this
Comment #57
larowlan<dejavu>
re-roll
willing to trade for reviews on this
</dejavu>
:)
Comment #58
jibranPlease have a look at these two points.
$this->getTargetId()
and$this->target_id
in IS?And maybe explain the difference here as well.
From #20
Can we at least explore that? We can create ERNotNull constraint?
Comment #59
larowlanFixes 58.1, 58.2 - it looks like the test failed in #35?
Looking at constraint stuff now.
Updated IS
Comment #60
jibranThe failed test was
testEntitySaveOrder
nottestEntityAutoCreate
which was added in #35.Comment #61
larowlantrying validation constraint approach here #2334503: Ignore: patch testing issue
Comment #62
larowlanRe-roll
Comment #64
larowlanMerge use collision
Comment #65
larowlanAnyone bold enough to Rtbc this?
Comment #66
jibranJust uploading test only patch and testEntityAutoCreate test only patch to make sure fails exists. Other then that this looks RTBC to me.
Comment #68
jibranComment #69
alexpottWhich patch is rtbc #64? It's best when the last patch on the issue is the rtbc patch since the rtbc retest uses this one.
Also @Berdir or @yched are yet comment despite people asking for it.
Comment #70
BerdirI can't speak for @yched but I don't feel like being able to provide a useful review for this, I haven't been involved in the recent refactorings that happened around this. @fago, @amateescu and @yched were.. and if they're OK with this then I'm fine with it too ;)
Comment #71
larowlanreroll
Comment #72
larowlanOver to @amateescu as per #70
Comment #74
larowlanuse statement merge collision
Comment #75
jibran@larowlan we talked about updating this test last time around in IRC.
Comment #77
amateescu CreditAttribution: amateescu commentedSorry for the delay, I'm traveling without much internet access until tomorrow evening :/ I'm going to try and poke a bit at the approach of using a custom validation during this long train ride.
Comment #78
daffie CreditAttribution: daffie commentedJust two little nitpicks:
This change is not necessary.
Can we change this to:
Comment #79
larowlanHere is the custom validation approach, no interdiff - fresh patch
Comment #81
larowlanSome fixes
Comment #82
jibranNew patch doesn't include old changes of
testEntitySaveOrder
Comment #83
larowlanNice catch
Comment #85
larowlanMore fixes
Comment #87
jibranWe should also test this.
Comment #89
larowlanAdds new test as per #88 and fixes some more fails
Comment #90
larowlanUpdated issue summary
Comment #91
jibranChanging the title as well perhaps now @yched will review it. :P
Comment #92
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedNIce, this looks very similar to my wip patch from London :) It had a few failures locally and it seems that I lost the last commit on my branch, but I'll post it anyway to see if I can spot any major differences.
Comment #93
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedOk, the only important difference is that we need to bring back the optimization from EntityReferenceFieldItemList::referencedEntities(), which used to check the 'target_id' property first in order to delay loading the computed 'entity' property as much as possible.
Added that optimization and a fixed a few nitpicks in @larowlan's patch (interdiff is to #89), this looks ready to me now.
Comment #94
yched CreditAttribution: yched commentedReally, really sorry for keeping everyone waiting so long :-(
I'll try to have a closer look in the next few days.
However :
That was the reason why we introduced the NEW_ENTITY_MARKER in #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities - being able to set the property as "required", so that its SQL column gets a 'NOT NULL' = TRUE and keep the indexes optimized.
If we remove that and abandon the NOT NULL for target_id, then it seems we're basically reverting the e_r part of #2232477: Fatal when adding new fields with NOT NULL constraints in a base table that contains existing entities, and then I'm not sure why we need to add new dedicated validators, they weren't needed back then ?
Comment #95
jibranThank you very much @yched for your insightful reply. :)
We should add explicit tests to make sure that target_id get 'NOT NULL' true. TBH we are not aware of this problem at all and I agree target_id field should be required.
AFAIK we are introducing validator to remove the dirty workaround to this problem (EntityReferenceItem::$NEW_MARKER). Maybe we are addressing this problem on a wrong layer? There should have to be some other way around this.
Comment #96
larowlanAdding back the required doesn't seem to break anything, ran a few tests locally - let's see what bot says
Comment #98
amateescu CreditAttribution: amateescu for Pfizer, Inc. commentedHeh, it's been so long that even I forgot what my proposal for a fix here was about :/
Looking back through the comments, what we would have to do in the patch is add the custom NotNull constraint to the 'target_id' property, not to the field item, and change
SqlContentEntityStorageSchema::getDedicatedTableSchema()
to also account for properties that have a 'not null' constraint (or an extension to it) instead of just checkingisRequired()
:Comment #99
larowlanOr we could do this - leave the setRequired but subclass DataDefinition so that we remove the NotNull constraint.
We want the required flag to return true for the column, just not the constraint that operates solely on target_id before preSave has run. This way our custom constraint makes sure that either a new entity or target_id is set and the lack of a target_id doesn't trigger the NotNull constraint from throwing an error. The value is written by the time it hits the DB because of the preSave, but we can bypass validation without the kludgy -1.
We could combine this constraint with ValidReference and remove the new one - but I'm happy with it separate for now - unless someone objects.
Comment #100
larowlanActually, this should be at the EntityReferenceItem level, not the list.
Comment #101
jibranOk so this means ERItem property target_id can be NULL but actual target_id will never be NULL in schema or in theory . If this is correct then we need some documentation for this on ERItem class.
Aren't both the conditions(new one and old one) same? Change seems unwanted.
I like that we have two constraint but NULL ref is a valid ref so perhaps they should belong together.
Class docs missing.
Comment #102
amateescu CreditAttribution: amateescu for Pfizer, Inc. commented#101.1: See #93.
Comment #103
jibranThanks @amateescu for clearing that out.
@yched & @fago any thoughts on the current patch.
@larowlan can we use
EntityReferenceNotNull
in DER if we keep it separate? Or do we need aDynamicEntityReferenceNotNull
validator anyway?Comment #104
larowlan@jibran we already have our own ValidReference plugin in DER which we can merge this logic into.
Fixed 101.2, 3 - 101.1 answered by @amateescu
Comment #105
jibranMinor fixes.
@larowlan I have uploaded the DER patch in #2407123: Can't access autocreated entity after serialization .
DynamicEntityReferenceItemTest::testEntityAutoCreate()
andDynamicEntityReferenceItemTest::testEntitySaveOrder()
still fails with this patch.Comment #106
larowlanLeft a comment over there @jibran.
Comment #109
larowlanComment #110
yched CreditAttribution: yched commentedHm, smart :-) Haven't done a line-by-line review, but I think I like the approach.
I hate to say that, but... might be good to have @fago vet that too :-/
Comment #111
larowlanthanks @yched, over to @fago
Comment #112
larowlanAnother re-roll
Comment #113
jibranLet's add a change notice as well or update Field schema no longer uses 'not null' entries.
Comment #114
larowlanre-roll
Comment #115
larowlanAnother reroll
Comment #116
benjy CreditAttribution: benjy at PreviousNext commentedIf we've just called $value->isEmpty() it doesn't make sense to be checking isset($value) right after?
Comment #117
larowlanFixes #116
Thanks @benjy
Comment #118
larowlanAdding issue to the title, as well as the solution
Comment #119
larowlanComment #120
larowlanComment #121
yched CreditAttribution: yched commentedOK, fago isn't responding, so I'm biting the bullet :-)
Comment #122
fagoI really like this approach - good work! :) The patch looks pretty good already also, here some small remarks related to documentation:
Let'S add a pointer here why this is not handled by a regular NotNull constraint. E.g. point to the explanation in the definition class?
This seems to change nothing?
Comment #123
fagonot so "fast" :P
Comment #124
yched CreditAttribution: yched commented@fago : HAH ! ;-)
Comment #125
larowlanFixed #122
Thank you @yched and @fago for the reviews.
As promised - I owe you some in return - please ping me on irc/twitter or comment here with any pet issues you need reviewed.
Comment #126
yched CreditAttribution: yched commented#125 adresses @fago's remarks. Back to RTBC then ?
Comment #127
fagoyeah, changes look good - agree on RTBC. Thanks!
Comment #131
larowlanRandom migrate fails
Comment #132
alexpottThis issue addresses a major bug and is allowed per https://www.drupal.org/core/beta-changes. Committed ab1885c and pushed to 8.0.x. Thanks!
Comment #134
jibranThanks @yched, @fago and @amateescu for all the help. @larowlan it is finally fixed great work.