Problem/Motivation

Currently in ContentEntityBase::__sleep we remove the initialized fields which means that if an entity reference is contained in some of these fields and it was changed after serializing/unserializing the parent entity the entity reference will be loaded in the original version with the changes lost.

The core currently does not need this so it should not be the default behaviour, but we could make it possible so that if contrib/custom for some reason have to serialize the whole structure then they should be able to. I am currently working on an autosave solution so this is the first approach I am currently evaluating and I do need to serialize the referenced entities for the case when we have nested entity forms.

The issues which are blocked by this feature:
#2844229: Serializing content entity form objects should deeply serialize the entity in case of nested/inline entity forms
#2824293: Inconsistent form build between initial form generation and the first ajax request

When the form is cached for multi step forms then the form entity is cached as part of it and on each Ajax request for e.g. adding new fields the entity with the new field on it is cached as part of the cached form. Now imagine you do the same, but on the third level of a nested entity form - in this case the added field on the entity on the third level is lost because we would only cache the ID of the referenced entities and not the whole entity like the parent one. Here we have the first inconsistency between the parent and the child entities in a nested inline form structure.

From the performance point of view: consider that on each form rebuild you might have to load a lot of entities separately currently but if you use deep serialization you will not need to load not a single entity anymore. So in the case of nested entity inline forms we have PHP computation time to serialize 100 entities at once and on form rebuild do 0 entity loads versus serialize only the parent and on form rebuild do 99 separate entity loads. Not having measured the impact I would say the second is slower. It is better to retrieve e.g. 500 KB with a single query instead 500 KB with 99 queries.

Caching the parent entity prevents that during concurrent editing the changed made from another user and saved will suddenly be mixed into the entity that another user is still editing and on form rebuild when he adds a fields suddenly the values might get reordered or some be deleted or new ones present. As a summary a lot of unexpected things happen when the entity is not being serialized no matter if parent or reference. A similar case where we got into troubles is #2675010: Cloned entity will point to the same field objects if the clone was created after an entity translation has been initialized. In that issue you might see what happens if suddenly you have an modified entity in the form instead the original used one to create the form.

Deep serilization will not be active by default. It is there only for widgets that will require it. For this see #2844229: Serializing content entity form objects should deeply serialize the entity in case of nested/inline entity forms

Proposed resolution

  1. Set a flag "deepSerialization" through the method ContentEntityInterface::setDeepSerialization() in order to perform deep serialization when calling serialize on the entity object.
  2. Introduce FieldItemListInterface::getSerializationValue($deep_serialization) and let ContentEntityBase::__sleep call this method instead of the ::getValue method, which by default will just call ::getValue, but will have the knowledge if deep serialization is running or not.
  3. EntityReferenceFieldItemList::getSerializationValue($deep_serialization) should flag the referenced entities for deep serialization if deep serialization is currently running in order to deeply serialize the whole entity structure.

Remaining tasks

Review & Commit.

User interface changes

None.

API changes

New methods:

  1. ContentEntityInterface::setDeepSerialization()/li>
  2. FieldItemListInterface::getSerializationValue($deep_serialization)/li>

Data model changes

None.

CommentFileSizeAuthor
#130 interdiff-127-130.txt1.86 KBhchonov
#130 2824097-130.patch24.42 KBhchonov
#127 interdiff-125-127.txt4.58 KBhchonov
#127 2824097-127.patch24.07 KBhchonov
#125 2824097-125.patch23.56 KBhchonov
#122 interdiff-120-122.txt5.97 KBhchonov
#122 2824097-122.patch23.43 KBhchonov
#120 interdiff-116-120.txt3.57 KBhchonov
#120 2824097-120.patch19.11 KBhchonov
#116 interdiff-105-116.txt1000 byteshchonov
#116 2824097-116.patch16.13 KBhchonov
#105 interdiff-103-105.txt1.41 KBhchonov
#105 2824097-105.patch16.13 KBhchonov
#103 interdiff-101-103.txt4.57 KBhchonov
#103 2824097-103.patch16.18 KBhchonov
#101 interdiff-100-101.txt1.2 KBhchonov
#101 2824097-101.patch16.17 KBhchonov
#100 interdiff-99-100.txt4.77 KBhchonov
#100 2824097-100.patch17.37 KBhchonov
#99 interdiff-98-99.txt4.42 KBhchonov
#99 2824097-99.patch14.99 KBhchonov
#98 interdiff-97-98.txt2.64 KBhchonov
#98 2824097-98.patch14.82 KBhchonov
#97 interdiff-93-97.txt5 KBhchonov
#97 2824097-97.patch13.13 KBhchonov
#93 interdiff-91-93.txt804 byteshchonov
#93 2824097-93.patch11.79 KBhchonov
#91 2824097-91.patch11.59 KBhchonov
#90 interdiff-86-90.txt2.8 KBhchonov
#90 2824097-90.patch12.29 KBhchonov
#86 interdiff-84-86.txt1.11 KBhchonov
#86 2824097-86.patch12.45 KBhchonov
#84 interdiff-82-84.txt685 byteshchonov
#84 2824097-84.patch12.21 KBhchonov
#82 interdiff-81-82.txt654 byteshchonov
#82 2824097-82.patch12.19 KBhchonov
#81 interdiff-79-81.txt7.53 KBhchonov
#81 2824097-81.patch12.12 KBhchonov
#79 interdiff-77-79.txt487 byteshchonov
#79 2824097-79.patch18.95 KBhchonov
#77 interdiff-74-77.txt1.86 KBhchonov
#77 2824097-77.patch18.31 KBhchonov
#74 interdiff-73-74.txt637 byteshchonov
#74 2824097-74.patch16.71 KBhchonov
#73 interdiff-61-73.txt8.71 KBhchonov
#73 2824097-73.patch16.71 KBhchonov
#61 interdiff-60-61.txt381 byteshchonov
#61 2824097-61.patch16.89 KBhchonov
#60 interdiff-54-60.txt12.97 KBhchonov
#60 2824097-60.patch16.88 KBhchonov
#54 interdiff-53-54.txt866 byteshchonov
#54 2824097-54.patch19.57 KBhchonov
#53 interdiff-50-53.txt9.1 KBhchonov
#53 2824097-53.patch20.42 KBhchonov
#50 interdiff-48-50.txt1.18 KBhchonov
#50 2824097-50.patch22.92 KBhchonov
#48 2824097-48.patch21.88 KBhchonov
#39 interdiff-37-39.txt6.23 KBhchonov
#39 2824097-39.patch25.44 KBhchonov
#37 interdiff-33-37.txt6.01 KBhchonov
#37 2824097-37.patch19.3 KBhchonov
#33 interdiff-31-33.txt710 byteshchonov
#33 2824097-33.patch14.54 KBhchonov
#31 2824097-31.patch14.48 KBhchonov
#23 2824097-experiment-3.patch2.67 KBtstoeckler
#23 interdiff.txt508 byteststoeckler
#20 2824097-experiment-2.patch2.17 KBtstoeckler
#18 2824097-experiment.patch1.51 KBtstoeckler
#10 interdiff-8-10.txt1.86 KBhchonov
#10 2824097-10.patch12.65 KBhchonov
#8 interdiff-7-8.txt1.39 KBhchonov
#8 2824097-8.patch10.84 KBhchonov
#7 interdiff-6-7.txt6.32 KBhchonov
#7 2824097-7.patch9.36 KBhchonov
#6 interdiff-1-6.txt1.19 KBhchonov
#6 2824097-6.patch8.75 KBhchonov
entity_serialization-FAIL.patch6.7 KBhchonov
entity_serialization-PASS.patch8.4 KBhchonov
Members fund testing for the Drupal project. Drupal Association Learn more

Comments

hchonov created an issue. See original summary.

hchonov’s picture

Status: Active » Needs review

The last submitted patch, entity_serialization-PASS.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, entity_serialization-FAIL.patch, failed testing.

The last submitted patch, entity_serialization-PASS.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
8.75 KB
1.19 KB

We have to define the "serializeFields" property on the object, so that the field definitions does not have to be checked when retrieving the property value.

hchonov’s picture

A little changing the approach -
ContentEntityBase::__sleep will self find the referenced entities and set on them the flag for serializing computed field values and EntityReferenceItem::getValue should have the $include_computed parameter as well.

hchonov’s picture

Ouh and I've just realised that on ajax requests the form state is cached and it contains the form object and in case of content entity forms the entity will be included as well and will be cached within the form state. So the idea here is to always rebuild the form with the entity that has been used the first time the form has been generated, but if we do not serialize computed values (entity references) then on wakeup the entity might differ from the initially used entity in case e.g. referenced entities have changed - this is for the case when dealing with nested entity forms really important. This means we have to cache the whole entity structure within the form state as well.

Status: Needs review » Needs work

The last submitted patch, 8: 2824097-8.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
12.65 KB
1.86 KB
tstoeckler’s picture

Chatted with @hchonov about this a bit yesterday. Some notes...

  • Since ContentEntityBase::__sleep() "backs up" whatever is returned from FieldItem(List)::getValue() I looked into EntityReferenceItem::getValue() as I thought if that returned the entity property this problem would be solved. It actually does, but only if the entity is new, so this doesn't help as of now
  • I think I remember some issue trying to change EntityReferenceItem::getValue() so that it returns the entity property under some other condition (or always?). Will try to find that
  • It is probably too much of a BC break to always return the entity property unconditionally in EntityReferenceItem::getValue()
  • @hchonov brought up that something like a $has_computed flag in getValue() would help here. Funnily, FieldItemList::getValue() already declares such a parameter, even though TypedDataInterface::getValue() does not. So it should perhaps be possible to add such a parameter to EntityReferenceItem::getValue() if it helps
  • @hchonov mentioned that even if we had a $include_computed flag, we would have to explicitly call that from ContentEntityBase::__sleep() which could be considered a behavior change
hchonov’s picture

@tstoeckler in my last patch EntityReferenceItem::getValue has the parameter $include_computed and if true it will return the entity as well. The patch is green :). And $field->getValue(TRUE) is called from within ContentEntityBase::__sleep only if the $serializeComputedFieldValues flag is set on the entity, so currently I am not causing any behavior changes.

tstoeckler’s picture

Ahh, somehow missed the last patch. That looks pretty sweet!

hchonov’s picture

About the changes that I have done in the FormCache .. There is another issue related to those changes - #2824293: Inconsistent form build between initial form generation and the first ajax request where we need a way to ensure on rebuild always the same entity object is loaded into the form object, so probably I should remove the changes in the FormCache here and later the other issue might use the feature from here to correctly store the entity with its whole structure and references somewhere else instead in the FormCache and on retrieval get that entity and inject it again in the form object. Any thoughts about this?

hchonov’s picture

Title: Make it possible to serialize entity references » Make it possible to serialize computed field values in order to be able to make an exact snapshot of a loaded entity object
tstoeckler’s picture

+++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
@@ -189,7 +189,7 @@ public function setValue($values, $notify = TRUE) {
-        if (!$this->entity->isNew() && $values['target_id'] !== NULL && ($entity_id !== $values['target_id'])) {
+        if (!$this->entity->isNew() && $values['target_id'] !== NULL && ($entity_id != $values['target_id'])) {

Why this change?

I think that the form changes should be moved into #2824293: Inconsistent form build between initial form generation and the first ajax request. I think it's unfortunate that the patch introduces knowledge about the entity system. Maybe we can try setting the $serializeComputedFieldValues flag somewhere in ContentEntityForm. But let's figure that out independently of the base issue that is being solved here.

Edit: Fix issue link

tstoeckler’s picture

So I was thinking about whether we could make the $include_computed = TRUE case the default and what implications that would have. I couldn't come up with a concrete case of anything breaking, but that's obviously not saying much in the diverse wonderland that the Entity Field API is in core and contrib.

In any case, sending a patch that does that to the bot to see if anything breaks. If it's red, we'll know for sure that things break. If it's green that obviously doesn't mean nothing breaks, but that if anything breaks it's either not tested or in contrib.

hchonov’s picture

+++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
@@ -189,7 +189,7 @@ public function setValue($values, $notify = TRUE) {
-        if (!$this->entity->isNew() && $values['target_id'] !== NULL && ($entity_id !== $values['target_id'])) {
+        if (!$this->entity->isNew() && $values['target_id'] !== NULL && ($entity_id != $values['target_id'])) {

Why this change?

This is needed because there will be an integer and after unserialize it will be a string...

This is actually what I expect to fail in your experimental patch as it caused failures in mine :). The failures from your patch should be identical to the ones in #8.

tstoeckler’s picture

Meh, that's annoying. But OK. Let's see.

The last submitted patch, 18: 2824097-experiment.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 20: 2824097-experiment-2.patch, failed testing.

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
508 bytes
2.67 KB

So I guess what is happening with the patch is that the node type object ends up in the cached response instead of just its ID and on a page cache hit, the response gets unserialized, but system.module is not loaded. So when the NodeType instance gets constructed it fails because NodeType::$preview_mode defaults to DRUPAL_OPTIONAL.

That means two things:
1. I think it is a big problem that node type objects (or any objects whose classes rely on *.module files) cannot be part of response objects due to page cache. That has nothing to do with this issue, though.
2. This is not a viable approach, so we should go back to something like #10 but probably document some of the problems described above.

Uploading a patch just to test/prove my theory.

tstoeckler’s picture

So I had opened #2825801: Node type objects (and other objects that depend on *.module files) cannot be part of response objects due to Page Cache: Document or fix!, but closed that as duplicate in the meantime, mainly on #2807501: ResourceResponse::$responseData is serialized, but is never used: unserializing it on every Page Cache hit is a huge performance penalty. Thinking about that more, I think the test failure does not actually point to a super valid issue, it's just a very unfortunate combination of edge cases and code debt. So if we wanted to go with this approach we could postpone this on (either) #2807501: ResourceResponse::$responseData is serialized, but is never used: unserializing it on every Page Cache hit is a huge performance penalty (or #2807961: Replace usages of deprecated global constants with the new interface constants).

What do others think? Is this something we can get away with in Drupal 8?

My motivation for pursuing a non-flag approach is that it simply removes one more (albeit tiny) layer of complexity. Although the patch itself looks great and I can't think of any concrete problems, the combination of entity references, translations, entity reference revisions, ... and this flag getting lost (or turning up unexpectedly) somewhere in the mix if only due to a bug in contrib, kind of scares me a bit. I think it is absolutely necessary to fix this for modules like Inline Entity Form/Paragraphs and others to retain/reclaim some of their sanity, so if we can't get away with the behavior change, then I'm all for a flag, but I'd like to make sure that we actually need it.

hchonov’s picture

I think we should serialize the whole structure only when this is required and for later some kind of conflict management is present.

If we always serialize the whole structure it means we will break the current behavior of the entity cache with entity references. Think of entity A referencing entity B (normal core references only by id). Now we cache entity A, but inside it we cache also entity B. This means that wherever entity B changes on loading A we will be still loading a previous version of B. If we cache the entities like this then we have to merge the cache tags of A with the ones of B. Sure, this is possible and is not hard to extend.

tstoeckler’s picture

Re #25:
1. Yes, that is a very good point and we need to update ContentEntityStorageBase::setPersistentCache() to account for that.
2. That also is a pretty strong case for not making this the default behavior, but somehow optional, like with the flag in the previous approaches. If you are using entity references for "embedding" entities - like the use-case of the Paragraphs module - then this is not a problem at all, you will just get a lot of cache tags if you heavily nest your entities. But in non-embedded use-cases of entity reference this could be a problem. If you have a setup of Music Genre -> Artist -> Album -> Song and you have a "listen_count" field on the Song, this would end up invalidating the entire Music Genre every time someone listens to a song in that genre. That's definitely not what you want in that case.

hchonov’s picture

So what about if the entity type gets a function indicating if the entity has to be persisted together with its references or not. If the entity type indicates this then we have to make sure the entity is persisted with the cache tags of all the referenced entities - which actually might be a performance boost when dealing with nested entity forms based on entity references as then instead of loading e.g. 100 entities separately we will be loading only a single entity containing already all the referenced entities, this in case of everything being in the persistent cache will lead to one db query instead of 100.

RajabNatshah’s picture

+1 Testing

tstoeckler’s picture

Talked a lot with @hchonov about this last week. Some thoughts

  • After thinking and talking about #26 a fair bit I realized that a single entity might actually have both "types" of references. I.e. the "Music genre" entity in #26 might also have a Paragraphs-based "Description" field, with embedded entities. So semantically the differentiation of whether or not to embed entities is not correct on the entity-level. It is actually a property of the reference itself. That doesn't necessarily mean that the entity-level flag is completely out of the question, in my opinion, but we should keep that in mind
  • Advancing on the previous point we considered making the "embed or not" flag a setting on the entity reference field. That way InlineParagraphsWidget and InlineEntityFormWidget in contrib could check whether a field definition has that flag and only expose themselves to "embedabble" entities. That would allow #2824293: Inconsistent form build between initial form generation and the first ajax request to be solved fairly cleanly, I guess. The challenge would be to somehow expose this flag in the UI without completely freaking people out, but I'm leaving that aside for now.
  • #27 brings up a good point, that serializing entities together with their "embedded" entities would mean that we have to invalidate the cache of entities when their embedded entities change. Thus, ContentEntityStorageBase::setPersistentCache() would have to add cache tags for all embedded entities. I was thinking something like the following pseudo-code:
    interface EmbeddableEntityReferenceFieldItemListInterface extends EntityReferenceFieldItemListInterface {
    
       public function embeddedEntities();
    
    }
    
    // In EntityReferenceFieldItemList
    class EntityReferenceFieldItemList extends FieldItemList implements EmbeddableEntityReferenceFieldItemListInterface {
    
     ...
    
      public function embeddedEntities() {
        if ($this->getSetting('embed')) {
          return $this->referencedEntities();
        }
        return [];
      }
    
    }
    
    // In ContentEntityBase::__sleep()
            if ($field instanceof EmbeddedEntityReferenceFieldItemListInterface) {
              foreach ($field->embeddedEntities() as $delta => $embeded_entity) {
                $this->values[$name][$langcode][$delta]['entity'] = $embedded_entity;
              }
            }
    
    // In ContentEntityStorageBase::setPersistentCache()
          foreach ($entity as $field_name => $items) {
            if ($items instanceof EmbeddedEntityReferenceFieldItemListInterface) {
              foreach ($items->getEmbeddedEntities() as $embedded_entity) {
                $cache_tags += $this->buildCacheTags($embedded_entity);
              }
            }
          }
    
    
  • While the performance benefit mentioned in #27 is nice, that's not the reason for this issue. We really need this to solve #2824293: Inconsistent form build between initial form generation and the first ajax request. And that's not even talking about #2820235: Add inline entity form to core as experimental. The issue title is precisely correct: We need a way to store the state of an entity across different requests. That is currently utterly broken for embedded entities.
hchonov’s picture

The only problem is that even if we make the setting on the field itself, we could have two form modes - one with inline entity reference and one with autocomplete, but I think there is no perfect solution here.

About the solution with the embeded intities check from the storage this will not work as it is not recursive, I am currently working on another solution where in sleep the cache tags of the entity are extended with the ones of its embedded entities, which should be working fine - e.g. Entity A references Entity B references Entity C, so first will be called __sleep on Entity C, then on Entity B, which will get the cache tags of Entity C added and then __sleep will be called on Entity A, which will get the cache tags of Entity B and C.

hchonov’s picture

I had to make getFromStaticCache and setStaticCache public in order to put the entity references in wake up into the static cache if not already there otherwise exchange them with the ones from the static cache.

Status: Needs review » Needs work

The last submitted patch, 31: 2824097-31.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
14.54 KB
710 bytes
hchonov’s picture

There are couple of things that have to be discussed about the current approach but no matter how we decide to solve this exactly there will be always one specific question that arises now :

How do we handle referenced entities on wake up?
a) If the referenced entity is not in the static cache then put it into the static cache - this is pretty trivial.
b) if the referenced entity is already in the static cache things get complicated because
b.1 we might be currently retrieving the parent entity from the persistent storage cache.
b.2 we might be currently retrieving the parent entity from the form cache.

In the case of b.1 we could just throw away the entity initialized in wake up and use the one from the static cache - this would replicate the current behavior at 100 %.

The most complex case is b.2 where we want to recover the initial structure so it would not be correct to start using the referenced entity from the static cache instead of the entity initialized in wake up. Here we could exchange the entity and put the one from the wake up into the static cache but we might break something by exchanging the entity. Not doing this is also problematic as it would not cover the case where the referenced entities could be retrieved from the storage just by $storage->load. We could not have a perfect solution here so the less evil has to be chosen.

However we are currently not able to differentiate between being in wake up because of b.1 or b.2 but we have to so that we have different behaviors - as it looks like the best option at least to my opinion would be to start using the entity from the static cache in case of b.1 and in case of b.2 directly put the referenced entity into the static cache no matter what.

Status: Needs review » Needs work

The last submitted patch, 33: 2824097-33.patch, failed testing.

hchonov’s picture

I was also thinking about the following use case -
we are loading entity A and it references entity B, where entity B is already loaded for some time in the persistent and in the static cache, but is A is assembled from the entity DB tables. Now B will be retrieved from the static cache and cached into the persistent cache as part of A, but B might already have changes as it was in the static cache already.

hchonov’s picture

Status: Needs work » Needs review
FileSize
19.3 KB
6.01 KB

This new patch does not have any changes regarding #34 or #36.

In this new version I extend the schema for the file field and make the file widget extend from the new EnttiyReferenceInlineWidgetBase and also set the new setting on file and image fields "serialize_embedded_entities" to TRUE as those fields and their widgets actually represent an Inline Entity Form - this was suggested by tstoeckler.

Status: Needs review » Needs work

The last submitted patch, 37: 2824097-37.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
25.44 KB
6.23 KB

I think now there should be no test failures left and we could discuss about the approach of the patch and think about #34 and #36.

hchonov’s picture

Thinking over #34 I start asking myself if it is correct to store an entity together with its referenced entities in the persistent entity cache as so we would cause inconsistencies when first loading an referenced entity and then the parent - this could happen outside entity forms. But if we are dealing with entity forms this use case most probably will not occur and we are on the safe side.
So probably it only makes sense of serializing the whole entity structure only in the case of form cache or auto save storage. No I am not trying talking against caching the whole entity structure in the persistent entity cache I actually would really like see this happen as we have some pretty big entity structures and it would be a performance boost when we retrieve everything with a single db statement but outside forms I have concerns that this could cause problems. The question is if my concerns are correct regarding the inconsistency or not? Any thoughts on this?

hchonov’s picture

If we cache the referenced entities as part of the parent entity in the persistent entity cache we would have problems that for the referenced entities the post load would have been already executed as the referenced entities are cached as part of the parent entity instead on loading them as similary described as well in #2824293-14: Inconsistent form build between initial form generation and the first ajax request.

tstoeckler’s picture

hchonov’s picture

If we decide us to use the setting on a field to filter out which referenced entities are serialized together with the parent then we do not exactly need the method or we have to provide a filter parameter, based on which we will collect only entities from the relevant fields.

hchonov’s picture

This is a blocker for a correct implementation of #2824293: Inconsistent form build between initial form generation and the first ajax request covering nested entity forms, which is major and I think this issue has to be a major as well, right?

Fabianx’s picture

Priority: Normal » Major

Yes, this can be major in light of that.

fago’s picture

So the idea here is to always rebuild the form with the entity that has been used the first time the form has been generated, but if we do not serialize computed values (entity references) then on wakeup the entity might differ from the initially used entity in case e.g. referenced entities have changed - this is for the case when dealing with nested entity forms really important. This means we have to cache the whole entity structure within the form state as well.

Right, we need to be able to preserve nested entities containing changes. Right now, we also loose those changes during rendering as I described in #2831727: Prepare-view of entity references silently drops changes contained in referenced entities.

hchonov’s picture

@fago could you probably take a look at the different solutions me and tstoeckler were talking about here and tell us your opinion?

hchonov’s picture

So after I've mentioned that there are a lot of issues if we are going to use the new functionality also for the persistent cache I am dropping this as a feature or as a part of the scope of this issue and only focus on the ability to serialize a whole entity structure.

I am posting a new patch without an interdiff as it deserves a whole review. Here basically I still use the field setting to determine if the referenced entites have to be cached together with the parent. This setting is checked by a new abstract widget class for nested entity forms - EntityReferenceInlineWidgetBase and only if the setting is active then would be the widget as well. Furthermore for this new deep serailization there is the new method ContentEntityInterface::serializeWithCompleteStructure. I've added also a test so I hope of a full review :).

hchonov’s picture

Issue summary: View changes
hchonov’s picture

As we are adding a new alterable property in ContentEntityBase we have to break the reference in ContentEntityBase::__clone and in ContentEntityBase::initializeTranslation create the reference between the translation objects.

dawehner’s picture

Issue summary: View changes
dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -407,9 +425,74 @@ protected function clearTranslationCache() {
    +  protected function getFieldDefinitionsFilteredBySetting($setting, $value) {
    

    Nice abstraction!

  2. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -407,9 +425,74 @@ protected function clearTranslationCache() {
    +    return $this->fieldDefinitionsFilteredBySetting[$setting][$value];
    

    I'm wondering whether we really have to cache the result. Serializing an entity at least is a rare and just one-time operation, isn't it?

  3. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -407,9 +425,74 @@ protected function clearTranslationCache() {
    +      foreach ($this->getFieldDefinitionsFilteredBySetting('serialize_embedded_entities', TRUE) as $name => $definition) {
    

    This flag makes much more sense ... better than what was discussed in IRC before, sorry. I just read the issue summary.

  4. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -407,9 +425,74 @@ protected function clearTranslationCache() {
    +          // Get the values together with the referenced entities and flag the
    +          // referenced entities for deep serialization as well.
    +          $this->values[$name][$langcode] = $field->getValue(TRUE);
    +          foreach ($this->values[$name][$langcode] as $delta => $values) {
    +            // We can assign protected properties of other objects only if they
    +            // have the same class as ancestor.
    +            $referenced_entities = array_filter($values, function ($val) {return is_object($val) && is_subclass_of($val, ContentEntityBase::class);});
    +            foreach ($referenced_entities as $referenced_entity) {
    +              $referenced_entity->serializeCompleteEntityStructure = TRUE;
    +            }
    +          }
    

    I'm wondering whether we could move some of this logic onto the field item list level etc., so its overridable on a field type level? This would allow us to probably get rid of this specific check for entity reference fields.

  5. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -427,6 +510,24 @@ public function __sleep() {
    +    $this->serializeCompleteEntityStructure = FALSE;
    

    Can we document this line? At least for me this is not entirely obvious of why this has to happen here.

  6. +++ b/core/modules/field/field.install
    @@ -104,3 +105,39 @@ function field_update_8003() {
    +        $settings['serialize_embedded_entities'] = is_subclass_of($class, FileItem::class) ? TRUE : FALSE;
    
    +++ b/core/modules/file/src/Plugin/Field/FieldType/FileItem.php
    @@ -49,6 +49,7 @@ public static function defaultFieldSettings() {
    +      'serialize_embedded_entities' => TRUE,
    

    Can we document why we need a different behaviour for file items?

hchonov’s picture

@dawehner:
#52.4 Awesome idea - how haven't we thought of it? :) I've moved the logic from ContentEntityBase::__sleep to EntityReferenceFieldItemList::getValue. However I had to make the flag public so that it could be set by EntityReferenceFieldItemList::getValue. I am not really sure if we need a setter method for it?

This removes the need for addressing #52.1 #52.2 and #52.3 as the code is now removed.

#52.5 I've documented why we have to set back the flag to FALSE.

#52.6 This is where @tstoeckler has to comment as well as it has been his idea to adjust the FileField and the FileWIdget to the new feature.

hchonov’s picture

I've forgot to remove the new method TranslatableInterface::getTranslationLangcodes, which is not needed anymore.

hchonov’s picture

dawehner’s picture

  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -546,8 +474,14 @@ public function __wakeup() {
        */
       public function serializeWithCompleteStructure() {
    +    // When running a deep serialization the flag
    +    // "serializeCompleteEntityStructure" has to be set in order to serialize
    +    // referenced entities as well, which are referenced by field having the
    +    // setting "serialize_embedded_entities" set to TRUE.
         $this->serializeCompleteEntityStructure = TRUE;
         $serialized = serialize($this);
    +    // After the deep serialization is ready we have to unset the flag in order
    +    // to allow for normal serialization afterwards.
         $this->serializeCompleteEntityStructure = FALSE;
         return $serialized;
       }
    

    Here is a question, do we need some recursive reference protection? I know there is a module for that: https://www.drupal.org/project/entity_reference_validators but I'm wondering whether we need that kind of production for serialize as well

  2. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -109,7 +109,8 @@ protected function doTestSerialization($deep_serialization, $field_name) {
    -    $entity_level_zero = unserialize($entity_level_zero->serializeWithCompleteStructure());
    +    $serialized_entity = $entity_level_zero->serializeWithCompleteStructure();
    +    $entity_level_zero = unserialize($serialized_entity);
    

    This change seems unnecessary :)

hchonov’s picture

#56.1
Hmm this is a good question, but I would say that we do not need such a protection as it actually depends on the developer who is building the system to set the field setting to true, so she has actually to think about this if there are circular references and do not serialize the parent if somehow it is referenced somewhere.

#56.2
I've made the change because it was easier for me to use breakpoints while debugging, but if desired on the next patch version I could revert the change.

The last submitted patch, 53: 2824097-53.patch, failed testing.

tstoeckler’s picture

I have one architectural issue with the patch:

+++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
@@ -156,6 +156,13 @@
   /**
+   * Whether to serialize the complete entity structure.
+   *
+   * @var bool
+   */
+  public $serializeCompleteEntityStructure = FALSE;

@@ -454,6 +461,34 @@ public function __sleep() {
+  /**
+   * {@inheritdoc}
+   */
+  public function serializeWithCompleteStructure() {
+    // When running a deep serialization the flag
+    // "serializeCompleteEntityStructure" has to be set in order to serialize
+    // referenced entities as well, which are referenced by field having the
+    // setting "serialize_embedded_entities" set to TRUE.
+    $this->serializeCompleteEntityStructure = TRUE;
+    $serialized = serialize($this);
+    // After the deep serialization is ready we have to unset the flag in order
+    // to allow for normal serialization afterwards.
+    $this->serializeCompleteEntityStructure = FALSE;
+    return $serialized;
+  }

I don't like the fact that for the top-level entity we have a dedicated method and then for nested entities we just set the flag ourselves and rely on the recursive nature of serialize(). I was thinking of how to avoid this, and I thought that maybe we can provide a dedicated method on the field item list. See this example code for what I mean:

class ContentEntityBase {

...

  public function __sleep() {
    ...
    if ($this->serializeCompleteEntityStructure) {
      $this->values[$name][$langcode] = $field->getSerializationValue();
    }
    else {
      $this->values[$name][$langcode] = $field->getValue();
    }
    ...
  }

}

class EntityReferenceFieldItemList {

...

  public function getValue($include_computed) { ... }

  public function getSerializationValue() {
    $values = $this->getValue($this->getFieldDefinition()->getSetting('serialize_embedded_entities'));
    foreach ($values as $delta => $item_values) {
      $values[$delta]['entity']->serializeCompleteEntityStructure = TRUE;
  }

}

This way the property value would be read exclusively by ContentEntityBase::__sleep().

We could then consider making it protected again by providing a setter for it. No getter would be needed.

I have a couple more notes, but I think this would be the most important issue to discuss. Here the rest of the notes, though for reference:

  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -156,6 +156,13 @@
    +  public $serializeCompleteEntityStructure = FALSE;
    
    @@ -454,6 +461,34 @@ public function __sleep() {
    +  public function serializeWithCompleteStructure() {
    

    I think the name of the property and method is not ideal. It implies that not using this method yields something that has an incomplete structure, or something. I think serializeWithEmbeddedReferences would be more explicit, although that's also very long, if not even longer.

  2. +++ b/core/lib/Drupal/Core/Field/EntityReferenceInlineWidgetBase.php
    @@ -0,0 +1,19 @@
    +abstract class EntityReferenceInlineWidgetBase extends WidgetBase {
    
    +++ b/core/modules/file/src/Plugin/Field/FieldWidget/FileWidget.php
    @@ -27,7 +27,7 @@
    -class FileWidget extends WidgetBase implements ContainerFactoryPluginInterface {
    +class FileWidget extends EntityReferenceInlineWidgetBase implements ContainerFactoryPluginInterface {
    

    I absolutely think this makes sense. The file widget is a form of an inline entity form. When we talked about all this we realized depending on how this embedded serialization will be used in form_state / entity cache / ... the one thing that could become problematic is if "embedded" entities are re-used. This is not possible e.g. with Paragraphs, but it can be possible with the more generic Inline Entity Form. With the FileWidget in core, though, this is also not possible, so it makes it a perfect candidate in my view.

    However, I think this change significantly increases the complexity of the patch. Especially, since the method is unused as of now, it really doesn't provide any value yet to do this change. So I would propose moving these changes to a follow-up.

hchonov’s picture

After talking with @tstoeckler about #59 we agreed on the following:

1. The entity property name: $deepSerialization.
2. The method name for retrieving a deeply serialized entity: ContentEntityInterface::deepSerialize();
3. A method for flagging the entity for deep serialization the next time it is serialized: ContentEntityInterface::setDeepSerialization();
4. As even in the previous patch we had the problem that we set the flag on referenced entities but do not remove it we considered that in such a complex case the only possible place of unsetting it for the parent and the children would be in __sleep just after we have prepared the entity for deep serialization.
5. Introduce a new method FieldItemListInterface::getSerializationValue($deep_serialization).
6. Have a follow-up to adjust file and image fields in order to keep the current patch as simple as possible for entity references only.

@dawehner and @tstoeckler thank you both for the reviews and the ideas. I think this is getting better and better :). I would even say it looks fine now according to my opinion, but there might be some unit tests that could probably fail by adding a new method to the field item list interface.. let's see...

hchonov’s picture

As we have a setter for the flag now the property should be protected, not public anymore.

jibran’s picture

Issue tags: +DER issue
dawehner’s picture

I really like the new name. Just like a deep copy, you do a deep serialization here.

  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -156,6 +156,13 @@
     
       /**
    +   * Whether to serialize the complete entity structure.
    +   *
    +   * @var bool
    

    ... I think it would be nice to mention that this also contains referenced entities.

  2. +++ b/core/lib/Drupal/Core/Entity/ContentEntityInterface.php
    @@ -74,4 +74,23 @@ public function getLoadedRevisionId();
    +  /**
    +   * Returns a deeply serialized entity.
    +   *
    +   * @return string
    +   *   The deeply serialized entity.
    +   */
    +  public function deepSerialize();
    +
    +  /**
    +   * Flags the entity for deep serialization.
    +   *
    +   * After calling this method and serializing the entity e.g.
    +   * "serialize($entity);" a deep serialization will be performed and the flag
    +   * will be automatically reset.
    +   *
    +   * @return $this
    +   */
    +  public function setDeepSerialization();
    

    I'm wondering whether these methods actually belong onto the FieldableEntityInterface

  3. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -340,6 +341,14 @@ public function fieldSettingsForm(array $form, FormStateInterface $form_state) {
    +    $form['serialize_embedded_entities'] = array(
    +      '#type' => 'checkbox',
    +      '#title' => $this->t('Serialize entity references together with the parent entity when performing deep serialization of the parent entity.'),
    +      '#description' => $this->t('If the field widget to be used for this field is an "entity reference inline widget" then the setting must be set.'),
    +      '#default_value' => $field->getSetting('serialize_embedded_entities'),
    +    );
    

    I'm really wondering whether its a good idea to expose this as a UI element. For me this reads like something you should be able to just handle in YML files directly

tstoeckler’s picture

EDIT: Crosspost. Not much duplication though, except that I agree with #63.3

Thanks @hchonov, I agree with you, this is looking quite nice now. I like it a lot! Thank you very much for your efforts on this!

I only have some minor points now, nothing architectural:

  1. +++ b/core/lib/Drupal/Core/Field/EntityReferenceInlineWidgetBase.php
    @@ -0,0 +1,19 @@
    +abstract class EntityReferenceInlineWidgetBase extends WidgetBase {
    

    I guess this is no longer needed either, right?

  2. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -340,6 +341,14 @@ public function fieldSettingsForm(array $form, FormStateInterface $form_state) {
    +    $form['serialize_embedded_entities'] = array(
    +      '#type' => 'checkbox',
    +      '#title' => $this->t('Serialize entity references together with the parent entity when performing deep serialization of the parent entity.'),
    +      '#description' => $this->t('If the field widget to be used for this field is an "entity reference inline widget" then the setting must be set.'),
    +      '#default_value' => $field->getSetting('serialize_embedded_entities'),
    +    );
    

    Forgot to mention this earlier, but what do you think about making this a "hidden" setting for now? and introducing a UI setting in a follow-up? I think it would make sense to introduce this together with an actual use-case (i.e. FileWidget or something). Also, it's kind of "unfair" ;-) right now to potentially confuse people with these details if it currently doesn't actually do anything.

    As far as I know contrib/custom module could alter this in, as well, and it should be properly saved to the field settings, so not doing this (for now) shouldn't block anything.

  3. +++ b/core/modules/field/field.install
    @@ -104,3 +104,39 @@ function field_update_8003() {
    +  $config = \Drupal::configFactory();
    

    Let's call this $config_factory to avoid confusion.

  4. +++ b/core/modules/field/field.install
    @@ -104,3 +104,39 @@ function field_update_8003() {
    +    $field = $config->getEditable($field_id);
    

    For clarity I think this should be called $field_config.

  5. +++ b/core/modules/field/field.install
    @@ -104,3 +104,39 @@ function field_update_8003() {
    +    if ($class == EntityReferenceItem::class || is_subclass_of($class, EntityReferenceItem::class)) {
    

    Can we not simply check $field['field_type'] === 'entityreference'?

  6. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -0,0 +1,119 @@
    +  protected $entityTypeId;
    ...
    +    $this->entityTypeId = 'entity_test_mul';
    

    Minor, could be initialized as the default value directly.

  7. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -0,0 +1,119 @@
    +    $this->entityTestMulStorage = $this->entityManager->getStorage($this->entityTypeId);
    

    Since we abstract away the entity type ID in a variable, I think can just be called entityStorage

  8. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -0,0 +1,119 @@
    +  public function testSerialization() {
    +    $field_name = 'field_test_entity_reference';
    +    // Create the test entity reference field.
    +    FieldStorageConfig::create([
    +      'field_name' => $field_name,
    +      'entity_type' => $this->entityTypeId,
    +      'type' => 'entity_reference',
    +      'settings' => [
    +        'target_type' => $this->entityTypeId,
    +      ],
    +    ])->save();
    +    FieldConfig::create([
    +      'entity_type' => $this->entityTypeId,
    +      'field_name' => 'field_test_entity_reference',
    +      'bundle' => $this->entityTypeId,
    +      'translatable' => TRUE,
    +    ])->save();
    +
    +    $this->doTestSerialization(FALSE, $field_name);
    +    $this->doTestSerialization(TRUE, $field_name);
    +  }
    ...
    +    $this->assertEquals($deep_serialization ? 'entity level one' : $initial_entity_name, $entity_level_zero->$field_name->entity->label());
    +    $this->assertEquals($deep_serialization ? 'entity level two' : $initial_entity_name, $entity_level_zero->$field_name->entity->$field_name->entity->label());
    

    Also pretty minor, and also arguable (so feel free to leave as is), but this feels a bit like the creation of the field config + storage should be in setUp() and the two doTestSerialization() should in fact be two separate test methods, i.e. testSerialization() and testDeepSerialization(). ($field_name could be another property on the test, next to entityTypeId then.) You could then put the entity setup in a separate createNestedEntities() method (or similar) to avoid duplication. In my personal opinion that would make the test a bit more readable, but again, feel free to ignore, it's not important.

  9. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -0,0 +1,119 @@
    +    $entity_level_zero->name = 'entity level zero';
    +    $entity_level_zero->$field_name->entity->name = 'entity level one';
    +    $entity_level_zero->$field_name->entity->$field_name->entity->name = 'entity level two';
    

    As this seems to be "the heart" of the test, I think it could use a comment. Generally it's best to describe why something is happening, not what is happening, but this might be an exception. Because it's very easy to miss that there is no ->save() call below I think simply saying "Change the entity values in memory but do not persist them to storage." or something like that would be helpful.

  10. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -0,0 +1,119 @@
    +    $this->entityTestMulStorage->resetCache();
    

    Very interesting. Can you explain why this is needed? (Preferably with a code comment instead of in this issue ;-))

hchonov’s picture

#63.3 This is not possible, as if a site builder is configuring the entity reference fields for a module like paragraphs or inline entity form she should be able of using this feature as those are just regular entity reference fields and we could not put the setting into there. We also have defined an abstract Widget for inline entity forms checking for this setting of the field.

hchonov’s picture

#64.1 & .2 why would the new widget not be needed anymore - it should check against the setting and be used by inline reference widget. The use of the widget is for entity reference inline modules/widgets .

#64.5 What about fields of type EntityReferenceRevisions and others extending EntityReference?

tstoeckler’s picture

#64.5 What about fields of type EntityReferenceRevisions and others extending EntityReference?

Hmm... that's a good point, I didn't think about this. So this would in go update all image and file fields as well. But it would also update e.g. entity reference revisions fields. I'm not sure that's a good thing, though. Maybe so, because otherwise it would kind of be unfair, that Entity Reference Revisions has to create its own update path just because it is calling parent::defaultSettings(). On the other hand it does seem strange to automatically update contrib field types. Not sure what others think, but I guess let's leave it as it is for now.

dawehner’s picture

#63.3 This is not possible, as if a site builder is configuring the entity reference fields for a module like paragraphs or inline entity form she should be able of using this feature as those are just regular entity reference fields and we could not put the setting into there. We also have defined an abstract Widget for inline entity forms checking for this setting of the field.

Well, that's the thing. Keep in only in YAML for now, because otherwise you would have to deal with userexperience in this issue. I think conceptually it would be better to have enough time to discuss the UX of this particular flag, so do the UI in 8.4.x but add the functionality for itself in 8.3.x

hchonov’s picture

I think if the setting is shown might be dependent on that if a specific module such as paragraphs or inline entity form is installed. Any idea how we could make those modules register into the system as if there is such a module installed?

hchonov’s picture

Ok, I agree. I'll address the other remarks later and provide a new patch and make a follow-up to decide how we could make the setting available. I would propose to leave the abstract EntityReferenceInlineWidget inside, what do you think? It should be used later so...

The last submitted patch, 60: 2824097-60.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 61: 2824097-61.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
16.71 KB
8.71 KB

#63.1 & .2 done.

#63.3 & #64.2 : For now I removed the setting from the field form completely. This really deserves its own issue.

#64.1 I think that it makes sense of keeping the abstract widget so that contrib modules extend from it when they define widgets for inline entity reference editing.

#64.3, .4, .6, .7 & .9 done.

#64.5 We have to cover also fields extending from EntityReference, but I think that the tests are failing now because we set the setting on image and file fields, so doing this now only for field types starting with "entity_reference" in order to cover also entity_reference_revisions and related ones.

#64.8 For now I only moved the field creation into the setUp, bet left the test as it is as currently it clearly shows the difference between normal and deep serialization.

#64.10 Actually we do not have to clear the cache, so removing the line.

hchonov’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
@@ -0,0 +1,126 @@
+      'field_name' => 'field_test_entity_reference',

Here we have to retrieve the field name from the property in the test class instead of hard coding it.

The last submitted patch, 73: 2824097-73.patch, failed testing.

Status: Needs review » Needs work

The last submitted patch, 74: 2824097-74.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
18.31 KB
1.86 KB

The FileItem::defaultFieldSettings is retrieving the settings from the parent (EntityReferenceItem) and thus also having the setting "serialize_embedded_entities", so we have two options:

1. FileItem::defaultFieldSettings should remove the setting after retrieving it from the parent.
2. The schema of "base_file_field_field_settings" should contain the new setting as well.

I think 2. is better so the new patch has this change.

Status: Needs review » Needs work

The last submitted patch, 77: 2824097-77.patch, failed testing.

hchonov’s picture

Status: Needs work » Needs review
FileSize
18.95 KB
487 bytes

Fixing the last failing test.

There is only one disadvantage I could think of about the current approach - parent entity with two form displays, where the one form display is using an entity reference inline widget and the other not to display the entities of a specific entity reference field. In this case the deep serialization will contain the referenced entities for both form displays, even if it is not necessary for the form display not using an entity reference inline widget .. I was thinking that the form cache might somehow check if entity reference widgets are being used by the form, but this is going to work only on the first level and not cover the underlying levels. Probably we could make it like this that if there are entity references using an entity reference inline widget on the first level then we perform deep serialization also for the children no matter if their nested entity forms use entity reference inline widgets or not. I think this would be a good trade-off. But all this is for the other issue - #2844229: Serializing content entity form objects should deeply serialize the entity in case of nested/inline entity forms. Having said that I think the patch should be ready now.

Version: 8.3.x-dev » 8.4.x-dev

Drupal 8.3.0-alpha1 will be released the week of January 30, 2017, which means new developments and disruptive changes should now be targeted against the 8.4.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

hchonov’s picture

I've just talked with @tstoeckler about my last comment here and based on it we are not really sure if we want the the field setting or not as it does not seem possible or if it would be not that trivial to combine with different form displays. And also we have to decide based on the form display components if we want a normal or deep serialization - if the first level has an entity reference inline widget or not. So based on all this we throw away the setting for now and make an universal deep serialization possible, but later in some of the issues, the field setting might be revived again, for which we are not sure yet.

By doing this we make the approach currently much simpler and easier to understand and use without any restrictions.

hchonov’s picture

And we've also forgot to check if the entity is implementing the FieldableEntityInterface before calling the ::setDeepSerialization() method on it when running deep serialization, which is needed because referenced config entities will be deeply serialized as well and they do not implement that interface.

tstoeckler’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -205,12 +205,12 @@ public function setValue($values, $notify = TRUE) {
    +    if ($include_computed || $this->hasNewEntity()) {
           $values['entity'] = $this->entity;
         }
    

    I think the $include_computed part should be !$this->isEmpty() && $include_computed otherwise $this->entity can be NULL, as far as I can tell and this can lead to errors down the road.

hchonov’s picture

Title: Make it possible to serialize computed field values in order to be able to make an exact snapshot of a loaded entity object » Deep serialization for contnet entities to make an exact snapshot of an entity object's structure based on its current state
Issue summary: View changes
Status: Needs work » Needs review
FileSize
12.21 KB
685 bytes

Sure, you are right @tstoeckler :).

I've also reworked the IS and the title.

hchonov’s picture

Title: Deep serialization for contnet entities to make an exact snapshot of an entity object's structure based on its current state » Deep serialization for content entities to make an exact snapshot of an entity object's structure based on its current state
hchonov’s picture

FileSize
12.45 KB
1.11 KB

I think that the setDeepSerialization method needs a parameter in order to be a real setter and to be possible to turn on and off deep serialization before the entity is being serialized.

hchonov’s picture

I've posted a PoC patch on #2844229-3: Serializing content entity form objects should deeply serialize the entity in case of nested/inline entity forms to show how the current patch here might be used to solve the other issue.

dawehner’s picture

I like that the patch got simpler. On the other hand, I'm not deep into the entity system to really be able to judge this change properly.

hchonov’s picture

Going over again through the patch I think that we don't really need the new method FieldableEntityInterface::deepSerialize() as it is using directly the PHP serialize function, but the direction of core is to allow for different serializers to be used.
See #839444: Make serializer customizable for Cache\DatabaseBackend about this.

Because of this I will remove the method in the next patch.

hchonov’s picture

Issue summary: View changes
FileSize
12.29 KB
2.8 KB

Removing FieldableEntityInterface::deepSerialize() because of #89.

hchonov’s picture

Re-roll only.

dawehner’s picture

+++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
@@ -108,8 +108,10 @@ protected function doTestSerialization($deep_serialization) {
+    $entity_level_zero = unserialize(serialize($entity_level_zero));

Can you put some documentation why this is needed?

hchonov’s picture

@dawehner, I've documented it. Hope that the comment is sufficient.

dawehner’s picture

Thank you @hchonov
I really like how simple the solution now looks like. It would be great if some additional entity subsystem maintainer could have a look at it though.

amateescu’s picture

Status: Needs review » Needs work
  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityBase.php
    @@ -157,6 +157,16 @@
    +   * Whether to serialize the complete entity structure.
    +   *
    +   * If set to TRUE when the entity is being serialized a deep serialization
    +   * will be performed and the flag will be automatically set back to FALSE.
    

    Like with the setDeepSerialization() method, we need to better explain what a "deep serialization" is.

  2. +++ b/core/lib/Drupal/Core/Entity/FieldableEntityInterface.php
    @@ -236,4 +236,19 @@ public function isValidationRequired();
    +   * After calling this method and serializing the entity e.g.
    +   * "serialize($entity);" a deep serialization will be performed and the flag
    +   * will be automatically reset.
    

    I think we need to explain what a "deep serialization" is, e.g. it includes computed properties or something like that.

    And maybe some usage examples as well :)

  3. +++ b/core/lib/Drupal/Core/Entity/FieldableEntityInterface.php
    @@ -236,4 +236,19 @@ public function isValidationRequired();
    +   * @param bool $deep_serialization
    +   *   TRUE if deep serialization should be performed when the entity is being
    +   *   serialized, FALSE otherwise. Defaults to TRUE.
    

    This parameter needs to marked as (optional) at the beginning of the description, but I actually think that we should drop the default TRUE value in order to make the calling code more explicit.

  4. +++ b/core/lib/Drupal/Core/Field/FieldItemListInterface.php
    @@ -271,4 +271,14 @@ public static function processDefaultValue($default_value, FieldableEntityInterf
    +   * @return mixed
    +   */
    +  public function getSerializationValue($deep_serialization);
    

    Missing description for @return.

  5. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -208,12 +208,12 @@ public function setValue($values, $notify = TRUE) {
    +    if (($include_computed && !$this->isEmpty()) || $this->hasNewEntity()) {
    

    In order to increase performance and not try to compute the 'entity' property of the field item, it's better to put the hasNewEntity() check in front.

    Actually, hasNewEntity() has the same problem and it can be optimized a bit, but that's not in the scope of this patch :)

amateescu’s picture

  1. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -208,12 +208,12 @@ public function setValue($values, $notify = TRUE) {
    +  public function getValue($include_computed = FALSE) {
    

    The $include_computed param is not coming from the interface so it is not actually documented anywhere :)

  2. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -208,12 +208,12 @@ public function setValue($values, $notify = TRUE) {
    -    if ($this->hasNewEntity()) {
    +    if (($include_computed && !$this->isEmpty()) || $this->hasNewEntity()) {
    

    This needs to be turned into a more generic approach. For example, LanguageItem also has a computed property ('language') which doesn't receive this special treatment and therefore will not be part of the "deep serialized" structure.

hchonov’s picture

Status: Needs work » Needs review
FileSize
13.13 KB
5 KB

@amateescu, thank you for the review!

#95;
1. & 2.:
I've put a better documentation on the method and referenced it from the entity property.

3.:
The parameter is not optional anymore.

4.:
I haven't added any return doc as TypedDataInterface::getValue does not have any return doc as well, but now I've tried to summarize what would be returned.

5.:
I've reordered the conditions.

#96;
1.:
I don't think we could do anything here about this and if so it would be out of scope of this issue. \Drupal\Core\Field\FieldItemList::getValue is defined the same way and when calling getValue on the $items always provides the $include_computed parameter.

2.:
Good catch! :). I am working now on a general solution and will afterwards provide another patch.

hchonov’s picture

#96.2

Introducing a more general approach for including computed properties during deep serialization.

hchonov’s picture

We have to put the new method on the ContentEntityInterface, as otherwise it would be a BC break. Having the new method on ContentEntityInterface is however allowed because there we have a 1:1 interace:class relationship and the Drupal policy is that in this case custom/contrib have to extend from the base class instead implementing the interface on their own. This has been just discussed at the DevDays Seville for another issue - #2862574-11: Add ability to track an entity object's dirty fields.

hchonov’s picture

I've added extended test coverage to assure that the language property will be included during deep serialization and discovered that this wasn't the case and fixed it as well. Here an updated patch.

hchonov’s picture

Oups a little mistake in #100 as I've had added a code to debug .. Removing it now.

tstoeckler’s picture

Nice, that the patch is now a lot simpler. I am not super happy that we are unconditionally including all entities (so e.g. type, uid and revision_uid of each referenced entity...) but I think the trade-off of the reduced complexity is worth it. And I don't see any reason why that would break anything, it's just that a bit more data is serialized than strictly speaking would be needed.

Some thoughts on the patch:

  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityInterface.php
    @@ -71,4 +71,34 @@ public function getLoadedRevisionId();
    +   * Example usage:
    +   * -Alter the title of a referenced entity and serialize the parent entity
    +   * through deep serialization in order to serialize the change made on the
    +   * referenced entity.
    +   * @code
    +   *   $entity->entity_reference_field->entity->setTitle('new entity reference title');
    +   *   $entity->setDeepSerialization(TRUE);
    +   *   serialize($entity);
    +   * @endcode
    

    Yay, thank you for this! Minor nitpick: Add a space after the "-"

  2. +++ b/core/lib/Drupal/Core/Field/EntityReferenceFieldItemList.php
    @@ -13,6 +14,25 @@ class EntityReferenceFieldItemList extends FieldItemList implements EntityRefere
    +    if ($deep_serialization) {
    +      $values = $this->getValue(TRUE);
    ...
    +    }
    +    else {
    +      $values = parent::getSerializationValue($deep_serialization);
    +    }
    

    As far as I can tell, the $this->getValue(TRUE) call is equivalent to the parent call. So (unless I'm mistaken) I think it would be clearer if we do $values = parent::getSerializationValue($deep_serialization) before the if-else.

  3. +++ b/core/lib/Drupal/Core/Field/FieldItemBase.php
    @@ -101,6 +101,43 @@ protected function getSetting($setting_name) {
    +    // Include computed properties if the condition of
    +    // ::includeComputedProperties() is fulfilled.
    +    if ($this->includeComputedProperties($include_computed)) {
    

    I think the interesting part of this is that we already pass in $include_computed but we allow field items to (optionally) override that. So maybe the comment could be something like "Allow field types to opt-in or opt-out of including computed properties in certain cases."

  4. +++ b/core/lib/Drupal/Core/Field/FieldItemBase.php
    @@ -101,6 +101,43 @@ protected function getSetting($setting_name) {
    +   * Helper method for ::getValue() to check if included properties have to be
    +   * included.
    

    Maybe s/have to be/should be ?

    Super minor, but would also be nice to add a @see to ::getValue() to this method. When looking at the code it's not necessary, because it's just above it, but on api.drupal.org it will be helpful for people.

  5. +++ b/core/lib/Drupal/Core/Field/FieldItemBase.php
    @@ -101,6 +101,43 @@ protected function getSetting($setting_name) {
    +    return $include_computed && !$this->isEmpty();
    

    Would be nice to add a comment here. Something like: "Even if computed fields were requested, do not attempt to compute anything if ther are no actual values." ?

  6. +++ b/core/lib/Drupal/Core/Field/FieldItemListInterface.php
    @@ -271,4 +271,18 @@ public static function processDefaultValue($default_value, FieldableEntityInterf
    +   *   Whether a deep serialization is running.
    

    Instead of "is running" I think "should be performed" would be clearer.

  7. +++ b/core/lib/Drupal/Core/Field/FieldItemListInterface.php
    @@ -271,4 +271,18 @@ public static function processDefaultValue($default_value, FieldableEntityInterf
    +   * @return mixed
    

    This should be array right? (Or even array[] I guess?)

  8. +++ b/core/lib/Drupal/Core/Field/FieldItemListInterface.php
    @@ -271,4 +271,18 @@ public static function processDefaultValue($default_value, FieldableEntityInterf
    +   *   Depending on whether deep serialization is running or not the returned
    ...
    +   *   serialization is not running or additionally contain computed properties
    +   *   next to the data values if deep serialization is running.
    

    Here, as well, I think "being performed" would be nicer than "is running".

hchonov’s picture

amateescu’s picture

Status: Needs review » Needs work

The patch looks much better now, thanks for addressing the reviews so quickly!

  1. +++ b/core/lib/Drupal/Core/Entity/ContentEntityInterface.php
    @@ -71,4 +71,34 @@ public function getLoadedRevisionId();
    +   *   serialized, FALSE otherwise. Defaults to TRUE.
    

    Since the parameter is not optional anymore, there is no default :)

  2. +++ b/core/lib/Drupal/Core/Field/FieldItemBase.php
    @@ -101,6 +101,47 @@ protected function getSetting($setting_name) {
    +        $definition = $property->getDataDefinition();
    +        if ($definition->isComputed()) {
    

    We are not using $definition again so it doesn't look very useful to assign a variable for it in the first place.

  3. +++ b/core/lib/Drupal/Core/Field/FieldItemBase.php
    @@ -101,6 +101,47 @@ protected function getSetting($setting_name) {
    +   *   TRUE if computed properties have to be included. FALSE otherwise.
    

    "FALSE otherwise" is not a sentence on its own, so we should replace the period before FALSE with a comma ;)

  4. +++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
    @@ -208,15 +208,10 @@ public function setValue($values, $notify = TRUE) {
    -  public function getValue() {
    -    $values = parent::getValue();
    -
    +  protected function includeComputedProperties($include_computed) {
         // If there is an unsaved entity, return it as part of the field item values
         // to ensure idempotency of getValue() / setValue().
    -    if ($this->hasNewEntity()) {
    -      $values['entity'] = $this->entity;
    -    }
    -    return $values;
    +    return $this->hasNewEntity() ?: parent::includeComputedProperties($include_computed);
       }
    

    I think this change is wrong because FieldItemBase has $include_computed as FALSE by default, so we are losing the "idempotency of getValue() / setValue()" part?

hchonov’s picture

Status: Needs work » Needs review
FileSize
16.13 KB
1.41 KB

@amateescu, thank you for the review again!

I've covered 1 - 3. About 4 - no we do not loose the idempotency of setValue/getValue, because even if we have the parameter set to FALSE by default we return TRUE if the entity is new - we have exactly the same condition - first check if the entity is new and then simply return it without taking the parameter into account.

return $this->hasNewEntity() ?: parent::includeComputedProperties($include_computed);

In this case the parameter will be taken into account if there is no new entity :).

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

Awesome! I think this is finally ready now :)

hchonov’s picture

@amateescu, @tstoeckler: thank you both for the reviews and all the suggestions!

amateescu’s picture

Issue tags: +DevDaysSeville
Wim Leers’s picture

Two things are not clear here to me:

  1. The performance implications.
  2. Why this is necessary. The code comment says this:
    Alter the title of a referenced entity and serialize the parent entity through deep serialization in order to serialize the change made on the referenced entity. But when would one do that? When would that be valid? Because in REST for example, we must modify entities one-by-one anyway.
alexpott’s picture

Status: Reviewed & tested by the community » Needs review

From the issue summary:

first approach I am currently evaluating

I think before we make this type of change to core we need something strong than that. Also theoretically couldn't said module write a class that wraps entities and uses reflection to perform to do the full state serialisation. Also tbh the full state serialisation feels like a recipe for some very complex bugs.

Wim Leers’s picture

Also tbh the full state serialisation feels like a recipe for some very complex bugs.

Exactly. Hence the concerns I raised in #109. Also, it can result in very complex performance bugs — because there is not necessarily a limit on the amount of recursion (when following entity references). Nor do I see a guarantee that this works correctly for cyclical references.

tstoeckler’s picture

Status: Needs review » Needs work
Issue tags: +Needs issue summary update, +Needs tests

Thanks @alexpott and @Wim Leers!
You are right that the issue summary doesn't currently make a good case for this. Will try to find some time to get it in shape. Also we should add tests for cyclical references (and I guess fix them, because I fear that indeed, they do not work with the current patch). Very nice catch!!

hchonov’s picture

Issue summary: View changes
Status: Needs work » Needs review
Issue tags: -Needs issue summary update, -Needs tests

When the form is cached for multi step forms then the form entity is cached as part of it and on each Ajax request for e.g. adding new fields the entity with the new field on it is cached as part of the cached form. Now imagine you do the same, but on the third level of a nested entity form - in this case the added field on the entity on the third level is lost because we would only cache the ID of the referenced entities and not the whole entity like the parent one. Here we have the first inconsistency between the parent and the child entities in a nested inline form structure.

From the performance point of view: consider that on each form rebuild you might have to load a lot of entities separately currently but if you use deep serialization you will not need to load not a single entity anymore. So in the case of nested entity inline forms we have PHP computation time to serialize 100 entities at once and on form rebuild do 0 entity loads versus serialize only the parent and on form rebuild do 99 separate entity loads. Not having measured the impact I would say the second is slower. It is better to retrieve e.g. 500 KB with a single query instead 500 KB with 99 queries.

Caching the parent entity prevents that during concurrent editing the changes made from another user and saved will suddenly be mixed into the entity that another user is still editing and on form rebuild when he adds a fields suddenly the values might get reordered or some be deleted or new ones present. As a summary a lot of unexpected things happen when the entity is not being serialized no matter if parent or reference. A similar case where we got into troubles is #2675010: Cloned entity will point to the same field objects if the clone was created after an entity translation has been initialized. In that issue you might see what happens if suddenly you have a modified entity in the form instead the original used one to create the form.

Deep serilization will not be active by default. It is there only for widgets that will require it. For this see #2844229: Serializing content entity form objects should deeply serialize the entity in case of nested/inline entity forms.

If you need the deep serialization because of nested inline entity forms and limit how much is going to be deeply serilized then your inline form will break, so limits are not really a good thing.
About circular references - well currently you could not open the edit form or render an entity that contains circular references, right? This problem exists in core and the new feature of deep serialization should not be the one that is fixing it, right? Therefore I am removing the tag "Needs tests".

I've updated the issue summary with the information from this comment and therefore removing the "needs issue summary update" tag.

Wim Leers’s picture

But this is changing innards of Entity API, not of entity forms. If you'd only be changing form logic, I'd agree. But now, I'm not entirely convinced by #113.

hchonov’s picture

Entity Forms are part of the Entity Sub-System and we could not introduce entity deep serialization without doing it in the Entity API. And I am still saying that is going to be used only by inline entity forms. So this is only a feature and not a default behavior.

hchonov’s picture

Re #110:

I think before we make this type of change to core we need something strong than that. Also theoretically couldn't said module write a class that wraps entities and uses reflection to perform to do the full state serialisation. Also tbh the full state serialisation feels like a recipe for some very complex bugs.

About deep serialization through the current patch approach:

@alexpott, the current way of serialization (not deep one) will always contain new entities, so the only thing changing is that if deep serialization is requested not only new but also existing entities will be contained. So there is no difference between the following code regarding only entities:

// Using normal serialization.
$entity_level_zero = EntityTest::create();
$entity_level_one = EntityTest::create();
$entity_level_two = EntityTest::create();
$entity_level_zero->field_entity_reference->appendItem(['entity' => $entity_level_one]);
$entity_level_one->field_entity_reference->appendItem(['entity' => $entity_level_two]);
serialize($entity_level_zero);

and

// Using deep serialization.
$entity_level_zero = EntityTest::create();
$entity_level_one = EntityTest::create();
$entity_level_two = EntityTest::create();
$entity_level_zero->field_entity_reference->appendItem(['entity' => $entity_level_one]);
$entity_level_one->field_entity_reference->appendItem(['entity' => $entity_level_two]);
$entity_level_zero->setDeepSerialization(TRUE);
serialize($entity_level_zero);

So the only thing that I could think of that could cause problems are only circular references, but they could be caused even now - e.g. consider appending the entity zero to entity two - both normal and deep serialization could not detect this.

About reflection:
Even if it would be possible somehow to implement it with reflection, then it would be much more complex than the current approach and the performance will suffer. Also I honestly tried to think of an approach how to do it with reflection after you've mentioned it, but I have no idea how could we attach ourselves to ContentEntityBase::__sleep and there put the entities into the $values property, just before the fields are destroyed. And we need the entities to be contained in $values, because that way on wake up the entity references will be provided to the fields automatically next to the target IDs.

I've found some nitpicks in the last patch, so fixing them now.

tstoeckler’s picture

Issue tags: +Needs tests

So the only thing that I could think of that could cause problems are only circular references, but they could be caused even now - e.g. consider appending the entity zero to entity two - both normal and deep serialization could not detect this.

Your right that that will currently cause problems. I will file a separate bug report for that. In combination with deep serialization it becomes a much more severe issue, though. In HEAD it is only an issue if you serialize an entity that you have created with looped references in memory - as your example shows. With deep serialization this will become an issue even for entities that have been persisted in the DB. And the likelihood of serializing persisted entities vs. entities in memory is much higher, as well as the likelihood of persisted entities having circular references vs. in memory entities. So while we need to fix the bug in HEAD, I don't think this issue can go in if it has a problem with circular references.

hchonov’s picture

How do we prevent infinite rendering in core currently? Is it possible to create an infinite loop of entity reference currently? What about my example from my previous comment, but instead save the entities? I think this is going to create the infinite loop when rendering the main entity? If this is the case then this problem already exists in core and is a problem on its own.

hchonov’s picture

I've talked with @tstoeckler and he showed me that there is a recursion limit for the rendering. So my mistake, I am sorry.

Afterwards we've discussed how we could track the currently serialized rereferences and @tstoeckler proposed a really nice solution, but... I was unable of writing a test that is going to fail, because how currently the serilization works is that in CEB::__sleep $this->fields will be emptied so that even with circular references it is impossible that we create an infinite loop with deep serialization. This means we have this already covered. In the worst case what could happen is the following with existing entities:

$entity_zero->field_reference->appendItem(['entity' => $entity_one]);
$entity_zero_clone = clone $entity_zero; // Ensure different object reference.
$entity_one->field_reference->appendItem(['entity' => $entity_zero_clone]);
$entity_one->setDeepSerialization(TRUE);
serialize($entity_one);

In this case because and only because the parent entity is appended with different reference because it is cloned the serialized object will contain the entity zero twice - once from the beginning and once inside entity one. So in this case CEB__sleep is called only 3 times - first for entity zero, then for entity one and then for the cloned entity zero. I was expecting that it will be called once more for the entity one referenced by the cloned entity zero but it wasn't called even if the reference field was populated with the entity one, so I guess somehow it is PHP that has a built in mechanism against serializing circular references OR actually serializing again an object that has been already serialized.

hchonov’s picture

Here is a test for deep serialization with circular references. The best explanation is to look at the test and at the four assertions to understand what is happening.

If it does not look sufficient then I'll add tracking of which entities are being currently serialized and prevent that a parent one is returned back again, but there is the question - is this additional complexity really worth it after looking at what currently happens with circular references?

tstoeckler’s picture

Issue tags: -Needs tests
  1. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -141,4 +141,67 @@ protected function doTestSerialization($deep_serialization) {
    +    // Case 1: new_entity->existing_entity->parent_new_entity
    

    This description doesn't seem to be true, since all entities are saved?

    That said would be interesting to have a test case where both entities are not yet saved. Still tentatively removing the "Needs tests" tag.

  2. +++ b/core/tests/Drupal/KernelTests/Core/Entity/ContentEntitySerializationTest.php
    @@ -141,4 +141,67 @@ protected function doTestSerialization($deep_serialization) {
    +    // Make sure all the entities are initialized in the structure, otherwise
    +    // they will not be contained in the serialization.
    +    $entity_level_zero->$field_name->entity->$field_name->entity;
    

    Ahh, so I guess if you do

    $entity_level_zero->$field_name->entity->$field_name->entity->$field_name->entity->$field_name->entity
    

    or something you would get a deeper level of serialization? If so, I think we are good because it would be broken code to manually instantiate nested entities to a nesting level where it could become problematic for e.g. memory reasons.

    I also think the test is quite clear.

hchonov’s picture

That said would be interesting to have a test case where both entities are not yet saved. Still tentatively removing the "Needs tests" tag.

I've just added such test case.

Ahh, so I guess if you do
$entity_level_zero->$field_name->entity->$field_name->entity->$field_name->entity->$field_name->entity
or something you would get a deeper level of serialization? If so, I think we are good because it would be broken code to manually instantiate nested entities to a nesting level where it could become problematic for e.g. memory reasons.

I've just extended the test to show what is happening if we initialize the structure further down - nothing, the serialization result remains the same :).

tstoeckler’s picture

Thanks, the test looks great to me!

@Wim Leers does that alleviate your concerns or do you also have concerns regarding deeply nested, non-cyclical entities? I personally don't see that as a concern, because both in forms (with Inline Entity Form / Paragraphs) and in rendering you end up loading all entities anyway so this will actually greatly improve performance once we start implementing it there. So this would be a hit, e.g. if we use deep serialization for forms (in a followup) and a very deeply nested entity is referenced per autocomplete. While this is not a far-fetched example (i.e. a "Related articles" field with "paragraphed" articles) I don't think the "normal" examples involve that many levels of references that it's going to be performance-relevant. But would love to hear your thoughts or if you have any specific examples in mind.

hchonov’s picture

Please remember that this full structure snapshot is needed not only for nested entity forms, but as well to solve #2824293: Inconsistent form build between initial form generation and the first ajax request.

hchonov’s picture

Re-roll.

Version: 8.4.x-dev » 8.5.x-dev

Drupal 8.4.0-alpha1 will be released the week of July 31, 2017, which means new developments and disruptive changes should now be targeted against the 8.5.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

hchonov’s picture

I've talked with @Wim Leers about this issue at the DrupalCon Vienna 2017 and he pointed me to an issue doing something similar - #2871591: Allow ComplexData in TypedData to specify computed properties that should be exposed in normalization and other contexts. While both issues need a way of exposing computed properties both issues are for different concepts and still need slightly different approaches.

We've talked about the concerns raised by @alexpott in #110 and especially about the danger of full state serialization:

Also tbh the full state serialisation feels like a recipe for some very complex bugs.

Therefore @Wim Leers suggested that instead of doing full state serialization we could narrow down the deep serialization only to entity references.

This means that we will not be introducing such a big change as the only thing that we are changing is to allow for serializing entities together instead of doing a full state serialization.

The new approach that I've implemented will by default not return computed properties for deep serialization. The only one exception in core will be the EntityReferenceFieldItemList which will return computed properties when a deep serialization is running. In order to keep the change as simple and small as possible I haven't implemented a solution similar to the one introduced by #2871591: Allow ComplexData in TypedData to specify computed properties that should be exposed in normalization and other contexts, but if anyone thinks that the current approach is still too general then I am fine with going further and providing a new method \Drupal\Core\TypedData\DataReferenceInterface::isExposedForDeepSerialization() which will return TRUE only in the implementation of \Drupal\Core\Entity\Plugin\DataType\EntityReference.

Wim Leers’s picture

Wim Leers’s picture

[…] but if anyone thinks that the current approach is still too general then […]

I'd go much further: I'd just hardcode the special-casing of \Drupal\Core\Entity\Plugin\DataType\EntityReference (and perhaps subclasses of that). IMHO there is no need for an API for this. If the need arises at some point in the future, we could add an API. But if this conceptually only makes sense for entity references, then we should only do it for entity references, we shouldn't add an entire API.

hchonov’s picture

Well that works fine for me. \Drupal\Core\Field\EntityReferenceFieldItemList::getSerializationValue() will now include during deep serialization only properties that are instance of Drupal\Core\Entity\Plugin\DataType\EntityReference.

amateescu’s picture

I remembered about this issue when @tstoeckler referenced it from #2923015-6: [PHP 7.2] Incompatible method declarations and I have a question.

The main use case for all this "deep serialization" stuff is that some code needs to get the entity property of an entity reference item when calling getValue() and when the the entity has been "changed" (i.e. it is not the same object that would be returned by loadUnchanged($target_id)), right?

If so, wouldn't it be easier to track the "changedness"/touched/dirty state of an entity in an internal property, with a setter and a getter for it? If we would have that, this patch would be just a one-line (code) change to EntityReferenceItem:

diff --git a/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
index f9922b9..2bf3cb6 100644
--- a/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
+++ b/core/lib/Drupal/Core/Field/Plugin/Field/FieldType/EntityReferenceItem.php
@@ -211,9 +211,9 @@ public function setValue($values, $notify = TRUE) {
   public function getValue() {
     $values = parent::getValue();
 
-    // If there is an unsaved entity, return it as part of the field item values
-    // to ensure idempotency of getValue() / setValue().
-    if ($this->hasNewEntity()) {
+    // If there is an unsaved entity or a changed entity, return it as part of
+    // the field item values to ensure idempotency of getValue() / setValue().
+    if ($this->hasNewEntity() || (isset($this->entity) && $this->entity->isChanged())) {
       $values['entity'] = $this->entity;
     }
     return $values;
tstoeckler’s picture

Yes, with something like #2862574: Add ability to track an entity object's dirty fields this would become much easier, as far as I can tell #131 is spot on (although maybe @hchonov has more info).

Historically, this predates the whole movement on #2862574: Add ability to track an entity object's dirty fields, that's why we went this route. Furthermore, even though that has seen some activity due to it touching the API-First initiative, I am still not convinced we will be able to solve that issue properly simply due to the complexity of the Entity API and all the different things you can do with entities and all the ways you can do things (i.e. set field values) in different ways. And as soon as we rely on the "dirty-fields" API for anything meaningful, bugs in it will inevitably lead to access bypass at best and data loss at worst. That's why I would be wary marking this (or anything, to be honest) postponed on that. That said, it would not be the first time that smart people (including all the smart people in this issue) have proven me wrong, where I thought we couldn't fix a complex problem properly.

hchonov’s picture

We need the state of the whole structure even if an entity hasn't been changed. The reason is that we always have to be able to rebuild a form in the state it has been retrieved.

There are two use cases that are causing the problem:

  • inline entity reference not based on a revision and
  • inline entity reference based on a revision with concurrent saving of changes in an existing entity revision without creating a new one

So the problem is what could happen with the inline referenced entities by concurrent editing. In such cases we are not allowed to rebuild the form with a different version of the entities.

Annd the conflict module allows for concurrent editing, but it would need that the form is still using the previous version of the entity :).

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.