Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Follow-up from #1778178: Convert comments to the new Entity Field API.
Let's use this issue to work on avoiding instantiating EntityNG field value objects by default as this should bring some memory-usage and speed improvements.
The idea
- lazy-instantiate field value object when accessed via get()
- store field item values in a $this->value array
- point the magic getter to read from $this->value[$name]
- fallback to the $this->get($name)->getValue() if there is no value - that way computed properties stay working
API changes
There are some changes in the TypedData API, the EntityNG API does not change.
- ContextAwareTypedData gets merged into TypedData as all our TypedData is context-aware now.
- onChange() is introduced for TypedData lists and complex data, setValue() and set() get a new optional parameter.
Summary of changes
Taken from #64:
The patch is quite big but there's a repeating pattern of changes in various classes:
- Field values, the lowest level in the EntityNG object are no longer instantiated into TypedData (e.g. Integer, String, .. classes) by default, only when explicitly requested as such, e.g. to validate them. This saves some function calls and memory.
- the onChange() method is introduced, which allows parent classes to react on changes in their children. This will for example allow to to recalculate the height/width of an image field if the referenced image changes. Or allows to keep entity reference id's and loaded entity objects in sync so that we can preload them without running into problem with stale references.
- To avoid calling onChange() on the parent if the call comes from the parent already, the $notify argument is introduced for setValue() and set() methods. This needs to be passed through and set to FALSE if calling your own child properties.
- Various setValue() methods are refactored/removed due to better default handling which makes them unecessary or simpler.
- The ContextAwareInterface is merged into the TypedDataInterface as all typed data is now context aware. That results in some additional methods in entity/typed data classes that will partially only be fully implemented in follow-up issues.
Comment | File | Size | Author |
---|---|---|---|
#62 | d8_instance.interdiff.txt | 5.52 KB | fago |
#62 | d8_instance.patch | 93.58 KB | fago |
#60 | d8_instance.patch | 93.27 KB | fago |
#60 | d8_instance.interdiff.txt | 874 bytes | fago |
#58 | d8_instance..patch | 93.42 KB | fago |
Comments
Comment #1
fagoComment #2
fagoComment #3
fagoComment #4
dawehnerJust to summarize, the goal is to not use a typed data object when accessing via $node->nid->value but use another object for that,
to save both performance and memory? If the "value" does not exist use the typed data as it might act different.
In order to support $node->nid->value the magic getter should return a stdClass or some other kind of object?
For ->get() you want to have the typed data all the time.
Comment #5
fagoRight now, the value is kept in a typed data "value" object, on which the magic getter calls __get() when you access $node->nid->value. The idea is to keep the data in the field-item object such that the magic getter can just return the data from an internal array instead of going via the typed data "value" object. But if there is no data, we fall back to instantiating this object such that a computed property can be calculated.
That way, we can save instantiating this object for regular read/write operations via the magic getter, while like using validation will still instantiate the object.
Comment #6
dawehnerSo something like that patch?
Comment #8
dawehnerThe current code produces the following error message:
, which is described on http://pkp.sfu.ca/bugzilla/show_bug.cgi?id=5028#c4 as well.
Maybe we could replace the values array by something with numeric IDs
Comment #9
fagoWe'll have to do this change in the FieldItemBase class, but this should build upon the improvements from #1869250: Various EntityNG and TypedData API improvements. Thus, marking this as postponed on this issue for now.
Comment #10
fagounpostponing since comment conversion went in.
Comment #11
moshe weitzman CreditAttribution: moshe weitzman commentedBump. Some speed/memory improvements are always welcome.
Comment #12
fagoTotally - I'll start working on that after feature-freeze.
Comment #13
fagoSo coming back to this from #1868004: Improve the TypedData API usage of EntityNG. We'll need a way to handle changes to an item's properties over there such that we can make sure the properties of an entity reference item stay in sync ($entity->ref->id && $entity->ref->entity). We'll need the same functionality here for being able to ensure any updates via the field-value objects are reflected in the plain $values object - so I think it's best to work on introducing that here.
Another use-cases for being able to handle typed-data property changes is the entity itself: By doing so we get informed when a field changes, what we could use to track changes and a) remove the current entity_load_unchanged() call before saving and b) track changes and only save what changed. Obviously, implementing this use-case would deserve it's own issue.
Comment #14
fagohm, for adding the possibility to react on changes we'll need to make every typed data object context aware. Right now, pretty much object is context aware already except of the simple, primitive value objects. For them to be able to notify the parent the obviously need context now also.
Given that it does not make much sense any more to differentiate between context-aware typed data and non-context aware typed data. Thus, let's do away with that and merge the ContextAware interface and base-classed into their typed data equivalences.
Comment #15
fagook, here a patch merging the ContextAware* stuff into TypedData*. Let's see whether tests are green.
Next steps:
1) add something like the following to the TypedDataInterface:
2) Revamp FieldItemBase to let __get() directly go with $values and make handle changes to its properties accordingly.
Comment #16
fagoand here the promised patch.
Comment #18
fagoThis should have the tests fixed. Next step see #15.
Comment #19
fagoComment #20
BerdirLooks like you forgot to remove those..
Comment #21
fagothanks, removed that.
Also, started implementing the actual change. I'm not done yet fixing all classes - running out of time now. But this should show where it goes. -> First reviews welcome!
Comment #22
andypost@fago please provide interdiff to make it easy follow a changes
Comment #23
fagoI'll for further updates, sometimes Git merges make that hard. But you can find the whole history at the entity sandbox in the branch "instance"
Comment #24
fagook, I completed the patch:
- realized we do not need to pass on anything more than $property_name, thus renamed handleChanges() to "onChange()" .
- magic getters and setters of field items can now work with the plain value, i.e. without a value object
- slightly improved the EntityWrapper such that it uses the new magic getter also. Cleaning it up more should be covered in #1868004: Improve the TypedData API usage of EntityNG, which covers splitting the typed entity-reference-object from the typed entity object.
I've done a short performance comparison on a page rendering 300comments, including fields (BC-mode) with xdebug disabled:
BEFORE
Page execution time was 2596.94 ms. Memory used at: devel_boot()=5.84 MB, devel_shutdown()=44.38 MB, PHP peak=56.25 MB.
AFTER
Page execution time was 2483.42 ms. Memory used at: devel_boot()=5.88 MB, devel_shutdown()=43.24 MB, PHP peak=54.5 MB.
I'd have expected more difference memory-wise, but at least it's a bit better. We should re-iterate on performance more once we have done away with the BC-mode, which adds lots of weight right now and hinders us from optimizing EntityNG.
But at least this settles the base for cleaning up references and handling changes in the entity (e.g. for tracking changes there and doing away with entity_load_unchanged()) thanks to the introduced onChange() method.
Comment #26
fagoops, looks like I got to update a few classes. Attached patch should fix that - as seen in the first interdiff. Also, I profiled the code - so I was able to do away with a few performance regressions (computed properties are cloned again now). So, this should be slightly faster than the last one.
Also, I noticed we've some repeated calls to language_load() now. Improving the reference to statically cache that we'll help us to improve that. That's something that #1868004: Improve the TypedData API usage of EntityNG would cover already.
Comment #28
fagoAnother change to the entityinterface got in, which resulted in conflicts in the BC-decorator. Fixed conflicts and re-rolled patch, no other changes.
Comment #30
moshe weitzman CreditAttribution: moshe weitzman commentedTest Failed :(
Comment #31
fagoYep, I've broken isEmpty() :/ - fixed that. Also fixed comments-new indicators which were not working due to a previously missing 'computed' flag. Let's see what the bot says now.
Comment #33
fagoThere was an issue with NULL value handling as those have been filtered out. Fixed that such that NULL values are not filtered out unless the whole property is NULL. So a map or field item is either unset (NULL) or set to an array which then keeps NULL values for contained properties - it's set to an empty data structure.
Comment #35
fagoComment #36
BerdirSome tests with my helper module, without xhprof enabled, rendering 300 comments.
8.x
Calling perftest_comment_view took 1362.301 ms and used 22.97 MB memory.
Calling perftest_comment_view took 1332.072 ms and used 22.97 MB memory.
Calling perftest_comment_view took 1361.004 ms and used 22.97 MB memory.
this:
Calling perftest_comment_view took 1315.48 ms and used 22.24 MB memory.
Calling perftest_comment_view took 1344.17 ms and used 22.24 MB memory.
Calling perftest_comment_view took 1329.661 ms and used 22.24 MB memory.
So it's a tiny bit faster and uses a tiny bit less memory but nothing remarkable. When looking at the profile output, most of the time (60%) is spent in theme(), 30% within template_preprocess_comment(), 10% is spent rendering the same user 300 times :) 30% is viewMultiple(), which means that 90% is view + theme. So we really need to get render caching working :)
But that's only partially related to this issue. Will review the code tomorrow. Quick question: I'm not sure I understand why all that refactoring is necessary, seems like the relevant parts where we actually create and return values/properties only makes up a small part of this patch?
Comment #37
fagoThanks for testing!
So here a summary of the changes:
- Field item values are now accessed directly via the protected $item->values array. Previously accessing those values was going via the value object, e.g. $item->properties['name']->getValue(). Property objects get only instantiated when they are directly used or needed, e.g. via get() or for computed properties.
- For making this possibly we need to make sure the $item object and $item->values is in sync with any possibly created property object. This is why the patch introduces the onChange() for complex data items and lists - so those get informed when some of their properties or list items change.
- For doing that we need the data-context (i.e. $this->parent) in the value objects, which previously we had only for so called ContextAwareTypedData. Previously, everything on an entity was context-aware except for the item value objects. With this change we need them to be context aware also, what makes every typed data object we use context ware. Consequently, keeping the differentation between non-context aware and context aware typed data is pointless and just creates unnecessary code complexity due to the checks whether something is context-aware and mental overhead. Thus, this patch merges the Context-Aware interface into the TypedDataInterface and respectively the base-classes.
Note, that the ability to know when something changed will enable more things:
- Clean up potentially stale reference properties. We already have $this->newEntity which might become stale. With this improvement we can improve reference to locally cache the loaded entity in $this->entity and speed up repetitive calls a bit. It's the same for loading the language object. Improving references is needed for #1868004: Improve the TypedData API usage of EntityNG, so I'll tackle that over there.
- This change makes an entity aware when some of its fields changes, so it enables us to track any changes to an entity, so we could do away with entity_load_unchanged() or optimize updates to only save what's really changed.
- If we decide that whole entity fields objects would be too much overhead for config entities, we could use the same approach to make $config_entity->property work for a primitive typed data property, while keeping the metadata in place and accessible via the usual get().
Comment #37.0
fagoUpdated issue summary.
Comment #38
BerdirTagging, will review asap.
Comment #39
BerdirGenerally, this looks very nice. Some questions below...
Hm, this means that we'll have entityType() and getType(), both will return the same thing after that other issue. We might want to kill entityType() afterwards. Going to be a big patch I guess but should be easy to do.
Not sure if I really understand this part.
What are we exactly doing here and in EntityReferenceItem? We do we exlude entity/language (note: comment says entity here).
This essentially also is a LanguageWrapper, right, just like EntityWrapper? Wondering if we should rename that, might be confusing to have two different Language classes that are quite different but still connected.
Not here, of course.
That's a pretty repetetive pattern :p
It really is. So a typed data implementation *must* call this, right? Do we need to document this somewhere or is it already?
Ah, here it is. Given that this is the place where people will (hopefully) go when implementing a typed data thingy, should we maybe even include example code here or something like that? Thought about making it helper function but that's really pointless, we'd have to pass notify anyway, so the only thing it could do is check if parent is set.
Not sure if the second sentence is is really necessary though. This is not something the client needs to check or decide, if the parent sets a value then he is responsible for setting $notify to FALSE. So that sentence should, if kept, address the parent and say, if you are calling this on your children, set to FALSE so that you are not notified again.
I understand this is just moved but I don't quite understand under which circumstances would be called with setContext(NULL, NULL), why bother to support that? Could something be moved to the root of a tree?
Comment #40
fagoThey don't necessarily return the same. With #1868004: Improve the TypedData API usage of EntityNG we'd implement types like "entity.node.article" - so getType() would return that while entityType() just returns the type of the entity. Thus, I think keeping both makes sense.
I'll improve the in-code docs here.
Agreed -> #1868004: Improve the TypedData API usage of EntityNG.
Well, that's the interface documentation which will be mostly read by API consumers I think. For implementers we've a base class anyway?
True. I'll rephrase it that way.
Good question. No it cannot be moved away from the tree but an object could be instantiated as root or as child item. However, we could skip calling setContext() in the former case but passing NULL would be the same and more explicit.
Comment #41
fagoOk, updated the patch. While I wanted to improve the code comment for the reference item's setValue() I realized it's much easier to follow if we improve the code, thus I've done so.
Patch covers the new ImageItem also now.
Comment #42
tstoeckler#41: d8_instance.patch queued for re-testing.
EDIT: The recent ViewUI changes might have broken this...
Comment #44
BerdirHere's a re-roll.
Tried to profile this a bit.
One thing that i noticed is that we're currently getting 1800 onChange() calls, which is removing some of the gain that we have with this. Tried to track it down a bit, 1200 calls or so to getTranslatedField(), some through __set() in field_attach_load(), most through rdf_comment_load(), that function makes up almost 20% of the whole wall time. Is it possible that it's just the first access that calls __get()?
Also don't quite understand why the __get() calls lead to an set/setValue() and therefore onChange()?
Not sure if the configurable entity reference change works.
Comment #46
fago>Is it possible that it's just the first access that calls __get()?
Yep, I think so as well. It's the first one accessing fields thus leading to field object instantiation.
>Also don't quite understand why the __get() calls lead to an set/setValue() and therefore onChange()?
Indeed they shouldn't as values should get set via getPropertyInstance() and thus skip notifications. AFAIR that worked though, but let's check again.
Comment #47
fagoI had a look at the test fail and figured it's a problem with the HAL deserialization. I do not know why this passed tests previously, but it tries to use translated values with a language neutral entity. Fixing HAL to create the entity with the usual language default, fixes the problem.
Comment #48
fagoCreated #1966436: Default *content* entity languages are not set for entities created with the API.
Comment #49
BerdirFound a few things to improve.
- Field.php was missing notify = FALSE
- Not sure about the uuid change. Doing the assignment like that avoids the onChange() but I'm not sure if this is actually faster/desired in such a case?
- The log property on Nodes was not defined but unset. This was the reason for some setValue() calls and while the onChange() was fixed by the first point, this call shouldn't have been made in the first place :)
- Language didn't respect the $notify when doing that language source call thing. Not sure that's correct.
100 requests on a page that renders 300 comments
8.x
Requests per second: 1.33 [#/sec] (mean)
Time per request: 750.249 [ms] (mean)
With patch
Requests per second: 1.39 [#/sec] (mean)
Time per request: 719.986 [ms] (mean)
Single requests vary quite a bit, but it is faster with the patch.
Comment #50
fagoGood catch!
It won't, it would still notify the $entity of the UUID being changed. That's correct though. On whether assigning $entity->uuid->value directly or $entity->uuid is better: I'd prefer the former as we'd read it from $entity->uuid->value also, but I do not have a strong opinion here.
hm, we have the same for entity references. Setting the whole field item makes it set the entity+id property, but the entity property does not respect the notify boolean right now. If notify is not set it still needs to update the source property, but it should be respected when we do so. We can fix it to do that by using setValue on the object, what obviously would lead to instantiating it. To avoid that during load phase I think we should ensure to set the ID during load phase only and do not set entity object - analogously for language.
Comment #51
Berdirthe UUID thing will go away in the default value issue, how will that work together with this, when will the default value be initialized. when it's accessed?
Not sure I follow the Language part yet, will have to re-read more slowly later :)
Comment #52
fagohm, re-iterating on the language/entity refernces.
>If notify is not set it still needs to update the source property, but it should be respected when we do so.
Generally yes, but if we are called from the parent it's not necessary. As we've documented that notify should be used when calling from the parent, I think that's reasonable. I guess this could need a comment though.
Comment #53
fagoDuring entity-create also. I don't think it changes much here though.
Comment #54
fagook, re-rolled the patch with an added comment and applied the same solution to entity references.
Comment #56
fagoI had a look at the test fail and discovered that we did default to setting the computed entity property previously - as this gives us maximum flexibility in setting the entity reference by entity or ID.
Thus, I re-added this. However for that to work out we need to update the entity ID property silently, which right now was not possible. I figured that we should add the $notify option to ComplexData::set() also so it's possible to issue a silent update to complex data's property only also. I think that's reasonable as it goes inline with setValue() and makes the previous solution cleaner as the assumption of #52 isn't necessary any more.
Attached an updated patch and interdiff.
Comment #58
fagoRe-rolled patch against latest 8.x.
Comment #60
fagoA recently added test as highlighted a bug in my fix from #47 - we need to check for language module to be enabled! Sry for missing that before :/
Comment #61
dawehnerIt would be great to get the big picture: when is get()/__get//getValue() to be used etc.
Maybe we could also (just) use "{@inheritdocs}"
We don't pass $notify. Does this make sense?
So, we don't call $value->getValue() on entities, because they don't have getValue() enabled properly?
So the logic on get() is: If values is set, it wins. Why then, the logic in set() is reverted: if $this->properties is set, it wins.
I guess we should just "use" the class?
This snippet is used in a lot of places, so wouldn't it be worth to have a helper method on TypedData (where parent is defined)? Additional this onChange() could maybe be moved to the TypedData class.
Great to see, how much this logic can be also simplified by this patch.
No notification here?
Shouldn't we also check that actual thing which is in the clone?
Let's use direct calls to the storage (this changed relative recently).
Comment #62
fagoThanks for the review!
The public API is not really changed by this patch - usually you go with the magic access while you use get() + getValue() when you want to work based on typed data interfaces only. What the patch changes though is that now it makes sense to use the magic access methods or set() also internally as that's faster than going via TypedData. For example, previously the magic get was doing $item->get('value')->getValue() so that was the fastest path, but now it's faster to use $item->value as this avoids instantiating the "value" object.
Makes sense, but the patch just adds a few more methods that are decorated, while there are quite some existing methods that use the same comment. Thus, I think it's better to change that in a separate issue and keep it consistent with others now.
I'm not sure, as this check is pursued quite some times during the setValue() operation on each object. So we'd have to make sure that's reasonable fast. However, berdir and I discussed another variant which would be faster and imho better: Have a totally different method such as setSilventValue() so we can skip the checks. Howsoever, this issue blocks entity validation and needs to move on asap, so I think we should figuring out further optimizations here to a follow-up.
Nope, as only ComplexData and Lists have onChange(), other TypedData not.
It does not really matter who "wins" as it's ensured that there is either a value or a property object. It's like that as having plain values is the 90% case so checking that first make sense, while for set() the logic is just a bit simpler that way (checking for a value to be there and update it first is quite unnecessary if it's the fallback (else) behaviour anyway). Howsoever I added some comments there to help with understanding the logic.
Nope, we skip that as in case of the EntityWrapper the value itself is an $entity which is now an instance of typed data also. But we want to keep this instance, so we do not call getValue() here. More improvements on entity references are on the way via #1868004: Improve the TypedData API usage of EntityNG, which right now is blocked by this issue.
Nope, the notification happened already by the method a few lines above. When we are just propagating changes down, we do not need or want to issue repeated notifications.
This happened already a few lines above. This loc is actually just re-using the $cloned object and not testing cloning at all. I fixed it to go with a $typed_data variable only what is less confusing.
Comment #63
fago#62: d8_instance.patch queued for re-testing.
Comment #64
BerdirOk, I think this is ready.
Why this patch is important/major: It's a required step towards entity validation which is for example necessary so that rest.module can validate incoming entities outside of the context of a form.
The patch is quite big but there's a repeating pattern of changes in various classes:
- Field values, the lowest level in the EntityNG object are no longer instantiated into TypedData (e.g. Integer, String, .. classes) by default, only when explicitly requested as such, e.g. to validate them. This saves some function calls and memory.
- the onChange() method is introduced, which allows parent classes to react on changes in their children. This will for example allow to to recalculate the height/width of an image field if the referenced image changes. Or allows to keep entity reference id's and loaded entity objects in sync so that we can preload them without running into problem with stale references.
- To avoid calling onChange() on the parent if the call comes from the parent already, the $notify argument is introduced for setValue() and set() methods. This needs to be passed through and set to TRUE if calling your own child properties.
- Various setValue() methods are refactored/removed due to better default handling which makes them unecessary or simpler.
- The ContextAwareInterface is merged into the TypedDataInterface as all typed data is now context aware. That results in some additional methods in entity/typed data classes that will partially only be fully implemented in follow-up issues.
This has been profiled, the last time in comment #49.
@fago: Did I miss something?
Comment #65
dawehnerThis would be an excellent part of the issue summary.
Comment #65.0
dawehnerUpdated issue summary.
Comment #65.1
fagoadded in new summary
Comment #66
fagoThat's a great summary - I added it to the issue summary and shortly described the API changes.
Comment #67
fago>This needs to be passed through and set to TRUE if calling your own child properties.
Minor: You need to pass FALSE then. ;-)
Comment #67.0
fagoadded API changes
Comment #68
catchMost of tihs looks sane and instantiating only when requested should help with performance a fair bit - there's usually plenty of entity properties on requests which are just ignored.
Two things:
This is a lot of @todos - how far off is that one from RTBC? What happens if it doesn't get done?
This comment reads quite confusing, then the code is just returning whatever's in $this->values - is it really necessary? It looks like it's referring to what happens now vs. before as opposed to just now a bit.
I heard you like magic so I put some magic in your magic so you could magic with your magic.
Already in here though.
Comment #69
fagoI think it really needs to get done to make the typed data usage of EntityNG totally sane. Well else, it just won't become sane :D I was planning to make entities implementing the TypedDataInterface in the other issue, but it was already required to implement the notifications here. Those notifications are again needed for the other issue, so we've a chicken-egg problem here. We could have rolled everything in one patch, but I think it's better to split things up to ease reviewing and discussions - it's already big enough.
Anyway, I don't think there is much left for the other issue (see my comment there) and I'd love to move on with it asap. As it needs to build upon this I've been postponing any further work on it on this issue (I really had enough re-roll pain :/).
Comment #70
fagoWell, there are really three possible cases here (non-defined -> value, defined + object, defined + value), whereas two cases get handled by this simple check. dawehner was wondering about the ordering vs. the setter in #61, so I've been adding more explanation here. I'm open to change that to whatever makes most sense to you? I guess I miss the out-side POV here.
Comment #71
fago#62: d8_instance.patch queued for re-testing.
Comment #72
catchOK that's all reasonable. We're trying to get stricter on patches with @todos, but this resolves a lot more than it adds. Committed/pushed to 8.x.
Comment #73
BerdirRemoving sprint tag.
Comment #74.0
(not verified) CreditAttribution: commentedBullets!
Comment #75
yched CreditAttribution: yched commentedFeedback welcome in #2368807: Remove special support for NULL values in FieldItemList :
This patch made FieldItemList::setValue() call $item->setValue($value, **FALSE**), while ItemList::setValue() just calls $item->setValue($value). Why ?