Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Updated: Comment #0
Problem/Motivation
Currently the most of cotent entities should implement id()
method when their primary key is not ID
Proposed resolution
Let ContentEntityBase::id()
to get id name from entity annotation entity_keys
Comment | File | Size | Author |
---|---|---|---|
#80 | 2182239-id-key-80-interdiff.txt | 1.98 KB | Berdir |
#80 | 2182239-id-key-80.patch | 17.01 KB | Berdir |
#78 | 2182239-id-key-78-interdiff.txt | 2.79 KB | Berdir |
#78 | 2182239-id-key-78.patch | 17.07 KB | Berdir |
#76 | 2182239-id-key-76.patch | 14.59 KB | Berdir |
Comments
Comment #1
andyceo CreditAttribution: andyceo commentedComment #2
BerdirThis was done for performance, to avoid the additional calls every time it's called.
However, that was before Entity Field API happened with the objects and everything, now that's actually the overhead.
What we should do is keep a computed id property around, something like $this->computedId and then only fetch the id once and after that, return it from that.
We basically do that already for the bundle and the language, the problem there is that they have very unfortunate names right now (bundle and language), which means that you can't have a content entity where the bundle key is named bundle.
Maybe have entityKeys or computedKeys or so for this, that we can easily check? Then we can use more or less the same logic for id, revision_id, bundle, uuid and langcode/language?
Comment #3
andypostMakes a lot of sense, will try to address suggestion
Media entity module does this now,
Comment #4
andypostA bit of optimization
Comment #5
andypostname should be null by default
Comment #9
andypostMessage entity actually have no ID
Comment #10
andypostNow config entities, some of them are using compound keys but annotation wrongly provides 'id' as key.
For example
RdfMapping.php
Comment #11
BerdirWell the easy solution is that those simply can't use a default implementation.
Entities must have a ID property/field. Those compound keys are backed by special code like this in ConfigStorageController::save():
Comment #13
andypostShould fix broken tests
I think we should statically cache
getEntityType()
result OTOH this will increase memory usageComment #15
BerdirThis is what I was talking about above.
This doesn't gain that much performance yet, but we could further optimize it to already preset the values in the constructor, so that we don't have the initialize the field objects as long as we don't change anything.
Comment #16
BerdirComment #18
BerdirOouch @ those unit test mock definitions... Seriously considering to limit this issue to content entities.
Comment #20
andypost18: 2182239-id-key-18.patch queued for re-testing.
Comment #22
BerdirCrazy, no idea how there weren't given my bogus onChange() implementation.
The createDuplicate() tests probably all use EntityTest, where getKey('id') == 'id'.
Comment #24
BerdirThe bundle must not be reset.
Comment #25
andypost+1 to RTBC, and profiling only hold this
Comment #26
andypostComment #27
dawehnerit would be great to document why we don't want to unset the bundle.
Is the entity key always a string?
Is there an API to fetch the main property value?
In general I wonder whether we really need to call the id() method in the constructor, is there no way around that?
Comment #28
Berdir1. Because bad things happen :) The bundle is very very important for content entities, as it defines what fields an entity has, if it's not set, then it can start to recurse as it needs to know the bundle to acess the bundle field ;)
2. The value? string or int.
3. That's the API :) There are only a few use cases, so not sure if it would save much to have a method there. There are also use cases to know just the name, not the value (validation), that's why it was added, actually.
I still think it might be better to ignore config entities here and just deal with content entities, as the implementation is quite different and config entities might also need further optimization?
Comment #29
Berdir24: 2182239-id-key-24.patch queued for re-testing.
Comment #31
BerdirOk, as suggested, going back to content entities, config entities are quite a different beast.
A bunch of changes:
- Introduced the pre-fill of the values in the constructor, this means that for entities loaded from the database, we never need to instantiate field objects
- Updated the getMainProperty() stuff for the new API.
- Avoid creating the EntityDataDefinition over and over again, this is somewhat unrelated but when profiling, I noticed that this is triggering *by far* the most calls to bundle()
Note that the removal of all the config stuff from the patch is not part of the interdiff, as that's just noise, the interdiff just shows the interesting parts.
Profiling is fairly hard as the method calls are moving to different places, but with the pre-filling and the persisted dataDefinition, this is definitely an improvement.
Comment #34
BerdirOk, we still need that loop for the isDefaultRevision flag, although we could argue to either hardcode that specific thing or require to set it directly?
Comment #35
fagoI'm not sure this is still required for content entities, there shouldn't be any defined properties being non-fields left?
People could pass in the value without making use of the whole field structure. So we'd have to instantiate the field to get the value from there, what sucks as it potentially instantiates uneeded fields. Maybe the cache could be filled on-demand?
Unnecessary new line.
Comment #36
fagoberdir clarified on IRC that the cache is already filled on demand, that's great. I guess 2 is fine as long as it keeps working with non-field-structured values as well, while the best performance of totally bypassing fields could be only achieved with the field structure.
As discussed on IRC we should benchmark to see what this actually brings.
Also, I must say the $entityCacheKeys property looks a bit ugly to me. Thinking about it, I'm wondering whether we shouldn't add a property as static cache per entity key as we already have $bundle ? Imo this would be less ugly, and with PHP >5.4 it should result in better memory usage also. (Declared properties use C structs then. I checked memory usage of defined properties vs 1 array property holding keys in the class definition issue and it was better with php>5.4.)
Comment #37
BerdirSee #2168431: Warn developers not to name entity bundle "bundle" or to unset it if they do. Right now, you can not really use bundle as the bundle field, weird things happen.
Also, I need to be able to separate between not having a certain entity key and not yet having it loaded. Still thinking about how to optimize that.
Comment #38
BerdirRe-roll, the namespace was fixed as I had a conflict there anyway.
I tried a few things to get meaningful numbers and finally ended up with this:
I'm mostly interested in avoiding to create field objects at all, so not doing repeated calls on the same object. And creating the object manually is the only way I can ensure that, load() will for example already call id() once.
This gives me 1-1.2ms with the patch, and 16-19ms on HEAD.
My initial script used load() and a loop on id() and the patch was still ~3 faster, even with the field object already created before.
As commented above, I don't think we should use properties directly for those key values, as we either have to use weird names or we make it impossible to use fields named like those keys, it is currently already impossible to have a bundle field, as written above. I want to fix that as part of this issue.
I'd also be happy to add unit test coverage when #2134857: PHPUnit test the entity base classes lands first, but I don't think we should try before that.
Also using the method for he label now.
Comment #40
BerdirFixed that test.
Comment #41
andypostThere's not so much
set()
(onChange
) actually suppose only migrate will be affected.But
getDataDefinition()
static cache on each entity could cause serious memory footprint, for example entity listings.Very helpful, but needs measurement for memory
this needs measurement, for cpu
Comment #42
BerdirRenamed entityKeysCache to entityKeys, with that rename, @fago confirmed in IRC that he is OK with this.
1. No it doesn't use more memory, adding a "print memory_get_peak_usage()" at the end of my example script above results in exactly 8784 bytes less peak memory usage ;) Not creating that object hundreds of times per request is saving much more than it is to keep one instance around.
2. As you said you said yourself, we should avoid going through the onChange() chain as much as possible anyway, this will be a bit slower, but we can live with that I think.
Comment #44
BerdirRe-roll of the re-roll after the entity base test patch was reverted.
Comment #46
BerdirFixed the unset, that apparently wasn't renamed...
Comment #47
jibranNeither it is a static method nor it is using caching so why are we using name getFromEntityKeyCache. Yes entityKey is using a static cache but imo it is simple getter function.
Comment #48
BerdirHm, the name is a bit tricky, yes. It is a cache, though, and #597236: Add entity caching to core follows a very similar pattern with getFromStaticCache() and getFromPersistentCache(), the former "just" accessing a property ($this->entities), exactly like this here. On the other side, it also returns it from the actual field if they cache isn't existent, so in this case, you could argue that it's an implementation detail.
Not sure what else to use, though. I'm not sure if just getEntityKey() or getKey() is clear enough that it returns the value for an entity key, unlike $entity_type->getKey(). It's a protected method, so it doesn't matter that much, it should only be used in rare cases outside of ContentEntityBase anyway.
Comment #49
jibranFair enough so I think it is RTBC.
Comment #51
BerdirRe-roll automerged by git rebase, so keeping at RTBC.
Comment #53
andypost51: 2182239-id-key-51.patch queued for re-testing.
Comment #54
jibranBack to RTBC.
Comment #56
jibranNeeds reroll.
Comment #57
BerdirTrivial conflicting in Term.php, patch diff only shows context changes, so back to RTBC.
Comment #58
fagoThanks for the rename, that's way better! Patch looks good to go to me as well.
True, but that means we should throw an exception then I guess? We've got the 'read only' flag in typed data which does not throw any exceptions or so right now (as we did not had onChange() initially), so what about throwing exceptions for changes on read-only properties in general?
-> should be its own issue.
Comment #59
BerdirOpened #2221955: Prevent changes for read-only fields (like bundle).
Not sure about change record?
Comment #60
alexpottThis should be NULLed in ContentEntityBase::__sleep()
Comment #61
BerdirSure, can do that, just want to clarify that we're talking about this:
O:49:"Drupal\Core\Entity\TypedData\EntityDataDefinition":1:{s:13:"*definition";a:1:{s:11:"constraints";a:2:{s:10:"EntityType";s:4:"node";s:6:"Bundle";a:1:{i:0;s:7:"article";}}}}
That's exactly 179 characters. a few more for longer entity types/bundles ;)
Comment #62
BerdirAdded that.
Comment #63
alexpottSorry :( I should have realised that we should NULL this in __sleep() too. I think serialised entities should be as small as possible since if their storage is not on the same machine as the running php process then this will minimise network traffic.
Comment #64
BerdirSure ;)
Comment #66
Berdir64: 2182239-id-key-64.patch queued for re-testing.
Comment #68
BerdirComment #70
BerdirAs discussed, we can't unset the entity keys because at least the bundle is required otherwise things get very sad.
Comment #71
Berdir70: 2182239-id-key-70.patch queued for re-testing.
Comment #73
BerdirRe-roll.
Comment #74
fagoyeah, no entity without a bundle ;) So let's keep it then - back to RTBC.
Comment #75
alexpott2182239-id-key-73.patch no longer applies.
Comment #76
BerdirRe-roll, was fairly trivial, just context change, so keeping it at RTBC.
Comment #78
BerdirThose unit tests... ;)
Comment #79
fagoRe-roll looks good. Looking at the patch again, I found the following though:
Should be $duplicate->EntityKeys!
Why not just call it getEntityKey() ? I guess that's a matter of preference though :)
Comment #80
BerdirYeah, see #48 about the name, the problem IMHO is the difference between $entity_type->getKey() and $this/$entity->getEntityKey() but ok, let's try that.
Comment #81
fagogetEntityKeyValue() else maybe? However, as this is protected I do not see that as a problem.
The update looks good, although I'm still wondering why it's necessary to set the field item class on the definition? Do you know? Maybe that would justify a comment for the next poor soul having to fix this test.
Comment #82
BerdirYou mean the unit test change? That is needed because that's how the mainPropertyName lookup works, the definition class calls the static method mainPropertyName() on the field item class. And mocking the field definition class would possibly not be trivial either and this allows us to use more actual code instead of mocks.
Comment #83
fagosure, I see. Yeah, keep using the regular field definition class makes sense to me as well.
Back to RTBC then.
Comment #84
BerdirBtw, I closed #2227711: Consider whether to make ContentEntity properties used by uuid(), bundle(), language(), and getRevisionId() not fields as a duplicate of this because it mostly avoids the problem that is described there.
Comment #85
alexpottCommitted 7ad0c57 and pushed to 8.x. Thanks!