Problem/Motivation
Every time user_attach_accounts() is called, it prepares a new user entity for user 0. That means it creates a new content entity which involves to create default values for all fields, which in turn creates all field objects down into the property objects. That's very, very expensive.
It is called every time a group of nodes are viewed. So if you have multiple views/blocks/.. that display a list of nodes, then this is called many times.
Based on new relic reports, on cold caches, I'm seeing up to 50 calls to Drupal\Core\Entity\ContentEntityStorageBase::doCreate(), that takes 250ms.
The same applies to comment_prepare_author(), that is a bit more complicated though, as it creates anonymous accounts with specific values for name and homepage (which isn't actually a user field..)
Proposed resolution
We don't need default values for the anymous user. Creating such an object is not quite easy at the moment, a helper method User::getAnonymousUser() is added to simplify that.
user_attach_accounts() is actually not needed at all, instead, we can make sure that the author formatter uses the standardized entity reference formatter pre-loading by extending the author formatter from the entity reference formatter base class.
comment_prepare_author() is additionally merged into Comment::getOwner(), so that that already returns the correct object.
Remaining tasks
User interface changes
API changes
comment_prepare_author() and user_attach_accounts() are removed.
Beta phase evaluation
| Issue category | Bug because this is a performance regresssion |
|---|---|
| Issue priority | Major because we loose 5ms * count($nodes) |
| Prioritized changes | Performance |
| Disruption | Two functions are removed that are very likely not called by anyone. |
| Comment | File | Size | Author |
|---|---|---|---|
| #51 | cleanup-author-preloading-2458817-51-interdiff.txt | 1.1 KB | berdir |
| #51 | cleanup-author-preloading-2458817-51.patch | 7.95 KB | berdir |
| #32 | cleanup-author-preloading-2458817-32.patch | 7.95 KB | berdir |
Comments
Comment #1
berdirActually, we might not even need it anymore. It used to be called by comment view builder too, but that's gone since #1548204: Remove user signature and move it to contrib.
The description is completely bogus, as is calling setOwner(), but that just calls ->id() again. The only thing that matters is populating the static cache with a single loadMultiple().
Let's see if this works. It might fail if node uid is NULL, then ->entity won't find anything.
Comment #2
berdirUpdated issue summary with some numbers.
Comment #3
dawehnerDoes that mean a User::load needs 5ms?
Comment #4
berdirAh, comment_prepare_author() is the other related function, and that one is a bit more complicated, because we create an anonymous user with specific values for name and homepage (which is kind of funny, because authenticated users don't have a homepage ;))
Comment #5
wim leers3 instead of 2 spaces.
#1 doesn't yet remove the function.
I wonder why this is actually necessary at all? We're only doing it for Node entities. Why do other entity types don't need this?
Comment #6
berdirAnd uh, we actually killed the pre-loading for that, so we should add it back.
Comment #7
fabianx commentedShould we remove this dangerous helper then?
Comment #8
dawehnerSo user_attach_accounts() also also doing the preloading, why wan't we rewrite in a way that it doens't create the anonymous account unless needed?
Comment #9
fabianx commentedWhy don't we create the anonymous user account once and save it as before?
Yes you could delete uid 0 in previous versions, but we could just re-create it if its missing:
User::loadAnonymous() --> lazily creates the account and saves it in a locked state
Edit:
Or maybe not even saved to the database, but just created once per request, which takes 5 ms, which might be okay I think.
Comment #10
berdirWe have a record for uid 0 in the database anyway, but the anonymous user isn't as locked as you might think. See comment_prepare_author(), that's anonymous user with different name every time.
Comment #11
wim leers"Just 5 ms per request" -> *gasp*! Are you sure you're @dawehner and not some impersonator? :P
Comment #12
berdirSo, this is more or less what I had in mind. User::getAnonymous() isn't too pretty, but it should be fairly fast.
Note that I removed manual preloading in NodeViewBuilder completely and instead made the AuthorFormatter extend from EntityReferenceFormatterBase, which gives us free preloading. We can do the same for comment once that is using a formatter too.
Total wall time varies quite a bit, but on a node with an actual author and two anonymous comments, this saves ~10ms. See uprofiler screenshots.
Also not that this time is with the profiling overhead, and without that, it's smaller than my other numbers, because this is a plain D8 site, while the other one has a bunch of configurable fields on the user.
Comment #13
berdirForgot the screenshots.
Also note that I disabled the render cache, to simulate a cold cache when profiling.
Comment #15
dawehnerI love the idea to simplify stuff and put the logic onto the user itself. Its a special factory of a user object, so putting that onto the user object is a good place.
Comment #16
berdirTwig, that was mean. getAnonymous() conflicted with the magic method detection, and it used that method instead of isAnonymous(), so I renamed it to createAnonymous().
Comment #17
wim leers100% agreed with #15!
Comment #18
wim leersThis patch looks great to me, but this requires somebody with more Entity/Field/Typed Data API knowledge to RTBC.
Comment #19
tim.plunkettI think the only weird part is having to specify uid => 0 here. Why not put that in createAnonymous?
Comment #20
berdirCreated https://www.drupal.org/node/2459807
And yes, adding the uid 0 automatically is a good idea.
Comment #21
fabianx commented+1 to RTBC, but I feel yched would be good to review this.
Comment #22
fagoGreat to see clean-ups and performance improvements at the same time like this. Here a review:
If the referenced entity is already anonymous, why would we create another one?
Big +1 for having that method, but I don't like the name as it unfortunately collides with the create semantics of entities. Afair the factory patch would try to differentiate between create() and createInstance(), so maybe createAnonymousInstance(). Anyway, the previous getAnonymous() would be way better. Can we fix the twig problem somehow instead?
Shouldn't we instead document $values to map to fields only - what's the point in passing in something else? This would be faster also as it saves us from instantiating field objects if they are not used later on.
Comment #23
berdirThanks for the review.
1. Because of the additional values that we set. Alternatively, we could also clone it and set them I guess.
2. No, we can't change twig. We could only go with a different name to prevent a conflict, like getAnonymousUser().
3. Because that's exactly what comment is doing (homepage is not a field...). Yes, that stuff is supper weird and I'd honestly rather have two different templates for it, or something, but that seems a rather big change..
Comment #24
berdir@amateescu liked getAnonymousUser().
We discussed the performance impact a bit, but considering that it is very likely that those values will be accessed anyway (All three calls immediately run it through the username theme function), trying to optimize for that not happening doesn't seems not really worth adding quite a bit more code. Remember that we would also have to key those values by the langcode and so on, so that they match the internal structure.
Comment #25
amateescu commentedI think this looks very good now :)
Comment #26
andypost+1 to removal, the only question about enforceIsNew(FALSE) - it was here for some reason, maybe just not allowing to save the author entity when comment saved
Comment #27
yched commentedNitpick for clarity : s/$entity/$user ?
Makes sense, but then the viewElements() method should use $this->getEntitiestoView(), like other ER formatters ?
Not sure I get why "values might not map to fields", what's special about the anon user regarding values and fields ?
I'm not sure the generic User::getAnonymousUser($values) method should behave differently than User::__construct($values) regarding what happens with $values ?
If that's because Comment.php uses that to assign an adhoc, non-field 'homepage' runtime property, then arguably that belongs to Comment.php ? Like so :
Pushing it one step further : "make an anonymous User with tailor-made 'name' and 'homepage' properties that come from 3rd party data (here, data stored in the comment)" feels a bit weird, User shouldn't have to cater for special cases like this. As far as user.module is concerned, there is only *one* version of "the anon user", right ?
Then shouldn't it be :
- User::getAnonymousUser() : no param, the anon user is always the same, the method creates the object once and statically caches it for subsequent calls
- Comment::getOwner() does :
Comment #28
yched commentedBack to NR for now ?
Comment #29
berdirThanks, great feedback as always ;)
1. Sure ;)
2. Yeah, it should, not sure why I didn't see that.
3. Yeah, comment is weird, not user. user.module actually cares about it in other places too (username preprocess/template has special logic for hostname, despite that only existing for comments, but that's not a reason to add more special stuff). That means we'd completely remove $values then? Makes sense...
Comment #30
berdirComment #32
berdirComment #35
dawehnerLooks like a great solution!
Comment #36
berdirStarted updating the issue summary, needs a beta evaluation I think.
Comment #37
fabianx commentedRTBC + 1
Prioritized changes are performance for beta evaluation.
Comment #38
yched commented@Berdir: yay, thanks !
Comment #39
berdirComment #40
berdirComment #41
alexpottThis issue addresses a major bug and is allowed per https://www.drupal.org/core/beta-changes. Committed 70ea0f4 and pushed to 8.0.x. Thanks!
Comment #42
yched commentedThat comes a little late :-/, but I was thinking : maybe it's not too safe to trust the caller of getAnonymousUser() to clone the object in case it's going to modify it. If he fails to do that, he then pollutes the static object everyone else uses ?
Maybe getAnonymousUser() should always return a clone of its internal the static object ?
Comment #43
fabianx commented#42: Yes, indeed. We should definitely return a clone instead directly.
Follow-up?
Comment #45
alexpottLet's handle #42 in this issue. Reverted.
Comment #47
berdirNot sure. I think comment is a special case. we don't clone entities either when we load them, why should we here?
Comment #48
andypostEach anonymous comment author could have different name
PS: profiling of the function (node with 273 comments) - with reverted patch 14.74%

Comment #49
berdir@andypost: I'm not sure what you are saying.
We already clone the anonymous user object. The only question is if getAnonymousUser() should always clone or if the caller has to decide if he needs to clone or not.
About those numbers, what exactly is 9% and 14%? The function no longer exists with the patch, so it can't be before/after?
Comment #50
andypost@Berdir I wanna show that patch shaves like 7% on rendering of ~300 comments
And +1 to #42
Comment #51
berdirMoved the clone.
Not fully convinced that this makes sense. On one side, it does, as two anonymous users aren't the same user on the other side, it kind of moves back knowledge about "weird anonymous users with additional values" back into user.module.
I'm fine with either patch.
Comment #52
fabianx commentedRTBC, It is safer to return a clone for that common shared object.
Comment #53
alexpottI agree with @Fabianx #51 is safer. Thanks @Berdir.
This issue addresses a major bug and is allowed per https://www.drupal.org/core/beta-changes. Committed 3ab0ef9 and pushed to 8.0.x. Thanks!