Closed (fixed)
Project:
Drupal core
Version:
8.0.x-dev
Component:
configuration entity system
Priority:
Critical
Category:
Task
Assigned:
Unassigned
Reporter:
Created:
20 Feb 2015 at 13:26 UTC
Updated:
22 May 2015 at 14:04 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #1
alexpottHere's the idea sketched out in code.
Comment #3
alexpottComment #4
catchAlternative storage for config such as mongodb shouldn't need to do this (it can query the config object natively), so would be good to ensure we don't add overhead for those implementations in terms of undoing what core does.
Denormalizing like this feels preferable to some of the other SQL-storage tweaks that have been suggested (like XML etc.).
Comment #5
alexpottOkay moved everything inside the QueryFactory and Query code. ConfigEntityStorage is not involved at all.
Now trying to solve the Tour use case.
That look up could get expensive.
Comment #6
alexpottRecursion... with wildcards... my head hurts.
Comment #8
berdir/resurrect
This shows up in #2473983: [meta] Evaluate Entity Field API Scalability, which is partially a duplicate of #2423591: Optimize config entity import.
Comment #9
alexpottRerolled... the failed test seems to pass.
Comment #10
alexpottComment #11
alexpottBumping to major - the performance implications of a slow lookup on UUID are nicely illustrated by #2473983: [meta] Evaluate Entity Field API Scalability
Comment #12
alexpottRunning @Berdir's test from #2473983-18: [meta] Evaluate Entity Field API Scalability
In HEAD and the patch on that issue by the time we're at 300 bundles we're well over 10 secs. Maybe this should be critical?
Comment #13
dawehnerIts a bit odd that contrib can't easily expand the lookup keys, not sure whether its worth to worry about, given its already quite a good improvement.
Comment #14
catchYes since this affects any config entity you could easily run into it with combinations like views + aggregator + fields/bundles, and this is seconds of savings rather than milliseconds, so bumping to critical.
Comment #15
berdirNOT UNIQUE is a bit a special name for this, it sounds like it *must* not be unique (like assertNotUniqueText()).
Do we really need this additional complexity? We're not actually enforcing that it is UNIQUE..
Maybe we should just always allow multiple keys internally?
So this would speed up #2479459: BlockRepository::getVisibleBlocksPerRegion() does an uncached entity query on every request as well, but adding caching there still makes sense I think.
Comment #16
alexpottIn IRC @chx suggested looking at his plan for XML - because of MySQL can do optimised queries on XML - see #2162161: Change DatabaseStorage to use XML instead of serialize. I'm reluctant because of earlier issues with config and XML (but I could be convinced with tests) and also (and probably more importantly) that it's being put forward as a MySQL only optimisation.
Comment #17
andypostAlso it could seriously affects UI (machine name checks) with related #2091871: Add an ConfigEntityForm with an exists() method
Comment #18
alexpottre #15. It looks like removing the unique code is costly. Trying to work out why.
Without the unique/non unique code
With it...
Comment #19
andypostAnother way to manage uniqueness, so it needs to be swappable
Comment #20
alexpottFound out why... doing
Fixed it.
So just as fast with this patch. And in fact non unique look ups will be faster when they return no matches... for example the block lookup :)
Comment #21
alexpottre #19 but that's only if we go for database specific solution - the solution in this issue is database agnostic.
Comment #22
alexpottWe have a ConfigEntityTypeInterface - we need tests now.
Comment #23
berdirlookup vs denormalized keys is a bit confusing, why go for a completely different name here?
Does denormalized really make sense here, is that really what we're it means? The thing is that it's not actually typical denormalization that we're doing here.
What about LookupKeys(), for both the method and property? That's what it means.. having a separate lookup mechanism for those.
The description seems very specific, that we are using key value is kind of an implementation detail and not relevant here?
As disussed yesterday in IRC, again the question of injecting they key value factory or the collection. In this case, it's probably more likely that we will use it.
There's an issue to also support partial (STARTS_WITH, CONTAINS...) matches on the ID. That would be very useful to speed up thinks like field_entity_bundle_field_info(). I'm wondering if that would also be useful here, but I guess not so much.
That issue is #2247379: Optimize config entity query conditions on ID and it's probably going to conflict badly with this.
weird => rare? It's not *that* weird to e.g. have an entity query on entity_type + bundle for example, although for fields/displays, we mostly optimize it to load on the id.
We will need a way to initialize they lookup collection, should we make those methods public for that purpose?
:)
Comment #24
alexpottThanks @Berdir - re #23
(weird).Still need to add tests and hopefully that'll help me improve the documentation I added to address @Berdir's (7).
I also realised that we could remove the condition if
Query::loadRecords()handles it. This should result in less looping and improved performance as well.No performance regression in this patch compared to the most recent runs. I wouldn't expect the condition removal to speed up this query because we're checking for non-existence.
Comment #26
alexpottIgnore the patch in #24
Comment #27
alexpottAdded some tests for the QueryFactory::getKeys() and QueryFactory::getValues(). Although they are protected considering the recursive nature and the ability to use wildcards extremely worth testing.
The changes have no impact on performance (as expected).
Comment #29
berdirmissing space ;)
Note that the key value factory already maintains a "static" cache, so doing this here again shouldn't be necessary, the overhead should be negligible.
Having a method is still useful, I should probably add that to entity manager, to avoid repeating the collection string many times.
might be fixed already, missing "string".
typo in pocessing.
maybe restructure a bit to say that exception is thrown when such a key is passed in?
Comment #30
alexpottre #29 - fixed everything thanks!
And reordered params to be consistent and less confusing.
Comment #32
alexpottFixed fails... silly alexpott.
And added an integration test - which caught the fact I wasn't handling deletes correctly.
I think we have the necessary test coverage.
Comment #35
alexpottDid my own review of the patch and found some unnecessary changes.
Comment #36
berdirI think this is close.
A few nitpicks (sorry) was all I could find. There's still the upgrade path question.
The only thing I'm not sure about is whether this should somehow be more pluggable. But given that AFAIK the query service itself is pluggable, someone could just replace that if they want to provide an alternative implementation?
I think we document this as string[] now...
I guess there's still the question of what to do with this for the upgrade path. I left @amateescu a tell, so he can have a look and decide himself what makes sense.
Missing description.
not exactly sure what that means, but we should probably either get rid of the todo by deciding what to do or have an issue reference as follow-up?
no other change in this file, looks like a left-over.
string missing for @param and description for @return.
Comment #37
alexpottThe interdiff won't apply to #35 because the patch needed a reroll but it does represent the changes I've made to address #36. Thanks for the review @Berdir.
Will work on implementing 2 afrer talking with @Berdir in IRC sometime.
Comment #39
berdirShould use the full namespace I think.
type still missing (@param string $name) :)
Test run was terminated somehow, lets try again...
Comment #41
alexpottDiscussed with @catch, @effulgentsia, @webchick and @xjm. This meets the performance criteria for a critical. The performance improvement is measured in seconds once there are more than a few hundred configuration entities.
Comment #42
alexpottDiscussed the possible race conditions that this patch introduces with @catch, @berdir and @xjm.
If two configuration entities are being saved at the same time, for example blocks and they have the same theme key it's possible the only one with end up in the lookup. Currently on configuration entity save we have race conditions when saving the same entity - this does open possibilities of new race conditions but the solution of locking during a config save would be the same - opening #2485077: Add concurrent editing protection to configuration entities to discuss if we need to do something about this.
Comment #43
alexpottPatches addresses #39. Thanks @Berdir.
Comment #44
berdirI can't find anything to complain about anymore :) This is well tested and documented and provides huge performance improvements with a lot of configuration.
With the follow-up for the race conditions, I think this is RTBC.
Comment #45
catchDiscussed the potential for race conditions. Since this is (mostly) pre-existing and we have a documented follow-up whereas we didn't before, I think that's OK. The probability of sites falling over due to long config lookups is higher than the probability of high-concurrency config writes causing a race condition.
Couple of whitespace issues I fixed on commit, otherwise I also couldn't find anything to complain about, and I'd like to see this in sooner than later just in case something does get flushed out before the next beta. So, committed/pushed to 8.0.x, thanks!
Nit, missing space. Fixed on commit.
Double line, fixed on commit.
Committed/pushed to 8.0.x, thanks!