Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Follow-up to #2658268: Cache persistence after derivative creation
Problem/Motivation
\Drupal\crop\Entity\Crop::findCrop()
is a often-used to find a crop type for the given URI and crop type.
It uses an entity query on "URI", "type" and sorts the results on "cid" field.
Proposed resolution
"cid" seems like unnecessary here. Also, there should be a query index on "URI" and "type" fields.
Comment | File | Size | Author |
---|---|---|---|
#18 | interdiff-2800739-18.txt | 415 bytes | woprrr |
#18 | improve_findcrop-2800739-18.patch | 2.39 KB | woprrr |
#15 | interdiff-2800739-15.txt | 1.18 KB | woprrr |
#15 | improve_findcrop-2800739-15.patch | 2.38 KB | woprrr |
#13 | improve_findcrop-2800739-13.patch | 2.38 KB | woprrr |
|
Comments
Comment #2
slashrsm CreditAttribution: slashrsm at MD Systems GmbH commentedSort by "cid" is there to have predictable results in an (unlikely) case of more crops being in DB for a given image style, uri combination. By using the one with the lowest cid we make sure we always return the oldest one.
I am not sure if that is strictly needed. Maybe we could handle this on save and prevent such duplicates from being inserted in the first place.
Definitely +1 for index.
Comment #3
BerdirBut don't we identify the crop based on the URI?
Ah, I guess there could be duplicates if you use multiple crop modules? But that seems... problematic anyway :)
also, that sorts by cid ascending, so it will always use the first.
anyway, we can keep that and just add that to the index too. See \Drupal\paragraphs\ParagraphStorageSchema for an example that adds a combined index and \paragraphs_update_8008() for the necessary update function.
This is probably a pretty good task for the drupalcon sprints...
Comment #4
matiasmirandaCurrently I'm working in a site with large amount of images cropped using focal_point module.
I'm experienced an overload in the RDS instance of MySQL then I could observe that the overload occurs due to a copy of data to memory, performed by MySQL.
This is an example of MySQL Show Process List output
Currently I have 15k registers in crop table, for that reason when the data is copied the memory is exhausted, then MySQL starts copying the temporary data to disk instead memory and the use of processor reachs 100%, and then the server crashes
In some cases MySQL creates internal temporary tables while processing statements, in our case due to the ORDER BY and the GROUP BY clauses.
The crop table has column "cid" as PRIMARY and Auto_incremet, for that reason we can remove ORDER BY, because in that case the result set is delivered ordered by PRIMARY KEY.
But that is not enough, because range(0,1) in findCrop still adding a GROUP BY clause to the statement, to avoid this is necessary the metaData
simple_query = TRUE
By this way, findCrop will retrieve the oldest cid for a certain URI and Crop Type, without copying data to tmp table in memory or disk.
I attach a patch to be reviewed.
Comment #5
dagmarComment #6
Berdirnot sure the simple_query thing is something that should be set from outside.
I started working on optimizing entity queries in #2875033: Optimize joins and table selection in SQL entity query implementation, would be interesting to see how that affects that query. the patch is definitely not yet ready though.
Comment #7
matiasmirandaInteresting.. maybe the simple_query not should be seted outside, anyway for this case it solves the problem, and will work fine until your optimization in entity query be ready (will be nice to have a retro-compatibility in order to avoid remove the metaData :)
Comment #8
matiasmirandaYou're right @Berdir, is not a good idea use metaData here. So, I rewrote the findCrop method to use a simple query using db api instead entity query.
Here goes a new patch to be reviewed
Comment #10
matiasmirandaOops, the "Reformat Code" of my IDE made a mistake.. I'm sorry for that
Here goes again..
Comment #11
BerdirYes, that's possibly the fastest option, I did something like that as well in redirect.module. However, then it either needs be a method on the storage or a separate service, tagged with backend_overridable, see the redirect.repository service.
So that if someone wants to replace the storage and use MongoDB or something like that, they can provide an alternative implementation of that service.
Comment #12
woprrr CreditAttribution: woprrr as a volunteer and at NeoLynk commentedHi all in that case we can purpose a new Crop Storage like "\Drupal\comment\CommentStorage" way to use that fastest option without loose other system compatibility. I try to purpose you a new one today !
Comment #13
woprrr CreditAttribution: woprrr as a volunteer and at NeoLynk commentedHere the patch with "CropStorage" implemented and use in findCrop() method and that permit overrides Crop handler to change it for MongoDB or something like that.
@Berdir ++ for backend_overridable I don't know that I see that. Which method are better in our case ? EntityStorage or backend_overridable service ?
Comment #14
BerdirYou could consider to already return null or the loaded entity here with $cid ? $this->load($cid) : NULL;
Would make the static findCrop() method a simple one-liner.
Comment #15
woprrr CreditAttribution: woprrr as a volunteer and at NeoLynk commentedYou totally right !! And I really love to make methods more simpler/cleaner as possible. Here you are the patch with refactoring applied.
Comment #16
BerdirI'd write "if nothing matches the search parameters" or so, but other than that, looks fine to me.
Comment #18
woprrr CreditAttribution: woprrr as a volunteer and at NeoLynk commentedHere the grammar fixe + follow-up issue to solve deprecated BaseFieldDefinition breaking the tests #2925305: Change all deprecated BaseFieldDefinition use
Comment #20
woprrr CreditAttribution: woprrr as a volunteer and at NeoLynk commentedComment #21
BerdirLooks good to me, lets get this in for both 8.x-1.x and 8.x-2.x.
Comment #24
woprrr CreditAttribution: woprrr as a volunteer and at NeoLynk commentedMerged in both branches Thanks all :) good job.