Problem/Motivation
When running the entity-usage:recreate drush command with the --use-queue --multi-pass flags, I discovered that memory is exhausting when running the initial query of an entity type.
Steps to reproduce
- Have many, many revisions of paragraphs.
- Run
drush entity-usage:recreate --use-queue --multi-pass. - Drop a breakpoint directly after the query executes in the generateQueueItems() method.
- Observe that the paragraph query exhausts memory before it can even be processed.
Proposed resolution
I was thinking if we could pass a "chunk" value and use that in the query range, we could limit the query and prevent exhaustion in the first run. This might also require putting the query in its own function and sorting out the logic to run multiple times until results are 0.
Remaining tasks
User interface changes
API changes
Data model changes
Issue fork entity_usage-3240349
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
weekbeforenextComment #3
marcoscanoThe 2 initial ideas that come to mind are:
1- Instead of getting all revisions at once in the query:
we could try to do that in separate loops, but still in the same PHP request/execution. For example we could set an "arbitrary" loop size, maybe 100, 500, 1000?, and then create an additional
forloop that will keep querying for revisions, getting only this number each time, and then creating only this number of queue items in every loop. I'm not really sure it would avoid the memory issue but maybe it's worth trying.2- Use a batch process to create queue items. This would be a more significant refactor, since it would mean to change the whole command to be executed in background, in batches. Here's an example of a drush command that is executed as a background batch process.
I would be reluctant to just do it in several "passes" without having a way to enforce everything is covered, since it could lead to some entities being missed (for example if they are created more or less at the same time). Now that I think of it, I'm kind of reaching the conclusion that the
if ($multi_pass)approach might not have been the best idea in the end... :)Comment #5
weekbeforenextWith this MR, you can run a command like this to queue up tracked entities:
drush entity-usage:recreate --use-queueWhen cron runs, the queued up items will be processed and the entity_usage statistics will be recreated or you can manually run like this:
drush queue:run entity_usage_regenerate_queue --time-limit=120This is helpful in cases where you have many, many entities that must be processed.
There is a default batch size of 100 at this time, but it can be overridden when you run the command with the batch-size option like this:
drush entity-usage:recreate --use-queue --batch-size=10000Note: This MR has refactored the way queue logic works, so the
--multi-passoption has been removed.I have successfully tested this with 2,309,543 tracked revisionable entities. Testing of non-revisionable entities is needed.
Comment #6
weekbeforenextComment #8
marcoscanoCommitted, thanks for helping!