Right now, when you try to do a "select all on all pages", VBO loads the view again, strips all fields except the base one, kills the pager, and loads all ids.
Which is fine, up to a point. I have a test instance with 75 thousand commerce orders, and it just dies in that case (memory limit 128MB). Even when it doesn't die, it is of course slow.

The solution is to batch the selecting of ids.
#1176794: Improve Batch Performance fixes our batch api code to use a queue in the background, with one queue item per a group of ids.
So this issue should make VBO do a batch, and on each batch invocation load all ids on one page (one page per invocation), and then add a:
1) queue item with the serialized ids if batch api is the exection method used
or
2) add a queue item per fetched id, if the "enqueue option" is used.

Of course, this mechanism is used only if we have more than one page to select from.
I've done some initial proof-of-concept code on this, and it looks to me that the added complexity will require a refactoring of execution types (batch, queue, direct) so that the code is not ugly and can actually be followed.
My idea is to move the current (and future) functions that deal with execution to classes with static methods, so we'd have ViewsBulkOperationsExecutionMethodDirect, ViewsBulkOperationsExecutionMethodBatch and ViewsBulkOperationsExecutionMethodQueue, each in its own file, autoloaded when needed, leading to less memory usage for VBO.
The methods need to be static because some of them are used as callbacks (batch api uses call_user_func which supports static methods).

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

bojanz’s picture

Title: Batch the selecting of all ids on all pages, refractor execution types. » Batch the selecting of all ids on all pages, refactor execution types.

Fixed typo in title.

rosinegrean’s picture

Has there been made any progress on this issue?

Thanks,

bojanz’s picture

No. I won't have time for this in the near future (next few months).

skipyT’s picture

Status: Active » Needs review
FileSize
40.37 KB

Hi,

I created a first patch file. I tried to fix the issue.

This is the status of the patch:
- we have a interface for the execution classes called ViewsBulkOperationsExecutionInterface.
- a base abstract class ViewsBulkOperationsExecutionBase.
- and 3 classes for the different execution types: ViewsBulkOperationsExecutionDirect, ViewsBulkOperationsExecutionQueue and ViewsBulkOperationsExecutionBatch.
- I tested the refractoring with search API views, I know that the batch execution type needs some work to be compatible with the database table based views also.
- I kept the old way of the selection adjustment on the direct and queue classes but I'm planning to rewrite that also.

Could you check the current patch and tell me if it is good what I did until now?

skipyT’s picture

I updated a new patch file which fixes some issues:
- autoloading is fixed now (I added the class files to the info file)
- batch is working now for all of the view types not only with search api.

I still need to refractor the direct and queue classes a bit.

bojanz’s picture

Title: Batch the selecting of all ids on all pages, refactor execution types. » Batch the selecting of all ids on all pages, clean up execution types.
Status: Needs review » Needs work
FileSize
17.12 KB

Discussed the patch and the approach with skipyT.

We agreed we want to try a new flow.

Changes to execution types:
1) The direct mode is now only invoked for aggregate operations.
(Previously it was also invoked if the number of selected items is less than $entity_load_capacity).
This allows us to remove it completely in D8 when we remove support for aggregate operations
2) The batch mode now uses a queue in the background, with each queue item containing a group of entity ids to load and process.
This was first discussed back in #1176794: Improve Batch Performance.
3) The queue mode now also has groups of entity ids in each queue item, instead of having one queue item per entity id.
This allows us to unify the "queue item" processing code, reducing code duplication in the process. Should also lead to better performance.

If all items on all pages have been selected, a batch runs that executes the view page by page, and then gathers and enqueues all rows.
This fixes #1784156: Only first page of view results 'rows' passed to action on VBO execution.
If only items on the current page have been selected, then the batch doesn't run and the queue is filled using the values from the current view.

If we're in "queue mode" we stop here (the rest is done on cron). If we're in "batch mode", we run a second batch job that processes the newly filled queue.

Attaching a preliminary patch. It is not complete and will not work.
The code in this patch is still strictly procedural. An OOP conversion like the one in #5 can happen once we know how much code we have left ;)
Will be doing more work in the morning.

skipyT’s picture

Status: Needs work » Needs review
FileSize
18.04 KB

I took your patch file and I worked today on it to have something which is working.

I tested locally on a project for the modify entity values and some csv export action and it seems it is working well.

Could you take a look if is it ok?

podarok’s picture

+++ b/views_bulk_operations.moduleundefined
@@ -62,12 +62,30 @@ function views_bulk_operations_load_action_includes() {
+  db_delete('queue')
+    ->condition('name', db_like('views_bulk_operations_active_queue_'), 'LIKE')
+    ->condition('created', REQUEST_TIME - 864000, '<')
+    ->execute();
+}

good to see here something like error handling

bojanz’s picture

FileSize
36.62 KB

Here's my tonight's progress, based on #7.
Added a bunch of comments, removed a few unneeded functions.
I think it's very close to being committable.

Basic testing (configurable action, selecting across pages) worked.
Enqueuing (so that the item is processed on cron) still seems a bit buggy.
Didn't test the drush integration yet.

@skipyT
Please provide feedback (on how the code looks to you, where further improvements could be made, etc).

bojanz’s picture

FileSize
34.03 KB

Committed #1826758: Remove _views_bulk_operations_operation_do(). and #1826790: Remove the display_result setting. which were originally in this patch.

Here's a reroll.

skipyT’s picture

@bojanz: I will take a look this afternoon on the new patch file.

Thanks!

skipyT’s picture

FileSize
34.07 KB

I've checked your patch. I tested for the enqueuing and I think I fix the issue there.

The code seems ok for me, I don't see the reason to use classes with only static methods now, if the functions are much smaller. I'm voting for merging this patch as we have it here.

skipyT’s picture

Status: Needs review » Reviewed & tested by the community

we applied this patch on our project and seems that it is working. I changed the status of the issue.

bojanz’s picture

Status: Reviewed & tested by the community » Fixed
FileSize
4.52 KB

I've worked on this some more, and added a views_bulk_operations_direct_adjust() function that keeps the old logic for batchless "select all items on all pages" for the direct method (aggregate operations). Without that, "select all items on all pages" was broken for aggregate operations, which would have been a regression.

Also found several problems in my testing:
1) Enqueued items (postponed processing) weren't getting processed because the access function wasn't getting the correct account passed.
2) Drush integration was completely broken.
3) Active queue items (processed by the batch) weren't removed from the database.
4) In some cases the "success" message wasn't set after processing all active queue items.

Fixed that, tweaked the messages shown to the user, committed:
http://drupalcode.org/project/views_bulk_operations.git/commitdiff/6db04...
Attaching an interdiff for #1-4, for reference.

Thanks skipyT, this wouldn't have happened without you.
I've given you full commit credit.

skipyT’s picture

Thanks! But it's your merit also :)

And congrats for VBO merged in Drupal 8 core :)

hernani’s picture

After this one, planning a VBO 3.1 release this week Bojanz?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Anonymous’s picture

Issue summary: View changes

Removed the title from the body.