Love this module, great job! Custom batch ops are finally easy. I do have a couple of questions/feature requests though.

Triggering a batch directly from custom code

Maybe this is already possible, but I can't find anything in the docs about. The only way to do it in code is from the update or deploy hooks. Which are scenarios that I rarely need. Cron triggers are great, but a way to manually fire it off in my own code would be nice. The run() command used in the hooks seems to require parameters that may not be available or necessary.

Breaking a large task into multiple batches

The core batch api supports breaking a single batch call into "multiple" batches. For example, processing 1000 nodes. A single triggering of the batch operation can break the task into multiple batch operations in succession. Say 50 nodes at a time. It's more efficient than making each node a single batch operation. But as near as I can tell, that's what your module does. You load and array of IDs and the the batch processes them 1 at a time, but I'm guessing each node is a distinct batch operation, which is fine for simplicity, but not very efficient. Is this the case? Is there a way to process them in groups? Is it already doing that? If so is there a way to control the number of items in each group?

Comments

cdesautels created an issue. See original summary.

swirt’s picture

Hi cdesautels,
Thank you for taking the time to raise these very good questions.

Triggering from code

Yes I need to make a good example of how to do this. It would look something like this:

// Establish a sandbox to keep state across multiple runs.
$sandbox = [];
// Initiate the Finished state as not even started.
$sandbox['#finished'] = 0;
// If this is custom you likely want it to fail gracefully on errors.
$allow_skip = TRUE;
$script = \Drupal::classResolver('\Drupal\MY_MODULE_NAME\cbo_scripts\SCRIPT_NAME';
do {
  $script->run($sandbox, 'MY CUSTOM NAME', $allow_skip);
} while ($sandbox['#finished'] < 1);

However, this is not a true batch. as it will not invoke the batch api and progress bar with repeated ajax calls. This means it is subject to timeouts since it is just one process running until it timesout or finishes, whichever comes first. So I would shy away from this for things that are processing more than a few hundred items. However, even if it did time out, the next time you ran it, it would pick up where it left off, so it might work with your specific use case.

I have it in the roadmap (a feature request exists) to have an option to add items to a queue and this would be better suited for calling large quantities by custom code.

swirt’s picture

Breaking a large task into multiple batches

THere are a couple ways around this to try to make it more performant.

A:
You don't have to pass node IDs in the list of items from gatherItemsToProcess(). You could actually perform a node load multiple and pass all of them as the array to process. If however you had a large number of items to load this would not be recommended since maybe you can't load that many items into memory at one time.

B:

You could decide on a batch size and then to an array_slice on $sandbox['items_to_process'] inside of processOne() and use what you sliced to loop through those. It requires a bit more active coding though and could get a little sketchy in terms of making sure each of your batch size subset gets logged correctly.

C:
Much of the wiring is in place for this but not all of it yet https://git.drupalcode.org/project/codit_batch_operations/-/blob/1.0.x/s...
Wait for me to figure out the rest or help me figure out the rest (contribute). The main hangup is that even though I know the batch size, I can not standardize the loading of the items, because each use case might be different, some are loading nodes, some terms, some ... So even if I know the batch size for how many to load, I don't know how to load them so that it could load multiple. I likely need to extract the loading from the processing, but then things become harder to explain to people wanting to use it. So for now it defaults to loading and processing one item at a time. It may be a little slower, but more it is more reliable and more explainable.

swirt’s picture

I am going to add a little code to basically wrap up the example code I gave you to make that easier to call with just 2 lines. I should have it for you tonight. :)

swirt’s picture

Title: Is there a way to trigger a batch dircectly from code? » Is there a way to trigger a batch directly from code?

  • swirt committed 01c58426 on 1.0.x
    Issue #3471007 by swirt, cdesautels: Add method to run BatchOperation...
swirt’s picture

Status: Active » Fixed
swirt’s picture

The ability to more cleanly call the BatchOperation from custom code has been added to release 1.0.4

Performing the following in your custom code should run it.

      $script = \Drupal::classResolver('\Drupal\MY_MODULE_NAME\cbo_scripts\SCRIPT_NAME');
      $script->runByCustomCode('CUSTOM EXECUTOR IDENTIFIER', $allow_skip = TRUE);
cdesautels’s picture

Thanks, you really jumped on that. I tested your change in 1.0.4 and it works fine. Just for clarity though, this suffers from the same problem you're more verbose code does? That is, it's not a true batch?

At for the other question. I think another approach is to just return the data from gatherItemsToProcess() as a 2 dimensional array, and write the callback in processOne() to loop through each entity in each nested array which is seen by processOne() as a single item. I suspect there'll still be issues with logging though.

swirt’s picture

Yes it this new additions suffers the same problem as the verbose... because it is the same thing. It is not a true batch so it has the potential to be subject to a timeout if the batch or the process is too big/intensive.

For the second one... yah I think I am getting closer to a way to pull off sub-batches, so that loading multiple can make it a bit more performant. Thank you for getting me contemplating that some more.

swirt’s picture

Status: Fixed » Closed (fixed)

I am going to close this issue. If you have other feature requests or questions, feel free to open new issues.