Attached patch fixes scaling issues in batch API.

The batch definition is stored as a serialized array in the {batch} table. During batch process, the whole $batch array is fetched from db, unserialized, processed, re-serialized, stored in db - on each iteration. When that array is too big, this slows down the processing, and more annoyingly causes 'max_allowed_packet' failures on shared hosts, where changing that setting is sometimes not possible. See #434032: Batch update fails with access denied message.

1st issue is that the stored batch $array contains the full array of operations. That's a design flaw, doesn't scale well for large number of operations. The correct design is to store operations separately, and only fetch them one at a time as we process them. That's what the Queue interface is about, so let's use it.

2nd issue is that batches defined within a form submit workflow (most common use case) store the whole $form and $form_state array. Problematic for 'big' forms like the modules page (which triggers a batches .po import) or simpletest list. In D7, $form_state['complete form'] contains the $form, so no need to store both. Additionally, only a few edge cases actually need the full $form_state, in most cases we only need a small subset.

Summery of what the patch does:

  • Moves batch 'operations' to a Queue. The {batch} table still stores the rest of the batch data (custom UI messages, results, sandbox for multipass data...)
  • Defines its own BatchQueue handler class, since the Queue API and SystemQueue handler officially don't ensure ordering, which batch API requires (think update.php without order...). This class extends SystemQueue and thus reuses the {queue} table. Batch API queues names use a 'drupal_batch_' prefix to avoid clashing with other queues.
  • Also defines a memory based BatchStaticQueue handler class, used for non-progressive batches, that are executed in one pass and skip db storage completely.
    Non-progressive batches are used for non-interactive Drupal installation, or when a batch-enabled form is programmatically submitted through form_submit_form() (drupal_execute() in D6).
  • Both progressive and non-progressive batches need a unique batch id (non-progressive batch previously didn't). Batch ids are now generated by db_next_id() (the 'sequences API'), {batch}.bid is no longer a 'serial' column. system_update_7046() does the change.
  • Since update.php now relies on the queue system, the creation of the {queue} table is moved from system_update_7022() to update_fix_d7_requirements().
  • When preparing the processing of a batch set in a form_submit handler, only store the full $form_state when actually needed (rare cases).
  • Comes with an extensive series of tests for the batch processing workflow, derived from the home-made, hand-run test module I used in the D6 cycle. Those tests revealed a couple minor bugs in later additions to the batch API (CSS styles, timer for elapsed time...), which the patch also fixes.
  • Moves the existing batch-related tests in system.test and form.test into batch.test and the newly added batch_test.module.
  • Enhances a few code comments along the critical path. Also fixes a couple 80 chars wrapping errors.
  • no API change whatsoever ;-)

Note that a fair bit of this patch is about tests. The actual code changes are like 25-30k.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

yched’s picture

Here's a patch without tests for easier review.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
66.32 KB

Hm, does this one apply better ?

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Small logic error for batches with no actual ops. Will reroll later today.

Crell’s picture

+1 on the concept.

If BatchQueue is extending SystemQueue, and SystemQueue already implements DrupalQueueInterface, there's no need to re-declare that you're implementing that interface. (It doesn't hurt anything, but it is not necessary.)

BatchStaticQueue looks sufficiently generic that it shouldn't be bound to just the batch system, even by name. Heck, for PHP 5.3 it would make sense to have it use SPL's queue class internally. :-)

yched’s picture

Status: Needs review » Needs work

If BatchQueue is extending SystemQueue, and SystemQueue already implements DrupalQueueInterface, there's no need to re-declare that you're implementing that interface. (It doesn't hurt anything, but it is not necessary.)

Yes, I wasn't sure, so I looked in includes/database ;-)

class DatabaseStatementBase extends PDOStatement implements DatabaseStatementInterface {

In fact a search on implements .* extends bring several matches in includes/database, includes/filetransfer, system/mail.sending.inc, system/system.updater.inc... So I assumed it was at least correct.
[edit: I was a little fast. Of these matches, only a few match the same case of ClassA extends ClassB implements Interface while classB already implements Interface. And none of those in includes/database, my bad :-p]

BatchStaticQueue looks sufficiently generic that it shouldn't be bound to just the batch system, even by name.

Possibly. The classes used by the batch engine need an additional getAllItems() method, that isn't part of the DrupalQueueInterface, so if we take the "static queue" class one level up, we'd still need a batch-specific subclass. Unless there's a clear consensus on that, I'd rather discuss this in a followup and not derail this patch.

yched’s picture

Status: Needs work » Needs review
FileSize
65.06 KB

Attached patch should fix the test failures.

Crell’s picture

Status: Needs work » Needs review

I can't speak for the rest of the instances, but PDOStatement is a PHP internal class while DatabaseStatementInterface is a Drupal interface. So we do have to declare that we implement that interface.

yched’s picture

Crell: yes, I realized that afterwards, and edited the comment - sorry 'bout that :-/
Well, it's just that BatchStaticQueue implements the queue interface but doesn't extend anything, so I kept the 'Implements' on both classes for symmetry in the file.

Crell’s picture

BatchStaticQueue can really be just StaticQueue, I suspect. Or MemoryQueue. Or something like that, which implies it can be used for things other than just the batch system.

yched’s picture

OK, I'll reroll with that. As I said in #7, we'll still need a BatchMemoryQueue to implement getAllItems() method, which batch API needs but is not part of the Queue interface.

"which implies it can be used for things other than just the batch system."
Well, this generic MemoryQueue handler will be of little use with the current DrupalQueue::get() factory method, since queue handlers are pluggable but variable-based (one queue handler for all queues on the site), and the MemoryQueue can hardly be a good candidate for handling all the site's queues.
The Batch API uses its own _batch_queue() "factory" function to circumvent that, and decide which handler needs to be used.

yched’s picture

Rerolled with a generic MemoryQueue handler. I'd rather not have this derail the issue, though, so if anyone objects to it, I would go back at patch #7.

new patch + updated 'no_test' patch without the test parts, for easier code review.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
28.45 KB
66.16 KB

Unlike the generic MemoryQueue, the BatchMemoryQueue handler needs to allow items to be repeatedly claimed (multipass ops).

dman’s picture

Thanks for getting stuck into this. I hit the max_allowed_packet thing on the batch table doing some some huge imports (taxonomy_xml) and was shocked when I found that the queued actions I'd worked so hard to make small and atomic were being smudged together in a big serialized lump anyway!
I ended up hacking together a meta-queue thing just to make sure that I didn't put too many things into batch at once (!) By the time I'd done that, I'd almost removed any benefit of using batch in the first case (apart from the nifty progress bar!)

Using a real queue API sounds great! Good one yched!

brianV’s picture

+1 for this issue. Over the last few weeks, I did 40-50 hour monster batch jobs importing data into several sites. Any performance upgrades are much appreciated!

yched’s picture

Bump...

This gets +1's, and I do think it's a needed change, but I fear it might be difficult to get actual reviews for a patch that touches the guts of Batch API. Unlike the current, it does have tests to prove it works, though ;-)

So, how do we get this in ?

sun’s picture

I spent some time reviewing this patch. It looks very nice and makes perfectly sense.

+++ includes/form.inc	13 Nov 2009 12:18:56 -0000
@@ -3073,11 +3097,28 @@ function batch_process($redirect = NULL,
+    // Assign an arbitrary id: don't rely on a serial column in the 'batch'
+    // table, since non-progressive batches skip database storage completely.
+    $batch['id'] = db_next_id();

Don't we also have a sequences API now? Just mentioning, not sure whether it makes sense here.

+++ modules/system/system.install	13 Nov 2009 11:32:13 -0000
@@ -2358,53 +2360,9 @@ function system_update_7021() {
- * Add the queue tables.
+ * Empty update.

Please revert, but add a note that it has been moved to d7_requirements

After these, this should be ready to fly.

yched’s picture

Thanks for looking at this, sun.

"Don't we also have a sequences API now?" : yup, that API is db_next_id() ;-) - http://drupal.org/cvs?commit=276280

Will reroll for the update PHPdoc.

yched’s picture

Rerolled for system_update_7021() PHPdoc.
Other than that, patch is the same as in #15, so I didn't roll a new 'no_test' patch.

sun’s picture

Status: Needs review » Reviewed & tested by the community

Thank you! Awesome work!

yched’s picture

Status: Reviewed & tested by the community » Needs review

yched requested that failed test be re-tested.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review

HEAD is broken - I got those two failures on a local HEAD, and so does http://drupal.org/node/634440#comment-2277426

yched’s picture

HEAD fixed. Meanwhile, rerolled for a couple missing t()s around assert messages.

Status: Needs review » Needs work

The last submitted patch failed testing.

Status: Needs work » Needs review

yched requested that failed test be re-tested.

yched’s picture

Status: Needs review » Reviewed & tested by the community

OK, back to RTBC as per #22 then.

yched’s picture

Status: Reviewed & tested by the community » Needs work
yched’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
66.12 KB

Just needed to update the test support module to account for the new $form_state['storage'] / ['rebuild'] behavior for multistep.

Patch summary is in the OP, and #15 has a 'no_test' slimmed down version of the patch with only actual code changes.

yched’s picture

Bump. As I wrote in other threads, the tests in here would be precious to ensure that the ongoing '$form_state' overhaul work doesn't break batch API.

Status: Reviewed & tested by the community » Needs review

yched requested that failed test be re-tested.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
66.1 KB

Bumped system_update_N() number.

Status: Needs review » Needs work

The last submitted patch failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
66.1 KB

No luck, two patches adding a system update committed in a row. Try again.

yched’s picture

Status: Needs review » Reviewed & tested by the community

Green, back to RTBC.

yched’s picture

Reroll.

Status: Reviewed & tested by the community » Needs work

The last submitted patch failed testing.

yched’s picture

Assigned: Unassigned » yched

Easier tracking.

yched’s picture

Grumblegrumble... 3rd system_update_N() bump in two days :-p (yes, I can grumble with my tongue out)

yched’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
66.1 KB

Reroll.

yched’s picture

Needs to be updated after #659710: Queue API is missing a releaseItem method (added a new method to the Queue interface, which the Queue handlers in this patch fail to implement).

I won't be able to do it this week, and I'd also like a word from a core committer that this will eventually get in before doing so, since the patch has been ready for a month and just keeps getting stale.

As a reminder, patch description is in the OP - internal refactor only, no API change.
#15 has a 'no_test' patch for easier review (tests are more than 60% of the patch)

moshe weitzman’s picture

This scaling problem just bit us on a big batch API powered migration (a D6 site). Would love to see this committed.

I'll ask Dries to give a statement on this patch.

fgm’s picture

This question raises a bigger issue. From a functional standpoint, a method like releaseItem() should not need to be implemented by all implementors of the interface, but PHP does not implement optional methods in interfaces. And SystemQueue has no base class, but just derives from stdClass and implements DrupalQueueInterface.

If our default queue was based on a default class providing stub/default methods, implementations wouldn't need to implement each and every optional method just to complete an Interface contract. However, from #391340: Job queue API it seems that the whole Queue API was built from its initial incarnation on top of a queue Interface instead of a base class.

The problem with using a base class, of course, is that any class in PHP can only inherit from one parent, but can implement multiple interfaces, so using a base class loses generality by preventing concrete implementations from inheriting from an other class. But, given the nature of this entity, it seems it would be more natural to have concrete queues extend a base class than implement an interface.

AFAICS, there are two possibilities to avoid future problems like this one (and fix this one too)

  • define the optional features as separate Interfaces, probably with those defined as extending DrupalQueueInterface, like: interface ReleasableQueueInterface extends DrupalQueueInterface { public function release(); }. Queue implementations could then implement any such derived interfaces, and their users would still have the base interface available in any implementation
  • keep a rich interface definition, but define a base class implementing default methods for each method in the interface, thus offering interface implementations the ability to extend that base class and avoid having to create unimplemented stub methods. But then that would remove pretty much all interest of the interface itself.
Crell’s picture

Having both an interface definition and a common base class is not at all a bad thing. In fact it's a generally good approach, as it gives developers the maximum flexibility. The DB layer does that a lot for things like the select query builder.

carlos8f’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
51.28 KB

Reroll, with implementation of releaseItem(). Re-running tests.

Status: Needs review » Needs work

The last submitted patch, batch_fix_scaling-629794-49.patch, failed testing.

yched’s picture

Status: Needs work » Needs review
FileSize
66.35 KB

Yay, thks carlos8f.

Patch #49 was missing the batch.test hunks, though. Let's see if this passes.

carlos8f’s picture

Thanks, I noticed that, too. For some reason my patching program left that out. Tests passed locally, though.

yched’s picture

Status: Needs review » Reviewed & tested by the community

Green. Back to RTBC, pending for next reroll :-p.

moshe weitzman’s picture

Quick - lets commit this bug fix before it needs yet another reroll.

moshe weitzman’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
45.24 KB

Reroll since it would not apply cleanly anymore. I just upped the hook_update_N and fixed 2 bad hunks in form.inc

Status: Needs review » Needs work

The last submitted patch, batch.diff, failed testing.

webchick’s picture

Hey, folks. I'm trying to tackle one massive patch each day before alpha. Comments body as field was tonight, I'd like to look at this one tomorrow.

Status: Needs work » Needs review

Re-test of batch.diff from comment #55 was requested by aspilicious.

Status: Needs review » Needs work

The last submitted patch, batch.diff, failed testing.

yched’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
66.13 KB

#55 was missing the newly added files.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, batch_fix_scaling-629794-60.patch, failed testing.

Status: Needs work » Needs review

Re-test of batch_fix_scaling-629794-60.patch from comment #60 was requested by aspilicious.

aspilicious’s picture

The test did not complete due to a fatal error. Completion check locale.test 1195 LocaleUserLanguageFunctionalTest->testUserLanguageConfiguration()

I think i've seen this error with casey's patch today....
He retested it and it worked...

carlos8f’s picture

@aspilicious: FYI, the completion check is something I implemented in #560646: Fatal PHP errors don't cause tests to fail. It happens when the test method doesn't finish executing, i.e. doesn't actually mean a fatal error occurred. It could've been due to an exit() call somewhere in the test code, etc. But if retesting it worked, it's probably because a change in HEAD had fixed it since it was last tested.

yched’s picture

Status: Needs review » Reviewed & tested by the community

Well retesting made the failure disappear here too :-/

OK, back to RTBC.

webchick’s picture

Status: Reviewed & tested by the community » Fixed

So this is a massive patch, but the vast majority of it is re-organizing of test hunks. In the future, it'd be nice to separate things like that out into another patch, cos this one would've looked a hell of a lot less intimidating to review :P, and it also would've been easier to tell what was actually changed here.

I feel underqualified to adequately review the OO architecture here. It makes me a bit hesitant to commit this, since this would normally be a patch I'd ask Dries to chime in on, due to his areas of expertise. However, I see a lot of really smart people who I do trust the OO architecture reviews of, saying that this patch is good, which is a comfort. It's also nice to see the greatly expanded test coverage here.

I also cross-referenced what Batch API is doing here with the SystemQueue that was already committed to HEAD, and the two look very similar. This does indeed seem like the perfect use case for the Queue API.

And best of all, it does all of this refactoring without breaking APIs, which is double++ from me. Although a little discouraging that Moshe didn't seem to find any difference in memory usage at #281405-104: Drupal 7.x can't be installed with memory_limit=32M. Hmmm... Might only really be visible on things like massive imports, though.

At any rate, I could find nothing to complain about in here, so I went ahead and committed this to HEAD. I hope to heaven that I cvs added everything I needed to. :P

yched’s picture

Yay. Thanks Angie.
It does seem the commit is complete :-).

"In the future, it'd be nice to separate things like that out into another patch, cos this one would've looked a hell of a lot less intimidating to review :P, and it also would've been easier to tell what was actually changed here."
True, that's why I also rolled a no-test patch in comment #15. I pointed to it in several of the rerolls above, but I should have reminded that in my last comment knowing that you were planning to review the patch, sorry bout that.

andypost’s picture

Status: Fixed » Needs review

Anyone tested this patch with update.php ?
It seems that after this patch update.php hangs! It stops #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue

yched’s picture

Priority: Normal » Critical
FileSize
4.18 KB

Ouch. I did test that the update and batch mechanisms worked within D7, but I did not test an actual D6->D7 update.

2 issues:

- Batch API now relies on the {sequences} table, that is currently created in an system_update_7044().

Attached patch moves the creation of the table and the initialization of the initial 'sequence_id' into update_fix_d7_requirements().
In addition to max(actions_aid.aid) and max(users.uid), we also look for the current max(batch.bid) to get the initial sequence id.
(Note that I don't get why we look at users.uid, but that's how it was made in #356074: Provide a sequences API)
Might deserve an approval from chx or DamZ.

system_update_7044() just keeps the deletion of the actions_aid table, which obviously does not belong in update_fix_d7_requirements().

- we can't use drupal_write_record() to write to the queue table during the 6->7 update, because we don't have a drupal_get_schema('queue') yet.
BatchQueue could provide its own createItem() method to use a direct db_insert() instead, but I see no real reason not to make the change directly in SystemQueue::createItem().

Testing on an actual D6 db, I hit the current D6-D7 fatal errors, but the batch processing does work and the first couple update funcs run correctly

andypost’s picture

@yched This patch fixes a batch process!

Moving a {sequences} table into update_fix_d7_requirements() is a part of #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue

yched’s picture

@andypost: Then it just means the {sequence} thing was already needed for other reasons :-). #69 fixes the D6->D7 bugs introduced by the patch that got committed here.

Maybe it means it should be merged with #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue, but we'd need chx to feedback on the SystemQueue::createItem() hunk in the patch above, which would be easier if it's not mixed with other unrelated fixes.

Also, note that this patch needs to also move this part of system_update_7044() into update_fix_d7_requirements():

   $max_aid = db_query('SELECT MAX(aid) FROM {actions_aid}')->fetchField();
   $max_uid = db_query('SELECT MAX(uid) FROM {users}')->fetchField();
   db_insert('sequences')->fields(array('value' => max($max_aid, $max_uid)))->execute();

, not just the creation of the table sequence.

andypost’s picture

My +1 to commit it here (after chx feedback) because ugliness of #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue which I think to split to typos and functional fixes.

This issue much better place for both changes in #69 !!!

webchick’s picture

Status: Needs review » Fixed

Committed this follow-up, thanks.

Now hopefully we can make more progress on #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue.

yched’s picture

followup - we can now remove unneeded workaround in simpletest: #683814: remove unneeded call to batch_process()

mfer’s picture

carlos8f’s picture

#684138: DatabaseConnection_sqlite->dbNextId() always returns NULL, breaking batch api, installation is just a silly oversight in the SQLite driver, nothing to worry about here.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.