Attached patch fixes scaling issues in batch API.
The batch definition is stored as a serialized array in the {batch} table. During batch process, the whole $batch array is fetched from db, unserialized, processed, re-serialized, stored in db - on each iteration. When that array is too big, this slows down the processing, and more annoyingly causes 'max_allowed_packet' failures on shared hosts, where changing that setting is sometimes not possible. See #434032: Batch update fails with access denied message.
1st issue is that the stored batch $array contains the full array of operations. That's a design flaw, doesn't scale well for large number of operations. The correct design is to store operations separately, and only fetch them one at a time as we process them. That's what the Queue interface is about, so let's use it.
2nd issue is that batches defined within a form submit workflow (most common use case) store the whole $form and $form_state array. Problematic for 'big' forms like the modules page (which triggers a batches .po import) or simpletest list. In D7, $form_state['complete form'] contains the $form, so no need to store both. Additionally, only a few edge cases actually need the full $form_state, in most cases we only need a small subset.
Summery of what the patch does:
- Moves batch 'operations' to a Queue. The {batch} table still stores the rest of the batch data (custom UI messages, results, sandbox for multipass data...)
- Defines its own BatchQueue handler class, since the Queue API and SystemQueue handler officially don't ensure ordering, which batch API requires (think update.php without order...). This class extends SystemQueue and thus reuses the {queue} table. Batch API queues names use a 'drupal_batch_' prefix to avoid clashing with other queues.
- Also defines a memory based BatchStaticQueue handler class, used for non-progressive batches, that are executed in one pass and skip db storage completely.
Non-progressive batches are used for non-interactive Drupal installation, or when a batch-enabled form is programmatically submitted through form_submit_form() (drupal_execute() in D6). - Both progressive and non-progressive batches need a unique batch id (non-progressive batch previously didn't). Batch ids are now generated by db_next_id() (the 'sequences API'), {batch}.bid is no longer a 'serial' column. system_update_7046() does the change.
- Since update.php now relies on the queue system, the creation of the {queue} table is moved from system_update_7022() to update_fix_d7_requirements().
- When preparing the processing of a batch set in a form_submit handler, only store the full $form_state when actually needed (rare cases).
- Comes with an extensive series of tests for the batch processing workflow, derived from the home-made, hand-run test module I used in the D6 cycle. Those tests revealed a couple minor bugs in later additions to the batch API (CSS styles, timer for elapsed time...), which the patch also fixes.
- Moves the existing batch-related tests in system.test and form.test into batch.test and the newly added batch_test.module.
- Enhances a few code comments along the critical path. Also fixes a couple 80 chars wrapping errors.
- no API change whatsoever ;-)
Note that a fair bit of this patch is about tests. The actual code changes are like 25-30k.
Comment | File | Size | Author |
---|---|---|---|
#69 | batch_fix_6_7_update.patch | 4.18 KB | yched |
#60 | batch_fix_scaling-629794-60.patch | 66.13 KB | yched |
#55 | batch.diff | 45.24 KB | moshe weitzman |
#51 | batch_fix_scaling-629794-51.patch | 66.35 KB | yched |
#49 | batch_fix_scaling-629794-49.patch | 51.28 KB | carlos8f |
Comments
Comment #1
yched CreditAttribution: yched commentedHere's a patch without tests for easier review.
Comment #3
yched CreditAttribution: yched commentedHm, does this one apply better ?
Comment #5
yched CreditAttribution: yched commentedSmall logic error for batches with no actual ops. Will reroll later today.
Comment #6
Crell CreditAttribution: Crell commented+1 on the concept.
If BatchQueue is extending SystemQueue, and SystemQueue already implements DrupalQueueInterface, there's no need to re-declare that you're implementing that interface. (It doesn't hurt anything, but it is not necessary.)
BatchStaticQueue looks sufficiently generic that it shouldn't be bound to just the batch system, even by name. Heck, for PHP 5.3 it would make sense to have it use SPL's queue class internally. :-)
Comment #7
yched CreditAttribution: yched commentedYes, I wasn't sure, so I looked in includes/database ;-)
In fact a search on
implements .* extends
bring several matches in includes/database, includes/filetransfer, system/mail.sending.inc, system/system.updater.inc... So I assumed it was at least correct.[edit: I was a little fast. Of these matches, only a few match the same case of
ClassA extends ClassB implements Interface
while classB already implements Interface. And none of those in includes/database, my bad :-p]Possibly. The classes used by the batch engine need an additional getAllItems() method, that isn't part of the DrupalQueueInterface, so if we take the "static queue" class one level up, we'd still need a batch-specific subclass. Unless there's a clear consensus on that, I'd rather discuss this in a followup and not derail this patch.
Comment #8
yched CreditAttribution: yched commentedAttached patch should fix the test failures.
Comment #9
Crell CreditAttribution: Crell commentedI can't speak for the rest of the instances, but PDOStatement is a PHP internal class while DatabaseStatementInterface is a Drupal interface. So we do have to declare that we implement that interface.
Comment #10
yched CreditAttribution: yched commentedCrell: yes, I realized that afterwards, and edited the comment - sorry 'bout that :-/
Well, it's just that BatchStaticQueue implements the queue interface but doesn't extend anything, so I kept the 'Implements' on both classes for symmetry in the file.
Comment #11
Crell CreditAttribution: Crell commentedBatchStaticQueue can really be just StaticQueue, I suspect. Or MemoryQueue. Or something like that, which implies it can be used for things other than just the batch system.
Comment #12
yched CreditAttribution: yched commentedOK, I'll reroll with that. As I said in #7, we'll still need a BatchMemoryQueue to implement getAllItems() method, which batch API needs but is not part of the Queue interface.
"which implies it can be used for things other than just the batch system."
Well, this generic MemoryQueue handler will be of little use with the current DrupalQueue::get() factory method, since queue handlers are pluggable but variable-based (one queue handler for all queues on the site), and the MemoryQueue can hardly be a good candidate for handling all the site's queues.
The Batch API uses its own _batch_queue() "factory" function to circumvent that, and decide which handler needs to be used.
Comment #13
yched CreditAttribution: yched commentedRerolled with a generic MemoryQueue handler. I'd rather not have this derail the issue, though, so if anyone objects to it, I would go back at patch #7.
new patch + updated 'no_test' patch without the test parts, for easier code review.
Comment #15
yched CreditAttribution: yched commentedUnlike the generic MemoryQueue, the BatchMemoryQueue handler needs to allow items to be repeatedly claimed (multipass ops).
Comment #16
dman CreditAttribution: dman commentedThanks for getting stuck into this. I hit the max_allowed_packet thing on the batch table doing some some huge imports (taxonomy_xml) and was shocked when I found that the queued actions I'd worked so hard to make small and atomic were being smudged together in a big serialized lump anyway!
I ended up hacking together a meta-queue thing just to make sure that I didn't put too many things into batch at once (!) By the time I'd done that, I'd almost removed any benefit of using batch in the first case (apart from the nifty progress bar!)
Using a real queue API sounds great! Good one yched!
Comment #17
brianV CreditAttribution: brianV commented+1 for this issue. Over the last few weeks, I did 40-50 hour monster batch jobs importing data into several sites. Any performance upgrades are much appreciated!
Comment #18
yched CreditAttribution: yched commentedBump...
This gets +1's, and I do think it's a needed change, but I fear it might be difficult to get actual reviews for a patch that touches the guts of Batch API. Unlike the current, it does have tests to prove it works, though ;-)
So, how do we get this in ?
Comment #19
sunI spent some time reviewing this patch. It looks very nice and makes perfectly sense.
Don't we also have a sequences API now? Just mentioning, not sure whether it makes sense here.
Please revert, but add a note that it has been moved to d7_requirements
After these, this should be ready to fly.
Comment #20
yched CreditAttribution: yched commentedThanks for looking at this, sun.
"Don't we also have a sequences API now?" : yup, that API is db_next_id() ;-) - http://drupal.org/cvs?commit=276280
Will reroll for the update PHPdoc.
Comment #21
yched CreditAttribution: yched commentedRerolled for system_update_7021() PHPdoc.
Other than that, patch is the same as in #15, so I didn't roll a new 'no_test' patch.
Comment #22
sunThank you! Awesome work!
Comment #23
yched CreditAttribution: yched commented#583730: How many ways are there to cache a form? got in, let's re-test once.
Comment #26
yched CreditAttribution: yched commentedHEAD is broken - I got those two failures on a local HEAD, and so does http://drupal.org/node/634440#comment-2277426
Comment #27
yched CreditAttribution: yched commentedHEAD fixed. Meanwhile, rerolled for a couple missing t()s around assert messages.
Comment #30
yched CreditAttribution: yched commentedOK, back to RTBC as per #22 then.
Comment #31
yched CreditAttribution: yched commentedWill need to be adapted after #634440: Remove auto-rebuilding magic for $form_state['storage'].
Comment #32
yched CreditAttribution: yched commentedJust needed to update the test support module to account for the new $form_state['storage'] / ['rebuild'] behavior for multistep.
Patch summary is in the OP, and #15 has a 'no_test' slimmed down version of the patch with only actual code changes.
Comment #33
yched CreditAttribution: yched commentedBump. As I wrote in other threads, the tests in here would be precious to ensure that the ongoing '$form_state' overhaul work doesn't break batch API.
Comment #36
yched CreditAttribution: yched commentedBumped system_update_N() number.
Comment #38
yched CreditAttribution: yched commentedNo luck, two patches adding a system update committed in a row. Try again.
Comment #39
yched CreditAttribution: yched commentedGreen, back to RTBC.
Comment #40
yched CreditAttribution: yched commentedReroll.
Comment #42
yched CreditAttribution: yched commentedEasier tracking.
Comment #43
yched CreditAttribution: yched commentedGrumblegrumble... 3rd system_update_N() bump in two days :-p (yes, I can grumble with my tongue out)
Comment #44
yched CreditAttribution: yched commentedReroll.
Comment #45
yched CreditAttribution: yched commentedNeeds to be updated after #659710: Queue API is missing a releaseItem method (added a new method to the Queue interface, which the Queue handlers in this patch fail to implement).
I won't be able to do it this week, and I'd also like a word from a core committer that this will eventually get in before doing so, since the patch has been ready for a month and just keeps getting stale.
As a reminder, patch description is in the OP - internal refactor only, no API change.
#15 has a 'no_test' patch for easier review (tests are more than 60% of the patch)
Comment #46
moshe weitzman CreditAttribution: moshe weitzman commentedThis scaling problem just bit us on a big batch API powered migration (a D6 site). Would love to see this committed.
I'll ask Dries to give a statement on this patch.
Comment #47
fgmThis question raises a bigger issue. From a functional standpoint, a method like releaseItem() should not need to be implemented by all implementors of the interface, but PHP does not implement optional methods in interfaces. And SystemQueue has no base class, but just derives from stdClass and implements DrupalQueueInterface.
If our default queue was based on a default class providing stub/default methods, implementations wouldn't need to implement each and every optional method just to complete an Interface contract. However, from #391340: Job queue API it seems that the whole Queue API was built from its initial incarnation on top of a queue Interface instead of a base class.
The problem with using a base class, of course, is that any class in PHP can only inherit from one parent, but can implement multiple interfaces, so using a base class loses generality by preventing concrete implementations from inheriting from an other class. But, given the nature of this entity, it seems it would be more natural to have concrete queues extend a base class than implement an interface.
AFAICS, there are two possibilities to avoid future problems like this one (and fix this one too)
interface ReleasableQueueInterface extends DrupalQueueInterface { public function release(); }
. Queue implementations could then implement any such derived interfaces, and their users would still have the base interface available in any implementationComment #48
Crell CreditAttribution: Crell commentedHaving both an interface definition and a common base class is not at all a bad thing. In fact it's a generally good approach, as it gives developers the maximum flexibility. The DB layer does that a lot for things like the select query builder.
Comment #49
carlos8f CreditAttribution: carlos8f commentedReroll, with implementation of releaseItem(). Re-running tests.
Comment #51
yched CreditAttribution: yched commentedYay, thks carlos8f.
Patch #49 was missing the batch.test hunks, though. Let's see if this passes.
Comment #52
carlos8f CreditAttribution: carlos8f commentedThanks, I noticed that, too. For some reason my patching program left that out. Tests passed locally, though.
Comment #53
yched CreditAttribution: yched commentedGreen. Back to RTBC, pending for next reroll :-p.
Comment #54
moshe weitzman CreditAttribution: moshe weitzman commentedQuick - lets commit this bug fix before it needs yet another reroll.
Comment #55
moshe weitzman CreditAttribution: moshe weitzman commentedReroll since it would not apply cleanly anymore. I just upped the hook_update_N and fixed 2 bad hunks in form.inc
Comment #57
webchickHey, folks. I'm trying to tackle one massive patch each day before alpha. Comments body as field was tonight, I'd like to look at this one tomorrow.
Comment #60
yched CreditAttribution: yched commented#55 was missing the newly added files.
Comment #63
aspilicious CreditAttribution: aspilicious commentedThe test did not complete due to a fatal error. Completion check locale.test 1195 LocaleUserLanguageFunctionalTest->testUserLanguageConfiguration()
I think i've seen this error with casey's patch today....
He retested it and it worked...
Comment #64
carlos8f CreditAttribution: carlos8f commented@aspilicious: FYI, the completion check is something I implemented in #560646: Fatal PHP errors don't cause tests to fail. It happens when the test method doesn't finish executing, i.e. doesn't actually mean a fatal error occurred. It could've been due to an exit() call somewhere in the test code, etc. But if retesting it worked, it's probably because a change in HEAD had fixed it since it was last tested.
Comment #65
yched CreditAttribution: yched commentedWell retesting made the failure disappear here too :-/
OK, back to RTBC.
Comment #66
webchickSo this is a massive patch, but the vast majority of it is re-organizing of test hunks. In the future, it'd be nice to separate things like that out into another patch, cos this one would've looked a hell of a lot less intimidating to review :P, and it also would've been easier to tell what was actually changed here.
I feel underqualified to adequately review the OO architecture here. It makes me a bit hesitant to commit this, since this would normally be a patch I'd ask Dries to chime in on, due to his areas of expertise. However, I see a lot of really smart people who I do trust the OO architecture reviews of, saying that this patch is good, which is a comfort. It's also nice to see the greatly expanded test coverage here.
I also cross-referenced what Batch API is doing here with the SystemQueue that was already committed to HEAD, and the two look very similar. This does indeed seem like the perfect use case for the Queue API.
And best of all, it does all of this refactoring without breaking APIs, which is double++ from me. Although a little discouraging that Moshe didn't seem to find any difference in memory usage at #281405-104: Drupal 7.x can't be installed with memory_limit=32M. Hmmm... Might only really be visible on things like massive imports, though.
At any rate, I could find nothing to complain about in here, so I went ahead and committed this to HEAD. I hope to heaven that I cvs added everything I needed to. :P
Comment #67
yched CreditAttribution: yched commentedYay. Thanks Angie.
It does seem the commit is complete :-).
"In the future, it'd be nice to separate things like that out into another patch, cos this one would've looked a hell of a lot less intimidating to review :P, and it also would've been easier to tell what was actually changed here."
True, that's why I also rolled a no-test patch in comment #15. I pointed to it in several of the rerolls above, but I should have reminded that in my last comment knowing that you were planning to review the patch, sorry bout that.
Comment #68
andypostAnyone tested this patch with update.php ?
It seems that after this patch update.php hangs! It stops #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue
Comment #69
yched CreditAttribution: yched commentedOuch. I did test that the update and batch mechanisms worked within D7, but I did not test an actual D6->D7 update.
2 issues:
- Batch API now relies on the {sequences} table, that is currently created in an system_update_7044().
Attached patch moves the creation of the table and the initialization of the initial 'sequence_id' into update_fix_d7_requirements().
In addition to max(actions_aid.aid) and max(users.uid), we also look for the current max(batch.bid) to get the initial sequence id.
(Note that I don't get why we look at users.uid, but that's how it was made in #356074: Provide a sequences API)
Might deserve an approval from chx or DamZ.
system_update_7044() just keeps the deletion of the actions_aid table, which obviously does not belong in update_fix_d7_requirements().
- we can't use drupal_write_record() to write to the queue table during the 6->7 update, because we don't have a drupal_get_schema('queue') yet.
BatchQueue could provide its own createItem() method to use a direct db_insert() instead, but I see no real reason not to make the change directly in SystemQueue::createItem().
Testing on an actual D6 db, I hit the current D6-D7 fatal errors, but the batch processing does work and the first couple update funcs run correctly
Comment #70
andypost@yched This patch fixes a batch process!
Moving a {sequences} table into update_fix_d7_requirements() is a part of #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue
Comment #71
yched CreditAttribution: yched commented@andypost: Then it just means the {sequence} thing was already needed for other reasons :-). #69 fixes the D6->D7 bugs introduced by the patch that got committed here.
Maybe it means it should be merged with #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue, but we'd need chx to feedback on the SystemQueue::createItem() hunk in the patch above, which would be easier if it's not mixed with other unrelated fixes.
Also, note that this patch needs to also move this part of system_update_7044() into update_fix_d7_requirements():
, not just the creation of the table sequence.
Comment #72
andypostMy +1 to commit it here (after chx feedback) because ugliness of #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue which I think to split to typos and functional fixes.
This issue much better place for both changes in #69 !!!
Comment #73
webchickCommitted this follow-up, thanks.
Now hopefully we can make more progress on #563106: Cannot upgrade from Drupal 6 to Drupal 7 - meta issue.
Comment #74
yched CreditAttribution: yched commentedfollowup - we can now remove unneeded workaround in simpletest: #683814: remove unneeded call to batch_process()
Comment #75
mfer CreditAttribution: mfer commentedFYI, this patch caused SQLite to not install. See #684138: DatabaseConnection_sqlite->dbNextId() always returns NULL, breaking batch api, installation
Comment #76
carlos8f CreditAttribution: carlos8f commented#684138: DatabaseConnection_sqlite->dbNextId() always returns NULL, breaking batch api, installation is just a silly oversight in the SQLite driver, nothing to worry about here.