Problem/Motivation
1. Define 3 batches with operations.
2. batch_set($batch_0);
3. batch_process();
4. During the processing of the first batch two new batches are being added through batch_set.
5. Because there is an active batch the new batch being added will be not placed at the end but prepended to the batch set stack right after the current active batch, but the operations of the other batches, that are already in the db queue will not be reordered. The batch set list will look now like this - $batch_0, $batch_1, $batch_2, but the queued operations will look like this - drupal_batch:49:0 (batch 0), drupal_batch:49:1 (batch 1), drupal_batch:49:1 (batch 2). This leads to having the batch set stack like 0-1-2, but the operations are queued in the wrong order - 0-2-1.
A current example in core is when installing a new module in hook_update_N:
1.DbUpdateController::triggerBatch() starts the processing of the updates in a batch.
2. During the updates a module is being installed.
3. The installation of a module triggers locale_system_update(), which on its turn will call batch_set() twice.
4. The second batch set from locale_system_update() depends on the first batch set from the same function.
5. The problems starts with the second call to batch_set() in locale_system_update().
6. What has happened now is the following:
6.1. The second batch set from locale_system_update() is put in front of the first batch set from locale_system_update(), which means it will be processed before the first one - wrong, because it depends on the first one and the first one should be executed before the second one.
6.2. The operations from the second batch set from locale_system_update() are being put together into the same queue of the first batch set from locale_system_update(), because the indexes of the batch set stack are the ones being used to determine the queue name and prepending the second batch set and reordering the indexes breaks all the corresponding queue mapping leading to executing operations in a batch set, which do not belong to that batch set.
7. In this case we notice the problem by having warning that the functions locale_translation_batch_status_check(), locale_translation_batch_fetch_download() and locale_translation_batch_fetch_import() could not be found. We see these warnings because the operations are being processed as part of the wrong batch set and not the correct $batch['file'] is being included! The warnings:
call_user_func_array() expects parameter 1 to be a valid callback, function 'locale_translation_batch_status_check' not found or invalid function name [warning]
batch.inc:163
call_user_func_array() expects parameter 1 to be a valid callback, function 'locale_translation_batch_fetch_download' not found or invalid function name [warning]
batch.inc:163
call_user_func_array() expects parameter 1 to be a valid callback, function 'locale_translation_batch_fetch_import' not found or invalid function name [warning]
batch.inc:163
Proposed resolution
1. Ensure that when multiple batches are being added during an active batch, that they will be put into the batch set stack before the ones that are already in the queue, but the order of the batches added during an active batch should be maintained. This means:
- batch_set($batch_1);
- batch_process();
- During the processing of an operation from $batch_1, batch_set() is called twice with $batch_2 and $batch_3.
- $batch_2 should be processed before $batch_3.
2. Ensure that when prepending and appending batch sets during an active batch the batch set queue mapping will be maintained by not breaking the indexes in the batch set stack.
Remaining tasks
User interface changes
API changes
Data model changes
| Comment | File | Size | Author |
|---|---|---|---|
| #76 | interdiff-73-76.txt | 506 bytes | gease |
| #76 | 2664016-76.patch | 17.57 KB | gease |
| #73 | interdiff-58-73.txt | 3.67 KB | gease |
| #73 | 2664016-73.patch | 17.57 KB | gease |
| #58 | interdiff-41-58.txt | 8.49 KB | hchonov |
Comments
Comment #2
danielen commentedComment #3
danielen commentedComment #4
danielen commentedComment #5
cilefen commentedI removed extraneous tags.
Comment #6
jose reyero commentedRelated, https://github.com/drush-ops/drush/issues/1930
Comment #7
swentel commented#2664290: Remove array typehints from batch callbacks will be the fix I think.
Comment #8
hchonovConfirming the bug.
However installing a module with an update is more suitable to be done in hook_post_update_NAME instead in hook_update_N, but atm the problem occurs with both hooks.
Comment #9
hchonovComment #10
dawehnerLet's mark this as duplicate of #2664290: Remove array typehints from batch callbacks then
Comment #11
duaelfrI can reproduce this issue in 8.1.7.
It seems that the
local.batch.incfile is not included in some cases. Shouldn't we have amodule_load_include('batch.inc', 'locale');call right in_locale_translation_batch_status_operations()?Comment #12
duaelfrHere is a quick patch, just in case it helps someone.
Comment #13
hchonovI confirm that the problem still exists.
Comment #14
dawehnerJust conceptually #12 totally makes sense. When a function needs some external code, it should declare that as dependency by loading it.
Comment #15
alexpottThis looks testable.
Comment #16
duaelfrI tried to figure out how to test it but I don't know.
Any help, please?
Comment #18
duaelfrI'd still need some directions to start working on the tests here. Could anyone help me, please?
Comment #20
hchonovI agree with #2664016-14: Adding a new batch set while the batch is running breaks batch order and I couldn't think of any suitable test, therefore putting it back again to RTBC.
I hope this could be accepted as it is without tests as it is an obvious bug and bug fix.
I am sorry if it is not possible to accept it without tests and feel free to put it back to "needs work".
Comment #21
alexpottHow about adding a test where we have an update function that installs a module - we have update path testing. Seems possible to me. Look at the things that extend \Drupal\system\Tests\Update\UpdatePathTestBase - specifically one like \Drupal\system\Tests\Update\UpdatePostUpdateFailingTest would give you hints about how to install a test module prior to running updates - and then this test module should have an hook_update_N() and post update function that both install different modules.
Comment #22
morsokHere's a reroll (the new array syntax broke the patch)
Nothing new besides that, works for me.
Comment #23
hchonovAs we've just ran into this issue again I've started looking more deeper and searching for the exact problem. I've came to the conclusion, that the problem relies in the batch API when setting/adding a new batch(es) while a one is being processed. What happens is the following:
1. Define 3 batches with operations.
2. batch_set($batch_0);
3. batch_process();
4. During the processing of the first batch two new batches are being added through batch_set.
5. Because there is an active batch the new batch being added will be not placed at the end but prepended to the batch set stack right after the current active batch, but the operations of the other batches, that are already in the db queue will not be reordered. The batch set list will look now like this - $batch_0, $batch_1, $batch_2, but the queued operations will look like this - drupal_batch:49:0 (batch 0), drupal_batch:49:1 (batch 1), drupal_batch:49:1 (batch 2). This causes a big problem where the batch sets are being reordered, but their corresponding operations remain with the initial indexes used for the batch sets and each new batch added while a current one is being processed will have its operations queued with the same index.
I was looking for a solution where we can prepend batches, but I am really sure we need that a complex solution. It has not worked yet properly therefore I think it will not be an API break to just define the batches so that a new batch will always be appended to the stack no matter if one is being currently processed or not.
I am even tempted to raise the issue status to critical, as this might under some conditions break systems, but only raising it to major.
Comment #24
hchonovComment #25
hchonovI think that this problem exists since the queue for the batch operations has been introduced in D7 in #629794: Scaling issues with batch API.
Comment #27
hchonovAfter thinking over it again I understand the requirement that when batches are set during an active batch that those new batches are processed before the existing ones in the queue. The problem is that currently this is broken and breaks the order of the queued operations. It also currently works like this that if we set two batches during an active batch the second new one will be placed before the first new one. This is not correct as well. The new patch I am uploading should take care of that. The ProcessingTest should be failing, which I will fix in the next patch version.
Comment #28
tstoecklerYeah, while I'm not yet sure why we do this dance, this certainly is a bug.
Haven't looked at the patch in #27 yet, but #25 is not quite correct. ;-) Unless I'm mistaken the bug has been introduced in the very first Batch API patch, in fact I traced it back all the way back to #56 in #127539: batch - progressive operations (à la update.php).
Comment #29
hchonovI think #27 is the way to go. I've just fixed one of the failing tests, but still have problems with batch_4 in \Drupal\Tests\system\Functional\Batch\ProcessingTest::testBatchForm().
Comment #32
hchonovI've found the reason for the failing test and it was that with the new implementation of batch_set I was destroying the reference to the current batch set and it was impossible to update it if during a batch operation batch_set() has been called.
Comment #33
hchonovExtending the comment to explain the other problem that the patch is solving, which was present as well.
Comment #34
hchonovThe test coverage has been updated :).
Comment #35
hchonovJust a little code refactoring.
Comment #36
hchonovUpdating the issue summary to help better understand the problems we have.
Comment #37
hchonovAdding additional test coverage as requested so by @dawehner in IRC.
Comment #38
hchonovAnd here the test only patch and the full patch. This bug exists in 8.4.x and in 8.3.x, luckily the patch applies to both branches.
P.S. is backport for Drupal 7 needed as well?
Comment #40
hchonovAfter I've woke up today I realized that the current patch has introduced a bug. We are not maintaining ordered numeric keys anymore and it might happen that the keys are like 0, 3, 4, 1, 2 in which case _batch_next_set() will not work correctly, because it is incrementing the index of the current set by 1, but instead we have to get the next key. We need one more test for such a complex case and a fix for this problem.
Comment #41
hchonovThanks to #40 I've discovered one more problem in the new approach for
batch_set()and fixed it as well. I've added additional test coverage for the complex use case :where the batches should be processed in the following order -
I am not really sure why, but I had to reset the static cache of the state service before updating the "test batch stack" as otherwise I was observing some race conditions and the test was passing 1 of 4-5 runs only.
Comment #42
hchonovComment #45
tstoecklerIt's been a while, but I just stumbled over this and had a look at the patch and - while it took me a while to (re-)grok it - it looks absolutely solid. Neither the runtime code nor the test code is particularly easy to read, but when looking at it in context you see that that's really not the fault of the patch. People think render arrays are the Arrays Of Doom (TM), but I'm not sure whether Batch API really isn't worse.
In any case, thanks!
Also resetting to 8.4.x since this is a bugfix.
Comment #46
larowlananyone else think this would be super simplified if it used a proper data structure, e.g. something modeled on \SplStack but that had support for insertBefore/insertAfter?
worth a follow up to experiment if we can replace this with that?
Adding review credits
Comment #47
duaelfr@larowlan Could this help? #66183: Add helper methods to inject items into a particular position in associative arrays
Comment #48
larowlanyes, but arrays really aren't data structures...but pretty sure that horse has bolted in Drupal :-P
Comment #49
tstoeckler@larowlan: I had the same thought when looking at this. Since inserting things at particular places in arrays is a common problem in Drupal (e.g.
#process,hook_module_implements_alter(), ...) I'm not sure if a generic data structure or a batch-specific object would be best, but in any case let's have a follow-up for that. ++Comment #50
alexpottThere's #2401797: Introduce a batch builder class to make the batch API easier to use to try to make batches a bit easier to work with.
Comment #51
catchThere's also #1797526: Make batch a wrapper around the queue system.
Comment #52
plachThis still needs to go into 8.5.x first.
Comment #53
plachCan we move this new code block to an internal helper function to improve readability? Something like
_batch_append_set().I had an hard time parsing this comment the first time I read it for two reasons:
What about the following simplified version?
Typo:
takes(I know this was copy/pasted from existing code, but let's not add new typos ;)
Shouldn't this be
Batch 7?Why are we using
roundhere? It seems unneeded and made me wonder whether$totalcould take other values aside from 10, which does not seem to be the case. Removingroundwould improve readability IMO.It's unfortunate we have to this without knowing the reason, we may be covering an existing bug this way. It would be great to add a @todo and a follow-up to investigate this.
Comment #54
plachOh, I almost forgot: nice detective work! :)
Comment #56
anybodyI can confirm this issue still exists. Just ran into it with the latest webform update under D8.4.4:
drupal/webform (5.0.0-rc2 => 5.0.0-rc3)
Comment #57
spokjeThis is still around (D8.5.1):
Comment #58
hchonovAddressing #53:
1-5: done.
6:
In #41 I've commented about this:
Now I've run all the tests from
\Drupal\Tests\system\Functional\Batch\ProcessingTestmultiple times in a row and no one failed. If there was a bug then it looks like it is not present anymore, therefore I've removed the line reseting the static cache of the state service.@plach, thank you for the review :)
Comment #59
tstoecklerChecked that the interdiff adresses the last review. Thanks! Back to RTBC now.
Comment #61
hchonovIt looks like it was just a random failure => back to RTBC.
Comment #63
alexpottI think this comment could be improved. I don't think we should mention prepend at all. It just confuses things and docs old behaviour and not the current.
A start on a clearer comment might be:
I think this needs work but it still an improvement.
I'm not sure this comment is correct. Well it might be correct but it does not read easy. I would suggest using sentences. I.e.
Submit batches 4 and 7. Batch 4 will trigger batch 2. Batch 7 will trigger batches 5 and 6.Also I think the test would be better if the batch 7 triggered 5 and 6 in the opposite order to prove the order in which the are added takes precedence over anything to do with ordering.
Comment #64
larowlanI was thinking about this some more, in particular the array splice work we're doing to keep the order of the batch_set/batch_get tasks and was thinking that perhaps we should be using a data structure better than an array.
E.g. \SplStack gives us LIFO support, or \SplQueue if we need FIFO.
We're basically trying to mimic those data structures with an array, we should try and leverage existing things?
Comment #65
tstoecklerRe #64: I don't think
\SplStackor\SplQueue, because we need to be able to insert elements after a specific other elements, while also retaining the information after which an element was originally added. It might be possible to concoct a data structure that would actually have that feature while still allowing simple iteration, but it seems that would be a larger task and I would request that we discuss that in a follow-up, as we might have the need for a similar data structure outside of the Batch system as well, so I could easily see something like that leading to longer discussions. And this is a very tricky and nasty bug, so I would like to get this in as soon as possible.To emphasize the last point, I recently hit this with the
drush locale-updatecommand, wherelocaleadds batch operations for each project inside of another batch. Due to this bug the translations did not end up getting imported because the order of operations was all out of whack.Still needs work for #63, though.
Comment #67
saranya ashokkumar commentedI m getting same error in 8.5.
Comment #68
jonnyeom commentedI had this error in 8.6.15 and the patch did resolve this issue for me.
Many thanks for the work!
Comment #69
idebr commentedClosed #3059068: Locale batch issues when installing modules via update_N as a duplicate issue.
Comment #70
pclaitte commentedSame error on 8.7.4 and the patch resolves the issue for me.
Thanks !
Comment #72
handkerchiefany news on that?
Comment #73
gease#63 addressed.
Comment #74
geaseComment #75
cspitzlaystore -> stores
Other than that I think that #63 has been addressed.
Comment #76
gease#75.
Comment #77
cspitzlayComment #78
alexpottCommitted and pushed bb6797a8e2 to 9.0.x and 3b32db50d1 to 8.9.x. Thanks!
Comment #81
alexpottDiscussed with @catch who +1'd the backport.