Block rehashing happens on every cron run [#1693336]

Problem/Motivation

_block_rehash() is a function that loops through all code-declared blocks for a given theme and writes/updates their database entries. It's called by block_flush_caches(), which is invoked on all cron runs when system_cron() attempts to get a list of cache bins to flush. Even if cache_clear_all determines that the block cache doesn't need to be flushed (due to cache minimum lifetime), because _block_rehash() is called directly in the Block module's hook_flush_caches() implementation, it's run on every cron run. That means that there's a DB insert or update for every block declared in code for every active theme on every cron run.

If you happen to have a lot of blocks declared in code, or even a relatively small number of blocks declared in code, but have several active themes, your cron run may take ages to complete, or run out of memory and fail completely, etc. This problem may be especially aggravated if you use the Nodeblock module. This is especially a problem if you run cron often.

Proposed resolution

There may be a few ways to cut down on the number of times _block_rehash() gets run.

Rather than running _block_rehash() in block_flush_caches() (which seems like a hack), we could probably move it to block's own implementation of hook_cron(). In D8, we could make how often it gets run configurable (though maybe not necessary); in D7, we could make it a hidden variable that defaults to the minimum cache lifetime.
Another way would be to wrap the call to _block_rehash() in block_flush_caches() with some logic that respects minimum cache lifetime (by either piggybacking off of cache_flush_cache_block or adding a new, similar variable that respects minimum cache lifetime). This might not make the most sense, since this isn't technically cache-related.
Just makes sure the database does not get updated all the time. There are only three things that can happen to a block coming from the database: the cache gets changed, it goes from enabled to disabled due to invalid region and alter does something. We can track the first two, the third we can compare.

Remaining tasks

Discuss which of the above resolutions is most appropriate, then implement.

User interface changes

Depends on discussion, but probably none.

API changes

If anything, this would be an API addition, but there are ways we could avoid any API changes. See proposed resolutions above.

Comment	File	Size	Author
#23	interdiff-22-23.txt	1.08 KB	David_Rothstein
#23	1693336_23.patch	6.4 KB	David_Rothstein
#23
#22	1693336_22.patch	6.43 KB	chx
#22
#22	interdiff.txt	2.34 KB	chx
#21	interdiff-8-21.txt	4.25 KB	David_Rothstein
#21	1693336_21.patch	7.44 KB	David_Rothstein
#21
#15	beforepatchblockrehash.jpg	116.04 KB	bradallenfisher
#15	afterpatchblockrehash.jpg	53.29 KB	bradallenfisher
#12	block_rehash_deployment.png	40.94 KB	iamEAP
#8	interdiff.txt	3.54 KB	chx
#8	1693336_8.patch	6.52 KB	chx
#8
#5	unchanged_database_blocks.patch	6.88 KB	chx
#5
#4	1693336.patch	1.46 KB	catch
#4

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Comment #1

iamEAP CreditAttribution: iamEAP commented 20 July 2012 at 00:38

Issue tags:

+Performance

Tagging with Performance.

Comment #2

iamEAP CreditAttribution: iamEAP commented 24 July 2012 at 18:40

Version:	8.x-dev	» 7.x-dev
Issue tags:	-Needs backport to D7

Looks like block module now does this using hook_rebuild() in D8. Moving to D7, removing backport tag.

Comment #3

iamEAP CreditAttribution: iamEAP commented 12 July 2013 at 22:00

Status:

Active

» Closed (duplicate)

Gonna mark this as a dupe of #2041107: Excess UPDATE {block} queries from _block_rehash(), which has a patch.

Comment #4

catch

he/him

English

CreditAttribution: catch commented 30 August 2013 at 20:10

Status:

Closed (duplicate)

» Needs review

File	Size
1693336.patch	1.46 KB

I actually think this is the right issue - it'd be good to optimize _block_rehash(), but it should simply not be running on hook_cron() ever.

#710874: Add optional argument to hook_flush_caches() so that cron runs can be identified would have fixed this issue, but is currently won't fix. #891600: Page and block caches are cleared the same way on cron as on content change is related and still a bug against 7.x

Attaching a patch that fixes this by calling debug_backtrace() to skip the rehash if we're inside system_cron(), it's a hack but there's no other way to know what's calling block_flush_caches() without the API addition from the other issue.

Comment #4.0

catch

he/him

English

CreditAttribution: catch commented 30 August 2013 at 20:10

Issue summary:

View changes

More clear about the scope of the issue.

Comment #5

chx CreditAttribution: chx commented 3 October 2013 at 01:53

File	Size
unchanged_database_blocks.patch	6.88 KB

There are only three things that can happen to a block coming from the database: the cache gets changed, it goes from enabled to disabled due to invalid region and alter does something. We can track the first two, the third we can compare.

Comment #6

iamEAP CreditAttribution: iamEAP commented 3 October 2013 at 19:09

Status:

Needs review

» Needs work

Tested #5 and it works very well as advertised. A couple of documentation notes below (#1,2), and a suggestion on code readability (#3).

+++ b/modules/block/block.module
@@ -401,23 +401,37 @@ function _block_rehash($theme = NULL) {
+    if ($cache == $block['cache']) {
+      // In this loop only the cache might change so keep note of unchanged
+      // blocks to avoid re-saving them. The original value is only used for
+      // comparison if the alter is implemented. Otherwise copying the block
+      // is a waste of time and memory.

Minor, but phrasing of this comment is a little ambiguous. Do you mean "In this loop, only the cache might change..." or "In this loop only, the cache might change..."

+++ b/modules/block/block.module
@@ -437,6 +451,8 @@ function _block_rehash($theme = NULL) {
         $block['status'] = 0;
+        // In this loop only this change means a change in the database, too.
+        unset($unchanged_database_blocks[$module][$delta]);

Similar comment ambiguity here. Would also help by explicitly specifying what the second "this" is.

+++ b/modules/block/block.module
@@ -456,7 +472,18 @@ function _block_rehash($theme = NULL) {
+      // If this is still marked as an original block and the alter hook is
+      // implemented then the block might have changed. The original contains
+      // every key that goes into the database and only those so this diff
+      // will correctly decide whether the database records differ.
+      if ($alter_implemented && !empty($unchanged_database_blocks[$module][$delta]) && array_diff_assoc($unchanged_database_blocks[$module][$delta], $block)) {
+        unset($unchanged_database_blocks[$module][$delta]);
+      }
+      if (empty($unchanged_database_blocks[$module][$delta])) {

I feel like the double-negative here is making it a little hard to understand your changes.

The primary action we're concerned with here is writing through changes rather than "not writing through unchanges." If you were to keep track of $changed_database_blocks rather than $unchanged..., and rearrange the conditionals appropriately, I think the whole diff would be a bit easier to grasp.

Comment #7

chx CreditAttribution: chx commented 3 October 2013 at 22:04

Well, I guess. Keeping changes would be doable but as the original is needed but then again it's only needed in the alter case and so the additional keeping would be superb small, okay

Comment #8

chx CreditAttribution: chx commented 3 October 2013 at 23:19

Status:

Needs work

» Needs review

File	Size
1693336_8.patch	6.52 KB

interdiff.txt	3.54 KB

Well this flip was a great idea, it really did made the code and comments much easier to understand, yes.

Comment #9

iamEAP CreditAttribution: iamEAP commented 4 October 2013 at 06:15

Love this! Thanks, chx.

+1 for RTBC from me; someone else (@ultimateboy?) ought to review and flip.

Comment #10

ultimateboy CreditAttribution: ultimateboy commented 4 October 2013 at 17:16

Status:

Needs review

» Reviewed & tested by the community

After many rounds of "apache siege" testing with and without cron runs, I am extremely confident that this patch successfully solves the root issue.

Without this patch, while a (relatively) small load test is being performed, if cron is running, we'd see a spike in "active/locked transactions" and very quickly see connection timeouts. With this patch applied and while cron was running, we saw 100% availability, no spike in MySQL, no failed transactions.

I'm really confident flipping the status to RTBC.

Also, I've gone ahead and marked #2041107: Excess UPDATE {block} queries from _block_rehash() as a duplicate as this patch solves what I was going for over in that issue.

Thank you chx for the patch and iamEAP for the review.

Comment #11

ultimateboy CreditAttribution: ultimateboy commented 7 October 2013 at 21:34

Just a quick follow up. I've deployed this patch to our production environment (~100 Drupal sites) and cron is functioning completely as desired at this time.

Comment #11.0

ultimateboy CreditAttribution: ultimateboy commented 7 October 2013 at 21:34

Issue summary:

View changes

Updated issue summary.

Comment #12

iamEAP CreditAttribution: iamEAP commented 2 November 2013 at 01:19

Issue summary:

View changes

File	Size
block_rehash_deployment.png	40.94 KB

Chiming in again: this absolutely improved things after deploying to prod. See this chart from New Relic:

Block Update before and after deploying this patch

Comment #13

pounard

French

CreditAttribution: pounard commented 4 November 2013 at 08:41

Didn't test the patch myself but +1 seeing this happening.

Comment #14

btopro CreditAttribution: btopro commented 6 November 2013 at 17:54

+1 for patch #8. Applied to 3 different systems both local and production machines and didn't experience any issues. Also noticed if I had two frames open that cron actively running on 1 frame wasn't negatively impacting the performance of the other (not usually the case w/ how many themes and context driven block positions I have on most distros).

Comment #15

bradallenfisher CreditAttribution: bradallenfisher commented 6 November 2013 at 20:00

File	Size
afterpatchblockrehash.jpg	53.29 KB
beforepatchblockrehash.jpg	116.04 KB

Applied path and can validate that the only drop in query count was from the patch.
All updates to the block table that would lock the records were prevented from happening.

Was able to navigate around while cron was running via separate browser.

Comment #16

btopro CreditAttribution: btopro commented 6 November 2013 at 20:32

So is there no longer a way to wipe the cache_block table via a cache_clear_all call? While using the admin_menu module and clicking to kill all caches which just trips cache_clear_all; pre patch #8 it wipes the block caches for everything doing the (painfully slow) block_rehash but after the patch it doesn't appear there is any way to globally clear these caches? Is this intended behavior since we're not running under the assumption that caches are being cleaned up correctly in an on-demand manner?

Comment #17

chx CreditAttribution: chx commented 7 November 2013 at 09:23

There is quite some misunderstanding in #16 both of what the patch does and what block module does.

Tthe patch does not affect any logic, absolutely not in Drupal 7 because it's a stable release. The only thing it changes is not running UPDATEs that do nothing because MySQL is too stupid to optimize them away and not lock. Really, no other change.

cache_block is wiped in system_cron and drupal_flush_all_caches because block_flush_caches does a return array('cache_block');. The patch does not even touch block_flush_caches in any way or form, it changes _block_rehash which in turn never ever had anything to do with cache_block but let me repeat myself: you can't find a single thing that has changed in the behavior of _block_rehash.

Comment #18

btopro CreditAttribution: btopro commented 7 November 2013 at 14:03

Thank you for clarifying, this is a much more layman's explanation of what's being proposed and how it works. I some how knew that and when reviewing the devel query logs to A/B compare started to mix the concepts up in my mind. It also wasn't conceivable to me that something this ineffective / inefficient could be in core so long :).

Drastic performance improvements in my testing for module enable/disable operations, cron runs, and cache clears since it no longer locks the db. Loving this patch. +5 :)

Comment #19

iamEAP CreditAttribution: iamEAP commented 17 December 2013 at 20:12

8: 1693336_8.patch queued for re-testing.

Comment #20

David_Rothstein CreditAttribution: David_Rothstein commented 27 December 2013 at 03:55

Status:

Reviewed & tested by the community

» Needs review

Great patch, and thanks for all the testing. I think we can get this in but see a few issues:

         // Disabled modules are moved into the BLOCK_REGION_NONE later so no
         // need to move the block to another region.
         $block['status'] = 0;
+        // This change needs to be recorded in the database.
+        $changed_blocks[$module][$delta] = TRUE;
       }
       // Set region to none if not enabled and make sure status is set.
       if (empty($block['status'])) {

Shouldn't the block be marked as changed in the second part too (when the region is changed)?

```
+  $alter_implemented = (bool) module_implements('block_info_alter');
```
Themes can implement alter hooks also.

It would be simplest just to remove the $alter_implemented check altogether. It seems like a miniscule performance improvement compared to the massive performance improvement that is the main goal of the patch.
(minor) The test methods are missing docblocks.

Comment #21

David_Rothstein CreditAttribution: David_Rothstein commented 27 December 2013 at 04:14

File	Size
1693336_21.patch	7.44 KB

interdiff-8-21.txt	4.25 KB

Here's a reroll fixing the above issues, and also adding one more very useful assertion to the test (seems like something my fix for point 1 above could easily have been broken).

Good to go?

Comment #22

chx CreditAttribution: chx commented 27 December 2013 at 05:32

File	Size
interdiff.txt	2.34 KB
1693336_22.patch	6.43 KB

Well, if we always do the original_database_blocks thing then bookkeeping locally changed blocks is not necessary. It was not for performance, it was for memory but I think we are good (reference counting and all that).

Comment #23

David_Rothstein CreditAttribution: David_Rothstein commented 27 December 2013 at 18:33

Status:

Needs review

» Reviewed & tested by the community

File	Size
1693336_23.patch	6.4 KB

interdiff-22-23.txt	1.08 KB

Oh, ha, good point.

Yeah, I think the memory hit of an extra array is not likely to be too large. It could also be improved as a followup.

I removed the $cache variable (since after #22 there's no reason to introduce it; it's not being used). But everything looks great to me now, so as long as this passes tests, it should be good to go.

Comment #24

David_Rothstein CreditAttribution: David_Rothstein commented 30 December 2013 at 19:48

Status:	Reviewed & tested by the community	» Fixed
Issue tags:		+7.25 release notes

Committed to 7.x - thanks! http://drupalcode.org/project/drupal.git/commit/31d124b

Comment #25

13 January 2014 at 19:50

Status:

Fixed

» Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

Comment #26

chaunceyt CreditAttribution: chaunceyt commented 9 February 2015 at 17:30

I'm responsible for a site that has over 5,000 bean blocks and every time a bean block is created, updated and/or deleted _block_rehash() is called. The query generated by this function is extremely expensive when you have over 5,000 bean blocks. Our team retransformed the query in an attempt to make it less expensive in the stated context.

Issue: https://www.drupal.org/node/2411917
Patch: https://www.drupal.org/files/issues/2411917-_block_rehash.patch