Problem/Motivation

(Cacheable|Bubbleable)Metadata::merge() always merges, instead of just adopting the values of $other if $this is empty.

This has quite a significant effect, because BubbleableMetadata extends CacheableMetadata, and BubbleableMetadata is what Renderer uses to track/merge/bubble all metadata that needs to bubble!

Proposed resolution

Fix that.

Remaining tasks

None.

User interface changes

None.

API changes

None.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Wim Leers’s picture

XHProf, /node/1

Before
90,164 function calls
After
86,298 function calls

Minus 3866 function calls! Or 4.3% less function calls overall!

Benchmarking, /, UID 1, ab -n 200 -c1

Slightly faster:

dawehner’s picture

So yeah I was wondering whether we could optimize for ages as well.

Wim Leers’s picture

FileSize
1.1 KB

#1 brought it down to 86,298 function calls, and was identical to what I did in #2335661-137: Outbound path & route processors must specify cacheability metadata. This brings it down to 84,748 function calls.

Wim Leers’s picture

FileSize
5.66 KB
5.82 KB
1.96 KB

Redoing the benchmarks with patch #2.

Benchmarking, /, UID 1, ab -n 200 -c1

Slightly faster:

Wim Leers’s picture

Issue summary: View changes

Clarified why the impact is so big in the IS.

We should do something similar for BubbleableMetadata::merge() in a follow-up. Or perhaps here?

dawehner’s picture

Status: Needs review » Reviewed & tested by the community

I love how diagrams replaced readable data. Let's better add a pie chart as well ;)

The last submitted patch, 1: 2335661-1.patch, failed testing.

alexpott’s picture

Status: Reviewed & tested by the community » Needs review

I'm guessing the performance benefits are due to not doing:

\Drupal::service('cache_contexts_manager')->validateTokens($cache_contexts);

However this is the only place we do this so are we now skipping this completely for some contexts?

Wim Leers’s picture

Title: Optimize CacheableMetadata::merge() » Optimize CacheableMetadata::merge() + BubbleableMetadata::merge()
Issue summary: View changes
FileSize
2.18 KB
1.13 KB
5.25 KB
5.74 KB
1.98 KB
339.1 KB

#8: No, that's just a nice additional benefit. The number of array_merge(), array_unique() and sort() function calls saved is also quite large.

Will answer this question better by doing another round of profiling.


Expanding scope to what I suggested in #5. And then checking performance again.


XHProf, /node/1

Before
90,164 function calls
After
82,961 function calls

Minus 7203 function calls! Or 8.0% less function calls overall! Measuring the wall time difference with XHProf is difficult though.

The differences:

Function Name Calls Diff Calls
Diff%
Incl. Wall
Diff
(microsec)
IWall
Diff%
Excl. Wall
Diff
(microsec)
EWall
Diff%
Incl.
MemUse
Diff
(bytes)
IMemUse
Diff%
Excl.
MemUse
Diff
(bytes)
EMemUse
Diff%
Incl.
PeakMemUse
Diff
(bytes)
IPeakMemUse
Diff%
Excl.
PeakMemUse
Diff
(bytes)
EPeakMemUse
Diff%
func_get_args -939 -13.0% 3 1.6% 3 1.6% -549,600 -1017.6% -549,600 -1017.6% -776 -1.8% -776 -1.8%
array_merge -878 -12.2% 37 19.8% 37 19.8% -90,480 -167.5% -90,480 -167.5% 1,408 3.2% 1,408 3.2%
Drupal\Core\Cache\Cache::Drupal\Core\Cache\{closure} -514 -7.1% -3 -1.6% -3 -1.6% -24,672 -45.7% -24,672 -45.7% -48 -0.1% -48 -0.1%
array_unique -439 -6.1% -10 -5.3% -10 -5.3% -51,872 -96.0% -51,872 -96.0% -256 -0.6% -256 -0.6%
sort -439 -6.1% -4 -2.1% -4 -2.1% 0 0.0% 0 0.0% -136 -0.3% -136 -0.3%
Symfony\Component\DependencyInjection\Container::get -434 -6.0% 713 381.3% 79 42.2% 328 0.6% 0 0.0% -7,312 -16.7% -7,904 -18.0%
Drupal::service -434 -6.0% -272 -145.5% -568 -303.7% -120 -0.2% 0 0.0% -2,496 -5.7% -480 -1.1%
Drupal::getContainer -434 -6.0% 6 3.2% 6 3.2% 0 0.0% 0 0.0% -280 -0.6% -280 -0.6%
in_array -257 -3.6% 117 62.6% 117 62.6% -8 -0.0% -8 -0.0% -440 -1.0% -440 -1.0%
array_filter -257 -3.6% -309 -165.2% -349 -186.6% -32,184 -59.6% -7,512 -13.9% -40 -0.1% 8 0.0%
Drupal\Core\Cache\Cache::mergeMaxAges -257 -3.6% -975 -521.4% -642 -343.3% -12,336 -22.8% 132,928 246.1% -816 -1.9% -512 -1.2%
Drupal\Component\Utility\NestedArray::mergeDeepArray -243 -3.4% -6 -3.2% -58 -31.0% -36,256 -67.1% -36,240 -67.1% -752 -1.7% -792 -1.8%
Drupal\Component\Utility\NestedArray::mergeDeep -243 -3.4% -452 -241.7% -389 -208.0% -36,256 -67.1% 152,880 283.1% -1,472 -3.4% 0 0.0%
Drupal\Core\Cache\Cache::mergeContexts -226 -3.1% -1,610 -861.0% -1,213 -648.7% -35,648 -66.0% 177,992 329.6% -4,064 -9.3% -1,208 -2.8%
Drupal\Core\Cache\CacheContextsManager::validateTokens -226 -3.1% 10 5.3% 11 5.9% 0 0.0% 0 0.0% 336 0.8% -176 -0.4%
Drupal\Core\Cache\Cache::validateTags -213 -3.0% 5 2.7% 5 2.7% 0 0.0% 0 0.0% 48 0.1% 48 0.1%
Drupal\Core\Cache\Cache::mergeTags -213 -3.0% -730 -390.4% -748 -400.0% -37,296 -69.1% 175,128 324.3% -16 -0.0% -488 -1.1%
array_merge_recursive -208 -2.9% 8 4.3% 8 4.3% -23,432 -43.4% -23,432 -43.4% 0 0.0% 0 0.0%
Drupal\Core\Render\Renderer::mergeAttachments -208 -2.9% -16 -8.6% -82 -43.9% -33,424 -61.9% -9,984 -18.5% 0 0.0% 0 0.0%
is_string -115 -1.6% 0 0.0% 0 0.0% 0 0.0% 0 0.0% 144 0.3% 144 0.3%
is_integer -18 -0.2% -10 -5.3% -10 -5.3% 0 0.0% 0 0.0% 0 0.0% 0 0.0%
min -8 -0.1% 0 0.0% 0 0.0% 0 0.0% 0 0.0% 0 0.0% 0 0.0%

Benchmarking, /, UID 1, ab -n 200 -c1

Saves ~2 ms/request at /node/1.

Before
Time per request:       112.953 [ms] (mean)
Time per request:       112.953 [ms] (mean, across all concurrent requests)
Transfer rate:          175.60 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   108  113   2.1    113     119
Waiting:       97  101   2.1    101     108
Total:        108  113   2.1    113     119
After
Time per request:       110.791 [ms] (mean)
Time per request:       110.791 [ms] (mean, across all concurrent requests)
Transfer rate:          179.03 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:   106  111   2.3    110     120
Waiting:       95   99   2.2     98     108
Total:        107  111   2.3    110     120

Wim Leers’s picture

And I want to stress this again (quoting the IS):

This has quite a significant effect, because BubbleableMetadata extends CacheableMetadata, and BubbleableMetadata is what Renderer uses to track/merge/bubble all metadata that needs to bubble!

That is the main reason why the savings are so significant.

Status: Needs review » Needs work

The last submitted patch, 9: 2483433-9.patch, failed testing.

Wim Leers queued 9: 2483433-9.patch for re-testing.

The last submitted patch, 9: 2483433-9.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review

Testbot frustrations++

catch’s picture

I think we can shave off some more function calls - since both $this->whatever and $other->whatever can be empty.

I tested this vs. #9 on the front page/minimal profile/no page_cache/logged out, and saved an extra 294 (see xhprof diff - the run1 is with the above patch applied). More verbose though.

No interdiff since all the lines change.

Wim Leers’s picture

FileSize
2.65 KB

#15: I tried that too locally, but deemed it not worth it for similar reasons: 1) it only presents a very small incremental gain, 2) it makes things much harder to read. My version used double ternary operators. Interdiff against #9.

Wim Leers’s picture

I'm completely fine with #15, BTW. @amateescu said in IRC that he thinks #15 is very clear/understandable. Which is true.

Thoughts from others?

Fabianx’s picture

I like #15, too.

Still need to answer when we would actually validate cache contexts then ...

dawehner’s picture

Well, this now potentially moves the validation to the first place where you have two cache contexts at the same time ... given that I'm wondering whether it should be moved to cache set instead?

Wim Leers’s picture

IMO changing whether validation happens here is out of scope here. This patch currently has basically zero DX impact.

Excluding merging from validating cache contexts would impact the DX, because exceptions thrown there would make it harder to track down the code responsible for setting an invalid cache context (the stack trace would be useless).

dawehner’s picture

Fair at least #9 doesn't, if we check both the right and left side, the merging could happen later.

Wim Leers’s picture

#21: I realized that after I'd posted #20 too. So, yes, the patches here do affect the DX (i.e. the stack trace will be less helpful), but doing #19 would make the stack trace almost completely useless. Or am I seeing that too bleakly?

catch’s picture

It'd be even more verbose, but we could validate individual contexts if either is set, but without merging unless both are set?

Wim Leers’s picture

#23: True. That'd keep the DX at absolute parity. But would also reduce the gain in number of function calls, of course. Then again, reducing the number of function calls for validating cache contexts is being addressed by #2408013: Adding Assertions to Drupal - Test Tools..

Status: Needs review » Needs work

The last submitted patch, 15: 2483433-9.patch, failed testing.

Wim Leers’s picture

Status: Needs work » Needs review
FileSize
3.23 KB
656 bytes

So what do we do with #23?

In the mean time, a reroll to fix the sole test failure.

plach’s picture

Fabianx’s picture

Status: Needs review » Needs work
Issue tags: +Needs technical direction

Needs work, we need to decide on what to do with #23 / #8 still ...

Wim Leers’s picture

Once #2254865: toolbar_pre_render() runs on every page and is responsible for ~15ms/17000 function calls lands, the number of function calls this is able to remove should be reduced a fair bit, because we'll be doing less rendering on each page load.

Fabianx’s picture

On #8 + #23:

I think we need two modes in the end:

- Production mode - where performance trumps
- Debug mode, where you can exactly see what is responsible for the fails

Probably assertions would take care of that.

Wim Leers’s picture

Sounds good. But what's the current status/likelihood of assertions?

catch’s picture

I think for now I'd keep the DX parity here (i.e. validate on set) - and assume that we'll get assertions in before 8.0.0 to further optimize the validation.

If we don't get assertions in, we could always add a follow-up if we think the performance gain is more important than the DX. But then we might as well use assertions at that point.

damiankloip’s picture

Status: Needs work » Needs review
FileSize
9.05 KB
6.68 KB
42.68 KB

Rerolled and made another change to add mergeAttachments to BubbleableMetadata. I spoke to catch and we couldn't see a good reason this was callout out, and needing to be overridable. As the renderer service uses BubbleableMetadata directly, so another implementation if it wanted another implementation would do that there instead anyway?

A quick benchmark looks a bit like this,don't take these figures as gospel please:

UPDATE: Just spoke to Fabian too, he said he spoke to Wim and they came to a similar conclusion too. That sounds like playground stuff, but that's how it happened :)

dawehner’s picture

Nice! Makes also things easier to follow

We don't have any test coverage for that function? I would have expected that things fail

damiankloip’s picture

I didn't look at the tests yet, So I would expect stuff to fail. I was just testing this out.

yched’s picture

+ a lot, felt weird to have BubbleableMetadata::merge() call out to \Drupal::service('renderer') to deal with #attached

Does the new BubbleableMetadata::mergeAttachments() still need to be public ? ::merge() is the real thing, right ?

damiankloip’s picture

I think we might want to keep the mergeAttachments method on the renderer and interface, seems a few things use this directly. It can just call out to the new static method I guess? I just copied the docs do BubbleableMetadata::mergeAttachments and RendererInterface::mergeAttachments have the same docblock atm.

EDIT: sorry, messed up the interdiff :)

Still didn't look at test coverage changes.

The last submitted patch, 33: 2483433-33.patch, failed testing.

Wim Leers’s picture

#33: +1!

I think we want to remove RendererInterface::mergeAttachments() altogether — it is just polluting that interface IMO. It made sense at the time, but everything in the render system has been improved structure-wise.

Almost nothing uses this in core, definitely not in contrib, so disruption will be minimal.

dawehner’s picture

\Drupal\system\Tests\Common\MergeAttachmentsTest is the test coverage you are looking for.
Beside from that call, there is 1 single proper function call, I think its fine to change the API?

Wim Leers’s picture

#40: Yep to both of those.

catch’s picture

I think the API change here is fine. If we add a bc layer that nobody uses then it's instant dead code, which just makes things harder to maintain/follow with no benefit.

damiankloip’s picture

FileSize
38.85 KB
18.65 KB

Let's see how this gets on then.

Removed mergeAttachments from the interface, replaced the 2 usages outside of the renderer, moved the tests to BubbleableMetadataTest, so DrupalUnitTest > UnitTest - nice.

EDIT: It was a lazy man's interdiff. So the delete of MergeAttachmentsTest.php is not in there, but is in the actual patch :)

Status: Needs review » Needs work

The last submitted patch, 43: 2483433-43.patch, failed testing.

damiankloip’s picture

Oh, placeholders.

damiankloip’s picture

Status: Needs work » Needs review
FileSize
37.32 KB
846 bytes

I think postRenderCache just needs to go?

dawehner’s picture

Not a bad improvement. This is /admin/content without views render caching for anonymous user (with page cache disabled)

### FINAL REPORT

=== 8.0.x..8.0.x compared (557186d0ce293..557186fee3ad5):

ct  : 62,911|62,911|0|0.0%
wt  : 126,854|127,702|848|0.7%
mu  : 22,666,216|22,666,200|-16|-0.0%
pmu : 22,794,936|22,794,968|32|0.0%

=== 8.0.x..2483433 compared (557186d0ce293..55718713bc09d):

ct  : 62,911|56,525|-6,386|-10.2%
wt  : 126,854|122,735|-4,119|-3.2%
mu  : 22,666,216|22,665,072|-1,144|-0.0%
pmu : 22,794,936|22,782,776|-12,160|-0.1%

---
ct = function calls, wt = wall time, cpu = cpu time used, mu = memory usage, pmu = peak memory usage

### XHPROF-LIB REPORT

+-------------------+------------+------------+------------+------------+------------+
| namespace         |        min |        max |       mean |     median |       95th |
+-------------------+------------+------------+------------+------------+------------+
| Calls             |            |            |            |            |            |
|                   |            |            |            |            |            |
| 2483433           |     56,525 |     60,990 |     56,570 |     56,525 |     56,525 |
| 8_0_x             |     62,911 |     67,376 |     62,956 |     62,911 |     62,911 |
|                   |            |            |            |            |            |
| Wall time         |            |            |            |            |            |
|                   |            |            |            |            |            |
| 2483433           |    122,735 |    168,099 |    127,682 |    126,084 |    133,862 |
| 8_0_x             |    127,702 |    171,335 |    132,693 |    130,573 |    140,132 |
|                   |            |            |            |            |            |
| Memory usage      |            |            |            |            |            |
|                   |            |            |            |            |            |
| 2483433           | 22,665,072 | 23,353,192 | 22,692,385 | 22,665,072 | 22,915,584 |
| 8_0_x             | 22,666,176 | 23,404,488 | 22,698,745 | 22,666,200 | 23,166,856 |
|                   |            |            |            |            |            |
| Peak memory usage |            |            |            |            |            |
|                   |            |            |            |            |            |
| 2483433           | 22,782,776 | 23,506,152 | 22,810,537 | 22,782,776 | 23,033,096 |
| 8_0_x             | 22,794,968 | 23,568,072 | 22,828,003 | 22,794,968 | 23,296,536 |
|                   |            |            |            |            |            |
+-------------------+------------+------------+------------+------------+------------+
+++ b/core/tests/Drupal/Tests/Core/Render/BubbleableMetadataTest.php
@@ -206,4 +206,394 @@ public function providerTestCreateFromRenderArray() {
+    );
+    $this->assertSame($expected['#attached'], BubbleableMetadata::mergeAttachments($a['#attached'], $b['#attached']), 'Attachments merged correctly.');
...
+    );
+    $this->assertSame($expected['#attached'], BubbleableMetadata::mergeAttachments($b['#attached'], $a['#attached']), 'Attachments merged correctly; opposite merging yields opposite order.');
+
...
+    );
+    $this->assertSame($expected['#attached'], BubbleableMetadata::mergeAttachments($a['#attached'], $b['#attached']), 'Attachments merged correctly; duplicates are retained.');
+
...
+    );
+    $this->assertSame($expected['#attached'], BubbleableMetadata::mergeAttachments($a['#attached'], $b['#attached']));
+

Sounds like a good usecase for a @dataProvider? And maybe dedicated test methods ...

damiankloip’s picture

FileSize
35.71 KB
11.32 KB

mit data providers

dawehner’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs technical direction

Nice descriptive test methods. Thank you!

alexpott’s picture

Status: Reviewed & tested by the community » Fixed

This issue is a major task that will improve performance significantly and the disruption it introduces is limited. Per https://www.drupal.org/core/beta-changes, this is a good change to complete during the Drupal 8 beta phase. Committed dc53be5 and pushed to 8.0.x. Thanks!

  • alexpott committed dc53be5 on 8.0.x
    Issue #2483433 by damiankloip, Wim Leers, catch, dawehner: Optimize...
Wim Leers’s picture

Several significant loose ends were left here, resulting in both dead and inconsistent code. Created #2505171: Follow-up for #2483433, with a patch, to fix that.

  • catch committed 8114a72 on 8.0.x
    Issue #2505171 by Wim Leers: Follow-up for #2483433
    

  • catch committed 848d6b8 on 8.0.x
    Issue #2505171 by neclimdul: Follow-up for follow-up for #2483433 (...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.