We rebuild the registry a lot during the installation process, and most of the time in this rebuild is currently spent calling hash_file() on the content of the files all over again.

I suggest we ignore the files we already have parsed previously in the same request. In my testing, this significantly improves the performance of the installation process (executed through Drush), and it is likely to increase the performance of the automated testing too.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Damien Tournoud’s picture

Proposed patch.

Damien Tournoud’s picture

Status: Active » Needs review
klausi’s picture

Status: Needs review » Reviewed & tested by the community

Not sure we should develop this against D8 since the registry will go away soon, see also #1541674: Remove the registry.

The approach looks sane and I can see that every call to module_enable() will trigger the parsing of all files, which is quite ridiculous.

Could you post some performance benchmarks for your use case? Would be interesting to see how much faster an installation of commerce kickstart or recruiter can be.

RTBC to me.

klausi’s picture

Issue tags: +Needs backport to D7

Tagging.

catch’s picture

Status: Reviewed & tested by the community » Needs work

Please post xdebug or xhprof output for the installer, or even time drush si would do if it's significantly different.

+  // During susequent rebuilds, we only add new files. It makes the rebuilding

Typo for subsequent.

The static flag then calling out to registry_get_parsed_files() meant I had to read the patch twice as well as digging out the function to see what was going on, wouldn't it be simpler to keep a static array of files that have already been parsed? If the answer is "no it's not" then I don't mind though.

NIce to see performance work happening on the installer again.

Damien Tournoud’s picture

FileSize
1.3 MB
1.3 MB

A typical installation of Commerce Kickstart using drush site-install commerce_kickstart --verbose.

Before:

real	7m10.698s
user	2m46.366s
sys	1m10.124s

After:

real	6m7.733s
user	2m27.961s
sys	1m3.828s

Attached the XHProf traces of the two Drush sessions.

Damien Tournoud’s picture

Status: Needs work » Needs review
FileSize
2.57 KB

Reroll with the typo spotted in #5 fixed.

The static flag then calling out to registry_get_parsed_files() meant I had to read the patch twice as well as digging out the function to see what was going on, wouldn't it be simpler to keep a static array of files that have already been parsed? If the answer is "no it's not" then I don't mind though.

It might, but I want to change the structure of this function as less as possible. Every change we make we risk breaking stuff, and there is no point in fine tuning a component that is going away anyway. This patch is purely self-contained and cannot possibly break anything.

Status: Needs review » Needs work

The last submitted patch, 1470670-drupalcachearray-destruct-invalid.patch, failed testing.

amateescu’s picture

Status: Needs work » Needs review
FileSize
1.74 KB

The wrong patch was rerolled in #7, this one should be better :)

sun’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: +Testing system

Nice one. Good to go, I think.

sun’s picture

oh, and this likely makes WebTestBase::preloadRegistry() obsolete...? (We can figure that out in a separate follow-up issue)

Anonymous’s picture

this looks good to me. i tried to think of how this would break, but came up with nothing.

catch’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

Thanks. This was RTBC already so I've gone ahead and committed/pushed to 8.x, moving to 7.x for backport.

tim.plunkett’s picture

Status: Patch (to be ported) » Needs review
FileSize
1.75 KB

Rerolled.

amateescu’s picture

Status: Needs review » Reviewed & tested by the community

Nice and clean reroll :)

David_Rothstein’s picture

Has anyone tested this patch with Drush commands that download/update new code? For example:

$ drush help pm-update
Display available update information for Drupal core and all enabled projects and allow updating to latest recommended releases. Also apply any database updates required (same as pm-updatecode + updatedb).....

I don't know much about the internals of how that command works, but if it's downloading new versions of a module and then trying to execute the updated code in the same request, it could run into problems with this patch, no?

Or actually even some of the simpler Drush commands also: If there's any Drush code that does "rebuild registry + download new code + rebuild registry again" in one request, that seems like this patch would leave the registry out of date until the next cache clear.

Besides the above, I suppose this is pretty safe.

webchick’s picture

Status: Reviewed & tested by the community » Needs review

That sounds like a "needs review."

pounard’s picture

This patch could be improved a bit: the static variable could be called $first_call for example, and a if ($first_call) {} could surround the first 5 lines (requiring database stuff), which would avoid some unnecessary require_once and static Database object calls.

mgifford’s picture

manuelBS’s picture

Any progress here? Patch seams to work very nice.

mgifford’s picture

14: drupal-1470656-14.patch queued for re-testing.

killua99’s picture

Issue summary: View changes
Status: Needs review » Reviewed & tested by the community

I'm working with this patch since .... I can't remember.

This patch has been installed in over more than 4 site (for me) with none problem at all.

My 4 sites are from smallers sites to biggers sites.

None problem at all.

manuelBS’s picture

For me this patch also works in 3 sites. Ready to commit I guess.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 14: drupal-1470656-14.patch, failed testing.

David_Rothstein’s picture

Status: Needs work » Needs review

Looks like that was caused by a testbot fluke, but #16 still needs an answer (and maybe #18 too).

killua99’s picture

FileSize
3.38 KB

What about this approach. Looks like take in consideration comment #18 and the #16 maybe the pm-update is really sensitive command.

We can start test from this patch and so.

Fabianx’s picture

Status: Needs review » Needs work

I think this should have a $seen array, which is static but keyed by filename.

Because while modules could be enabled in the same request, it is very very unlikely that any class file changes dynamically.

Additionally to support updates however we likely should add a drupal_static instead and have it be cleared by the module system.

Then its _really_ safe.

stefan.r’s picture

Issue tags: +Performance

joseph.olstad’s picture

we're still using patch 14

  • catch committed 42e17a1 on 8.3.x
    Issue #1470656 by Damien Tournoud, amateescu: Fixed Registry rebuild...

  • catch committed 42e17a1 on 8.3.x
    Issue #1470656 by Damien Tournoud, amateescu: Fixed Registry rebuild...
Fabianx’s picture

Issue tags: +Drupal bugfix target
joseph.olstad’s picture

This is the exact same patch as patch 14 by @tim.plunkett, and exactly the same changes that were committed to D8 core a while back. The patch was originally for D7 , committed to D8 verbatim.

Need to re-trigger the testbot.

I can't see why this wouldn't pass tests, so lets trigger the testbot.

We've been using patch #14 for a long time now and on D7 7.50 as well as 7.44 and previous releases.

From my review of the patch, it is identical to what was committed (see above commits) into D8 registry.inc

renamed the patch.

joseph.olstad’s picture

Status: Needs work » Needs review

Status: Needs review » Needs work
joseph.olstad’s picture

Status: Needs work » Reviewed & tested by the community

RTBC.
Jenkins tests were re-run, php 5.6 passes on second test
failures were just glitches by the testbot.

Fabianx’s picture

Status: Reviewed & tested by the community » Needs review
Issue tags: +Needs framework manager review

I am still of the opinion that we should use $seen array instead of a global variable. It might be a little more memory needed, but it definitely is safer.

Also #16 still needs an answer.

Overall we might be able to get away with just checking timestamps instead of the whole hash.

joseph.olstad’s picture

@Fabianx , I noticed this was already committed to 8.0.x, 8.1.x , 8.2.x, 8.3.x and 9.x without the $seen array.

Commit 42e17a1 on 9.x, 8.1.x, 8.0.x, 8.2.x, 8.3.x
by catch

As far as I understand the fact that this was already committed 'as is' / 'verbatim' seems logical to me that the next step would be to commit it to D7.
Just trying to figure this out where this fits with the backport policy "if there are good reasons to work on a fix independently (for example divergence between the 7.x and 8.x code bases or a different approach being needed), then the 7.x issue may be worked on and fixed independently."

In this situation the patch was written for D7 and the reroll was a verbatim no code change for D8 and was committed to D8. I'm of the opinion that a new child issue should be created for the new $seen array idea. Otherwise how many years will it take to get this into D7? D7 is going to be around for quite a few more years and it'd be nice to get performance fixes in like this so that the 1 million or so installations can enjoy improved performance a few years before D7 support is dropped.

Fabianx’s picture

#39: The backport policy does not apply to code however that was removed from Drupal 8 again after commit, which is true for all registry related things. Therefore we can (and must) treat it as a new patch.

I am not opposed to the patch, but also don't want to break lots of installations in certain cases, which is why answering #16 is important.

joseph.olstad’s picture

Oh wow, comment #31 caused me some confusion, it says 27 days ago, however the commit it'self mentioned in the comment was actually done in 2012 .

It appeared to me like commit was recently made 27 days ago but it was done 4 years ago...

Wow, took almost 4 years for that commit message to show up.

This caused me a bit of confusion.

Fabianx’s picture

#41: No, the commit message is from when the 8.x-3.x branch was opened, which makes it a new commit on that branch.

But the functionality has been reworked in Drupal 8 by removing the registry in favor of composers autoloader.

Therefore the code that was committed is no longer present even though it had not been reverted.

Introducing that code now in D7 has such a different risk than usual backports, because the functionality was never really tested on D8 and especially not on the released version.

Therefore in this case the backport policy does not apply.

joseph.olstad’s picture

@fabianx , I consider this patch to be low-lying fruit, low risk, its on joelpittets list of patches for low risk, we've been using it for a long time on Drupal 7.38 , 7.39 , 7.40 , 7.44 and now 7.50 , it's greatly improved installation times which is important for our CI testing on travis and also for simplytest.me when our distro is spun up (wetkit distro) for demos.

We dropped our installation times by more than half (the bigger the distro install profile is the more noticeable improvement this patch makes ) by including this patch in the wetkit distro and included a few other of joelpittets favourites , some of which were brought into 7.50 , but I consider this patch to be easy pickings, easy commit credits to drupal core glory for maintainers and adds a lot of value to Drupal , we have millions invested in the development of the wetkit distribution which is still used by many of our government clients and is still being implemented for some new government clients. Although we have less than 100 reported installs of our distribution it is installed by really big clients that have deep pockets and big budgets. One of those sites has over one quarter million pieces of content. There is also a province who's entire site is using a version of this distro.

While currently my main client (a very high profile client) does not use a current version of this distro we do use this patch.

The success of Drupal 8 is depends on the success of Drupal 7 , holding back on improving Drupal 7 may not necessarily convert directly into new Drupal 8 installations and might actually do the opposite and sour developer relations.

Fabianx’s picture

#43: Thanks for giving feedback that it works for you.

However #16 is still not answered.

And it needs to be first.

joseph.olstad’s picture

Status: Needs review » Reviewed & tested by the community

RTBC++

To answer @david_rothsteins question about drush use case in comment #16

Using this patch did the following update to the entity API module using drush pm-update , WORKS GREAT

drush pm-update entity

 drush pm-update entity
Update information last refreshed: Thu, 09/01/2016 - 11:13
 Name                 Installed Version  Proposed version  Message
 Entity API (entity)  7.x-1.6            7.x-1.7           Update available


Code updates will be made to the following projects: Entity API [entity-7.x-1.7]

Finished performing updates. [ok]

filename	name	type	status	schema_version
sites/all/modules/entity/entity.module	entity	module	1	7003

Checked the system table, everything looks as expected, although there were no schema changes between 1.6 and 1.7 so I tried another module, next up: custom_search

drush pm-update custom_search -y
Update information last refreshed: Thu, 09/01/2016 - 11:48
 Name                           Installed Version  Proposed version  Message
 Custom Search (custom_search)  7.x-1.16           7.x-1.20          Update available

Code updates will be made to the following projects: Custom Search [custom_search-7.x-1.20]

Custom_search 7104 Store vids - machine names association.

Before upgrading the schema record in the system table was at 7103
After , it is 7104 (AS EXPECTED) (see query and result)

SELECT filename
      ,name
      ,type
      ,status
      ,schema_version
  FROM system where name like 'custom_search'
filename	name	type	status	schema_version
sites/all/modules/custom_search/custom_search.module	custom_search	module	1	7104

RTBC++

Fabianx’s picture

#45: For a true test case however you need e.g. a renamed class within the same file as that is what we are testing here.

joseph.olstad’s picture

@fabianx, how about your $seen array idea in comment #27 , do you have an example of where this has been done in the past? I guess I could search through the core code.

Actually a better drush pm-update test would be an update of many modules rather than the one-at-a-time that I did.

joseph.olstad’s picture

Status: Reviewed & tested by the community » Needs review

while my tests all passed, and this patch does really make a difference on installation speed, I would have to do a LOT more testing and debugging and searching to really be able to fully answer these questions brought up and investigate the $seen array suggestion.

  • catch committed 42e17a1 on 8.4.x
    Issue #1470656 by Damien Tournoud, amateescu: Fixed Registry rebuild...

  • catch committed 42e17a1 on 8.4.x
    Issue #1470656 by Damien Tournoud, amateescu: Fixed Registry rebuild...
joseph.olstad’s picture

Ok, if its good enough for 8.4.x , it should be good enough for 7.x. We're still using this patch and have been for ages.
Passes all testing, what more can you ask for?

dsutter’s picture

RTBC+ patch #34

gdaw’s picture

Patch 34 RTBC +1,

let's get this in for D7 !

joseph.olstad’s picture

Status: Needs review » Reviewed & tested by the community
Issue tags: -Needs framework manager review +Drupal 7.60 target

still using this with 7.56
it's been committed to D8 , there's nothing left to review.
Do it like they do on the discovery channel.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 34: D7-registry_rebuild_should_not_parse_same_file_twice-1470656-34.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

joseph.olstad’s picture

Status: Needs work » Reviewed & tested by the community

returning status to RTBC.

I triggered some re-tests on 7.57
except one, all of the tests I triggered passed, including php 7.1
It appears that the php 7 fail was a glitch in the test bot. Retriggering php 7 mysql test

Pol’s picture

I'm testing this patch on my local setup since a small week and I haven't found any issue, and indeed, it's faster.

I'm all for it in D7.

joseph.olstad’s picture

tests all pass, any fails were due to a temporary glitch/config regression in the jenkins / drupal.org testbot automation which has since been fixed.
Yet another win here.

joseph.olstad’s picture

Issue tags: -Drupal 7.60 target +Drupal 7.61 target

Bumping to 7.61, this didn't make it into 7.60

joseph.olstad’s picture

joseph.olstad’s picture

Pol’s picture

Updating credits.

Fabianx’s picture

I am okay to include this as is, because Drupal 8 has the same patch, so deviating from it does not seem wise.

  • Fabianx committed ea394f9 on 7.x authored by Pol
    Issue #1470656 by Damien Tournoud, joseph.olstad, Fabianx, catch:...

  • Fabianx committed 333fd50 on 7.x
    Revert "Issue #1470656 by Damien Tournoud, joseph.olstad, Fabianx, catch...
Fabianx’s picture

Status: Reviewed & tested by the community » Needs work

Unfortunately after commit I realised that we break the lookup cache with this.

Proposed change:

+++ b/includes/registry.inc
@@ -66,7 +71,14 @@ function _registry_update() {
+          // Ignore that file for this request, it has been parsed previously
+          // and it is unlikely it has changed.
+          unset($files[$filename]);

Here we should add the file to an $unchanged_files array.

$unchanged_files[$filename] = $file;

Later we should combine the files with the unchanged files again.

$files += $unchanged_files;

and do the lookup then.

Unrelated, but small performance gain: Use isset() instead of in_array($file, array_keys()) for the lookup in $files.

The main thing this skips is to call _registry_parse_files() on all $files again, but the lookup cache should not break as a result of that.

--

Unrelated follow-up:

module_implements('', FALSE, TRUE);
_registry_check_code(REGISTRY_RESET_LOOKUP_CACHE);

should be _outside_ of the transaction not inside. Can someone open a follow-up please?

Pol’s picture

Status: Needs work » Needs review
FileSize
3.45 KB

Updated patch with @fabian 's comment.

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC, the reason this is safe and #16 is no problem is because how this works:

- We just do not check the hash of the files again -- new files (from newly enabled modules) will be picked up and hashed correctly, as they cannot be present in registry_get_parsed_files(), which is what $files is compared against

- Existing files would only be affected _if_ the hash changed during the request, which is the performance gain that is used here

- Old files are deleted correctly -- by the else block -- as they are present in registry_get_parsed_files(), but not in $files [unchanged case]

  • Fabianx committed 007ee8a on 7.x authored by Pol
    Issue #1470656 by Damien Tournoud, joseph.olstad, Pol, Fabianx, catch:...
Fabianx’s picture

Status: Reviewed & tested by the community » Fixed

Committed and pushed to 7.x - Thanks all!

joseph.olstad’s picture

The cheer leading squad has arrived!

 (*) (*)
   \   |
    (0)
      |
    _/ \_

   (*)  (*)
    |   /
    (0)
      |
    _/ \_

 (*)--(0)--(*)
       |
     _/ \_

backflip

          _  _
    (*)   | /
      \   /
      (0)--(*)

   _   _
     \ /
      |
(*)--(0)--(*)

         _  _
          \  /   (*)
            \   /
            (0)
           /
        (*)

          _
         |   (*)
  |__   |    /
      \  ---(0)
          /
       (*)


   (*)  (*)
    |   /
    (0)
      |
    _/ \_


Pol’s picture

LOL ! :-)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.