Problem/Motivation
Right now we don't use the rsync file copier as the default file copier used by Composer Stager for 2 reasons
- We can't test it on DrupalCI
- There might be platform variations that would be hard to test
But rsync might perform better than the PHP file copier. It also might be better at handling symlinks if choose to support them in some form.
Proposed resolution
As a first step to determine whether it is worth trying to see if rsync should be the default copier we should test to see if performs better and if so how much better.
How to test
We can only test this locally because Drupalci doesn't have rsync available
Probably the easier to test this to make branch where record times to a file. We just append the time to the same file at various points in the stage life cycle
The file copier is only used for create and apply the stage
- In \Drupal\package_manager\StageBase::create before and after
$this->beginner->begin($active_dir, $stage_dir, new PathList($event->getExcludedPaths()), NULL, $timeout); - In \Drupal\package_manager\StageBase::apply() before and after
$this->committer->commit($stage_dir, $active_dir, $paths_to_exclude, NULL, $timeout);
I would make a utility function on StageBase that records the time to file so it would be like
$this->recordTimeToFile('before begin');
$this->beginner->begin($active_dir, $stage_dir, new PathList($event->getExcludedPaths()), NULL, $timeout);
$this->recordTimeToFile('after begin');
`recordTimeToFile` should make the file name based on the stage id, the current stage class, developers name(hard coded to tell difference between locals) and the copier being used. that way we will have 1 file per life cycle, know which copier was used and which updater. The directory can just be hardcoded.
We should upload the files to this issues. Each person testing should run multiple. A good way to test to just run \Drupal\Tests\automatic_updates\Build\CoreUpdateTest so we know it the same for everyone
The copier can be set in settings.php via the config `package_manager.settings.file_syncer` see \Drupal\package_manager\FileSyncerFactory::create for values
Even though we are not testing on drupalci we should still push the branch so others can test because it might be different systems
| Comment | File | Size | Author |
|---|---|---|---|
| #21 | Screenshot 2023-05-10 at 4.29.54 PM.png | 149.05 KB | omkar.podey |
| #13 | Screenshot 2023-05-05 at 11.19.55 AM.png | 712.5 KB | yash.rode |
| #10 | UpdateStage_yashrode_rsync.txt | 95 bytes | yash.rode |
| #10 | UpdateStage_yashrode_rsync.txt | 95 bytes | yash.rode |
| #10 | UpdateStage_yashrode_php.txt | 95 bytes | yash.rode |
Issue fork automatic_updates-3304108
Show commands
Start within a Git clone of the project using the version control instructions.
Or, if you do not have SSH keys set up on git.drupalcode.org:
Comments
Comment #2
effulgentsia commentedThank you for this issue! It would be good to test this with scenarios that include a large number of files that need to get copied. For example:
Comment #3
tedbowComment #4
tedbowComment #6
tedbowComment #7
tedbowComment #8
wim leers🚀
Comment #10
yash.rode commentedI tried running testApi() for both php and rsync and following are the results I got, rsync is faster than php file copier.
php
php1
php2
rsync
rsync1
rsync2
Comment #11
yash.rode commentedNeeds testing from someone else,
Change the file path of file_put_contents() in StageBase.php to match to your local and try running
\Drupal\Tests\automatic_updates\Build\CoreUpdateTest::testApiwith php file copier and rsync.Comment #12
wim leers#10 contains raw data. Can you please post your interpretation/analysis of that data? 🙏 😊
Comment #13
yash.rode commentedLeft side are the file contents when php file copier is used and right 2 are the results for rsync. The difference between before and after the begin/commit operation has done is noted and rsync takes ~2 sec lesser time but it can be a problem with my machine, as it's bit slower.
Comment #14
wim leersThanks, #13 is a start for an analysis, but is nowhere near enough.
Missing fundamental pieces:
find,duetc.)Comment #15
yash.rode commentedAs discussed in yesterday's call, I am running
\Drupal\Tests\automatic_updates\Build\CoreUpdateTest::testApiand we want to test this on other machine as we found in past that my machine is slower compared to others. So the first step should be running the same test on @omkar.podey and @kunal.sachdev's machine so that we can confirm that we are getting the correct results and as described in the issue summary if we try to run the same test on different machines we can come to a conclusion from the observations from several developers.Comment #16
kunal.sachdev commentedI also tried running testApi() for both php and rsync and following are the results :
Php1
before begin:1683287269
after begin:1683287287
Total time for begin: 18
before apply:1683287332
after apply:1683287360
Total time for apply: 28
Php2
before begin:1683287635
after begin:1683287654
Total time for begin: 19
before apply:1683287699
after apply:1683287728
Total time for apply: 29
Rsync1
before begin:1683286476
after begin:1683286496
Total time for begin: 20
before apply:1683286542
after apply:1683286575
Total time for apply: 33
Rsync2
before begin:1683286850
after begin:1683286868
Total time for begin: 18
before apply:1683286911
after apply:1683286940
Total time for apply: 29
Comment #17
wim leersGreat call, @yash.rode, to rule out that it's a machine/environment-specific observation that you were seeing! 👍
@kunal.sachdev confirmed @yash.rode's surprising discovery (that
rsyncandphpfile syncer have similar performance).Great! 😊
Now it's time to address my review from #14, to build an understanding of when the performance is comparable between the two: in all circumstances, or only some?
Comment #18
omkar.podey commentedComment #19
omkar.podey commentedI'm not sure if the tests are actually using rsync as i tried to set
but when i tried to get the setting using
i always get
phpinstead ofrsync.Comment #20
wim leersGREAT FIND, @omkar.podey! That completely invalidates the test results in #13 and #16…
Comment #21
omkar.podey commentedMy results show similar times, and just a 2 sec difference in the apply operation, now expanding the test to copy more files/larger files.
Comment #22
effulgentsia commentedDo the begin and apply operations run precondition checks? If so, it's possible that those are a significant portion of the time and are therefore masking the difference between php and rsync for the actual copying part. I think it would be good to run these tests without running the precondition checks (for testing purposes, is it possible to override services in such a way as to remove the precondition checks?).
Also, is this being tested on a Mac? If so, I think it would also be good to test this on Linux, since that's the more common production environment. I think doing that within Docker would be good enough, though another possibility would be to use an account on DigitalOcean or somewhere similar.
Comment #23
wim leersThis is a very good point. I've actually noticed that the Status Report page is horrendously slow when Automatic Updates is installed, so I think this is a very likely theory!
Another great point. Although … I don't see why performance characteristics of
rsyncvs the naïve PHP-based alternative would be so wildly different on different operating systems? You're right that this would be better for sure.But isn't Docker-on-Mac known for its abysmal file system performance? 😅
Comment #24
omkar.podey commentedThis should not be closed, i tried to add an early return in the
\Drupal\package_manager\dispatchto avoid invoking validators, but the test fails to register times from apply operation so it's not as simple to just add an early return and test the file copiers without the validators.So we need to figure out how to disable validators with the test passing.
Comment #25
wim leersOne possible solution: new test-only module that is installed for this test that removes the
event_subscriberservice tag from all services except for the essential ones.