For every contrib project we download during a make we spawn two Drush subprocess from the main process. The first subprocess is the 'make-process' Drush command, and then that process spawns another process, to do the actual downloading of the contrib project.
This seems a bit wasteful to me.
Even if you set --concurrency=1 (the default in Drush 6) you are still spawning lots and lots of subprocesses, when actually you could do it all in a single process.
Here's a patch that changes how make splits up the downloading of contrib projects, so that we use at most drush_get_option('concurrency', 4)
new sub-processes to run 'make-process'. This means that we still benefit from the possible concurrency, but don't waste time spawning sub-processes and bootstrapping Drush where it's not needed.
In casual testing this makes things a lot quicker on my machine. I've not run this patch through the test suite though.
Comment | File | Size | Author |
---|---|---|---|
#4 | drush-2012362-make-fewer-subprocesses.patch | 3.01 KB | Steven Jones |
drush-make-fewer-subprocesses.patch | 3.02 KB | Steven Jones |
Comments
Comment #1
moshe weitzman CreditAttribution: moshe weitzman commentedWe recently defaulted concurrency to 1. I'm not sure if this diff is intentional:
Comment #2
jhedstromTests are passing with this, and it looks like a reasonable approach. Also wondering about the comment in #1--is this intentionally being moved back to 4?
Comment #3
greg.1.anderson CreditAttribution: greg.1.anderson commentedPrevious tests showed that the performance of make with concurrency was very similar to the performance without concurrency; we should not go back to a default value > 1 without performance numbers that show a benefit of doing so.
Comment #4
Steven Jones CreditAttribution: Steven Jones commentedYeah, so the concurrency thing really does make a difference because the whole process puts a lot less emphasis on bootstrapping Drush, so concurrency does make a difference:
In my simple testing downloading 16 simple projects in a make file with this patch, and different values of concurrency:
Cold cache:
Warm cache:
However, I don't want this patch to get held up by discussion around concurrency, when I think it just makes the whole process more efficient in any case, so here's a patch that doesn't include the default concurrency change, I've also introduced a temporary variable so that if we do decide to change the default concurrency in a follow up issue, we only have to do so in one place.
Comment #5
moshe weitzman CreditAttribution: moshe weitzman commentedHopefully jhedstrom can take a look.
Comment #6
jhedstromThanks Steven. I committed #4 in 9a666f6. Let's move the default concurrency discussion back to #1982502: Reduce complexity of backend invoke; consider removing concurrency, as that's where it was removed. The benchmarks in #4 have me leaning back to restoring it if the backend messages have been sorted out.