Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
I recently upgraded from 5.8 to 5.9 and now I'm getting DRUSH_BACKEND messages like this with "drush make make_file".
>> field_group-7.x-1.1 downloaded.[ok]
>> strongarm-7.x-2.0 downloaded.[ok]
>> view_unpublished-7.x-1.1 downloaded.[ok]
>> link-7.x-1.1 downloaded.[ok]
>> jquery_update-7.x-2.3 downloaded.[ok]
>> DRUSH_BACKEND:{"type":"notice","message":"Command dispatch complete","timestamp":1367242226.1479,"memory":7515724,"error":null,"packet":"log"}
>>
>> DRUSH_BACKEND:{"type":"memory","message":"Peak memory usage was 8.19 MB","timestamp":1367242226.1485,"memory":7516408,"error":null,"packet":"log"}
>> Project xmlsitemap contains 9 modules: xmlsitemap_taxonomy, xmlsitemap_modal, xmlsitemap_i18n, xmlsitemap_node, xmlsitemap_user, xmlsitemap_menu, xmlsitemap_custom, xmlsitemap_engines, xmlsitemap.
>> webmaster_menu-7.x-1.1 downloaded.[ok]
>> DRUSH_BACKEND:{"type":"ok","message":"xmlsitemap-7.x-2.0-rc2 downloaded.","timestamp":1367242226.2942,"memory":7577432,"error":null,"packet":"log"}
>>
>>
>> DRUSH_BACKEND:{"type":"memory","message":"Peak memory usage was 8.19 MB","timestamp":1367242226.2985,"memory":7516388,"error":null,"packet":"log"}</em>
>> Project views_bulk_operations contains 2 modules: actions_permissions, views_bulk_operations.
>> diff-7.x-3.2 downloaded.[ok]
>> module_filter-7.x-1.7 downloaded.[ok]
>> views_bulk_operations-7.x-3.1 downloaded.[ok]
>> DRUSH_BACKEND:{"type":"notice","message":"Command dispatch complete","timestamp":1367242229.1381,"memory":7515772,"error":null,"packet":"log"}
>>
>> DRUSH_BACKEND:{"type":"memory","message":"Peak memory usage was 8.19 MB","timestamp":1367242229.1386,"memory":7516456,"error":null,"packet":"log"}
>> override_node_options-7.x-1.12 downloaded.[ok]
>> DRUSH_BACKEND:{"type":"memory","message":"Peak memory usage was 8.19 MB","timestamp":1367242229.194,"memory":7516572,"error":null,"packet":"log"}
>> masquerade-7.x-1.0-rc5 downloaded.[ok]
>> redirect-7.x-1.0-rc1 downloaded.[ok]
>> Project metatag contains 8 modules: metatag_panels, metatag_twitter_cards, metatag_ui, metatag_context, metatag_dc, metatag_views, metatag_opengraph, metatag.
>> metatag-7.x-1.0-beta7 downloaded.[ok]
>> zen-7.x-5.1 downloaded.[ok]
>> DRUSH_BACKEND:{"type":"notice","message":"Command dispatch complete","timestamp":1367242232.4251,"memory":7516092,"error":null,"packet":"log"}
and on and on...
Comment | File | Size | Author |
---|---|---|---|
#62 | drush-reduce-complexity-1982502-62.patch | 9.21 KB | greg.1.anderson |
#57 | drush-reduce-complexity-1982502-57.patch | 2.97 KB | greg.1.anderson |
#55 | 1982502.patch | 748 bytes | Damien Tournoud |
#54 | 1982502-repro.patch | 651 bytes | Damien Tournoud |
#40 | drush-correct-minor-defects-1982502-40.patch | 5.61 KB | greg.1.anderson |
Comments
Comment #1
moshe weitzman CreditAttribution: moshe weitzman commentedWe can't see the command you ran.
Comment #2
jhedstromI've noticed this as well. Happens on 6.x as well. No exact command/makefile yet as it is sporadic.
Comment #3
dpearcefl CreditAttribution: dpearcefl commentedI added the full command line to the original post.
Comment #4
jhedstromUsing git bisect, I traced this down to commit cc8b76b, for issue #1829418: Drush invoke process can't handle large packets of data.
Comment #5
greg.1.anderson CreditAttribution: greg.1.anderson commentedDoes this only happen for drush make?
Comment #6
jhedstromdrush make is the only place I've seen it. Unfortunately, those messages appear to be sporadic. To bisect, I used a makefile that contained enough projects so it was fairly consistent in outputting those messages (attaching here).
Comment #7
greg.1.anderson CreditAttribution: greg.1.anderson commentedAfter this issue is fixed, it would be good to go back and investigate #1380948: Print errors from caught exceptions during updatedb.
Comment #8
achtonI'm seeing this on upgrading from 5.8 to 5.9 too:
Command: drush -y make --working-copy --no-core --contrib-destination=. ./my-makefile.make
Etc. etc. ..
Comment #9
Damien Tournoud CreditAttribution: Damien Tournoud commentedWe are seeing that happening too.
Looking at the code, it feels like the "out of the band" facility provided by
drush_backend_packet()
is basically broken:drush_backend_packet()
output a "packet" to STDERR,_drush_backend_proc_open()
(that call the backend command), only sets up pipes for STDIN and STDOUT.The facility is only used in a few places, so it's possible that nobody every noticed.
Comment #10
greg.1.anderson CreditAttribution: greg.1.anderson commentedHistorically, Drush redirects stderr to stdout in backend invoke. It could be that there are some places where this is not done.
The best solution would probably be to reserve stderr for log messages & out of band info, and to stop redirecting stderr to stdout in backend invoke. This might be time consuming.
As mentioned in #9, it was probably a mistake that the out of band info was sent to stderr; we could try a quick fix of just changing this to stdout. That would be a much quicker fix, if it worked; however, there is some advantage to the fact that old versions of drush are sending out of band information to stderr; it might make it easier to do the "best" solution in Drush 6, and still work with Drush 5.
Not sure how soon I could take this on; I know we want to release Drush 6 RSN.
Comment #11
Damien Tournoud CreditAttribution: Damien Tournoud commentedI assume something somewhere redirects STDERR to STDOUT when calling the backend command... so it kind-of works most of the time, but I haven't been able to find where.
Comment #12
Damien Tournoud CreditAttribution: Damien Tournoud commentedHere is my hypothesis on how this currently works:
_drush_bootstrap_output_prepare()
) is actually what converts STDERR into STDOUT, at least in most cases.The output buffer set by
_drush_bootstrap_output_prepare()
will either discard everything except backend packets (withdrush_backend_output_discard()
- when --pipe is set), or will collect everything (withdrush_backend_output_collect()
), but in that case backend packets will be discarded a bit later bydrush_backend_output()
.I'm confused how all this is working in the first place. Seems like the whole system has grown out of control.
Comment #13
moshe weitzman CreditAttribution: moshe weitzman commentedIt sounds like this might be more widespread than what we hoped. Thanks for investigating, Damien.
This system might have been out of control when Adrian first wrote it :). Anyway, lets look for simplifications. FWIW, the concurrency that we added has really not worked out as well as I hoped. Only make uses it, and it causes surprises for folks there. I would not be opposed to removing it, if that would simplify the backend code.
Comment #14
greg.1.anderson CreditAttribution: greg.1.anderson commentedBackend has been hairy from the beginning; there have been some improvements, but concurrency and incremental output added additional logic which I think has resulted in a net increase in complexity. In any event, I think that we can all agree that improvements would be helpful.
I don't know that I'm in favor of removing concurrency. I am pretty sure msonnabaum is using it (where's my d.o feature to add someone as a follower to a thread? :> ), and it's useful for speeding up batch processing. I think it is of no practical benefit to make, though; while there is the possibility of some benefit under the right conditions, it rarely works out to any substantial increase in performance in practice.
I think that we could simplify the whole system if we managed to get consistent about stderr usage, and stopped using ob_start / ob_end in backend and pipe mode entirely. We could then also remove all of the places where we redirect stderr to stdout. We could also get rid of the dreaded (and problematic-for-Windows) `tty` construct if we handled stdin in backend invoke. That would involve -not- closing stdin at the beginning of the function, and avoiding the deadlock in some other way. Perhaps Drush could send a new special "start" and "end" log packet pairs. Backend invoke could close stdin on receipt of the "end" packet if "start" is received as the first packet, or on the receipt of any other packet if "start" is not the first packet. I think this should allow Drush 6 to talk to Drush 5 without deadlock.
Comment #15
greg.1.anderson CreditAttribution: greg.1.anderson commentedChanging title.
Comment #16
anarcat CreditAttribution: anarcat commentedfor the record, this is also a significant problem in aegir because some errors just disappear from our task logs, because they are not fed through stdout as they should be.
also note related issue #1721586: Improve Drush's handling of STDERR in backend invoke, which suggests adding an option of suppressing stderr (not a good idea in my opinion, because we may miss stuff) or just redirecting it to stdout in all cases (that I would prefer).
it is unclear to me why those packets go on stderr.
i'd love to discuss this during the code sprint tomorrow.
Comment #17
greg.1.anderson CreditAttribution: greg.1.anderson commentedUnfortunately, I cannot sprint tomorrow. :( I'll try to be in IRC at least, tho.
If you guys want to take out concurrency, I am not opposed.
Comment #18
anarcat CreditAttribution: anarcat commentedIs there a test to cover the problem that was fixed when this problem was introduced?
In other words, is there a test that checks the deadlock mentionned above?
Comment #19
greg.1.anderson CreditAttribution: greg.1.anderson commentedUnfortunately, the problem only happens on Windows. The issue is discussed here: http://drupal.org/node/766080#comment-4309936
Comment #20
moshe weitzman CreditAttribution: moshe weitzman commentedI discussed concurrency with jhedstrom since make is its only customer right now. We agreed that concurrency has not worked out that well and can be removed. Hopefully Greg can help with that.
Comment #21
anarcat CreditAttribution: anarcat commentedactually, i don't know if I can really reproduce this directly in drush 5. what I see is that this may be drush 6 specific for core drush commands *buuut* i definitely see some backend packets output to stderr regularly in aegir. this shouldn't happen: all drush backend output should go out on stdout or stderr, but not both!
this is an api change that breaks everything.
so here's a patch to reverts a tiny part of 510735ac, which was introduced to get #1058874: drush_backend_invoke buffers output till end of command. Show progress. working. Apparently, the reason that data is sent to stderr is to avoid "Cannot modify header information - headers already sent" messages, as those packets are being sent too early. The thing is, that also means that those packets are lost to certain backend handlers. It also makes drush foo --backend | drush bar fail.
So I believe we should fix the "Cannot modify header information" instead of working around it.
I'm running the test suite now to see if this explodes.
Comment #22
greg.1.anderson CreditAttribution: greg.1.anderson commentedBased on some inconclusive tests I started, it also seems to me that there might be something wrong with the handling of the introduced null character. I might try changing this to something non-null later, to make it easier to debug. No time to continue now, though.
Comment #23
greg.1.anderson CreditAttribution: greg.1.anderson commentedHere is a fix that removes a highly-destructive 'trim' that can corrupt backend packet start patterns in the remainder buffer. This should take care of the primary symptom; however, because of the issue with 'drush foo --backend | drush bar', I agree that anarcat's STDERR changes are also necessary. I don't have time to address those at the moment, but I believe that with this patch, we're now pretty close.
Comment #24
greg.1.anderson CreditAttribution: greg.1.anderson commentedCorrect minor defects in backend invoke that can cause large output problems
Comment #25
greg.1.anderson CreditAttribution: greg.1.anderson commented[Made a typo with drush iqs in #24; here is the correction of what should have appeared above.]
It turns out that #21 and #23 merge together quite nicely. All tests pass; I did not have any problems with "Cannot modify header information" or other defects. It seems that it is not necessary to rewrite backend invoke after all.
Comment #26
moshe weitzman CreditAttribution: moshe weitzman commentedCommitted to 8.x-6.x.
1. Does anything need backport?
2. Do we need any new tests?
Once those are resolved, we can mark this Fixed.
Let's release Drush6 with concurrency and evaluate it again for 7.
Comment #27
greg.1.anderson CreditAttribution: greg.1.anderson commentedSince the bug fix turned out to be trivial, I think that backporting to 7.x-5.x is appropriate. I cherry-picked the commit on the Drush 5 branch.
Could still use a new unit test. I think a good way to test this would be to repeatedly call drush_log from within a command called via backend invoke, with each successive call returning a message that is one character longer. The total backend packet length should go from smaller than the packet chunk size (4096, unless set to something else) to somewhat larger than same.
I'll put something together to do this later.
Comment #28
Steven Jones CreditAttribution: Steven Jones commentedSo...this looks to have introduced a regression in Drush 5.x.
We (the Aegir project) have some lame tests, that basically consist of running lots of Aegir specific Drush commands, and making sure that no errors are logged.
We run these tests with Jenkins, and we can run then against lots of different Drush versons, which is handy for testing compatibility.
You can see this matrix test here:
http://ci.aegirproject.org/job/Matrix%20test/76/
Note that Drush 7.x-5.x-dev is failing, but 7.x-5.9 is not.
Seems like the change to sending output to stdout rather than stderr does cause fatal errors for installing Open Atrium.
Also, note that we still have
DRUSH_BACKEND
in the output too :(Comment #29
greg.1.anderson CreditAttribution: greg.1.anderson commentedOkay, I'm going to back out all of #24 from 7.x-5.x, and put the log packets back on STDERR in 8.x-6.x. We can continue to try to isolate DRUSH_BACKEND in the output as well, but it seems clear that moving away from STDERR is a problem.
Perhaps
drush foo --backend 2>&1 | drush bar
will have to do, at least for Drush 5. In Drush 6, perhaps we could consider moving the packets to STDOUT if we cached all of the log messages prior to the first call to drush_print; however, this would not be very desirable for long-running commands that output log messages, but no standard output. Maybe someone else might have a better idea here.Comment #30
greg.1.anderson CreditAttribution: greg.1.anderson commentedHere is a failing test. It sends longer and longer log messages, until a seven-character message causes Drush to drop a backend packet somewhere.
Comment #31
anarcat CreditAttribution: anarcat commentedwhoa... okay, this is so very weird.
we should really consider rewriting this whole packet handling thing from scratch at some point, it seems really confusing to me, and we just keep piling up weirder stuff on top of it all the time... :)
couldn't we just ob_start() around the bootstrap process to get rid of the 'headers already sent' crap?
is anyone going to fix this by friday? it's the first day i'll have free to figure this thing out and i'd like to release aegir beta2 with drush 6 support and have this bug fixed. ;)
Comment #32
greg.1.anderson CreditAttribution: greg.1.anderson commentedI think it would take longer than 'til next Friday to rewrite the backend invoke mechanism; even if we had a candidate implementation, it would probably take longer than that to review. A full rewrite would be welcome, if someone had the time to do it. In the meantime, though, try this patch, which changes the end-of-packet marker from null + eol to eot + eol. Note that there is a code comment that says that we only care about the last null in the trailing string because there is only one null in a packet -- and yet each packet started and ended with a null. Bummer. All of the backend tests pass with this; I will run the full suite next. n.b. #24 was also a necessary fix; there were two packet-corrupting bugs here. Hopefully there is not yet another defect still lurking. ;)
Comment #33
greg.1.anderson CreditAttribution: greg.1.anderson commentedAll unit tests pass with #32. Hopefully this will work for Aegir as well.
Comment #34
greg.1.anderson CreditAttribution: greg.1.anderson commentedThis version is a little cleaner; it puts the terminal null at the very end of the packet, after the eol. This still works, because if the terminal null appears in the text string being processed, then the entire packet is complete and will be handled and removed. Placing the null in the penultimate position left a one-character window that could corrupt the packet.
Comment #35
greg.1.anderson CreditAttribution: greg.1.anderson commentedWe should make sure this is fixed prior to the Drush 6 release.
Comment #36
jhedstromWith #34 I'm still seeing these sort of messages in the output:
Comment #37
greg.1.anderson CreditAttribution: greg.1.anderson commentedI cannot reproduce; I do not see any of these messages when using the makefile attached to #6.
It could have something to do with concurrency; there is no unit test for split packets with concurrency at the moment. Could you try running your test again with --concurrency=1, and see if the problem goes away?
Comment #38
jhedstromRe: #37 with concurrency set to 1, those messages go away--however, they go away with and without this patch.
Comment #39
greg.1.anderson CreditAttribution: greg.1.anderson commented#34 and #24 both correct rare intermittent bugs with split packet handling; neither of these two bugs are related to concurrency. It appears that in addition to these two defects, there is at least one more intermittent defect related to concurrency. By the symptoms, it also seems that this is a split packet defect; however, I cannot easily see the problem by inspections.
We have two ways to proceed:
1. Write a unit test that manages to illustrate the bug every time, to allow isolation and correction of the problem.
2. Remove concurrency.
I'm not sure which route I would take, if I picked this up again first; if someone else wants to submit a patch, either option would be okay with me -- whichever is easier.
Is everyone okay with committing #34?
Comment #40
greg.1.anderson CreditAttribution: greg.1.anderson commentedHere is a third option. This patch includes the fix in #34, and also changes the default concurrency to 1. This will mask the existing concurrency bugs, and still allow those who want to use it the option of turning it ack on. There is little motivation to do that for make, since the performance is about the same for concurrent and non-concurrent runs of make; however, some folks have gotten good performance increases running batch jobs concurrently. I recommend committing this, at least, and continue the discussion on whether to fix or rewrite or remove concurrency - maybe in Drush 7.
Comment #41
moshe weitzman CreditAttribution: moshe weitzman commentedCommitted #40. Leaving this open for further work and discussion. I think we could release as is though.
Comment #42
greg.1.anderson CreditAttribution: greg.1.anderson commentedSounds good; adjusting title (although we could also just decide to remove concurrency to fix this as well).
Comment #43
anarcat CreditAttribution: anarcat commentedWhat's wrong with the idea of sending everything on the same stream (STDOUT) and buffering the bootstrap process?
It seems every time we touch this we make the code more complicated, I feel very unconfortable obscuring it further.
Comment #44
greg.1.anderson CreditAttribution: greg.1.anderson commentedSending everything to STDOUT has no benefit. Why would you want to do that? Is
drush foo --backend | drush bar
even a real use-case? Drush itself does not have any mechanisms to parse the backend output from stdin. Does something in Aegir do that? If so, you could add a 2?&1 if you really needed it.Regarding obscuring the code, the last two commits fixed an off-by-one error, and removed a 'trim' that was stripping characters that should not be stripped. I agree that incremental output and concurrency complicated and obscured backend invoke, but the commits here did not further complicate or obscure it.
If you want to simplify backend invoke without removing features, I would recommend that we keep backend packets on STDERR as they currently are, and adjust backend invoke to read packets from STDERR (remove the 2>&1 from all of the places that Drush adds it when doing backend invokes). If we put the packets all on STDERR, then we could just read lines of text from STDERR (unlimited buffer size, everything up to the eoln at the end of each packet) and discard any line that was not a packet. Then all of the split-packet code would disappear.
Removing concurrency has also been discussed. This feature can be helpful when doing large batch processing, but I still would not object if we removed it.
However, I think that things work here well enough to release. It would be a bad idea to rewrite backend invoke at this point, unless doing so is absolutely necessary, as it would take a long time to write, review, debug and commit. Let's get Drush 6 out, and consider further improvements to backend invoke in Drush 7, as time and availability of contributors permits.
Comment #45
moshe weitzman CreditAttribution: moshe weitzman commentedComment #46
Steven Jones CreditAttribution: Steven Jones commentedSorry to report bad news again, but according to Aegir's tests, this problem still exists:
http://ci.aegirproject.org/job/Matrix%20test/DRUSH_VERSION=8.x-6.x-dev/8...
(then search on that page for 'DRUSH_BACKEND')
You'll see lots of occurrences of the backend packets getting output.
Comment #47
greg.1.anderson CreditAttribution: greg.1.anderson commentedBother. I still recommend shipping Drush-6 like this, and then fix per #44. For more rapid resolution, patches welcome.
Comment #48
moshe weitzman CreditAttribution: moshe weitzman commentedI notice that all the stray messages happen at the beginning of the request or the end. That's probably significant. Maybe output buffer related? I see ob calls at http://api.drush.org/api/drush/includes%21bootstrap.inc/function/_drush_.... I'm not too familiar with this part of Drush
I see both these late log messages a lot:
"type":"notice","message":"Command dispatch complete"
and"type":"memory","message":"Peak memory usage was
The outlier is
"type":"message","message":"Changed permissions of \/var\/aegir\/.drush\/platform_drupal7.alias.drushrc.php to 640"
but perhaps that happens in an exit hook or something?Comment #49
greg.1.anderson CreditAttribution: greg.1.anderson commentedI also noticed that it tended to be later log messages that were lost (and therefore emitted as plain text). The backend invoke buffer is large, so if you see two notices in a row, it probably means there was one incident where a packet was lost, and every remaining packet in the buffer was dumped to stdout.
This could happen if aegir writes a null to stdout near the end of a run. I could probably make this better by fixing that part in backend invoke that says "we only care about the last null...". That seems to be untrue.
Comment #50
Damien Tournoud CreditAttribution: Damien Tournoud commented@moshe: I noticed that also, and my hunch is also that we are somewhat hitting the atomicity limit of the pipe.
Writing to a pipe is only atomic up to
PIPE_BUF
bytes, which is at least 512 bytes by POSIX standards, but 4096 bytes (a page) under linux. I suspect that somewhat Drush sub-processes are outputting everything to the pipe at the same time, and the last few messages end-up being interleaved, and as a consequence do not match the regexp and are just output to the screen.Looking at the code, that should not happen because we are using the
$chunk_size
parameter ofob_start()
. Hm. Or maybe the output buffer is properly flushed, but we need an additionalflush()
call to flush to the pipe?Comment #51
Damien Tournoud CreditAttribution: Damien Tournoud commentedWe could try adding a
flush()
right after:Comment #52
greg.1.anderson CreditAttribution: greg.1.anderson commentedI believe that in #64, concurrency is set to 1, and there is only one process writing log messages at the same time. Perhaps this should be confirmed, though.
Comment #53
greg.1.anderson CreditAttribution: greg.1.anderson commentedAlso, I do not believe that packets are being interleaved; if they were, we would see them interleaved in the output. "Dropped" packets (those that are not interpreted and parsed) are echo'ed to stdout exactly as they are received.
Comment #54
Damien Tournoud CreditAttribution: Damien Tournoud commentedI was poking at all the wrong places. The problem is just that if the master process is too slow to read the output of the children, multiple packets have to be read at the same time and the first and/or last packet in a batch fails to parse.
Here is an easy way to reproduce the problem.
Comment #55
Damien Tournoud CreditAttribution: Damien Tournoud commentedIn that case, the problem is that the first packet is missing its facing
\0
, and the last packet its trailing\0
.Removing the
trim()
within_drush_backend_proc_open()
that occurs before reading the backend output seems to fix the issue.This actually explains all the issues that I have seen: when we read both some packets and the backend output in one go, the facing and/or trailing zeros are discarded, which makes the first and/or last packet fail to parse.
Comment #56
greg.1.anderson CreditAttribution: greg.1.anderson commentedAh yes, you are correct. That trim must go. The code path above only runs when the final buffer is processed -- the one that contains the backend results (or at least the start of them). If this buffer also contains one or more backend packets, then these are discarded.
I'll run the unit tests with #55 and see how they fare.
Comment #57
greg.1.anderson CreditAttribution: greg.1.anderson commented#55 introduces failures, because an extra end-of-line is inserted in the output in instances where there are no packets or output in the final buffer. In this case, the trailing string contains a single null, which the trim was removing. This null is also removed by drush_backend_parse_packets, so we just make sure that we do the 'empty' test again after all of the packets are removed from the data.
This works fine for Drush's unit tests; how do it still do the right thing with Aegir?
Comment #58
Steven Jones CreditAttribution: Steven Jones commentedI'm re-running the tests locally with a patched and non-patched version of Drush and will report back shortly.
Comment #59
Steven Jones CreditAttribution: Steven Jones commentedVictory! I've just tested locally, and the
DRUSH_BACKEND
messages go away with the patch from #57.Any chance that the fix could get backported to 5.x too?
Amazing work everyone, sorry that I was only able to test, and didn't have the time to post patches.
Comment #60
Damien Tournoud CreditAttribution: Damien Tournoud commented^ Let's try to avoid this if we can. What about
$trailing_string === ''
orstrlen($trailing_string) == 0
?Comment #61
greg.1.anderson CreditAttribution: greg.1.anderson commentedCommitted #57, adjusted as suggested in #60.
This could be backported to 7.x-5.x, but note that there are multiple commits here, and we'd need to be sure to include them all.
Comment #62
greg.1.anderson CreditAttribution: greg.1.anderson commentedOkay, think I got them all. All Drush tests pass; feedback on how this works with Aegir would be appreciated.
Comment #63
greg.1.anderson CreditAttribution: greg.1.anderson commented#62 is the backport to 7.x-5.x; setting version.
Comment #64
Steven Jones CreditAttribution: Steven Jones commentedSure, will apply and test within 12 hours.
Comment #65
boombatower CreditAttribution: boombatower commentedSeems to work for me.
Comment #66
Steven Jones CreditAttribution: Steven Jones commentedTested 5.x with the patch from #62 and it seems to have fixed the DRUSH_BACKEND appearing in the output too.
Comment #67
greg.1.anderson CreditAttribution: greg.1.anderson commentedCommitted #62 to 7.x-5.x.
We can still consider removing concurrency and simplifying logic in Drush 7.
Comment #68
Steven Jones CreditAttribution: Steven Jones commentedLovely. FWIW I don't think concurrency is actually too much of an issue, and I don't think it complicates this code path too much, but I do think that there's too much magical string handling going on in backend.inc and it's way too hard to follow. I would suggest that cleaning it up so that it was easier to follow what was going on would be better than just removing features to be able remove the corresponding code.
Comment #69
greg.1.anderson CreditAttribution: greg.1.anderson commentedIt would be helpful to know if Aegir runs correctly with --concurrency=4. I expect it likely does now.
#44 contains an idea for simplifying the code that could work without removing concurrency.
Comment #70
Steven Jones CreditAttribution: Steven Jones commentedI would imagine that Aegir would, but we don't make active use of anything that could be run concurrently except for make, by that I mean that all our backend invokes are for a single command on a single target, so there's nothing to run in parallel.
Comment #71
greg.1.anderson CreditAttribution: greg.1.anderson commentedEarlier it was reported that Aegir was failing with concurrency, but worked when it was set to 1. I think this result was spurious, but it would be good to have confirmation.
Comment #72
Steven Jones CreditAttribution: Steven Jones commentedSorry, do you know exactly where someone said that concurrency made a difference in Aegir land? This has not been the case in my testing, concurrency or not it's not been working correctly.
Comment #73
greg.1.anderson CreditAttribution: greg.1.anderson commentedSorry, my mistake -- setting concurrency to 1 fixed make, not Aegir. I think all is well here -- but simplification would still be a good idea.
Comment #74
Steven Jones CreditAttribution: Steven Jones commentedAh fine. We actually do a make as part of the Aegir tests, and we've seen that go from having DRUSH_BACKEND in it to those disappearing too, so make is looking good too.
Comment #75
anarcat CreditAttribution: anarcat commentedBy the way, I found that using the "interactive" backend option hides the backend packets from us and they get properly integrated into the calling process. My guess is that this is because "interactive" uses proc_open() while ... non-interactive uses system() or something silly.
Yet, interactive just doesn't work in Debian and probably other linuxes, as reported elsewhere, because of the crazy `tty` stuff. I have opened a separate issue to remove this obscenity, see #2025527: > `tty` is evil and killed my cat.
Comment #76
anarcat CreditAttribution: anarcat commentedAlso, just thought I'd mention that right now, on PHP 5.4 Wheezy, the drush 6.x test suite yields a ton of DRUSH_BACKEND packets in the output. This may mean that the whole test suite is compromised and simply not finding errors when it should, because they are lost in those DRUSH_BACKEND packets.
This is reproduced on travis too: https://travis-ci.org/drush-ops/drush/jobs/8107473 - and the test suite is actually failing right now, although travis doesn't detect it:
Comment #77
anarcat CreditAttribution: anarcat commentedAfter fighting with this, I have come to believe we should always call drush with proc_open().
Comment #78
greg.1.anderson CreditAttribution: greg.1.anderson commented#76: While sub-optimal, it is actually expected to see the DRUSH_BACKEND in the unit tests, because the tests are calling drush --backend directly.
Comment #79
greg.1.anderson CreditAttribution: greg.1.anderson commentedAlso, Drush backend invoke "interactive" mode does not use --backend when calling Drush; you therefore don't see any packets in backend invoke, because no packets are generated. You also won't see any structured results from the command; all you get is the status result code from proc_open, and the raw stdout from the command. If you don't need to parse the Drush log output, or get the structured result of a command, then calling Drush in "interactive" mode is fine.
Comment #80
anarcat CreditAttribution: anarcat commentedIt is not merely suboptimal - it is wrong. Errors are being skipped, as I demonstrated above.
Comment #81
greg.1.anderson CreditAttribution: greg.1.anderson commentedThe tests in #76 are not wrong; some tests test the negative; the error messages you are seeing are from Drush, not from the test. The test expects Drush to fail.
This is the test that produces
The shell alias echotest uses replacements "%root".
:So, the situation is suboptimal, as it often tricks people into thinking that there is an error in the test; however, the test is actually working correctly.
Comment #82
anarcat CreditAttribution: anarcat commentedThat is just one of the errors, how about this one:
DRUSH_BACKEND:{"error":"DRUSH_COMMAND_INSUFFICIENT_BOOTSTRAP","message":"Command core-cron needs a higher bootstrap level to run - you will need to invoke drush from a more functional Drupal environment to run this command.","packet":"set_error"}
This is in testDispatchUsingAlias(), I don't think this is an expected failure.
I really don't see why drush packets should be spewed on stderr, it just confuses everything.
Comment #83
greg.1.anderson CreditAttribution: greg.1.anderson commentedMaybe you mean this test, from testTarget() in backendTest.php? This test is expecting core-cron to fail for the reason specified in the log message quoted in #82.
As for your next point, 'spewed' is kind of a charged word; I'm not completely sure what specific change you are suggesting here. If you mean that stderr output should be hidden in the unit tests, then I agree with you. It is standard behavior for phpunit to emit anything written to stderr, but we could take pains to hide stderr in the unit tests so that the output was cleaner. This improvement would be welcome. If you mean that backend packets should go to stdout, I still disagree per #44. In either event, though, the best way to continue the point would be with a patch.
Comment #84
anarcat CreditAttribution: anarcat commentedRight, sorry for the loaded words, I guess I get pretty upset with this. Thank you for your continued response and help with this issue.
I guess I never answered #44, let's see if I can do that now. To clarify: I don't believe drush packets should be sent to stderr, not only in the unit tests, but in drush as a whole.
Yes, there is a real use case for drush | drush. We use the following contraption to safely transmit database credentials during the Debian package install process:
The reason this is done this way is that this is the only way to safely pass credentials so they do not show up in the process list, something drush has problems with elsewhere (e.g. #671906: mysql credentials leak in drush sqlc and archive-dump). --backend is necessary for that reason here - if there would be another way to make drush read JSON from stdin, that would be useful to our use case.
"backend-parse" is something that parses backend output from stdin. The source can be found here, it's quite trivial (and should probably be merged into core): http://drupalcode.org/project/provision.git/blob/HEAD:/parse.backend.inc
But this is not the only use case: a bunch of other places call drush using system() or similar, which doesn't properly connect the stderr pipe, which leaks to stderr. In our case, this is what is happening with drush hosting-queued, an alternative queue runner:
http://drupalcode.org/project/hosting.git/blob/refs/heads/6.x-2.x:/queue...
I suspect there are other occurrences in drush, but I am not familiar enough with all the commands to get one on top of my head.
So yes, there are use cases. :)
You're right, this was uncalled for, and I apologize. I do have the feeling this code is way too complicated, but you can hardly be blamed for that. Thanks again for trying to fix it. :)
If we put packets on stderr, we need to use proc_open() everywhere, that's for sure. That said, I don't understand why we really need to send data on STDERR. It seems like just a workaround, as the packets are lost instead of parsed... In my mind, we should use output buffering to avoid the "headers already sent message"... maybe I missed why this wasn't appropriate.
If concurrency works, I don't mind it being there. :) But in my mind, I can do concurrency on top of drush, not sure it should handle multiple children itself...
Agreed. I wish I had time to rewrite the whole backend system again, but the truth is that I am struggling to get Aegir 2.x out the door and I can't afford to spend time on a full rewrite now. :)
Should we close this issue and open a new one specifically for refactoring backend.inc?
Comment #85
greg.1.anderson CreditAttribution: greg.1.anderson commentedThis issue is specifically about reducing the complexity of backend invoke, so I think we can put refactoring work here.
I can agree with a lot of what is in #84. I don't think that ob_start is necessarily going to work here; note that _drush_bootstrap_drupal_full() is already using ob_start, and the message still is coming through. I think that my idea to suppress Drush log messages until after the Drupal bootstrap is dubious; it could run into problems with Drush commands that don't need to bootstrap Drupal, or that call bootstrap_max. If you come up with a patch that works, by one technique or another, we could consider it; it seems much simpler to keep the log messages on STDERR. We could make Drush read log messages from STDERR, and commands that want to pipe can use 2>&1.
I don't think that we really need drush backend-parse (except maybe for debugging?); what would be more useful would be a way to parse options from stdin without using --backend, which I think is your use-case above. It's harder to prevent credentials leaks in remote sql-syncs, but I agree that should ideally be fixed (separate issue, as quoted above).
Comment #86
greg.1.anderson CreditAttribution: greg.1.anderson commentedThis issue was marked
closed (won't fix)
because Drush has moved to Github.If desired, you may copy this bug to our Github project and then post a link here to the new issue. Please also change the status of this issue to
closed (duplicate)
.Please ask support questions on Drupal Answers.
Comment #86.0
greg.1.anderson CreditAttribution: greg.1.anderson commentedAdding full command line
Comment #87
greg.1.anderson CreditAttribution: greg.1.anderson commentedContinuation is at https://github.com/drush-ops/drush/issues/1270