Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
Running drush hosting-task outputs weird drush packets, for example:
aegir@marcos:~/hostmaster-HEAD/profiles/hostmaster$ drush @hostmaster hosting-task 296
DRUSH_BACKEND:{"type":"message","message":"Changed group ownership of <code>sites\/test1.orangeseeds.org\/private\/temp<\/code> to www-data","timestamp":1351798138.4491,"memory":13152528,"error":null,"packet":"log"}DRUSH_BACKEND:{"type":"notice","message":"Exception Object\n(\n [message:protected] => We were unable to find any installation profiles. Installation profiles tell us what modules to enable and what schema to install in the database. A profile is necessary to continue with the installation process.\n [string:Exception:private] => \n [code:protected] => 0\n [file:protected] => \/home\/aegir\/platforms\/drupal-7.14\/includes\/install.core.inc\n [line:protected] => 1034\n [trace:Exception:private] => Array\n (\n [0] => Array\n (\n [file] => \/home\/aegir\/platforms\/drupal-7.14\/includes\/install.core.inc\n [line] => 473\n [function] => install_select_profile\n [args] => Array\n (\n [0] => Array\n (\n [parameters] => Array\n (\n [profile] => \n [locale] => \n [q] => \n )\n\n [settings_verified] => 1\n [forms] => Array\n (\n [install_settings_form] => Array\n (\n [driver] => mysql\n [database] => test1orangeseeds\n [username] => test1orangeseeds\n [password] => XXXXXXXX\n [host] => localhost\n [port] => 3306\n [db_prefix] => \n )\n\n [install_configure_form] => Arr
That's an install task.
Comment | File | Size | Author |
---|---|---|---|
#9 | large-backend-packets-9.patch | 8.17 KB | greg.1.anderson |
#8 | large-backend-packets.patch | 5.83 KB | greg.1.anderson |
#5 | drush-1829418-pipe-output.patch | 3.37 KB | Steven Jones |
Comments
Comment #1
Steven Jones CreditAttribution: Steven Jones commentedYeah, we are still seeing these in the testbot logs, and when testing manually in-fact, looks like we need to do some bug hunting in Drush itself.
Comment #2
Steven Jones CreditAttribution: Steven Jones commentedIt's really odd indeed, there are logs like this:
Which looks like the start of the packet was lost somehow?
Comment #3
Steven Jones CreditAttribution: Steven Jones commentedThis is indeed a Drush issue, so moving to the Drush queue.
Here's where the bug is: in
_drush_backend_proc_open
it checks to see if there's anything to read from the child processes, and then when there is it reads the next 4096 characters from the stream. If the packet of data written to the STDOUT in the child process was bigger than 4096 characters, then that packet basically gets logged to the screen, not parsed by Drush, and then the remaining characters are also read and logged to the screen.Simple 'solution' would be to increase 4096 to something larger so that larger packets can be read, but I think we'll just run into an issue somewhere else, or the reading will happen in the middle of a large packet being transferred, and so we'll be back in this weird position.
The code needs to be smarter and if it can't parse the backend packets it needs to buffer them up and appending that string to the next chunk of data it gets I think.
Comment #4
greg.1.anderson CreditAttribution: greg.1.anderson commenteddrush_backend_parse_packets assumes that every packet fits inside the 4096-byte packet size. The solution suggested at the end of #3 is correct; if a packet contains "\0DRUSH_BACKEND:", but does not contain the trailing null, then we should accumulate more data from the stream and try again. More specifically, since it is also possible that the "DRUSH_BACKEND:" might be truncated, the code should also insure that any \0 that occurs within 14 characters or so of the end of a packet will also cause the output to be re-scanned on the next iteration.
The code is already doing this:
In the existing implementation, packets are removed from $string before it is appended to 'output', and 'output' is never edited. Perhaps it would be better to append $string to 'output' prior to removing packets, and edit them out as they are processed. If we then remember the last offset of 'output' that we scanned for packets, we wouldn't need to re-search the whole thing on every iteration. If there are no nulls in 'output', then the last offset would equal the length of the output buffer at the end of each iteration. If there is an unterminated \0DRUSH_BACKEND, then the last offset will be stuck there until the packet is terminated, and if there is any other null in the output, then the offset will be stuck there until there are at least 14 bytes more in the output.
Will give this a go when I have a chance; in the meantime, patches welcome.
Comment #5
Steven Jones CreditAttribution: Steven Jones commentedHere's a very initial stab at this, fixes the original issue of packets not being parsed but being logged, but shifts the logging of non-packets to the eof for that process. I guess that may on each iteration it could check to see if there's something that looks like a packet in the output (maybe a single null character?) and if there isn't then logs the output as before.
Comment #6
greg.1.anderson CreditAttribution: greg.1.anderson commentedThat's not a bad start, and it even looks as if you do log packets incrementally (although I did not test to confirm). However, I am still concerned that there may be memory and performance issues with scanning the output on each iteration. Increasing the buffer size could help here, if so. Could you either measure and confirm that this performs similarly to the old code (ideally by first dumping a lot of text data before logging), or adjust to record the last found packet position to reduce scanning?
Comment #7
greg.1.anderson CreditAttribution: greg.1.anderson commentedHere's an alternate implementation that should avoid re-parsing large output chunks and should also maintain the incremental log message output behavior from the existing code. I have not tried with large packets, but existing test cases pass.
If you could provide a unit test that verifies this works with large packets, then I think it's probably ready to go in.
Tagging as 'Release blocker' so that Jonhattan can decide if he wants a backport of this for the pending Drush-5.9 release.
Comment #8
greg.1.anderson CreditAttribution: greg.1.anderson commentedOh, and here is the patch.
Comment #9
greg.1.anderson CreditAttribution: greg.1.anderson commentedHere is an updated patch with a functional test that confirms that split packets can be re-assembled. This patch adds a new capability to backend invoke that allows the caller to specify a backend option '#process-read-size' that will change the block size that backend invoke packets are processed in, if desired. (If not specified, the previous default of 4096 is still used.) The test case sets this option to a very small value, so that all packets will be split. With the split-packet handling code removed, this test fails. When it is present, it passes.
Please confirm this works with Aegir. If it does, I recommend committing this to Drush-8.x-6.x and also Drush-7.x-5.x for the upcoming Drush-5.9 release.
Comment #10
Steven Jones CreditAttribution: Steven Jones commentedIs the introduction of the null character backportable? Will the backport have to remove the additional of the null character?
Comment #11
greg.1.anderson CreditAttribution: greg.1.anderson commentedI think that the issue of backporting the addition of the null character is subject to discussion. Note that:
1. If the null is removed from the output stream, the existing code will still run, but will be vulnerable to an intermittent bug: drupal backend packets will be fine, but if the final backend output marker is split across a 4096-byte boundary, then the final function result will be lost, with unpredictable results.
2. The existing code uses 'trim' when processing the backend result, so the null disappears in this function.
3. The existing code does not use 'trim' when processing incremental results, so the null may be printed to the console.
Note that although the null may be printed to the console (to no effect, I believe), it never appears in machine-parsed data structures. Also, even if the null is not backported, existing versions of Drush-5.x will still see the null, to the same effect described above, when talking to Drush-6.x with this patch applied.
I therefore recommend backporting this to 5.x with the addition of the null in the output stream. Note also that Drush 5.9 is scheduled to come out soon (around "the end of this week" == today or in the next day or two), and this is the only remaining bug under consideration for inclusion -- all others have been committed. If this fix is desired in Drush 5.9, please provide prompt feedback so that we can drive this to completion. n.b.: It may still be decided to defer this to Drush 5.10, depending on the opinions of other maintainers.
Comment #12
Steven Jones CreditAttribution: Steven Jones commentedInitial testing looks okay to me, applying this patch and running Aegir's test-suite now to test the original issue.
Comment #13
Steven Jones CreditAttribution: Steven Jones commentedSadly I still see 'DRUSH_BACKEND' in the output with this patch applied, so there must be something subtle going on that's not quite addressed by this patch. I'll try to work out what, but I'd suggest that this issue shouldn't block a release.
Comment #14
anarcat CreditAttribution: anarcat commentedI'll take a look at this sucker shortly if you guys don't mind.
Comment #15
Steven Jones CreditAttribution: Steven Jones commentedYeah sorry, I've not been able to get to the bottom of this one.
Comment #16
greg.1.anderson CreditAttribution: greg.1.anderson commentedI haven't had time to return to this since #9; it would be great if you could take a look.
Comment #17
anarcat CreditAttribution: anarcat commentedSo here's an example of the behavior I am seeing:
Notice the weird spacing and the DRUSH_BACKEND bullcrap in there...
Comment #18
anarcat CreditAttribution: anarcat commentedSo after looking at this for about an hour just got me completely and utterly confused.
I am not sure the bug I am seeing is related to the 4096 byte limit.
I am also having trouble simply reproducing the issue outside of Aegir - there isn't a lot of ways to reproduce this bug short of writing full unit tests with the output. Hairy. This is what I will probably end up doing. I have done this:
Then I will try to build a test case for drush_backend_parse* functions by feeding them those output files, which seem to be causing problems, or at least irregular output, in Aegir.
While looking at unit tests, I also noticed that they are themselves spewing out a lot of packets to stderr, example:
This looks fairly harmless, but I think it's actually an error - stderr should be eaten by backend_invoke...
I am fairly confused now, and my test environment here isn't quite up to speed, so I'll try to work more on this next week...
Comment #19
Steven Jones CreditAttribution: Steven Jones commentedTo reproduce, I created two drush commands, one took a single argument, an integer, and outputted a string of that length. Then the other command, invoked the first, passing an argument through, so it was quite easy to test outside of Aegir.
I'll try to remember to dig out the code and post it here.
Comment #20
anarcat CreditAttribution: anarcat commentedRegardless of my above comment, it seems that the patch *DOES* improve the output. I was able to find out what the error was with my wailing task while I couldn't do it before.
The fact that DRUSH_BACKEND packets still show up is a minor inconvenience.. The only concern I have with the current patch is that it complicates code that is already FUBAR'd for me, but it depends whether our priority is to get things working or understandable. :P
Otherwise, let's just commit this, at least to fix the regression in D5.
Comment #21
greg.1.anderson CreditAttribution: greg.1.anderson commentedWent ahead and committed this per #20. If anyone has more to add on any of the observed odd output, please feel free to re-open.
Committed in cc8b76b.
Comment #22
Steven Jones CreditAttribution: Steven Jones commentedI think that this needs backporting to 5.x too?
Comment #23
greg.1.anderson CreditAttribution: greg.1.anderson commentedCommitted on 7.x-5.x branch. Please re-open if any problems are observed.
Comment #24
Steven Jones CreditAttribution: Steven Jones commentedThanks, we shall test this with Aegir's testing scripts, which seemed to show the issue quite clearly.
Comment #26
jhedstromThere appears to be a bit of a regression in the output for all make commands after this commit. #1982502: Reduce complexity of backend invoke; consider removing concurrency.