This issue has a history starting in D6 #245990: HTTP request testing unreliable and may be undesirable

Problem/Motivation & Proposed resolution

(Note: this is the original summary)
Summary of Problem: Drupal's attempts to detect and report when a site cannot make outgoing HTTP requests do not always work correctly, and probably never will. The problem in Drupal is a result of the function system_check_http_request() reporting that the connection isn't working when it actually is. As a result dependent features, such as update status, aggregator, adding new feeds etc. fail and performance substantially slows (~30 seconds) on admin pages when it runs.

What to do about the check function system_check_http_request()
D8: Patch to remove system_check_http_request
- Suggest dependent contrib modules log to watchdog on fails
- Update function to delete drupal_http_request_fails variable addressed in issue#
D6/D7: Leave system_check_http_request for dependent modules
but patch to add a timeout to the drupal_http_request() call to improve performance

What to do about failed HTTP requests:
- D6/D7/D8: When drupal_http_request fails apply patch to log message to watchdog instead of using variable_set('drupal_http_request_fails')
- Need to address potential security issues related to logging urls with possible presence of passwords/tokens/etc.

Remaining tasks

    7.x-dev
  • Currently working on the backport to D7
  • Consider the comments in #70 as part of the code review.
  • Needs further discussion concerning the security implications of logging full URLs in D7.
    This issue should remain open while those discussions are on-going.
    8.x-dev
  • The patch from #46 was reviewed and tested (RTBC).
  • The patch from #46 was commited in #58 here is the commit information (from 8.x repo)

User interface changes

None, but some additional watchdog error messages might appear in the system error log.

API changes

The function system_check_http_request() has been marked as deprecated but only in the Doxygen comments, the original code remains intact.

(Note: Intentionally not including closed/fixed issues and duplicates -- please see comments for those)
#1635882: Remove stale drupal_http_request_fails variable
#2012468: Add trusted roles recommendation to 'access site reports'

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

dww’s picture

Yes, please. ;) We should also ensure that the update manager does likewise. I can't do so myself anytime soon, but I hope someone will...

Thanks,
-Derek

hass’s picture

linkchecker also use drupal http request and have it's own status code/error handling logics. Make sure this will not break, please.

catch’s picture

When the current behavior isn't working correctly, it can cause a severe slowdown when visiting the admin pages of the site that makes it extremely difficult to perform site administration tasks. However, the bug only seems to affect a small number of users.

The current logic can result in many, many calls to variable_set(), since it keeps trying to set the same variable over and over until there's a successful request. There could also potentially be a race condition where the process that sets the variable off again is overwritten by another process. Either way the current method is really bad.

I just marked #959620: drupal_http_request() can lead to variable_set() stampede as duplicate. Switching from a variable to watchdog sounds better than my idea over there though.

IT Guy 3000’s picture

One of my D6.20 sites is suffering from the very slow admin referred to in #1 and #3. Been wading through the many threads but hope someone on this thread know the best patch/solution to remove the error and reset whatever flag needs to be reset in the database. The Update Status works fine, btw, but admin issue is creating enormous headaches. Thanks.

chipzz’s picture

I have had this exact problem.
Short summary:
ipvsadm/ldirectord DNAT load-balancing setup, with the webservers on the internal range 192.168.1.0/24, and the website on $external_ip .

Symptoms:
- The internet -> $external_ip -> 192.168.1.0/24 => works
- 192.168.1.0/24 -> the internet => works
- 192.168.1.0/24 -> $external_ip => times out.
It seems ipvsadm/ldirectord is simply broken.

Solution:
Add this iptables rule on the webservers:
iptables -t nat -A OUTPUT -p tcp -d $external_ip --dport 80 -j DNAT --to-destination `ifconfig | sed -n "s/.*inet addr:\(192\.168\.1\.[0-9]*\).*/\1/p"`
You may have to tweak it slightly to get it to work for you (this assumes your internal range is as mentioned above).

anavarre’s picture

Subscribe

sun’s picture

Version: 7.x-dev » 8.x-dev
Category: bug » task

As this is merely a continuation to improve already fixed things, it's a task.

David_Rothstein’s picture

Category: task » bug

I think this one is still a bug... the bug is "HTTP request checking is unreliable".

(It's not obvious from the fact that the previous issue is marked closed, but there were a number of people who were reporting continuing problems there, even after all the patches in that issue were committed.)

We could try to make it reliable, but so far the only way anyone has figured out how to make it completely reliable is to remove it :)

catch’s picture

We still have this line in drupal_http_request():


    variable_set('drupal_http_request_fails', TRUE);

Still a bug.

catch’s picture

Status: Active » Needs review
Issue tags: +Performance
FileSize
2.49 KB
2.08 KB
xjm’s picture

Tagging issues not yet using summary template.

hass’s picture

Status: Needs review » Needs work

I'm strongly against this patch. In linkchecker I'm still doing drupal_http_request(), but i'm logging a *helpful* message what the reason really was and some other stuff if an action was taken. With this patch just fill watchdog with garbage and eat performance if drupal_http_requests are running and make all just more unreadable. As database performance is a real problem here, the http function themself should not save anything in watchdog. Maybe do it in hook requirement, but this patch is currently not modular and affects all drupal_http_request() requests what is not ok.

David_Rothstein’s picture

Lots of API functions in Drupal use watchdog() internally, so this is nothing new. And database performance is an issue of site configuration (you can send watchdog data anywhere; doesn't have to be the database), so that seems unrelated. This patch will also result in very few watchdog() calls on a typical Drupal site, and the ones that do result are important since they indicate something is broken. I don't see a problem.

However, there are a couple other issues with this we need to consider:

  • Definitely want to log more useful info if we do this (e.g. including the error string and error code).
  • Probably should be WATCHDOG_ERROR?
  • There are other places in drupal_http_request() where the function errors out; perhaps should log the result consistently for all of them.
  • The system_check_http_request() function should also be removed, especially in D8 (when it's called it currently sets the now-obsolete 'drupal_http_request_fails' variable also).
  • Some URLs contain sensitive info (such as tokens or private keys) so we may want to be careful logging it...
  • Is it possible that a hook_watchdog() implementation could call drupal_http_request(), leading to infinite recursion? This might sound far-fetched, but I actually do know a module that comes very very close to doing that.

Not sure how to deal with those last couple points, but one option (which might also address @hass's concerns) would be to add a 'log_errors' parameter to the $options array of drupal_http_request() - which would default to TRUE but could be turned off by callers if they want to bypass the default error handling.

sun’s picture

Count me -1 on the watchdog call as well.

Whether or whether not a failing HTTP request is actually a hard failure or error is application-level logic. Calling code needs to check that. Only application-level logic can decide whether it is a unexpected error.

catch’s picture

The database load from watchdog calls pales in comparison to variable_set(). That said I would have no problem simply not logging failures internally and expecting calling code to handle it.

hass’s picture

I'm checking ~200-500 or more links per cron run (depends on setting). This could result in 500 watchdog entries + a few hundred other db transactions. It's already expensive enough, don't add more slowliness, please. The function drupal_http_request() should not log the watchdog calls. Make the error handling outside of drupal_http_request() e.g in the code where the drupal_http_request() has been called. I must have a highly performant drupal_http_request() and i do not like to clone the function only because of modularity issues. This drupal_http_request() is an api function for http requests not a reliability checking function. For me it's very important to get all resulting status codes back in a consistent way. This consistency has already been broken in d7 and require conditions now, if status code is 3xx. But i can workaround this inconsistencies. :-(

Don't try to make drupal_http_request() intelligent... Just make it simple - call drupal_http_request() with parameters, check the link (in a reliable way), may follow a redirection, and provide all results back to the caller function, do status code handling outside the drupal_http_request() function in the calling function.

catch’s picture

@hass - while it might make sense to have no reporting in the function, your performance argument makes no sense. The watchdog here is only replacing a variable_set() call, which is considerably worse (merge query, cache clear, cache rebuild then set). Additionally no errors are supressed here, that was the only behaviour change.

hass’s picture

Maybe, but a variable call does not fill a watchdog table with 1.000.000 or more entries only for the reason that other modules make use of the function. As sun also said, error handling is application logic not api function logic. I'm also for removing the variable set. I very often see network issue errors with negative numbers, but the local system is absolutly fine. Triggering a functionality check for every negative error code - also have negative side effects.

David_Rothstein’s picture

We could just skip doing the watchdog() altogether; however, the reason for proposing it is that currently, core tries to detect when your connection to the outside internet is broken and tell you about it (and at least for most sites that check probably does work OK). So if we don't record an error in the logs, that means we'll stop trying to tell the site administrator about these problems at all.

I think not wanting to log an error here is definitely an edge case. What's wrong with my suggestion above to have a 'log_errors' parameter that allows callers to opt out of the logging if they want to, but log them by default? The alternative of having every single caller of drupal_http_request() deal with logging its own errors independently (most of which would be the exact same code) doesn't seem that appealing.

catch’s picture

How about this:

We remove the watchdog from drupal_http_request()

But then, we add back a system_requirements() that does the same check as now, but logs to watchdog if it fails.

Also we flood limit the system requirements() check to once every 24 hours or similar so it isn't done every request on /admin (could possibly include the variable, or a cache entry or something rather than rate limit so if it passes it only happens occasionally rather than flood).

And we add some documentation to drupal_http_request() to make it explicit that functionality that expects the check to work needs to log its own errors etc.

hass’s picture

#20 sounds good.

David_Rothstein’s picture

But the bug here is that for a significant minority of sites, the current check doesn't work at all. It reports that the connection isn't working when it actually is, and it also leads to extreme slowdowns (~30 seconds) on admin pages when it runs.

So reducing the frequency of the check will reduce the frequency of the slowdowns but won't solve the underlying problem. Also, if we reduced to once every 24 hours then there would also need to be some kind of "retry" button or otherwise if you were trying to debug the error on the requirements page and fix your system, you wouldn't be able to get any immediate feedback that it was fixed.

Overall, doesn't seem like a profitable direction to me.

catch’s picture

Well I'd be fine with removing all error logging from drupal_http_request(), and the hook_requirements(), and also adding watchdog messages to every place in core that does not handle it properly.

catch’s picture

An example of when you might not want to have logging to watchdog when it fails is doing something like http://www.downforeveryoneorjustme.com/ in Drupal.

catch’s picture

Status: Needs work » Needs review
FileSize
3.49 KB

Not quite what I said in #20 but this might be enough? Restricts the system_requirements() check to once every 24 hours, removes the variable, no new messages anywhere so no string changes.

Status: Needs review » Needs work

The last submitted patch, http.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
3.46 KB
Dries’s picture

Status: Needs review » Needs work

Nice clean-up. One small nit:

+++ b/modules/system/system.install
@@ -466,14 +466,17 @@ function system_requirements($phase) {
+    if (flood_is_allowed('system_http_request', 1, $window = 86400, 'system_requirements')) {
+      flood_register_event('system_http_request', 86400, 'system_requirements');

The $window = 86400 is unnecessary (and located a bit unconventional).

catch’s picture

FileSize
3.45 KB

Weird, fixed that.

Note this patch does not yet go through core checking for real instances of drupal_http_request() that might fail and need a watchdog message, that should probably be added to update status if it's not already. Re-rolled with that change though.

David_Rothstein’s picture

This looks like a definite improvement, but if we're going to go through core and add watchdog messages where necessary, we don't need the system_check_http_request() stuff at all. That's still fundamentally broken for some sites, and although running it once per 24 hours is better than always, it's still broken :)

We also need an update function to get rid of the 'drupal_http_request_fails' variable.

catch’s picture

FileSize
4.53 KB

Here's a start on that, but I ran out of steam, some places are already checking for errors, some just ignore failures altogether.

Full list at http://api.drupal.org/api/drupal/includes--common.inc/function/drupal_ht...

catch’s picture

Issue tags: +Needs backport to D7

Tagging for backport.

catch’s picture

Status: Needs work » Needs review
FileSize
5.12 KB

Did some more stragglers, everything else was already checking errors in one way or another.

Status: Needs review » Needs work
Issue tags: -Performance, -Needs issue summary update, -Needs backport to D7

The last submitted patch, drupal_http_request.patch, failed testing.

MustangGB’s picture

Status: Needs work » Needs review
Issue tags: +Performance, +Needs issue summary update, +Needs backport to D7

#33: drupal_http_request.patch queued for re-testing.

MustangGB’s picture

Status: Needs review » Needs work

The last submitted patch, drupal_http_request.patch, failed testing.

MustangGB’s picture

Status: Needs work » Needs review
FileSize
5.24 KB

Rerolled

Status: Needs review » Needs work

The last submitted patch, drupal_http_request.patch, failed testing.

MustangGB’s picture

Status: Needs work » Needs review
FileSize
5.68 KB

Lets see if this helps

catch’s picture

#40: drupal_http_request.patch queued for re-testing.

jhodgdon’s picture

There's some junk at the end of that patch that is changing file permissions or something. Was that intended?

catch’s picture

FileSize
5.27 KB

That just looks like cruft to me, removed.

MustangGB’s picture

Ooops, sorry was unintended junk

klausi’s picture

Status: Needs review » Needs work
+++ b/core/modules/aggregator/aggregator.admin.inc
@@ -339,6 +339,9 @@ function aggregator_form_opml_submit($form, &$form_state) {
+    else {
+      watchdog('aggregator', 'HTTP request to !url failed with error: !error', array('!url' => $form_state['values']['remote'], '!error' => $response->error));
+    }

$response->error can contain stuff provided by a third party, i.e. it is untrusted user provided input. Therefore the "@" placeholder must be used for sanitization. I would suggest to just always use "@", also for the URL.

catch’s picture

Status: Needs work » Needs review
FileSize
4.66 KB

Yeah treating these as untrusted input makes sense, patch makes those changes. Patch size is slightly smaller since we no longer have a clean URL check in 8.x.

catch’s picture

D7 and D6 backports. I left the system_check_http_request() function in the backports just in case a contrib module is using it.

MustangGB’s picture

If you're leaving in system_check_http_request() for backports then do we need to reopen #926426: system_check_http_request() causes deadlock on single threaded webserver because, as you say, it could get called by contrib.

catch’s picture

Hmm I'd probably instead want to file issues against those contrib modules (if they exist) suggesting they just log to watchdog on fails instead, since we decided here that the function doesn't make sense anyway.

The only reason to leave it in is just in case some code somewhere happens to be calling it so people don't get fatal errors after upgrade, even if we check all of contrib it's not impossible a custom module somewhere is calling it.

dww’s picture

Maybe for the backport, just leave the function in place as:

function system_check_http_request() {
  watchdog('php', t('The system_check_http_request() function is deprecated. See <a href="@url">the issue where it was deprecated</a> for more.', array('@url' => 'http://drupal.org/node/965078')));
}

Or something. ;) Doesn't cause a fatal, but nags them to fix their stuff, and doesn't leave us carrying around dead code.

Just a thought,
-Derek

hass’s picture

But not in D6/D7, please.

dww’s picture

We're only talking about D6/D7 here. Either this function is dangerous and shouldn't be used, or we fix it. We're deciding against fixing it. Hence, deprecating it. If your code relies on it, you're probably screwed and need to change your code.

hass’s picture

I have two projects in public releases that use the function in hook_requirement. Both modules would break; if this is silently removed. ~15.000 installations.

catch’s picture

Yeah I think just

Hmm I'd probably instead want to file issues against those contrib modules (if they exist) suggesting they just log to watchdog on fails instead, since we decided here that the function doesn't make sense anyway.

works for that then. We can add @deprecated to the D7 docs.

What's the chance of an RTBC for the 8.x patch?

dww’s picture

Status: Needs review » Reviewed & tested by the community

@hass: Both modules are already broken, since the function they're calling is unreliable and broken. But that's your problem, not mine. ;)

@catch: #46 looks good to my eyes, the bot is happy. I'm not particularly trusting of update manager test coverage, but the diff to update.fetch.inc looks fine to me. I doubt anyone knows if update manager actually works in D8 yet. ;)

So yeah, for D8, #46 is hereby RTBC. I love patches that remove a lot more lines of code then they add and keep the same level of functionality.

Cheers,
-Derek

Dries’s picture

catch seems to be very involved with this patch so I'll let him handle it.

I skimmed the patch and it seems like it is a nice simplification, in addition to being a bug fix.

hass’s picture

This patch may creates some support troubles for Link checker. Link checker for sure - expect to see failed requests to be able to notify authors about these failed links.

With this patch I cannot say for sure if the module works reliable or not - as the underlying infrastructure may be broken, but I have no indicator for this. This is really bad as past code - helped a lot when it comes to support analysis. It would much more helpful to have a real status flag that tells me if a Drupal site can issue reliable HTTP requests or not. Than I'm able to exit before running 1000+ HTTP requests in one cron run as I know all of them will fail. With the patch I can only guess by the number of same fails and strange errors that something may be wrong or not with the server. I remember it was not only failed to create socket type issues in past... :-(

Why is there no way to make a "reliability check" where a system variable can be evaluated by other modules; if the module may work or not?

catch’s picture

Version: 8.x-dev » 7.x-dev
Status: Reviewed & tested by the community » Patch (to be ported)

@hass: there's no way to make a reliability check, because various combinations of server configuration can cause either false positives or false failures for the check. See #245990: HTTP request testing unreliable and may be undesirable and #926426: system_check_http_request() causes deadlock on single threaded webserver for some of the reasons.

I've gone ahead and committed/pushed this to 8.x, moving to 7.x for backport since we definitely want to backport most of this.

catch’s picture

Status: Patch (to be ported) » Needs review
FileSize
2.1 KB

I'd already done a backport in #47, re-uploading that with a bot-testable name.

Status: Needs review » Needs work

The last submitted patch, update_cron_notify_1263040.patch, failed testing.

catch’s picture

Status: Needs work » Needs review
FileSize
3.93 KB

Completely different patch from a completely different issue..

c960657’s picture

If we decide to keep system_check_http_request() in D6 and D7 for BC reasons, I suggest adding a timeout on, say 10 seconds, to the drupal_http_request() call.

catch’s picture

FileSize
4.68 KB

Here's a patch that adds that.

David_Rothstein’s picture

So if we're not removing system_check_http_request() (which I think is correct), should we add a note to the PHPDoc that it's deprecated?

It also occurs to me that for Drupal 7, this is the only remaining place in the code that will still use the 'drupal_http_request_fails' variable, and in Drupal 8 it's gone entirely. So perhaps that means Drupal 8 should have an update function to delete the variable.

catch’s picture

Issue tags: +Novice

I opened #1635882: Remove stale drupal_http_request_fails variable for dropping the variable so we can keep the backport smooth here (that also has some retrospective justification for not adding an update, although thinking about it more on balance we should probably add one).

@deprecated sounds good to me, I don't have time to re-roll the patch, but tagging Novice in case someone else does just for adding that to system_check_http_request()

chuangban2012’s picture

#10: http_request_D7.patch queued for re-testing.

Devin Carlson’s picture

Rerolled to add "DEPRECATED:" to the system_check_http_request() function description and the "@deprecated" directive.

MustangGB’s picture

Status: Needs review » Reviewed & tested by the community
David_Rothstein’s picture

David_Rothstein’s picture

Status: Reviewed & tested by the community » Needs review

I think this looks good, though when we do it in Drupal 7 we'll definitely want a change notification. Modules like Link Checker which want to (try to) track this stuff on their own will still be able to, but it will require tiny code changes, right? But I think it's correct not to do it in core given that it's broken for many users.

However, about half of my feedback from #13 looks like it still applies... and I think we should consider addressing that before committing to Drupal 7 because these watchdog calls are our only replacement for the previous functionality, and they are also new translatable strings that we only want to add once.

The parts that look like they still apply (to the watchdog calls now scattered across core) are:

  • Add the error code to the watchdog message.
  • Increase the severity (WATCHDOG_WARNING looks appropriate)?
  • Think a little harder if it's safe to log the full URL in some of these cases (in particular aggregator), due to the possible presence of passwords/tokens/etc. This could be a minor security issue (like http://drupal.org/node/1441252).

Also wondering if we've added watchdog() calls to all the appropriate places... looks like OpenID is missing them (for example). Was that intentional?

David_Rothstein’s picture

Oh, I left this on Drupal 7 for discussion, but we could move back to Drupal 8 first if we need to make additional code changes.

Scott M. Sanders’s picture

The HTTP Request Fail Reset D6 module is very helpful with this if your problem is intermittent and you have no admin access to edit things on the server.

mgifford’s picture

catch’s picture

Version: 7.x-dev » 8.x-dev

I'm not sure WATCHDOG_WARNING is appropriate, some external requests will just fail because a service was down or temporary network issues, and that's not a problem with the Drupal site itself.

Moving back to 8.x for the rest of the discussion though.

mikeytown2’s picture

Version: 8.x-dev » 7.x-dev

Moving back to D7 as D8 is using guzzle.

David_Rothstein’s picture

Version: 7.x-dev » 8.x-dev

This is filed against Drupal 8 for the followup described in #70, which still applies.

kerasai’s picture

Status: Needs review » Needs work
Issue tags: +Performance, +Needs issue summary update, +Novice, +Needs backport to D7

The last submitted patch, replace-http-request-checking-965078-67.patch, failed testing.

patrickd’s picture

Issue tags: +Needs reroll

needs reroll

sassafrass’s picture

DrupalCon Portland Sprint: Working on Issue Summary Update

dbcollies’s picture

Assigned: Unassigned » dbcollies
dbcollies’s picture

Assigned: dbcollies » Unassigned
thomas.fleming’s picture

Assigned: Unassigned » thomas.fleming
thomas.fleming’s picture

Assigned: thomas.fleming » Unassigned
VladSavitsky’s picture

We are working today with this issue during Code Sprint UA

VladSavitsky’s picture

I didn't managed to even start work on this issue during code sprint.

joates’s picture

Version: 7.x-dev » 8.x-dev
Assigned: Unassigned » joates
Status: Needs review » Needs work

FYI: the patch from comment #67 applies cleanly on the current 7.x-dev HEAD (tested @9am GMT on 05-Jun-2013 directly after a "git pull")
because it's a D7 patch !!

it seems the fact that this issue keeps flip-flopping between D8 and D7 is adding to the confusion.

assigning this to myself
i see a few things here that just need formalizing to progress the status of this issue, it seems all the work has been discussed and completed, it's just being held-up and needs someone to take ownership and bring it home
removing "Needs Reroll" and "Novice" tags
as far as i can tell reroll is *not* needed and additionally giving a task like this to a novice is just going to send them down a painful path of time-wasting discovery, when the real issue is that this is a D7 patch failing against the D8 codebase, which is perfectly normal and expected !
removing "Needs issue summary update" tag
because that was tagged in comment #11 on July 29, 2011 at 7:50pm and it seems was never done, although the summary seems perfectly adequate in content, it doesn't use the template format and it could use some extra details to clarify the current position (i will do that)

the new issue #1635882: Remove stale drupal_http_request_fails variable takes care of previous concerns about this unused variable creeping into D8 via a D7 upgrade.
[edit: removed because as catch informs in the next comment, CMI in D8 makes these variables entirely redundant anyway :]

i'm concerned that if i just flip this issue back to 7.x-dev to get the patch commited someone else will revert it back to 8.x-dev (as before), this is new territory for me so i need some time to find out the correct way to get both the working D8 patch & the D7 backport patch tested, reviewed and ultimately commited.

catch’s picture

Version: 8.x-dev » 7.x-dev
Assigned: Unassigned » joates
Issue tags: -Needs issue summary update, -Novice, -Needs reroll

We can actually leave the stale variable in 8.x now, since all variables are stale with the API being removed - although an update to specifically remove it would also be fine - less in that table for people who are checking custom site updates.

The quickest way to get this issue done would be to discuss/resolve the remaining issues from #70.

The watchdog warning and error code both make sense, but that's actually been done already with the Guzzle conversion, see this from aggregator for example:

    catch (BadResponseException $e) {
      $response = $e->getResponse();
      watchdog('aggregator', 'The feed from %site seems to be broken due to "%error".', array('%site' => $feed->label(), '%error' => $response->getStatusCode() . ' ' . $response->getReasonPhrase()), WATCHDOG_WARNING);
      drupal_set_message(t('The feed from %site seems to be broken because of error "%error".', array('%site' => $feed->label(), '%error' => $response->getStatusCode() . ' ' . $response->getReasonPhrase())));
      return FALSE;
    }
    catch (RequestException $e) {
      watchdog('aggregator', 'The feed from %site seems to be broken due to "%error".', array('%site' => $feed->label(), '%error' => $e->getMessage()), WATCHDOG_WARNING);
      drupal_set_message(t('The feed from %site seems to be broken because of error "%error".', array('%site' => $feed->label(), '%error' => $e->getMessage())));
      return FALSE;
    }

Logging the full URL or not I'm not sure, I'd probably add the 'access site reports' permission to the security sensitive ones, since if you're debugging an issue with this, you really want to see the whole string, and in 8.x the string mostly comes from Guzzle itself as the exception message which we can't control the contents of.

I've opened #2012468: Add trusted roles recommendation to 'access site reports' for that in 8.x to discuss.

The error handling/message is going to be slightly different in 7.x so if we wanted to be more circumspect there we could.

Given that, I think we can move this back to 7.x again. Re-rolling the patch with WATCHDOG_WARNING would be a good start to get things moving towards a commit there.

joates’s picture

Status: Needs work » Needs review
FileSize
5.42 KB

@catch thanks very much for those clarifications.

this D7 patch adds the WATCHDOG_WARNING severity level.

with the extra parameter to the watchdog function calls i felt it was a good time to break it down to enable the code to be a bit more "readable".

Status: Needs review » Needs work

The last submitted patch, http-request-checking-d7-backport-965078-89.patch, failed testing.

joates’s picture

Issue summary: View changes

Drupalcon Sprint: Updated Issue summary

joates’s picture

Issue summary: View changes

Updated issue summary (1st pass).

joates’s picture

Issue summary: View changes

Updated issue summary (add a related issue and tidying :).

joates’s picture

Assigned: Unassigned » joates
FileSize
5.42 KB

just my sloppy coding.. whoops!

Important:
If you are reviewing this patch please also consider the comments from #70 as part of the code review.

joates’s picture

Assigned: joates » Unassigned
Status: Needs work » Needs review

updated the issue summary as best i could,
hopefully this will help us to make progress.

standing down.. *salutes*
(but still available for additional work/tasks :)

joates’s picture

re #88:

i'm trying to balance the overhead of an "access site reports" check against just "clipping" the query-string part of the URL (anything after the ?= i.e. name/value pairs / tokens etc) which is fairly trivial.

my argument is that the fact that it's failing is a more useful piece of information than the URL at which it fails (because essentially it will fail for all --external-- URLs regardless of the location or what parameters were passed)

trim it, report it and done.. no security concerns :)

@catch you raise a valid point (when debugging more information is useful) but how to isolate the "security sensitive ones" ? how to test for that ?

catch’s picture

The query string might cause a 500/401/403/404 depending on what's at the other end - there's both network issues and response codes from the endpoint to consider.

It's not just URLs that get into dblog though, PHP errors, node titles - there's lots of situations where there's potential information disclosure to the person reading the logs that they otherwise might not see elsewhere on the site.

catch’s picture

Issue summary: View changes

Updated issue summary (more detailed Remaining tasks).

joates’s picture

Issue summary: View changes

Updated issue summary (i corrected some mistakes made in my initial understanding).

joates’s picture

Issue summary: View changes

Updated issue summary.

joates’s picture

Issue summary: View changes

Updated issue summary.

joates’s picture

had another go at updating the issue summary because there was some confusion in my understanding of the Remaining tasks section. I think it is accurate now.

joates’s picture

Issue summary: View changes

Updated issue summary (more tinkering).

joates’s picture

Assigned: joates » Unassigned

i have been thinking about the 'access site reports' permissions issue and my conclusion is that the logic to implement that functionality belongs in the watchdog module, not in other modules that are generating log messages.

if a module is generating a log message then it's responsibility is to provide the most complete and appropriate information about the situation that it is capable of, the module should not need to be concerned about who might or might not view that message.
(of course it should still perform sanitization of untrusted values as usual)

when the watchdog module detects a message that contains a URL it needs the ability to decide if it will show it or mask it in some way, i think this is outside of scope for this issue and we should spawn a separate issue to handle this permissions check in the watchdog module if that is the way forward.

MustangGB’s picture

Version: 8.x-dev » 7.x-dev
Assigned: joates » Unassigned
Status: Needs work » Needs review

Can I ask what is the reason for patching in watchdog messages in every place that drupal_http_request() is called?

Why not just replace variable_set('drupal_http_request_fails', TRUE); with a watchdog message from within the drupal_http_request() function?

MustangGB’s picture

Issue summary: View changes

Updated issue summary.

MustangGB’s picture

mgifford’s picture

RaulMuroc’s picture

gonna give a try!

Fabianx’s picture

Status: Needs review » Reviewed & tested by the community

I am just gonna be bold and set RTBC again. It still passes tests, the @url within watchdog should not be a real problem - but if it is it can be set back to CNW and we can just remove the URL from the message as before there was no message at all ...

And this is still fixing the cache_set stampede.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 91: http-request-checking-d7-backport-965078-91.patch, failed testing.

David_Rothstein’s picture

  1. +    if (isset($result->error)) {
    +      $message = 'HTTP request to @url failed with error: @error.';
    +      $replace = array(
    +        '@url' => $url,
    +        '@error' => $result->error,
    +      );
    +      watchdog('update', $message, $replace, WATCHDOG_WARNING);
    +    }
    +    elseif (!isset($result->error) && isset($result->data)) {
    

    I'm pretty sure these strings need to be inline with the watchdog() call so that https://drupal.org/project/potx can find them and make them translatable.

    Also (minor), in this case the second isset($result->error) is unnecessary due to the first one.

  2. Still missing error codes in the watchdog messages? (That really would be useful when debugging, to tell the difference between a 403/404/etc.)
  3. Still missing these watchdog() calls in several places in core, for example OpenID? (By the way, to answer akamustang's question in #97 about centralizing these, I'm pretty sure that was already discussed in depth earlier in the issue.)
  4. I'm very wary of committing the full URLs to Drupal 7 without #2012468: Add trusted roles recommendation to 'access site reports' being resolved first. In the past we've considered passwords in the logs to be a security issue (example: https://drupal.org/node/912412) and although passwords in URLs in the logs aren't exactly the same thing, it's pretty close.
David_Rothstein’s picture

Also, regarding the error codes, it looks like this has regressed in Drupal 8. The code from #88 is no longer there; instead I see this:

    try {
      $response = $this->httpClient->send($request);
.....
    }
    catch (RequestException $e) {
      watchdog('aggregator', 'The feed from %site seems to be broken because of error "%error".', array('%site' => $feed->label(), '%error' => $e->getMessage()), WATCHDOG_WARNING);
    }

So that would be good to resolve also (perhaps in a separate issue).

David_Rothstein’s picture

With regard to passwords in URLs, I think that part's probably OK now given that the above issue has been resolved via https://drupal.org/PSA-2014-002.

(Passwords in URLs aren't sensitive in quite the same way as user passwords are since they can be logged other places too already.)

MustangGB’s picture

#2012468: Add trusted roles recommendation to 'access site reports' is fixed, hoping someone is willing to take up this helm again.

MustangGB’s picture

Status: Needs work » Reviewed & tested by the community

Patch still works.

The last submitted patch, 47: drupal_http_request_D6_do_not_test.patch, failed testing.

MustangGB’s picture

Hmm, should be testing #91, not #47.

The last submitted patch, 47: drupal_http_request_D7_do_not_test.patch, failed testing.

Fabianx’s picture

Status: Reviewed & tested by the community » Needs work
Issue tags: +Needs change record

Okay, I thought hard about it and while I did RTBC this before, I am no longer convinced as much.

The biggest trouble is the variable_set() (Performance wise) and that it is very unreliable anyway.

On the other hand this variable_set() has brought down more than one site for me.

While the UX of aggregator, etc. sucks, those are essentially other issues.

For here I first suggest to fix the issue at hand:

+ if (variable_get('drupal_http_request_fails', FALSE) === FALSE) {
    variable_set('drupal_http_request_fails', TRUE);
+}

That would do one variable_set() until the next time the status report is checked for.

That is still not good, but better than to clear the variable_cache again and again.

The aggregator thing, etc. can be nice follow-up issues, but are independent of this one.

Note: This makes it also possible to just do a quick and simple:

$conf['drupal_http_request_fails'] = NULL;

to remove the check completely for custom installations, which we could recommend.

MustangGB’s picture

@Fabianx:

This is a backport issue, so are you suggesting removing the watchdog calls from D8?
If not then what's the problem with having feature parity?

Also whilst it may not work perfectly the current system and proposed watchdog system both allow admins to be flagged of connection issues, whereas $conf['drupal_http_request_fails'] = NULL; would mean this ability is lost.

I don't really care if it appears in the status report or watchdog log, but I'd like to have visibility, arguably watchdog even provides greater visibility.

Fabianx’s picture

#112: What I mean is:

We should split this into:

a) A feature request / task to have watchdog() and co log those messages.

b) A bug fix to not set the variable again and again if we know that a request failed.

I don't care which issue should be used for which, but the D8 backports are new functionality the system gains, while the http_request fails can bring a site down.

That means I don't want to mix a bug fix with new functionality.

The problem is if we remove the http_request fails completely, custom code would not log any message - intentionally, but that means a behavior change and I believe that should be opt-in.

We could expose the http_request fails in the UI in a follow-up after the watchdog message is in, too then:

So three steps is what I propose:

1. Ensure that drupal_http_request fails does no longer bring down and site and can be disabled by setting NULL (settings.php only).

2. This patch by adding all the nice watchdog calls to aggregator and co (Great!)

3. Expose the settings.php setting in the UI to allow to turn off messages once contrib has used watchdog to display messages, too when something fails in drupal_http_request.

catch’s picture

Splitting this in two sounds good. I'd do the quick bug fix here and open a new issue for the watchdog() calls.

David_Rothstein’s picture

The quick bugfix sounds good - however I think the fix that has been discussed here (removing the variable entirely) is better in the long run; it fixes not just the performance bug but also the one in the issue title ("HTTP request checking is unreliable"). So I would suggest keeping this issue focused on the original goal.

The problem is if we remove the http_request fails completely, custom code would not log any message - intentionally, but that means a behavior change and I believe that should be opt-in.

Code which makes an HTTP request that fails unexpectedly really should be logging or recording the failure already if they care about it at all. It is hard for me to imagine that anyone is relying on the message in the status report for this purpose, because (a) that message only gets displayed as a result of a particular type of failure, not others (e.g. a 403 or 404 error) that are more likely, (b) it provide no real useful information about the failure itself, and (c) the message is in many cases completely inaccurate.

Fabianx’s picture

Issue tags: +Drupal bugfix target

I am gonna open an issue for the bug fix "soon" (as soon as I get back to the core queue) - unless someone beats me to it. Adding a tag for issues I want to focus for the next bugfix release on.

  • catch committed 6679cf9 on 8.3.x
    Issue #965078 by catch, akamustang: Fixed HTTP request checking is...

  • catch committed 6679cf9 on 8.3.x
    Issue #965078 by catch, akamustang: Fixed HTTP request checking is...

  • catch committed 6679cf9 on 8.4.x
    Issue #965078 by catch, akamustang: Fixed HTTP request checking is...

  • catch committed 6679cf9 on 8.4.x
    Issue #965078 by catch, akamustang: Fixed HTTP request checking is...
MustangGB’s picture

Any movement here?

John_B’s picture

This remains a major bug, although dormant for years. The check relies on the following line in system.module
drupal_http_request(url('', array('absolute' => TRUE)), array('max_redirects' => 0))
The request is supposed to call the front page and checks for status code between 100 and 600. Without that status code it returns false and sets drupal_http_request_fails to true.

We can check what this code does with drush php-eval "var_dump(drupal_http_request(url('', array('absolute' => TRUE)), array('max_redirects' => 0)));"

There are several common scenarios in which it returns 'false' (and sets drupal_http_request_fails to true) on a working site. If your base url is not set (and for example drush uli outputs http://default/...), the output is

object(stdClass)#19 (2) {
  ["code"]=>
  int(0)
  ["error"]=>
  string(71) "php_network_getaddresses: getaddrinfo failed: Name or service not known"
}

If you set your base URL but your SSL certificate is not valid, the output is:

object(stdClass)#20 (2) {
  ["code"]=>
  int(0)
  ["error"]=>
  string(58) "Error opening socket ssl://mysite.com:443"
}

The results are however unpredictable. On a server where we firewall port 80, opening port 80 causes the message to disappear without altering the results of that drush command (or of calling the checking function which populates the status report using ).

MustangGB’s picture

Issue tags: +Drupal 7.69 target
MustangGB’s picture

Issue tags: -Drupal 7.69 target +Drupal 7.70 target

  • catch committed 6679cf9 on 9.1.x
    Issue #965078 by catch, akamustang: Fixed HTTP request checking is...
MustangGB’s picture

izmeez’s picture

So, For Drupal 7.x is the patch from #91 still "RTBC" or "Needs work" as result of comments #111 - 113 ?

joseph.olstad’s picture

Needs change record, please write a draft CR

izmeez’s picture

Yes, needs a draft CR but also needs work as described in comment #113 which may affect the contents of CR.
Added to #3179845: [meta] Priorities for 2020-12-02 bugfix release of Drupal 7.76 / 7.77 in unsorted to draw attention to it.

kmonty’s picture

Re-rolled the patch against dev, as the current d7 patch did not apply against 7.82