In case your hosting provider has disabled fsockopen() for some reasen, here's a patch that can bypass that restriction using curl.
I know the code is ugly, but it was a quick hack for a site I administer. Please add suggestions or improvements.
Code below replaces ping_ping() in ping.module.

function ping_ping($name = '', $url = '') {

  function curl_xmlrpc_ping ($ping_server, $local_name, $local_url) {

    $xml_request  = '<?xml version="1.0"?>';				//  encoding?
    $xml_request .= '<methodCall><methodName>weblogUpdates.ping</methodName>';
    $xml_request .= '<params><param><value>'.$local_name.'</value></param>';
    $xml_request .= '<param><value>'.$local_url.'</value></param></params></methodCall>';

    $curl_handle = curl_init();
    $post_header  =  array( 'Content-type: text/xml' );		//  required for xml posting
    curl_setopt ($curl_handle, CURLOPT_URL, $ping_server);		//  hostname with 'http://'
    curl_setopt ($curl_handle, CURLOPT_POST, 1);
    curl_setopt ($curl_handle, CURLOPT_HTTPHEADER, $post_header);
    curl_setopt ($curl_handle, CURLOPT_POSTFIELDS, $xml_request);
    curl_setopt ($curl_handle, CURLOPT_RETURNTRANSFER, 1);
    $curl_buffer = curl_exec($curl_handle);
    curl_close($curl_handle);

    $response = trim(strip_tags($curl_buffer));
    if (preg_match ('/flerror{0,1}/i',$response)) {
      $return_msg = 'Server response: ';
        if (preg_match ('/flerror0/i',$response)) {
          $return_msg .= 'OK, ';
          $return_code = 0;
        } else {
          $return_msg .= 'ERROR, ';
          $return_code = 1;
        }
        $return_msg .= preg_replace ('/message/i','',stristr ($response, 'message'));
    } else {
      $return_msg = 'ERROR, NOT a weblogUpdates.ping response: ';
      $return_msg .= $response;
      $return_code = 2;
    }
    $return_array[code] = $return_code;
    $return_array[msg] = $return_msg;
    return $return_array;
  }

  $pingers = explode("\n", variable_get('ping_pinger_list', 'http://rpc.pingomatic.com'));
  foreach ($pingers as $pinger) {
    $pinger = trim($pinger);
    if (empty($pinger)) {
      continue;
    }
    $result = curl_xmlrpc_ping ($pinger, $name, $url);

    if ($result[code] = 0) {
      $watchdog_severity = 'WATCHDOG_NOTICE';				//log successful pings too
    }
    else {
      $watchdog_severity = 'WATCHDOG_WARNING';
    }
    watchdog('directory ping',t('Server: '.$pinger.'. Response: '.$result[msg]), $watchdog_severity);
  }
}

Related issues

#1664784: drupal_http_request() has problems, so allow it to be overridden
#1447736: Adopt Guzzle library to replace drupal_http_request()
#7881: Add support to drupal_http_request() for proxy servers (http not https)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

killes@www.drop.org’s picture

Version: 4.7.0 » x.y.z

no new features for 4.7

pepiqueta’s picture

Oh, sorry.

breyten’s picture

What are the odds that fsockopen is disabled, and curl is enabled? I think it's not very likely.

chx’s picture

Title: Ping without fsockopen or xmlrpc: custom curl function for pinging sites » Make it possible to use other transport mechanisms
Status: Needs review » Active

Core should provide a hook in drupal_http_request through which a curl_http_request can take over. That's a minimal patch. Then curlhttprequest module can be put to contrib (not into core, though).

breyten’s picture

Component: ping.module » base system

Let's make ab object out of drupal_http_request ;p

But seriously, chx is right. I think we should make an implementation akin to user_mail.

pepiqueta’s picture

I don't know how often this happens. In fact I know of only one case: mine! I set up a site and when I tried to ping some servers it failed because fsockopen was disabled (it seems that some users on that hosting provider abused it for spamming or something). So I did this for bypassing that restriction. I don't know if this will be helpful for someone, but I just wanted to share this.

I just wanted that feature for ping.module. I agree that it should appear somewhere else in the source tree, but I'm noob to Drupal (well, PHP newbie too...) so that's too much for my skills. Maybe someone could do this.

BTW, I aplied for CVS access a couple of days ago but got no response from admins.

LAsan’s picture

Version: x.y.z » 7.x-dev

Any work done in the patch code?

Moving to cvs.

Dave Reid’s picture

Assigned: Unassigned » Dave Reid

I'm going to pick up this issue. I'd like to be able to over-ride drupal_http_request with curl.

Dave Reid’s picture

Here's my first attempt at this patch. Introduces a new hook: hook_http_request. The main fetching code done in drupal_http_request has been moved to system_http_request. drupal_http_request now acts as the abstract HTTP request function:

  // Perform request.
  $module = variable_get('http_request', 'system');
  $result = module_invoke($module, 'http_request', $url, $options);

Patch also makes use of the new hook_modules_enabled and hook_modules_disabled hooks to override and un-override the http_request variable:

/**
 * Implementation of hook_modules_enabled().
 */
function system_modules_enabled($modules) {
  // Check if modules define hook_http_request().
  // Set override variable if it requests successfully.
  // TODO: Registry has not been rebuilt yet and module_hook FAILS if module does have hook_http_request().
  foreach ($modules as $module) {
    if (module_hook($module, 'http_request') && module_invoke($module, 'check_http_request')) {
      variable_set('http_request', $module);
      break;
    }
  }
}

/**
 * Implementation of hook_modules_disabled().
 */
function system_modules_disabled($modules) {
  // Check if modules defined hook_http_request().
  // Reset override variableif no longer available.
  foreach ($modules as $module) {
    if (module_hook($module, 'http_request')) {
      variable_set('http_request', 'system');
      module_invoke('system', 'check_http_request');
      break;
    }
  }
}

Attached is the patch to core and the first-version curl module to test overriding http requests. Hopefully this can start getting a little momentum and review towards this patch.

Dave Reid’s picture

Status: Active » Needs review

I should note that this patch also changes drupal_http_request to use an $options parameter instead of the $headers, $method, $data, and $retry parameters.

Dries’s picture

Mollom is affected by this too so I'd like to see us fix this. For Mollom, we would like to open a secure 'https' connection to mollom.com. That seems impossible to do with the current drupal_http_request() code but would be possible with CURL.

Looking at your patch, I wonder if it needs to be that configurable. It feels a bit over-engineered to me. Why don't we just use CURL, when CURL is available, and fall-back to the current implementation when CURL is not available?

chx’s picture

Someone might want to use the http extension in PECL mayhaps in php core soon? Anyways those system hooks hurt my eyes. Why not jus let moudles set their variables in their install hook instead of burdening system with it? Also, why is this in .module instead of a handy system.http.inc or something? Excellent opportunity to remove rarely run code out of every page request!

Anonymous’s picture

Status: Needs review » Needs work

The last submitted patch failed testing.

Dave Reid’s picture

Title: Make it possible to use other transport mechanisms » Pluggable architecture for drupal_http_request()

I should probably model this patch off the work done in #303930: Pluggable architecture for aggregator.module, and give the site administrator the option to select which module to use for drupal_http_request instead of having the modules set variables on install/uninstall. I'm also going to split off the parameter array-itizing into #337783: DX: Array-itize drupal_http_request()'s parameters and hopefully get that accepted first before this issue.

Dave Reid’s picture

Status: Needs work » Postponed
boombatower’s picture

drewish’s picture

subscribing. i'd love to see this feature added.

Dries’s picture

Status: Postponed » Needs work

#337783: DX: Array-itize drupal_http_request()'s parameters has landed, so I'm switching this back to 'code needs work'.

Dries’s picture

When CURL is supported, core should use it. For one, it makes SSL support a lot easier, and it is more robust than our own implementation is. So, OK for making it a pluggable solution (if needed), but let's avoid introducing a barrier here -- we don't want people to install or configure anything when CURL is available, IMO. I think CURL should be the default, with a silent and graceful fallback to our own implementation when CURL is not available.

boombatower’s picture

Considering SimpleTest already has a CURL browser I would recommend we focus on it. I already have a working pluggable backend patch at: #340283: Abstract SimpleTest browser in to its own object.

It still needs a bit of work, but getting close.

chx’s picture

I still think that the system_http_request should use the HTTP(S) wrapper with file_get_contents. We can use stream_context_create to set options and stream_get_meta_data to read the response headers.

Dave Reid’s picture

RobLoach’s picture

ibandyop’s picture

subscribing. Looking forward to cURL with Proxy handling

boombatower’s picture

can we focus/look at #20

I think it is rather far along, just been to busy to finish it.

pwolanin’s picture

essentially blocked on Crell's mythical handlers in core?

chx’s picture

The way we make things pluggable can now be seen in system.queue.inc and soon cache.inc. (and field_sql_storage.module)

boombatower’s picture

I will be working on #20 and I would like some reviews (hope to have documented patch later today). It will provide the same tools that drupal_http_request() does (we could even keep that as a wrapper), but also has a number of advanced browser tools and a pluggable architecture.

pwolanin’s picture

any progress on this?

pwolanin’s picture

Version: 7.x-dev » 8.x-dev
SqyD’s picture

subscribing

mikeytown2’s picture

Subscribe now that I've created the HTTP Parallel Request Library module.

Alan Evans’s picture

Working on a version of this - aiming to incorporate suggestions of prior comments. Results won't be immediate but I may post partially usable patches as I go for ideas ...

Alan Evans’s picture

So, I have a version of this working to the point where the simpletests all pass for the native HTTP client, but I get several fails for the curl client. Part of the problem here is that the CURL client hides numerous steps from the user, for example during multiple redirects, and the tests depend on knowing the details of the interim steps.

Now, I can either change the test so that it also detects CURL and does slightly different tests in those cases, or I can throw away some of the optimizations that curl uses and handle every request individually (so, instead of allowing curl to follow multiple redirects, make each redirect a single curl call and keep track of history). My feeling is that I need to keep curl's behaviour, otherwise we're just neutralising its advantages, but I don't know entirely if we have a policy on conditionally testing different things based upon available capabilities.

Alan Evans’s picture

discussed with Gabor in chat:

hi Gabor - if you have a moment, do you have an opinion on this: I've been working on a pluggable system for drupal_http_request that wil use curl if available.  Now, the simpletests expect redirects to be handled as many individual requests, with detailed info about the interim requests.  The curl implementation, if I allow it to use the best of curly goodness, does not do that, so tests fail.  The question is: is it better to emulate the native drupal_http_request behaviour to please tests, discarding curl's optimized flow, or fork the tests so that they also detect whether curl is being used and test differently? (don't really like that either - having the tests "unstable")  both have downsides to me … I guess there might be a third option: change the tests so that they cover both adequately, but without forking (requires less stringent testing of the native drupal_http_request) Alan Evans @ 1:41:18 PM	

yeah, I guess since drupal_http_request() is used at various places in tests, it would be best to modify the original tests instead of forking, no? Gábor Hojtsy @ 1:42:39 PM	

I'd lean toward that third option, yeah … forking sounds like a recipe for disaster (you don't absolutely know which fork of tests will run) and emulating is bad too (removes some of curl's advantages) Alan Evans @ 1:45:05 PM
thanks
Alan Evans’s picture

Attaching a patch warts-and-all in case anyone wants to offer opinions on the general direction this is taking.

Notes:

  • Tests have been tweaked to accomodate curl error codes
  • Tests, with these tweaks in place, are all green currently (but that changes fairly regularly ;) )
  • There is definitely cleanup to do and numerous TODOs noted in code
  • I don't consider this ready for a formal review, just a brief glance to see whether anyone has opinions either way whether this is going generally in the right direction
mikeytown2’s picture

We should implement 2 more options that both CURL and HTTPRL can use. Queue a request and sending all queued requests. This will give us multi support in core and make it available to modules in the future.

As for how to code this up we should be looking at locking and queuing right? Or is there a better pattern to use?

mikeytown2’s picture

Assuming that locking and queuing are the correct patterns to use; I've created an issue in HTTPRL for converting that code into something that would be acceptable in core #1593862: 2.x branch and target for core inclusion. Move extra stuff to sub-modules.

Please let me know if this is the right direction or the wrong direction :)

tstoeckler’s picture

+++ b/core/includes/common.inc
@@ -751,264 +751,34 @@ function drupal_access_denied() {
+    // This check mainly applies to user-configured classes. Normally indicates
+    // a typo in the configuration class name option.
+    $interfaces = class_implements($http_client);
+    if(!isset($interfaces['DrupalHTTPClientInterface'])) {
+      throw new Exception('HTTP Client does not implement DrupalHTTPClientInterface');
+    }

This is baby-sitting broken code. We should simply remove that.

Also the new classes and the interface should be PSR-0.

Crell’s picture

We should not reimplement drupal_http_request(), pluggable or otherwise. There's several existing libraries that we could use that would be more featured and more efficient than us writing our own.

https://github.com/guzzle/guzzle
https://github.com/kriswallsmith/Buzz

And Symfony has a partial implementation intended to be extended:
https://github.com/symfony/BrowserKit

mikeytown2’s picture

@Crell
Should we remove drupal_http_request and use something like guzzle by default in core? What is in core barely works and we have CURL code littered throughout contrib modules; we need a default way to issue http requests.

Crell’s picture

mikeytown: That is exactly what I am suggesting.

mikeytown2’s picture

FileSize
5.52 KB

To give an idea to everyone how bad the issue is (drupal_http_request() not adequate for most use cases) I've used Gotta Download Them All and did grep -l -r -i "curl_init" ./ on all of modules as of 2 months ago (last time i synced up); see attached file for output.

All 3 options are MIT licensed so we would need to special case it (like Symfony) if we wish to include it in core. @Crell where do we go from here?

Crell’s picture

Issue tags: +WSCCI

We need someone to take point on defining the criteria for evaluating such libraries, get feedback on that criteria, then apply that criteria and come back with a report and recommendation.

Assuming the recommendation makes sense, we bring in that library in one patch, then go through and convert stuff to it in follow-up patches. Depending on the recommendation there may also be a "now build some extensions on it" step in there as well.

This isn't strictly speaking part of WSCCI (which is server end), but I'm going to tag it anyway since the main reason to have such a library is to connect to 3rd party web services. Besides, for s2s syndication we'll need a tool like this.

Mikeytown, if you want to volunteer ping me in IRC and let's talk roadmap. :-)

mikeytown2’s picture

List of requirements based off of what I've seen/need.

Whats in core:
- Set Headers
- Method (GET, POST, etc)
- Max Redirects
- Data
- Timeout
- Error handling

Nice things to have (whats in httprl):
- Parallel HTTP streams.
- Non Blocking Requests.
- Set callback function with arguments.
- Domain Connections: Maximum number of simultaneous connections to a given domain name.
- Global Connections: Maximum number of simultaneous connections that can be open on the server.
- Global Timeout: A float representing the maximum number of seconds the call may take.
- Option to alter all streams mid execution (example: request 20 urls & break after at least 5 return).
- Chunk Size (needed for writing to REST server on IIS)
- HTTP Version.
- Add new requests to stack mid execution.
- Cookie Parsing.
- Proxy Support.
- Async Connect.
- Handle chunked encoding.
- Handle gzip and deflate.
- Handle more error conditions.
- Automatically handle given data (if not a string use http_build_query and set application/x-www-form-urlencoded content type).

Things that are not in HTTPRL
- Get things from FTP.
- Sending Files.
- Full HTTP 1.1 compliance.
- Support protocols other than HTTP and FTP.

Other projects like HTTPRL:
Webclient
cURL HTTP Request
Async Jobs
Background Process

Modules that use context (stream_context_create()) in drupal_http_request():
Acquia Network Connector in acquia_agent_stream_context_create()
Shorten URLs in _shorten_googl()
Both deal with SSL. All other uses for stream_context_create in contrib do not use drupal_http_request.

catch’s picture

Status: Needs work » Postponed

There's already discussion about third party browser libraries in #1447736: Adopt Guzzle library to replace drupal_http_request(), we should just postpone this one on that issue I think, so I'm doing that. mikeytown2 would you mind copying your comment over there? Looks like a good start to me for criteria to evaluate on.

Also note there is #1551600: DBehave! which is talking about using behat for acceptance testing (which includes Mink - not a browser but a browser driver, however existing or easy compatibility with that would be an extra bonus).

effulgentsia’s picture

This issue is great. I'm looking forward to seeing more progress on it. Meanwhile, any chance of getting #1664784: drupal_http_request() has problems, so allow it to be overridden in as an interim step?

pwolanin’s picture

Status: Postponed » Active

we should be working to define good interfaces with minimal implementations in core and allow them to be swapped for those sites that need something more.

Damien Tournoud’s picture

Status: Active » Closed (duplicate)

BrowserKit + Goutte (that is using Guzzle behind the scene) is basically this "good interface with minimal implementation" that @pwolanin is talking about.

So let's duplicate this in favor of #1447736: Adopt Guzzle library to replace drupal_http_request().

pwolanin’s picture

Status: Closed (duplicate) » Active

re-opening this.

Looking at the list above, it's not clear that modules using cURL actually need it, or if someone just copied some example PHP into a module. e.g. what's special about the request handling in http://drupalcode.org/project/commerce_payflow_pro.git/blob/refs/heads/7... ?

effulgentsia’s picture

[Guzzle is the] good interface with minimal implementation that @pwolanin is talking about.

Guzzle is 1MB and tens of thousands of lines of code. It's hard to imagine that that's a minimum implementation. Is it possible to abstract out just the top-level interfaces from Guzzle that are needed by core?

mikeytown2’s picture

It can be implemented in a smaller package (httprl ~80kb) if one doesn't use cURL, doesn't follow PSR-0 and does not take full advantage of HTTP 1.1. We are trying to avoid NIH syndrome because what is out there is a better solution than what we currently have. Guzzle has lots of tests and is PSR-0 and takes full advantage of HTTP 1.1.

See the comparison wiki chart for the different PHP HTTP client libraries out there: http://groups.drupal.org/node/233173
Read the whole thread if you think Guzzle is not the right solution. People agree the thread is a good read: https://twitter.com/webchick/status/205926036305752065, http://theweeklydrop.com/archive/issue-37 (under Drupal8)

The features that I see helping core are: Parallel Requests with Callbacks, Non Blocking Requests, PSR-0, Proxy Support, Testbot uses core's HTTP Client, and taking Full advantage of the HTTP 1.1 specifications. Once we get a better HTTP client in core, a lot of cool things are now possible. Read the comments in the comparison wiki for some of the ideas that will be possible.

mtdowling’s picture

I posted a comment in a related issue that I think applies to this conversation: http://drupal.org/node/1447736#comment-6430890

-Michael

pwolanin’s picture

Right, my argument is that many sites don't need HTTP 1.1 nor any of these features. Can we implement an interface that makes it easy to bring in a library with those features for the sites that need them? I understand guzzle is the best library choice, but that doesn't mean we should ship it with core.

My idea of a minimal implementation is just a forward port of drupal_http_request behind a reasonable interface.

pwolanin’s picture

roughly a version of this interface maybe omitting some of the curl-specific and URI template elements (or making them no-op in core): https://github.com/guzzle/guzzle/blob/master/src/Guzzle/Http/ClientInter...

and maybe this one: https://github.com/guzzle/guzzle/blob/master/src/Guzzle/Http/Message/Req...

and https://github.com/guzzle/guzzle/blob/master/src/Guzzle/Http/Message/Req...

In other words, roughly enough so that core can do everything that's in "HTTP Basics" in https://github.com/guzzle/guzzle/blob/master/README.md

Crell’s picture

The challenge with "My idea of a minimal implementation is just a forward port of drupal_http_request behind a reasonable interface" is that it sounds like a "reasonable interface" that covers both full Guzzle and drupal_http_request() is a lot harder than it sounds; at least that's the impression from mikeytown2, mtdowling, and the discussions on the FIG list.

What we would have to do then is ship core with a "http_client.basic" service that is a forward port of drupal_http_request(), and then let contrib inject a "http_client.advanced" service (or something) and acknowledge, yes, different APIs, deal. We can do that, but that runs into the ever-present problem of "how do we make basic do enough that it's useful without reimplementing advanced"? Do we just assume "anyone who's anyone will use guzzle.module", which just exists to wire up the Guzzle library?

pwolanin’s picture

@Crell - so, my thought was that whatever the interface that core expects, we could make it such that the methods work if you get a real guzzle object, or instead get the simple core implementation that only handles a small subset of the methods and options.

I think that yes, anyone who needs more depends on the guzzle.module or something else that provides more functionality through the same basic facade.

A significant use case that is still not 100% supported in core is making requests through a forward proxy - so the goal is to be able to tell people "get guzzle.module (or curl.module or whatever) and configure your proxy settings there and everything else in core and contrib will work like magic"

Sure, we could re-write core to require cURL, etc, but I agree about getting away from the NIH syndrome and thus we should continue to ship an implementation that still works on the lowest-common PHP setup.

mikeytown2’s picture

#7881: Add support to drupal_http_request() for proxy servers (http not https) Has been in D8, and just got into D7; this took 8 years to do. We could continue to patch drupal_http_request with backports from HTTPRL to improve it (see #1320222: Bring in other drupal patches for some of the fixes made) but this is a slow process. I would much rather replace it with something modern.

@pwolanin Your main objection to Guzzle is cURL. Is that correct?

pwolanin’s picture

@mikeytown2 - the cURL requirement, as well as the size of the code and the corresponding potential challenges around coordination that have been discussed above.

As above - I am not interested in adding more of this functionality to core - I was on #7881 pushing it for a several years (and tearing my hear out). I don't think most of the advanced functionality is in the 80% use case, and shouldn't be in core, but we need to make it much easier for the 20% to adopt guzzle or something similar. See comment #57

mtdowling’s picture

Posted a related comment in the other thread: http://drupal.org/node/1447736#comment-6464452

The plan is to break Guzzle into more fine grained components, so that if Drupal decided to add Guzzle to core, you would only need to require the minimal parts.

Berdir’s picture

Status: Active » Closed (duplicate)

And it happened. Guzzle is now in core. Am I right that this issue can be closed now?

In theory, it would still be possible for a contrib module to provide a different implementation for the http_default_client service but without an interface of what such a service exactly supports is that unlikely to happen.

Berdir’s picture

Issue summary: View changes

added related issues