Hello,

I am attempting to import HTML contents with XPath from a website that is password protected. I have an account for the website in question. Having reviewed the little information that exists on this issue, I have tried the suggested entry of 'http://username(in this case it is the email):password@website' in the 'Automatically Add Scheme' section of the HTTP Fetcher Advanced Settings. This solution has not worked for me. Does anyone have any ideas or module recommendations on how I can go about logging in and obtaining the HTML data from a password protected site? For your reference, the link to the site is: http://pro.energycentral.com/membership/sign_in.cfm. Please note, I am not having trouble with my importers using xpath for no log in sites...just need a way to log in as part of the Feeds Import process.

Thank you!

Comments

Stefan Lehmann’s picture

Did you try putting username and password through something like: http://meyerweb.com/eric/tools/dencoder/ before putting it into the request URL.

I think special characters like '@' in the email of the username might break the actual formatting of the expected URL.

It might work in your browser as your browser is html encoding special chars on the fly - whilst the Feed importer won;t do that automatically.

I like cookies!

martinpsz’s picture

I have changed the '@' in the email/username to "%40" and left the '@' after the password as is. This was based on what I saw from others who had this problem. Taking this step did not fix the issue.

Stefan Lehmann’s picture

So can you actually access the website through a browser with prefilled credentials?

I like cookies!

martinpsz’s picture

I can access the site if I go in and input my credentials and click log in. I tried using the direct URL to the article but that did not produce a successful feed import. One guess I have is that this feature may only be open to RSS feeds and not a website. I don't know how Drupal would be able to figure out where to put the credentials with just a link to the sign in page.

Stefan Lehmann’s picture

The http://username:password@domain URL should work in your browser, if you can login via the login form. I very much believe that you have maybe some special chars somewhere in username and/or password.

I like cookies!

martinpsz’s picture

I tried again and used the encoder you sent. It doesn't work. I can't log in with the format above in the URL. It just leads me to the sign in page.

Stefan Lehmann’s picture

That's all very strange. It's time for the crowbar then. :D

If you look at line 50 in: http://cgit.drupalcode.org/feeds/tree/plugins/FeedsHTTPFetcher.inc?h=7.x...

It's doing: $result = http_request_get($this->url, NULL, NULL, $this->acceptInvalidCert, $this->timeout);

whilst Drupal API documentation here: http://www.drupalcontrib.org/api/drupal/contributions!feeds!libraries!ht...

says, that you can just set username, password in that request.

So I believe if you just (temporary for the time of the import) hack that file and insert your user name & password like that:
$result = http_request_get($this->url, 'username', 'password', $this->acceptInvalidCert, $this->timeout);

It should probably start working. Good luck and don't forget to undo that once you're done. :-)

I like cookies!