I'm trying to set up a wget script to login to my Drupal site and check everything is ok. However i'm stuck at the login part. The watchdog log shows that i successfully logged in and the login field in the users table shows the correct time, but the entry in the sessions table has uid 0. I've tried getting a session cookie first and passing that along with the request but it doesn't help either.

$ wget -S -O 1.html --post-data='name=jon&pass=jon&op=Log%20in&form_id=user_login_block' http://drupal.localdomain/
--09:11:10--  http://drupal.localdomain/
           => `1.html'
Resolving drupal.localdomain... 127.0.0.1
Connecting to drupal.localdomain|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 302 Found
  Date: Tue, 13 Feb 2007 20:11:10 GMT
  Server: Apache/2.0.55 (Ubuntu) PHP/5.1.2
  X-Powered-By: PHP/5.1.2
  Set-Cookie: PHPSESSID=2fd7070b361d0c1b3e77207718ee2283; expires=Thu, 08 Mar 2007 23:44:30 GMT; path=/; domain=.drupal.localdomain
  Expires: Sun, 19 Nov 1978 05:00:00 GMT
  Last-Modified: Tue, 13 Feb 2007 20:11:10 GMT
  Cache-Control: no-store, no-cache, must-revalidate
  Cache-Control: post-check=0, pre-check=0
  Set-Cookie: PHPSESSID=2decc3f388a2eed7658890742ce3b330; expires=Thu, 08 Mar 2007 23:44:30 GMT; path=/; domain=.drupal.localdomain
  Location: http://drupal.localdomain/user/1
  Connection: close
  Content-Type: text/html; charset=utf-8
Location: http://drupal.localdomain/user/1 [following]
--09:11:10--  http://drupal.localdomain/user/1
           => `1.html'
Connecting to drupal.localdomain|127.0.0.1|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 403 Forbidden
  Date: Tue, 13 Feb 2007 20:11:10 GMT
  Server: Apache/2.0.55 (Ubuntu) PHP/5.1.2
  X-Powered-By: PHP/5.1.2
  Set-Cookie: PHPSESSID=2decc3f388a2eed7658890742ce3b330; expires=Thu, 08 Mar 2007 23:44:30 GMT; path=/; domain=.drupal.localdomain
  Expires: Sun, 19 Nov 1978 05:00:00 GMT
  Last-Modified: Tue, 13 Feb 2007 20:11:10 GMT
  Cache-Control: no-store, no-cache, must-revalidate
  Cache-Control: post-check=0, pre-check=0
  Content-Length: 4210
  Keep-Alive: timeout=15, max=100
  Connection: Keep-Alive
  Content-Type: text/html; charset=utf-8
09:11:10 ERROR 403: Forbidden.

Comments

fluke’s picture

The first problem is that you can't log in unless you have a PHPSESSID cookie set and send it with the form submission. So you need to load the login page first and save the cookie sent. I'm not sure if the second problem is a bug in Drupal or wget or just wget obeying our explicitly set cookie header, but there are 3 cookies sent from the 302 page after we log in:

Set-Cookie: PHPSESSID=fcd80d48901b026996f952c980dcb805; expires=Fri, 20 Apr 2007 01:44:52 GMT; path=/; domain=.localdomain
Set-Cookie: PHPSESSID=deleted; expires=Mon, 27-Mar-2006 22:11:31 GMT; path=/
Set-Cookie: PHPSESSID=f4f802870390db7179ccb64e1e73350c; expires=Fri, 20 Apr 2007 01:44:52 GMT; path=/; domain=.localdomain

The first is the cookie i sent, the third is the one for our logged in session. But wget sends the first one again when it loads the main page (after the redirect), so we are still logged out and get that cookie back:

  Set-Cookie: PHPSESSID=fcd80d48901b026996f952c980dcb805; expires=Fri, 20 Apr 2007 01:44:52 GMT; path=/; domain=.localdomain

The solution then is to remember the third cookie and send that for future requests. Here's a shell script using wget and awk to log in and get the cookies. Watch out for line wrapping.

#!/bin/sh 

site=<url of site, ending with a slash>
name=<username>
pass=<password>

# get the initial session cookie
cookie=`wget -S -O /dev/null $site 2>&1 | awk '/Set-Cookie/{print $4}' FS='[ ;]'`

# do the login
cookie=`wget -S -O /dev/null --header="Cookie: $cookie" --post-data="name=$name&pass=$pass&op=Log%20in&form_id=user_login_block" "${site}?q=node&destination=node" 2>&1 | awk '/Set-Cookie/{print $4}' FS='[ ;]' | awk 'NR==3{print}'`

# load the main page
wget --header="Cookie: $cookie" $site
kiboro’s picture

I took the above script and can get the homepage logged in successfully (after changing NR==3 to NR==2). However if I add the -r to recurse my site I get the pages logged out.

Before anyone thinks I'm leeching, I'm trying to get an offline copy for a demo.

Can anyone possibly help?

asciikewl’s picture

I've managed to get it working this way, using wget's own cookie handling:

I'm also referencing the /user page, not a login block (not enabled on my site)

#!/bin/sh

site=http://your site url with a slash on the end/
name=ScriptUser
pass=somethingsecure
cookies=/tmp/cron-cookies.txt

wget -O /dev/null --save-cookies /tmp/ba-cookies.txt --keep-session-cookies --load-cookies $cookies "${site}user"
wget --keep-session-cookies --save-cookies $cookies --load-cookies $cookies -O /dev/null \
        --post-data="name=$name&pass=$pass&op=Log%20in&form_id=user_login" \
        "${site}user?destination=login_redirect"
wget --keep-session-cookies --save-cookies $cookies --load-cookies $cookies "${site}login_redirect"

Maybe with the wget cookie handling you can recurse the site.

naught101’s picture

/tmp/ba-cookies.txt < is this a typo?

could you explain how the script works? mostly, I don't understand the second and third wget lines...

if you want to avoid putting your password in a plaintext file, replace:

pass=somethingsecure
  with
read -a -p "password: " pass

and change the #!/bin/sh to #!/bin/bash

jbova’s picture

Thank you asciikewl. That works perfectly.

drupaledmonk’s picture

Thanks a lot works flawlessly.

rafal.cygnarowski’s picture

Small update. For Drupal 8 one should use:

wget --keep-session-cookies --save-cookies $cookies --load-cookies $cookies -O /dev/null \
        --post-data="name=$name&pass=$pass&op=Log%20in&form_id=user_login_form" \
        "${site}user/login"

Cheers ;-)

nshenry03’s picture

Thanks fluke, your solution was just what I needed to get started with a similar issue!

kim.pepper’s picture

I assume by the urls that this is for Drupal 5? Does anyone have an example of how this would work for Drupal 6?

Kim

kim.pepper’s picture

It was just the login_redirect bit that was giving me grief. Once I removed that, all was ok.

K

colan’s picture