Migrating from PHPweblog

We recently successfully imported thousands of nodes from a phpweblog cms into drupal. Also, the stories were in portugese, encoded in iso-8859-1, so in order to preserve the accents, the db also had to be run through 'iconv', a converter utility, in order to switch it to utf-8 (drupal's standard encoding.)

Here were our basic steps in case this will help you...

make the backup:

mysqldump -u {db_username} -ppassword {db_name} > weblog.sql

copy the file over to your test site location, and run iconv:

iconv -f iso-8859-1 -t utf8 weblog.sql > weblog_utf8.sql

I added all the old tables right into the drupal tables (although that might not have been the best move speed wise, and all the old tables will have to be dropped when you're done). But it looked like this:

mysql -u {db_username} -ppassword {db_name} < weblog_utf8.sql

The phpweblog stories are all in the table T_Stories.

drupal     phpweblog
node       story
-------    ---------
nid          CAST(REPLACE(Rid,'/','') AS UNSIGNED)
type        'story'
title         Heading
uid          '2'
status      '1'
created     unix_timestamp(Birthstamp)
changed    unix_timestamp(Birthstamp)
comment    '2'
promote     '0'
moderate   '0'
teaser      Summary
body        Content
revisions    ''
sticky       '0'
format      '1'

this mapping converts the phpweblog story id (Rid) into an id compatible
with drupal (nid). For example 05/03/08/179851 --> 50308179851. this
only works if you make "nid" a bigint and increase the length to at
least 15. the only purpose of doing this instead of letting drupal
assign new node ids is that this way we can make it so that the old URLs
(which might have tons of links to them) can still be used with a little
mod_rewrite alchemy:

RewriteCond %{QUERY_STRING} ^story=(.*)/(.*)/(.*)/(.*)$
RewriteRule ^.*$ /index.php?q=node/%1%2%3%4 [L]

So, in mysql, the copy over looked like this:

INSERT INTO node
(nid,type,title,uid,status,created,changed,comment,promote,
moderate,teaser,body,revisions,sticky,format)
SELECT CAST(REPLACE(Rid,'/','') AS
UNSIGNED),'story',Heading,'2','1',unix_timestamp(Birthstamp),
unix_timestamp(Birthstamp),'2','0','0',Summary,Content,'','0','1'
FROM T_Stories;

Also, in the above code, the stories will all appear as authored by 'uid 2', I created a user named "archiver" (who happened to have a uid of 2.) So, to make all the old posts appear from a different user, here is the mysql:

UPDATE node SET uid={whatever number} WHERE uid=2;

And there they all were! If you run cron.php, your site will be re-indexed, and they're all searchable!

Becareful of varchar vs. Int

arthurf - September 23, 2005 - 19:00

Thanks for this tip. Origionally this code did not work for me. Looking at the tables, Rid is a varchar, nid is an int. By changing nid over to a varchar, i was able to import the table into the node table and then switch the nid back over to a bigint.

thanks!

 
 

Drupal is a registered trademark of Dries Buytaert.