Seems to die when it hits output such as the above in the .xml import file. Trying to debug; will add a new 'if' to handle and see; more later :)
| Comment | File | Size | Author |
|---|---|---|---|
| #12 | wordpress.2010-01-17-taussig.xml_.tar_.gz | 25.13 KB | 1kenthomas |
| #11 | wordpress.2010-01-17-mtucker.xml_.tar_.gz | 14.15 KB | 1kenthomas |
Comments
Comment #1
1kenthomas commentedI would say this was my problem (setup), but I can do a dumpXmlReader and it parses the whole file. Therefore I don't know why the existing code causes a hang, but it does :|
Rewriting as case statements.
Comment #2
1kenthomas commentedLooks to be an error in how categories are parsed: @
next($wordpress_import->data['categories']);
Well maybe. Still rewriting.
Comment #3
1kenthomas commentedDitto tag names. You can't execute a read when you don't know what's coming next :)
Comment #4
1kenthomas commentedEtc... current code does a while(read()) and then executes reads randomly inside that loop, assuming it knows what it will read next... bad bad bad... crashes XMLReader.
Comment #5
1kenthomas commented... ok, so even if the read is rewritten & doesn't crash XMLReader, it goes through the importing process and creates no nodes...
more to come :)
Comment #6
finex commentedI confirm the bug :-(
Comment #7
lavamind commentedWhat version of PHP are you using ?
Also, could you please provide a sample export file on which the problem is manifested.
Comment #8
lavamind commentedComment #9
1kenthomas commentedHi, sorry, buried in other projects.
This was with 5.2.6 and 5.3.1. I though there might be an issue with my version of XML Parser, so I also switched those in/out.
I'll provide a sample file ASAP; what code rewrites I did do, got me farther (and clearer) but did not resolve (alas).
Thanks for your reply & your help!
-Ken
Comment #10
finex commentedI've used the following PHP version:
PHP 5.2.6-1+lenny4 with Suhosin-Patch 0.9.6.2 (cli) (built: Nov 22 2009 02:38:03)I cannot provide an example file. Anyway I've solved using the old import version (1.1).
Comment #11
1kenthomas commentedAttached as tar.gz.
This file validates (passes test around l. 261, import_read_wxr) but causes WSOD later. One post found.
PHP Version 5.2.6-3ubuntu4.5, though I've tried on PHP up to 5.3.1.
Thanks again-- other XML as I get a chance.
Comment #12
1kenthomas commentedThis finds only one post (though there are multiple) and also WSODs w/out import.
I will try exporting from a different WP environment, just in case...
Comment #13
finex commentededit: I've answered to the wrong thread. Sorry
Comment #14
lavamind commentedThere are two problems I found in these XML files.
First, there are atom:link elements. For some reason, Wordpress includes these tags in its output but without declaring the "atom" namespace, therefore producing malformed XML. For now, try removing all tags beginning with atom:link from your export file.
Secondly, in the "mtucker" example, there's an XML error on line 618 : Wordpress included the
&(ampersand) character as an XML value, but strings containing that character should be enclosed as CDATA, or escaped using&. Try correcting that mistake and your data should import properly.I will try and see if I we could detect these XML errors and refuse to import if it finds any of them.
Comment #15
lavamind commentedComment #16
lavamind commentedOkay I tracked it down to this Wordpress bug report : http://core.trac.wordpress.org/ticket/9633
In summary, this problem has only been fixed recently in the Wordpress export code, and will be released with version 3.0.
Let me reiterate that this is a problem with WordPress export, as in some cases it produces invalid XML. But what is really frightening is that the developpers don't seem to really care about generating proper XML, since their own WXR importer is a dumb regexp script.
I'll need to make some modifications to the way XMLReader is used so that XML errors are better tolerated...
Comment #17
lavamind commentedFixed in 6.x-2.x-dev.
First, aborts if XMLReader doesn't reach the end of the WXR file, which most likely indicates XML problems.
Secondly, documents possible problem causes/solutions in the README.
Thanks!
Comment #19
nikitas commentedi had the same problem while exporting from wordpress and importing to drupal . . .
used the 6.x 2.x dev. version and removed the atom links plus these xml items from each post and everything worked just fine. . .!!!
.. .
so if the .xml cant be imported just remove those items from your posts !!!