currently, drupal_get_message() in common.inc does this:

$response = preg_split("/\r\n|\n|\r/", $response);
...
$result->data = implode('', $response);

This is bad news, particularly for parsing xml.
It can make valid xml be invalid, in the case where and eol character is the sole white
space separator. My fix is to do this:

implode("\n", $response);

CommentFileSizeAuthor
#5 drupal-http.diff1.37 KBKjartan
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

mda-1’s picture

Title: drupal_get_message incorrectly removes EOL characters » drupal_http_request incorrectly removes EOL characters

Sorry, i meant drupal_http_request not drupal_get_message

mda-1’s picture

The above fix still does not preserve the original body. Here is a fix that does.
$orig_response = $response;
$response = preg_split("/\r\n|\n|\r/", $response);
...
$result->data = preg_replace("/^.*?\015\012\015\012/s", '', $orig_response);

mda-1’s picture

Dries’s picture

Please attach a patch instead.

Kjartan’s picture

Priority: Critical » Normal
FileSize
1.37 KB

This patch should leave the original data alone and just extract the headers.

Dries’s picture

Committed to HEAD and DRUPAL-4-4. Thanks.

moazam’s picture

I've applied this patch to my 4.4.1 distribution but the problem is still not fixed.

For example, if you look at this feed:

http://blogs.sun.com/roller/rss/tucker

The RSS feed is fine and the content is fine...but when Drupal reads it
over, it screws up the text. "How we're" becomes "Howwe're" and "create
an" becomes "createan".

From discussing with Todd Dailey, this is happening because there are hard returns in the feed. "So, drupal is taking "how(hard return)we're" and turning it into "howwe're"." (Todd).

I currently have an example of this at www.unixville.com. Look at the feed which is titled "On opening up Solaris".

This behavior can be seen when using the "Blog It" feature and also when
using the general "News Aggregator" page. I currently have filters set to
OFF (Do Not Filter). I see the same behavior even if I set filters to
"Strip Tags" or "Escape Tags".

mda-1’s picture

have you deleted the existing items from the feed in your drupal?
do a "remove items" then an "update items".
i can read that feed fine in my patched version.

-mda

moazam’s picture

mda, you're the man! It all worked and actually fixed another problem I was having with munged href links! Thanks a million.

-Moazam

Anonymous’s picture