I've got an RSS feed from a Wordpress blog that only displays a headline and a summary, but the RSS feed contains a section that has the full-text of the blog posts.

I want to import the feed and map that section to the body field of my nodes.

How do I do that? Do I need to write a parser for feeds?

Comments

alex_b’s picture

Status: Active » Postponed (maintainer needs more info)

This *may* be fixed by patching the parser you're using.

- What parser are you using?
- Can you post an example feed? Specifically, I don't know what namespace "content:encoded" is in. What is the namespace of 'content' here?

bflora’s picture

Hi, Alex!

Here's the feed: http://www.bearsbeat.com/blog/feed/ check the source to see what I mean.

For parser, I'm using the common syndication parser. Thanks!

stefan81’s picture

Status: Postponed (maintainer needs more info) » Active

Hi

I have the same issue.

I can import this source through the Common syndication parser (Parse XML feeds in RSS 1, RSS 2 and Atom format).
But I have no mapper for the full content, seen below <content:encoded>
I would be mostly grateful if someone can point me into the right direction.

Here‘s a sample code:


<?xml version="1.0" encoding="utf-8" ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" 
			xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" 
			xmlns:cc="http://web.resource.org/cc/" xml:lang="ja">
<channel rdf:about="http://carbontras.blog136.fc2.com/?xml">
<title>Tras News</title>
<link>http://carbontras.blog136.fc2.com/</link>
<description></description>
<dc:language>ja</dc:language>
<items>
<rdf:Seq>
<rdf:li rdf:resource="http://carbontras.blog136.fc2.com/blog-entry-170.html" />
<rdf:li rdf:resource="http://carbontras.blog136.fc2.com/blog-entry-169.html" />
<rdf:li rdf:resource="http://carbontras.blog136.fc2.com/blog-entry-168.html" />
<rdf:li rdf:resource="http://carbontras.blog136.fc2.com/blog-entry-167.html" />
<rdf:li rdf:resource="http://carbontras.blog136.fc2.com/blog-entry-166.html" />
</rdf:Seq>
</items>
</channel>
<item rdf:about="http://carbontras.blog136.fc2.com/blog-entry-1701.html">
<link>http://carbontras.blog136.fc2.com/blog-entry-1701.html</link>
<title>Rd.15 Japan GP  02 OCTOBER</title>
<description> Rd.15 Japan GP  02 OCTOBERPhoto:RIZLA+.SUZUKI MotoGP 10月2日、ツインリンクもてぎで開催されたMotoGP世界選手権第15戦、リズラスズキのアルバロ・バウティスタは、一時は3番手まで浮上してモトGP自己最高位を走行したが、クラッシュを喫して無念のリタイアとなった。  決勝8番グリッドスタートのバウティスタは、オープニングラップで前を走っていた2台が接触転倒を起こしたことで6番手に浮上。
 </description>
<content:encoded>
<![CDATA[ Rd.15 Japan GP  02 OCTOBER<br />Photo:RIZLA+.SUZUKI MotoGP<br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo001.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo001.jpg" alt="mo001.jpg" border="0" width="448" height="298" /></a><br /><br /> 10月2日、ツインリンクもてぎで開催されたMotoGP世界選手権第15戦、リズラスズキのアルバロ・バウティスタは、一時は3番手まで浮上してモトGP自己最高位を走行したが、クラッシュを喫して無念のリタイアとなった。 <br /> 決勝8番グリッドスタートのバウティスタは、オープニングラップで前を走っていた2台が接触転倒を起こしたことで6番手に浮上。さらに2名のライダーがジャンプ・スタートによるピットスルー・ペナルティを科せられたため、4番手へと順位を上げた。そこへトップを走行していたC・ストーナー(ホンダ)がコースアウト、バウティスタは3番手となり、表彰台を狙う状況となった。6ラップにわたりN・ヘイデン(ドカティ)からのアタックをかわして3番手をキープ、やがてコースアウトから復帰し追い上げてきたストーナーにパスされ4番手となったものの、モトGP自己最高位フィニッシュが期待された。しかしながら13ラップの最終セクションでフロントがスリップ、ハイスピードでのクラッシュを喫した。幸いバウティスタに怪我はなかったものの、リタイアを余儀なくされた。 <br /> 当初4月に開催予定だった日本グランプリは、3月に起きた大震災により延期され、この日の開催を迎えた。観客動員数は3万4000人を越え、訪れた人は皆、ライダーを応援すると共に震災復興について支援した。レースを制したのはD・ペドロサ(ホンダ)、2位に前年チャンピオンのJ・ロレンソ(ヤマハ)、現ポイントリーダーのC・ストーナー(ホンダ)が3位に入った。   <br /> リズラスズキは、一週間のオフののち、環太平洋3大会(日本・オーストラリア・マレーシア)の2戦目となるフィリップアイランドへ向かう。 <br />Rizla Suzuki’s &Aacute;lvaro Bautista crashed out of this afternoon’s Japanese Grand Prix when he was fighting for the best MotoGP finish of his career.<br /><br />Bautista started from eighth on the grid and found himself up into sixth early on after narrowly missing two riders that collided in front of him on the first lap. He was then promoted to fourth as two other riders were forced to complete a ride-through penalty for jumping the start, and then almost immediately found himself in a podium position when race-leader Casey Stoner ran off the track. Bautista held third position for six laps and fought off an attack from Nicky Hayden, before Stoner re-grouped and caught and passed Suzuki’s Spanish racer. Bautista looked comfortable in fourth and began to push to secure his best-ever MotoGP finish, but lost the front near the end of the 13th lap and crashed at high-speed. He walked away uninjured, but bitterly disappointed.<br />Today’s Japanese Grand Prix was a re-scheduled race after the initial date was cancelled due to the earthquake and tsunami that struck the country earlier in the year. A crowd of just over 34,000 showed their support for both the MotoGP racers and the people of Japan. The race was won by Dani Pedrosa, with current World Champion Jorge Lorenzo second. Current championship leader Stoner took the final place on the podium.<br />Rizla Suzuki now has one weekend off before heading over the equator to Phillip Island in Australia for the second leg in a trio of Pacific races.<br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo002.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo002.jpg" alt="mo002.jpg" border="0" width="448" height="299" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo003.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo003.jpg" alt="mo003.jpg" border="0" width="294" height="448" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo004.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo004.jpg" alt="mo004.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo005.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo005.jpg" alt="mo005.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo006.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo006.jpg" alt="mo006.jpg" border="0" width="448" height="296" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo007.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo007.jpg" alt="mo007.jpg" border="0" width="448" height="299" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo008.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo008.jpg" alt="mo008.jpg" border="0" width="448" height="294" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo009.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo009.jpg" alt="mo009.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo010.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo010.jpg" alt="mo010.jpg" border="0" width="448" height="298" /></a><br /><br /><br />・・・RACE DATA<br />■2011MotoGP第15戦日本GP<br />■開催日:2011年10月2日(日)決勝結果<br />■開催地:もてぎ/日本(4.801km)<br />■周回数:24周(115.224 km)<br />■コースコンディション:ドライ<br />■気温:19度 ■路面温度:31度<br />■PP:C・スト―ナー(1分45秒267/ホンダ)<br />■FL:D・ペドロサ(1分46秒090/ホンダ)<br />Round: Round: 15- Grand Prix of Japan(Classification after 24 laps = 115.224 km)<br />Circuit: Motegi, Japan<br />Date: 2 October 2011<br />Circuit Length: 4.801 km<br />Practice condition: Dry<br />Air: 19 C<br />Ground: 29 C<br />PP: C. Stoner(1'45.267/Honda)<br />FL: D. Pedrosa(1'46.090/Honda)<br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo011.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo011.jpg" alt="mo011.jpg" border="0" width="448" height="299" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo012.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo012.jpg" alt="mo012.jpg" border="0" width="298" height="448" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo013.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo013.jpg" alt="mo013.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo014.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo014.jpg" alt="mo014.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo015.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo015.jpg" alt="mo015.jpg" border="0" width="448" height="293" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo016.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo016.jpg" alt="mo016.jpg" border="0" width="448" height="299" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo017.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo017.jpg" alt="mo017.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo018.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo018.jpg" alt="mo018.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo019.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo019.jpg" alt="mo019.jpg" border="0" width="448" height="298" /></a><br /><br />・・・RESULT<br />順位	ライダー	チーム	マシン	タイム <br />1	D・ペドロサ	Repsol Honda Team 	Honda	42'47.481<br />2	J・ロレンソ	Yamaha Factory Racing 	Yamaha	+7.299<br />3	C・スト―ナー	Repsol Honda Team 	Honda	+18.380<br />4	M・シモンチェリ	San Carlo Honda Gresini 	Honda	+23.550<br />5	A・ドビツィオーゾ	Repsol Honda Team	Honda	+23.691<br />6	B・スピース	Yamaha Factory Racing	Yamaha	+37.604<br />7	N・ヘイデン	Ducati Team 	Ducati	+39.167<br />8	C・エドワーズ	Monster Yamaha Tech 3 	Yamaha	+45.023<br />9	青山博一 	San Carlo Honda Gresini 	Honda	+49.074<br />10	R・ド・ピュニエ	Pramac Racing Team	Ducati	+59.022<br />11	C・クラッチロー	Monster Yamaha Tech 3 	Yamaha	+1'13.964<br />12	秋吉耕佑	LCR Honda MotoGP	Honda	+1'21.709<br />13	伊藤 真一	Honda Racing Team	Honda	+1'26.381<br /><br />Pos.	Rider	Team	Machine	Lap<br />1	D. Pedrosa	Repsol Honda Team 	Honda	42'47.481<br />2	J. Lorenzo	Yamaha Factory Racing 	Yamaha	+7.299<br />3	C. Stoner	Repsol Honda Team 	Honda	+18.380<br />4	M. Simoncelli	San Carlo Honda Gresini 	Honda	+23.550<br />5	A. Dovizioso	Repsol Honda Team	Honda	+23.691<br />6	B. Spies	Yamaha Factory Racing	Yamaha	+37.604<br />7	N. Hayden	Ducati Team 	Ducati	+39.167<br />8	C. Edwards	Monster Yamaha Tech 3 	Yamaha	+45.023<br />9	H. Aoyama	San Carlo Honda Gresini 	Honda	+49.074<br />10	R. De Puniet	Pramac Racing Team	Ducati	+59.022<br />11	C. Crutchlow	Monster Yamaha Tech 3 	Yamaha	+1'13.964<br />12	K. Akiyoshi	LCR Honda MotoGP	Honda	+1'21.709<br />13	S. Ito	Honda Racing Team	Honda	+1'26.381<br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo020.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo020.jpg" alt="mo020.jpg" border="0" width="448" height="297" /></a><br /><br /><br />アルバロ・バウティスタ DNF<br />「期待された結果が出せずに残念だ。昨日より気温が低く、オープニングラップからできるだけ前に出て行きたかったので、ソフトのリヤタイヤを選んだ。スタートは、ジャンプスタートしたライダーに惑わされて、レッドシグナルが消えた時、少し出遅れた。その後第2コーナーでクラッシュがあって順位を上げた。数ラップして、ジャンプスタートしたライダー達がピットに入ったので、3番手になったが、これは自分の実力というよりは運良く順位が上がっただけだ。それからケーシー(ストーナー)にパスされ、後ろにいたアンドレア(ドヴィツィオーゾ)との距離と残り集回数を考えて、アンドレアとの差をキープするためにケーシーについていこうとした。けれど最終コーナーの手前でフロントを滑らせクラッシュして、レースは終わった。一生懸命取り組んで、フリー走行、予選と徐々に調子を上げ、決勝は良いところを走っていたのに、最後まで運は味方してくれなかった。チームにとっても残念な結果で、申し訳なく思っている。スズキのホームGPで良いレースをして結果を出したかっただけに、スズキにも謝りたい。この週末にポジティブだった点を次のレースでも活かして、結果を出せるよう頑張りたい。」<br /><br />&Aacute;lvaro Bautista:<br />“This was for sure not the result we expected today! I chose the softer rear tyre for the race because the conditions today were colder than yesterday and because for the first laps I needed to be as fast as possible. I didn’t get a good start because some riders jump-started and I was a bit confused by them and when the red light went off I was little bit late. There was a crash on the second corner and I then found myself in a good position. A few laps later some riders entered the pits because they did a jump start and I was then in third, but I knew it was not my real position! When Casey went past me and I saw what the distance was between me and Andrea &#8211; who was the next rider &#8211; and how many laps were left, I tried to follow Casey and keep the gap to Andrea. Near the last corner I lost the front and crashed and that was the end of the race for me. We worked very hard this weekend and improved in all the sessions, and in the race we were in a good position. Today the luck was just not with us!<br />“I am sorry for the whole team because this result would have meant so much to them, and I’m sorry for Suzuki at its home Grand Prix because we wanted to make a good race and a positive result. I have to keep the good things from this weekend in my mind and in the next race we have to keep doing the same things we have done here and keep our heads up!”<br />「スズキのホ-ムGPで、4番手を走っていてクラッシュというのは、ただただ残念としか言えない。しかしながら、この週末に良かった点についても評価したい。寒く曇った気象条件下でGSV-Rが素晴らしいパフォーマンスを発揮したこと。予選でアルバロが自身ベストタイの8位と健闘したこと。我々もまた誠心誠意レースに取り組んだ。ホンダやヤマハとのマシン比較はさておき、今日のように4位を狙える状況もある。フィニッシュまであと10ラップ余りとなった時点で、後方のドヴィツィオーゾとの差は7秒だけで、かなり攻めなければ4番手の位置をキープできないことをアルバロはわかっていた。5位や6位でもいいとは考えないし、レースとはそういうものだ。クラッシュはハイスピードで起きたが怪我はないので、次のレースを期待したい。」<br /><br />Paul Denning &#8211; Team Manager:<br />“When your rider crashes out of fourth position at Suzuki’s home Grand Prix it can only be described as disappointing! However, we have to take the positives from this weekend - the GSV-R performed well in cold and overcast conditions, &Aacute;lvaro achieved his equal best qualifying and we were running very strongly in the race itself. We don’t quite have the speed of the Factory Hondas or Yamahas &#8211; at the moment &#8211; but apart from that we can race with anyone in the field, and when the opportunity presents itself &#8211; like today &#8211; fourth was definitely on the cards.<br />“&Aacute;lvaro’s not stupid and with just over 10 laps to go, and Dovizioso only seven seconds behind, he knew he had to push hard to keep fourth place - fifth or sixth wasn’t going to do it as far as he was concerned. That’s racing; it was a high-speed crash and &Aacute;lvaro’s completely uninjured, so let’s move on and look forward to the next Grand Prix.”<br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo025.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo025.jpg" alt="mo025.jpg" border="0" width="298" height="448" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo021.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo021.jpg" alt="mo021.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo022.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo022.jpg" alt="mo022.jpg" border="0" width="298" height="448" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo023.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo023.jpg" alt="mo023.jpg" border="0" width="448" height="298" /></a><br /><br /><a href="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo024.jpg" target="_blank"><img src="http://blog-imgs-24-origin.fc2.com/c/a/r/carbontras/mo024.jpg" alt="mo024.jpg" border="0" width="298" height="448" /></a><br /><br /> ]]>
</content:encoded>
<dc:subject>Trasからのお知らせ</dc:subject>
<dc:date>2011-10-09T18:18:35+09:00</dc:date>
<dc:creator>Carbon Tras</dc:creator>
<dc:publisher>FC2-BLOG</dc:publisher>
</item>


stefan81’s picture

I had a look into common_syndication_parser.inc

Apparently it seems to be implemented already?

    // Declaratively define mappings that determine how to construct the result object.
    $item = _parser_common_syndication_RDF10_item($rdf_data, array(
      'title'       => array('rss:title', 'dc:title'),
      'description' => array('rss:description', 'dc:description', 'content:encoded'),
      'url'         => array('rss:link', 'rdf:about'),
      'author_name'      => array('dc:creator', 'dc:publisher'),
      'guid'        => 'rdf:about',
      'timestamp'   => 'dc:date',
      'tags'        => 'dc:subject'
    ));

and

    if (isset($news['description'])) {
      $body = "{$news['description']}";
    }
    // Some sources use content:encoded as description i.e.
    // PostNuke PageSetter module.
    if (isset($news['encoded'])) {  // content:encoded for PHP < 5.1.2.
      if (strlen($body) < strlen("{$news['encoded']}")) {
        $body = "{$news['encoded']}";
      }
    }
    if (isset($content['encoded'])) { // content:encoded for PHP >= 5.1.2.
      if (strlen($body) < strlen("{$content['encoded']}")) {
        $body = "{$content['encoded']}";
      }
    }

On line 293 I changed

'description' => array('rss:description', 'dc:description', 'content:encoded'),

to

      'description' => array('content:encoded'),

Now it picks up the content:encoded.
So It works basically.

Maybe a candidate for a minor patch to fix the flaw?
Unfortunately I am not experienced enough for a serous attempt.

dman’s picture

The fix you applied is in mostly the right place ...

The problem here is that _parser_common_syndication_RDF10_property goes looking for any one of 'rss:description', 'dc:description', 'content:encoded' to place into the body ... because throughout different feeds, these are usually equivalent.
YOUR SAMPLE FEED has TWO of these in it - both rss:description AND content:encoded.
The data extraction function finds the first one and returns it immediately. It can't put both values into one target without overwriting, and you never want to concatenate.

(this behavior is the reverse of what happens in other cases of conflict in feeds - I've also seen where the LAST valid match wins with an overwrite)

So, your state is sort of ambiguous. In this case the description is empty and therefore useless to you, so your expected behavior is to carry on looking for the next candidate. #1092652: Possible to allow for blank fields and not overwriting existing data?
But ... there is the generic edge case where sometimes a feed actually does need to update over an existing value with a null value (though damned If I can think of an example where that would actuall be the desired result)
OTOH, maybe it makes sense to just change the order of the fields that get scanned to apply some logical weighting.

jelo’s picture

Status: Active » Closed (works as designed)

I just tried this out in version 7 and it works fine. Feeds imports from either or. However, the issue I ran into is that the feed does not validate if no description is provided. According to some sources, including the feed validator tool, they expect a plain text summary in the description field AND a full text encoded version in the content:encoded field. Apparently, it is not okay to ONLY provide content:encoded. At least this is what I found. Maybe someone with a better understand of the existing standards could clarify...

If that indeed were true, feeds should maybe not treat description and content:encoded as synonymous, but should indeed have a mapping option for each, e.g. map description to a summary/teaser field and content:encoded to body.

In my case I control the feed as source and the destination, i.e. I have abandoned content:encoded and stick everything I need to transfer into the description field.

Given the age of this thread and that it appears to work as intended, I changed the status.