In somes sites I use a lot of aggregator-feeds. Not all of them are well-formed. This results in ugly output and/or uncorrect links. I think it would be better to add some checking of the input to prevent bad output. I know the objection: the goal of the module is to retrieve the input and present the output: as-is. But most users of Drupal don't know how to change the output into quality output when necessary.
Best examples I found up till now:
1. Malformed titles. In my language (Dutch) you have several accented characters. The titles are often in a wrong byte-format. The title-function can't handle characters in a wrong byte-format. My transformation array (depending on the place where to use):
from: “, ”, ‘, ’, •, –, , ', &#
into: ", ", ', ', ., -, [blank], ', &#
or: “, ”, ‘, ’, •, –, [blank], ', &#
Maybe there is a php function for this conversion.
2. Use of: <a href="../
This leads to references to my site in stead of a reference to the original site
3. Wrong href= and img src=: missing item's link
<a href="/' translated into <a href="'. $item->link .'/'
<a href='/' translated into <a href=\''. $item->link .'/'
<img src="/' translated into <img src="'. $item->link .'/'
<img src='/' translated into <img src=\''. $item->link .'/'
4. empty <a ...></a>, mostly intended as a link to the top
" ></a> translated into " >#</a>
"></a> translated into " >#</a>
5. missing http://
Attached you find an implementation to correct these problems. You can change the function at will (I only have a basic knowlegde of regular expressions). Currently I call this function in aggregator-item.tpl.php after creating an array with the fields. But, as said before, I think it is better to change the input items before saving them in the database, or before sending to the output template.
Comment | File | Size | Author |
---|---|---|---|
update_one_item.txt | 3.44 KB | PROMES |
Comments
Comment #1
ParisLiakos CreditAttribution: ParisLiakos commentedComment #12
SpokjeThe
aggregator
module has been removed from Core in10.0.x-dev
and now lives on as a contrib module.Issues in the Core queue about the
aggregator
module, like this one, have been moved to the contrib module queue.Comment #13
larowlanParsers are plugins, so you can implement this without hacking core now