Closed (duplicate)
Project:
Drupal core
Version:
7.x-dev
Component:
aggregator.module
Priority:
Normal
Category:
Bug report
Assigned:
Issue tags:
Reporter:
Created:
9 Mar 2009 at 08:31 UTC
Updated:
5 Dec 2011 at 14:34 UTC
Jump to comment: Most recent file
Comments
Comment #1
brianV commentedI have a patch in the works to fix this - marking as assigned so we don't duplicate the same work.
Going to sleep on what I have done, and do some tweaking & testing tomorrow.
Comment #2
brianV commentedPatch attached.
This adds a new function to aggregator.parser.inc which runs two preg_replace()'s against the item's description looking for relative links in href or src attributes, and substitutes either the domain, or the domain + path into the links as necessary.
My only concern is that thousands of feed items could be imported in a cron run in some rare cases. Running two regexes against each item could add up to a fair amount of processing time.
Please review, and send your comments. This is my first core patch, so pardon any 'newbie' mistakes :p
Comment #3
brianV commentedEhh, because this is a core patch, attaching a copy generated from Drupal root
Comment #4
brianV commentedUpdated version to comply with Drupal's coding standards.
Comment #5
keith.smith commentedNote that comments must begin with sentence case and end in a period, per the coding standards.
Comment #6
brianV commentedNew version, with proper comment formatting.
Comment #7
brianV commentedSomething similar to this could also be very useful when generating a RSS feed in Drupal. Having relative paths in a RSS feed serves no real purpose when imported on the other end.
Perhaps the function I wrote should be made more general and added to the main API, so that it can be called when an RSS feed is being generated. Thoughts?
Comment #9
brianV commentedRevision 4, fixes a type (oops!)
Comment #10
alex_b commentedIt would be great to get a test for this functionality... : )
I'm still a bit torn between putting this cleanup step on the input (like the patch here) or on the output (a cached filtering process in a preprocess function which has the additional benefit of being overrideable). But as long as #14104: Aggregator should support filter.module does not move, I am fine with leaving this cleanup step on the input side.
Comment #11
brianV commentedalex - perhaps a combination of the two may be of some value.
This code to fix the URLs is specific to Aggregator's needs. I would leave it on the input side, but still push toward allowing aggregator to conform regular input filter system on it's output.
Also, I am not sure how filter.module works, but you need to know the item's URL in order to convert the links from relative to absolute. Can this be done in a filter? Or do the filter hooks only take the body of the content you want to change?
Renaming issue title for better clarity.
Comment #12
brianV commentedI just realized I posted the last patch in the wrong format. Re-upping in the correct format.
Comment #13
cburschkaVery important fix!
This function needs a unit test case...
Some minor points on the comments:
Convert all relative links contained in the src and href elements withinStrictly speaking, href and src are attributes, while a and img are elements.
This is the link to the itemAll other parameters are described without a complete sentence, so "this is" should be removed for consistency.
Comment #14
jeffschulerBrianV, re: your comment #7, just a heads up to this issue from 2005: #29023: This feed is valid, but may cause problems for some users.
Comment #15
brianV commentedOops - thought this was a different issue. The tag still applies, though
Comment #16
brianV commentedNew revision has tests and coding style fixes pointed out by Arancayter.
Comment #17
ngaur commentedLooks promising.
For my purposes, I'm not in a position to correct URLs at the receiving end. I need to be able to publish RSS feeds that will work for sites which don't implement this sort of URL correction.
Also I need D6 implementation, but presumably that will come.
Comment #18
brianV commented@ngaur,
Please open a new issue for 'Converting relative URLs to absolute URLs in output RSS feeds'. I will take a look at it when I have a chance.
Comment #19
catchLooks like #16 was supposed to come with a patch.
Comment #20
brianV commentedGrrr... there *WAS* a patch. It had all the the unit tests and everything. Guess I have to do it again from scratch....
Comment #21
ngaur commentedWhat part of drupal should such a ticket relating to feed generation be filed against?
Comment #22
alex_b commentedngaur: Drupal project, component node.module
Comment #23
grendzy commented#88183: Relative URLs in feeds should be converted to absolute ones
Comment #24
momardieng commented@brianV,
Was a D6 version of this patch ever posted? I looked everywhere on this site but could not find one. I would really love to get my hands on it as I am having this issue with feeds coming from several sites that I really need (for example http://www.aps.sn/aps.php?page=backend&id_secteur=28). Thanks!
Comment #25
brianV commentedMomar,
Unfortunately, this was only ever patched against D7 (HEAD). If and when a patch for this is accepted, it would only then be backported to D6.
I never ended up finishing it off - I had a version at one point with tests, that I suspect would have been just about RTBC-quality. Unfortunately, for some unknown reason, it disappeared from post #16, and I lost my local copy as I reverted my local working copy (since I thought the version in the issue was safely here!).
I never got up the heart to rewrite the tests for this issue as it took a few hours of work. TBH, I think tests are really all that are needed at this point to bring this one to RTBC.
Comment #26
boris mann commentedJust found this issue from http://chrisshattuck.com/blog/save-planet-well-drupal-planet-actually-dr... -- would be nice to be able to consume the RSS feeds that Drupal produces without errors.
Comment #27
gábor hojtsy@BrianV: can you move this patch over to #88183: Relative URLs in feeds should be converted to absolute ones that this was marked a duplicate of? Thanks!
Comment #28
kurtzhong commented#12: aggregator-fix-relative-links-rev-5.patch queued for re-testing.
Comment #30
symbios_zi commentedhow to fix this problem in D6? the actual problem
Comment #31
jvieille commentedPlease, a D6 fix!
Thanks
Comment #32
brianV commentedThis patch is a duplicate. re-marking as such.