Is it the expected behavior of Aggregator that it never re-checks old items for updates?
For example, on this page: http://www.heritage.fastballweb.com/aggregator/sources/1
The item from July 3 has a title: "The Fruit Of The Spirit: Love"
But the RSS Feed (http://sermoncloud.monkserve.com/EKK/5266/sermons.xml) shows that that item has had some more information added to the title. However, there seems to be no way that I can force Aggregator to update its records for those old items. It seems that, on cron, it only adds records for items it hasn't seen before.
Am I missing an option somewhere, or should this become a feature request? This could be a useful feature if it's not already there.
Comments
Comment #1
devin carlson commentedHave you tried going to admin/config/services/aggregator and using the "update items" link? If that doesn't work then you could always using the "remove items" link and then use the "update items" links.
Aggregator doesn't automatically check older items for performance reasons. Imagine if you set aggregator to never discard old items and you made it read a feed that updated 100 times a day. After a year it would be checking 36500 items per cron run to make sure that they weren't out of date!
I'd imagine that on many shared hosts your account would be disabled after trying to process that many items every hour.
Comment #2
WebmasterDrake commentedI am having the same issue. Seems to me the purpose of an aggregator would be to actually display what is on the feed, not load the content once. What is the point of the cron if that is the case.
There should at least be a function to run "remove items" before "update items".
Is there another module that will do this?
Comment #3
parani commentedI am having the same issue. Has anyone found a solution for this.
Comment #4
David_Rothstein commentedClicking "remove items" followed by "update items" (manually) is a workaround for this, I think.
See also #1294472: Add option to delete removed items which is an open feature request for this issue, filed around the same time as this one.
Comment #5
sboutas commentedI have found a workaround on this.
I have added the following lines of code in aggregator.module / function aggregator_refresh($feed) line 612
if ($feed->fid == 'your_feed_number') {
aggregator_remove($feed);
}
So every time the aggregator tries to refresh the specified feed with new items, the old ones will be removed first and I will get the items with the new values.
It could be nice if the developers could add a hook for this function so I don't have to alter the module locally or at least to add a configuration option to add a list with the feeds that you want their items to be removed before refreshing them.
I hope this goes into the next version of the module.
Comment #6
buddym commentedThank you sboutas! I was banging my head against the wall with this one. Adding your little hack resolved the issue for me. Although, I don't like hacking core files! Every core upgrade I am stuck trying to remember the core files I hacked. Hopefully someone will address this in the next aggregator release.
In my scenario, I am using aggregator to pull in a weather feed. The feed is only 1 item set to update every 15 minutes. Cron is set to run every hour. Prior to your hack the only way I could get the item updated was to manually remove it then manually update it. The lowest aggregator settings for removal is 3 hours, so that wasn't working for me. There would be points where the item would be removed before the item would get updated and meanwhile my block would be missing.
Your hack did the trick allowing my weather feed to update every hour via cron. Thanks again!
Comment #7
toby wild commentedI'm having yet the same issue.
Since hacking core is bad form, the better solution is to turn the update interval of the aggregate off entirely, then create a Cron task using either a third party module, or just Rules that runs a PHP string, that first runs aggregator_remove, and then aggregator_refresh.
Hope that helps prevent the hacking of core.
EDIT:
I ended up creating a Rule that executed PHP on a Cron Task run.
However what was happening was the remove was running fine, but the refresh was never populating it.
I ended up ensuring that I reloaded the feed and overwrote a Hash check to ensure it always ran.
Hope this helps someone:
Comment #8
capysara commentedThe core hack allowed me to update the feed with the "update items" button (without manually removing items first). However, when cron ran, it just deleted all the items.
Comment #9
buddym commentedThis remains an issue for me (and others?). I am not sure why this is marked as fixed. Each time I update core, I have to go into the aggregator.module file and add the following code snippet to function aggregator_refresh($feed) now located on line 620: