Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
In cases where more than the batch limit of 50 nodes need to be expired, expiry is not scheduled ASAP but after the next 3600. This leads to a back up in nodes that need to be deleted after a while.
Solution: move expiry to source level (integrate it in clear()?), use FeedsSource' batch state tracking.
Comment | File | Size | Author |
---|---|---|---|
#24 | feeds-expiry-batch-930652-24.patch | 24.55 KB | twistor |
#22 | feeds-expiry-batch-930652-22.patch | 24.5 KB | twistor |
#21 | feeds_log_issue_930652_comment_14_patch.pdf | 184.62 KB | byronveale |
#14 | feeds-expiry-batch-930652-14.patch | 20.99 KB | twistor |
#12 | feeds-expiry-batch-930652-12.patch | 21.12 KB | twistor |
Comments
Comment #1
Anonymous (not verified) CreditAttribution: Anonymous commentedI've moved node expiry into FeedsSource and everything appears to be working now. I've attached a patch.
Comment #2
Anonymous (not verified) CreditAttribution: Anonymous commentedHere is an updated patch with an upgrade path and I've removed some other overlooked cruft so I've removed that.
Comment #3
Anonymous (not verified) CreditAttribution: Anonymous commentedMissed that I added feeds_node_expiry.test to info file this shouldn't be added in this patch. Here's an updated patch.
Comment #4
Anonymous (not verified) CreditAttribution: Anonymous commentedMight as well roll expiry tests into this patch as well. Here is an updated patch with expiry tests included.
Comment #5
alex_b CreditAttribution: alex_b commented- Update hook should remove feeds_importer_expire from job_scheduler
- Why is scheduleImport() removed from feeds_source_import()?
- Also scheduleExpire() should be invoked at the end of feeds_source_export().
- scheduleExpire() is using fetcherPeriod() where it shouldn't.
- scheduleExpire() should only schedule a job if
FEEDS_EXPIRE_NEVER != $this->processor->expiryTime()
- scheduleExpire() should remove a job otherwise.
Comment #6
Anonymous (not verified) CreditAttribution: Anonymous commentedAlright I've fixed
scheduleExpire()
and I've added back in somethings I removed accidentally while troubleshooting, likescheduleImport()
. I've also updated the update hook.Comment #7
Anonymous (not verified) CreditAttribution: Anonymous commentedJust noticed one of the comments in my expiry tests were not accurate just a quick update.
Comment #8
Dave ReidComment #9
twistor CreditAttribution: twistor commentedPicking up the torch. Let's see what testbot says.
Comment #10
twistor CreditAttribution: twistor commentedForgot the new test file.
Comment #12
twistor CreditAttribution: twistor commentedSecond stab. This should pass.
Comment #14
twistor CreditAttribution: twistor commentedI moved the new test class into the node processor tests.
C'mon testbot!
Comment #15
ressa CreditAttribution: ressa commentedIs this patch about ready to be included in the dev version? The "Expire nodes" doesn't work in the current release, as can be seen in these posts.
Will this patch delete nodes before import? That would make sense to me.
Comment #16
ressa CreditAttribution: ressa commentedI spoke to soon. I just ran cron, after setting "Expire nodes" to one hour, and after a bit of a wait I went to the log and saw this entry "Type: job_scheduler" - "Finished processing scheduled jobs (0 sek. s, 1 total, 0 failed)."
And indeed, all nodes older than one hour are gone, so it seems "Expire nodes" actually works.
Comment #17
byronveale CreditAttribution: byronveale commentedJust wanted to point out a sort-of similar discussion with patch going on in the "Unpublish/Delete nodes not included in feed" feature request.
I notice 7.x-2.0-alpha8 was last updated on April 22nd; presumably this patch made it in, and this issue can be closed??
Thanks for everyone's efforts on this!
Comment #18
byronveale CreditAttribution: byronveale commentedActually, it looks like this code did not make it in to the latest dev version. Just want to cross-reference other related issues, for users coming here: #1470530 and #1363088.
(Twistor, if you're getting notified every time I leave a comment, sorry if it feels like I'm hounding you!)
Comment #19
byronveale CreditAttribution: byronveale commentedI tried the patch in #14, and am having issues, not exactly sure where they are coming from:
- when manually running feeds import, no feed items were expired, regardless of the expiry setting (1 year, 1 month, 1 hour, etc.), or process setting (background on or off) specified
- when running automatically (periodically, every 15 minutes, scheduled cron entry), feeds items were deleted and recreated, seemingly at random, and could not find a single specific instance of an item that should have been deleted (no longer in feed) being deleted
I tested in in a local Acquia dev instance, Drupal 7.23, MySQL 5.1.66, PHP 5.3.18, Mac OS X 10.8.4. There are 767 items in my feed (coming from LDAP, using the LDAP module, "ldap_feeds" component).
So, can't say specifically whether or not the patch successfully fixed expiry "batching"; but the whole expiry process seems to have issues.
Comment #20
byronveale CreditAttribution: byronveale commentedOkay, another day of testing, and some enlightenment:
- because I was working with the issue of unpublishing nodes no longer in a feed source (#1470530), I was conflating that issue with this one, that is expecting nodes no longer in a feed to be expired, thinking that the "expire nodes" setting would expire a node only after it was no longer in a feed for the specified duration (yeah, I know, I know...!)
- I now get that the "expire nodes" setting removes an item after the specified duration regardless of whether or not it is still in the feed; if the item is still in the feed, it is recreated as new (this explains the "random" behavior I reported in my last comment)
- it appears there may be an issue still with expiration, as brought up in issue #1433170, as after a morning of updates, things have been quiet; whether or not this is part of this batching issue or something else, I couldn't say...
I'm going to let it go overnight (periodic import/cron running every 15 minutes, nodes expiring after an hour, feeds_process_limit = 100). Will report results tomorrow...
Comment #21
byronveale CreditAttribution: byronveale commentedLooks like it's working just fine, had an evening full of expiring and creating 100, 200 nodes at a shot (see attached). Will file under a separate issue, but I think it would be nice if Feeds logged when it expires nodes, in addition to logging their creation. Dare I say this has been "reviewed and tested"...?!
Comment #22
twistor CreditAttribution: twistor commentedAlright, some minor tweaks and fixes.
If this passes, I'm ready to commit.
Comment #24
twistor CreditAttribution: twistor commentedThat's why we have tests.
Comment #25
twistor CreditAttribution: twistor commentedhttp://drupalcode.org/project/feeds.git/commit/a44ecf6
Comment #27
Yorgg CreditAttribution: Yorgg commentedIt may have passed the tests but suddenly nodes don't get expired any longer. Nothing that I'm aware has changed.
This is the second time that it happens since #24 though somehow I managed to get it back to work once.
I have disabled and uninstalled the feeds module, then and (re)-installed. I thought this was the fix but this time it was not.
Should I open another issue?
Comment #28
MegaChriz CreditAttribution: MegaChriz commented@j_nunes
Are you using the latest dev version of the module? If so, can you provide the minimal steps to reproduce the issue? With the current information it can be anything from misconfiguration to bug.
Comment #29
Yorgg CreditAttribution: Yorgg commented@MegaChriz: I am using the latest dev now with xpath parser and json parser.
In the Node Processor, the expiration date is set to 3 hours from "The node's published date will be used for determining the node's age"
My mapping is set so expiration occurs 3 hours from the current published date, meaning nodes should expire 3 hours from import.
I don't see any other configuration other then this.
I went ahead to see the timestamp values with devel on some feed-items and I believe I must have 200 or 300 candidates pending expire action, but they don't.
Am I missing something?
Comment #30
MegaChriz CreditAttribution: MegaChriz commented@j_nunes
Can it be that cron didn't run? I think that expiring nodes depend on cron.
Comment #31
Yorgg CreditAttribution: Yorgg commentedThat's not the case. I even run cron manually on my devel machine to make sure its running properly every time.
Comment #32
Yorgg CreditAttribution: Yorgg commentedIs there any other information you require to reopen this issue again?
Comment #33
donquixote CreditAttribution: donquixote commentedFellows,
the variable
$reschedule
infunction feeds_reschedule()
is messed up.Sometimes it is a boolean, sometimes it is an array (list of importer ids), sometimes a string (importer id).
This is asking for trouble.
The
$importer_id
is not much better. But at least this one is documented.But the combination of different states of these two variables is hard to follow.
And probably buggy.
Comment #34
twistor CreditAttribution: twistor commentedI agree #2497483: Sort out feeds_reschedule()..