Hi,

For my use case, I have an XML feed of used machinery which is for sale, and when it disappears from the feed then it is 'sold' so our client would like to keep the item on their website but mark it as such.

I've noticed that the latest beta release has included some extra options for cleaning entities which no longer appear in the feed, but no way of extending / hooking into what action feeds takes with the entity?

For NodeProcessor we get an option of Skipping, Deleting or Unpublishing - obviously I don't want any of these for my purposes, but what I'd like is to be able to hook onto a hook like hook_feeds_before_clean() and be passed a list of entities which no longer exist in the feed.

As far as I can see this would require a new 'missing action' option on the NodeProcessor (because if you select Skip then it doesn't even build the removeList) - where the removeList is built but no action is taken by the processor itself, other than invoking the hook. We could just build the removeList when FEEDS_SKIP_NON_EXISTENT is selected (but just don't actually call the delete functions), but this would obviously cause overhead for those who literally just want to skip.

For my purpose I will just write a custom Node Processor to do what i want, but I've needed this on three separate projects now so thought I better flag it here!

Summary

Below is a quick summary of my issue an proposed solution.

Problem / use case

I need to set a value on a custom field when an item is removed / missing from the feed.

Proposed solution

  • Add a new hook hook_feeds_before_clean() which expects an array of entites which are about to be cleaned.
  • Implement a new option for $form['update_non_existent'] which builds the removeList and invokes the hook, but is otherwise the same as FEEDS_SKIP_NON_EXISTENT
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

orphans created an issue. See original summary.

mattjones86’s picture

Issue summary: View changes
mattjones86’s picture

Title: Feeds API - feeds_after_clean hook? » Feeds API - feeds_before_clean hook?
mattjones86’s picture

Issue summary: View changes
MegaChriz’s picture

Interesting feature request.

As I dislike adding an option that does nothing unless you implement the new hook, I wonder if this feature could also be implemented in the following way:

  1. Using a form alter, to add an extra option for the processor settings.
  2. By implementing hook_feeds_after_import() and then call $state = $source->state(FEEDS_PROCESS); from there.

For this to work, FeedsProcessor::clean() does need to explicitly check for FEEDS_DELETE_NON_EXISTENT (right now it just deletes entities when the option is not set to FEEDS_SKIP_NON_EXISTENT).

mattjones86’s picture

I think that would work!

I might start a module to provide this functionality - I'm sure other people would find it useful.

MegaChriz’s picture

@orphans
Great! :) Please do that. I can imagine a client of mine would ask for this functionality someday.

Attached is a patch for a change in Feeds which would be required if the feature is implemented as suggested in #5. FeedsProcessor::clean() needs to check explicitly if the option is set to FEEDS_DELETE_NON_EXISTENT and only in that case it should delete entities.

  • MegaChriz committed 237e6ae on 7.x-2.x
    Issue #2822831 by MegaChriz: Fixed only delete non-existent when the...
MegaChriz’s picture

Title: Feeds API - feeds_before_clean hook? » Allow other modules to add other options for dealing with non-existent items
Status: Needs review » Fixed

I've committed #7.

@orphans
I'm looking forward to your module. :)
Let me know if #5 indeed does work.

mattjones86’s picture

I've spent some time fiddling with this today - I've almost got it working apart from one issue.

After injecting my extra fields into the processor config form, they are subsequently stripped out by array_intersect_key() in FeedsConfigurable::addConfig.

I understand why this is required, but is there any way we could modify this to allow custom form keys to be saved?

EDIT:: Perhaps you could provide a new hook for hook_feeds_plugin_config_defaults() which would allow a module to add extra config defaults for a plugin?

MegaChriz’s picture

Ah great!

I also came across the issue that extra configuration fields added to a Feeds plugin via a form alter were not saved. I've been playing with a possible solution for this problem, though I still need to check if that solution is sufficient. See #1962006: Add hook hook_feeds_config_defaults() to allow other modules to add extra configuration.

mattjones86’s picture

Great! That patch works nicely for me, thanks!

Eric_A’s picture

FeedsProcessor::clean() needs to check explicitly if the option is set to FEEDS_DELETE_NON_EXISTENT and only in that case it should delete entities.

This was changed in 237e6ae but I just noticed that the same check is in initEntitiesToBeRemoved(). Still in its old form, so now it seems inconsistent with clean(). Also, there's a smell to having the check in both methods, but I haven't really looked into this.

MegaChriz’s picture

@Eric_A
FeedsProcessor::initEntitiesToBeRemoved() is still right, but maybe slightly badly named. The node processor has an option for unpublishing non-existent items and when that option is selected, the node processor makes use of the same list that is built by FeedsProcessor::initEntitiesToBeRemoved() to determine which nodes to unpublish. See FeedsNodeProcessor::clean().
So yes, the check needs to be in both methods. The first one - ::initEntitiesToBeRemoved() - builds the item list and the second one - ::clean() - performs an action on this list (unpublish [in case of nodes], block [in case of users] or delete [any processor]). ::initEntitiesToBeRemoved() needs to check if it is necessary to build the item list at all and ::clean() needs to check which action to perform.
Perhaps FeedsProcessor::initEntitiesToBeCleaned() would have been a better name.

Eric_A’s picture

Thanks! So FeedsProcessor::clean() is 100% about deleting. FeedsProcessor::initEntitiesToBeRemoved() and FeedsState::removeList are really about maintaining the list of items that are no longer in the feed. In fact I just remembered I have an implementation of hook_feeds_before_update() relying on FeedsState::removeList to do some flagging instead of actual deleting.
I agree that the naming of the method and property don't reflect the current purpose, so perhaps we could add a new method and property at some point and deprecate the current ones. I'd be glad to help.

Thanks again!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.