FEMP 2 brings the possiblity to define the uniqueness of a mapping target. I. e. when a mapping target is unique, the implementer of the mapping target will make sure that only one mapping target with a given value exists in the target system.
Right now, the implementation is flawed: it assumes that any target can be unique, which is not the case. FeedAPI and FeedAPI mapper don't know anything about the mapping targets and it is the active processor's responsibility to heed the 'unique' flag.
We should introduce a $op 'unique supported' and 'unique'. Mappers that return TRUE on 'unique supported' are being called with $op = 'unique' when they've been picked. On $op = 'unique' they need to return TRUE if the item is unique, FALSE if not. (Alternatively, we could merge the 'unique supported' flag into the return value of 'list'. Not sure about that though.)
Further, this change will require a change to the mapping UI. If some mapping targets support being unique, others don't we'll have to change setting the unique flag when adding a mapping to setting the unique flag in a separate step. Only like that we can show a unique checkbox or not depending on its availability.
I. e.: instead of
a) picking a mapping, checking unique or not and then
b) adding a mapping to the mapping set,
you would rather
a) pick a mapping,
b) add it to the mapping set and then you would see a checkbox for uniqueness depending on whether the mapping target you chose supports 'unique' or not.
c) You can then check the box and hit save to save the mapping.
Thinking about this more, the 'Save' button on this form will cause some confusion, as people will inevitably think that the entire mapping won't have effect until they hit save, not just the unique flags. We should think about meeting this expectation by creating mappings in an edit cache and saving them to feedapi_mapper when the user hits 'save'.
Comment | File | Size | Author |
---|---|---|---|
#20 | 536666_mapper_v2_compatibility_7.patch | 9.59 KB | Aron Novak |
#17 | 536666_mapper_v2_compatibility_6.patch | 10.97 KB | Aron Novak |
#15 | 539752_unique_conditions_5.patch | 9 KB | Aron Novak |
#13 | 539752_unique_conditions_4.patch | 9 KB | Aron Novak |
#12 | 539752_unique_conditions_3.patch | 8.94 KB | Aron Novak |
Comments
Comment #1
alex_b CreditAttribution: alex_b commentedThis is critical for the first release of FEMP 2.
Comment #2
Aron NovakThings to do here:
Add a feedapi_mapper_unique() function to feedapi_mapper.module. This function should utilize feedapi_mapper_get_uniques() to retrieve the list of user-configured unique fields, then call the mappers. In the end, the function should return the array of item IDs (the duplicated items's IDs, for example a bunch of node IDs). If that array is empty, the item is unique.
At the mappers:
They should implement hook_feedapi_mapper() $op = 'unique supported' and $op = 'unique'. They should use feedapi_get_settings() do know if they should do site-wide or feed-wide deduping. 'unique supported' means that field can be marked to be unique, with $op = 'unique', the hook should return the array of item IDs (the duplicated items's IDs, for example a bunch of node IDs).
At the UI:
- just as described at the initial post
Comment #3
alex_b CreditAttribution: alex_b commented"They should use feedapi_get_settings() do know if they should do site-wide or feed-wide deduping." - I would add this later. At first we can assume that site wide deduping is required. If this is not a performance problem, we should leave it that way for simplicity's sake.
Comment #4
Aron NovakFirst patch.
See #368561: Mapper for FeedAPI URL/GUID and #536666: FeedAPI Mapper 2 compatibility also to make it work.
Comment #5
alex_b CreditAttribution: alex_b commented#4: Awesome. Read the patch, here are my observations:
A
feedapi_mapper_unique_toggle() will need to use a token to avoid XSS attacks (I know, minor, but still). We should use FAPI for toggling (i. e. make the toggle link a form submission).
B
If there are no uniques in feedapi_mapper_unique() array_pop() will be executed on an empty array and return NULL.
C
feedapi_mapper_unique() needs better documentation: what are the parameters, what return values to expect, when is this function being used?
Comment #6
Aron Novak#5: alright, valid issues. new patch is attached.
Comment #7
Aron NovakComment #8
alex_b CreditAttribution: alex_b commented2 problems:
A For effective cross feed deduping we need feed_nids in the array. We've talked about this on IM and found that a mapper that implements 'unique' should return a value in the format array([id] => array([feed_id], [feed_id], [feed_id]).
B The function merges all results into one array. This points to a bigger issue where FEMP does not know which mappers react on which processors - I've opened an issue here #541814: Breaks when enabling multiple processors - once this is in, we should add a $processor flag to the function signature of feedapi_mapper_unique() that indicates which mappers to query.
Comment #9
Aron NovakHere is a new patch, covers all the thoughts from #8.
I preserved that we want AND logic between the fields, but now it won't cause any problems, because of the processor-specific changes.
Also i changed some of the tests to provide mappings by default to fill title and body, i assume this is not chating at the tests :)
All tests are okay.
Comment #10
Aron NovakRelated patches:
#536666: FeedAPI Mapper 2 compatibility
#368561: Mapper for FeedAPI URL/GUID
Comment #11
Aron NovakA tiny improvement is still needed:
call_user_func_array('array_intersect_key', $fields_ids);
This is not appropiate, the array of feed ids should be handled somehow. Needs some concepting how to do it right.
Comment #12
Aron NovakAfter lots of thinking, i think this is not a task for FEMP. We decided that cross-feed deduping and simple deduping is not something that should be handled here. I added one more layer to the returning array, the serial numbered array is about the fields.
Depending patches need to be updated.
Comment #13
Aron NovakImproved version of #12
The result can be a numeric value (or any other kind) as well for some kind of simplier processors, so the return value of the mappers at $op = 'unique' is valid if it's not empty().
Also feedapi_mapper_unique() returns FALSE to indicate that the user did not configure any unique fields, this means that feedapi's built-in logic should be used.
Comment #14
alex_b CreditAttribution: alex_b commentedGreat.
Springing to the eye:
this is not correct anymore.
Otherwise looking very good. Need to review further.
Comment #15
Aron Novak#14: Yep, good catch, this is only true in the case of node processor.
Comment #16
alex_b CreditAttribution: alex_b commentedMore nitpicking:
1 $url_toggle_unique should be $toggle_unique_path, $url_delete $delete_path - it's a 'path' and the natural order of words makes the variable easier to read. Further it's consistent with $feed_path
2 Don't use the ?: ternary form if the statement doesn't fit neatly on a line
3 There is an indentation problem around "if ($unique_supported == TRUE) {"
4 $unique = 'N/A'; - shouldn't it be t('N/A') ?
5 feedapi_mapper_unique_toggle() should be feedapi_mapper_toggle_unique()
6 "Collects the ids of the duplicated items of $item. Utilized from FeedAPI processors." should be "Collects ids of duplicate items of $item. Most commonly used by FeedAPI processors to determine whether given item is unique."
7 "Feed item to decide about." -> "Feed item to decide on."
8 @return of feedapi_mapper_unique() should be:
9 Let's use better names for $fields_ids and $dup_ids: $result and $duplicates
10 There is an indentation problem with both patches to .test files
I really love how neat feedapi_mapper_unique() is now.
Comment #17
Aron Novak2 Don't use the ?: ternary form if the statement doesn't fit neatly on a line - in which line? FeedAPI Mapper does this elsewhere anyway :)
Others seems to be fixed.
Tests passed.
Comment #18
alex_b CreditAttribution: alex_b commented"2 Don't use the ?: ternary form if the statement doesn't fit neatly on a line - in which line? FeedAPI Mapper does this elsewhere anyway :)" - we should fix that...
Comment #19
Aron Novakplease, which line?
I know that we should fix that, that's why i did not modify the status of the patch.
Comment #20
Aron NovakFixed.
Also there was a non-related snippet in #17 about the date mapper, removed.
Comment #21
alex_b CreditAttribution: alex_b commentedBeautiful. RTBC.
Comment #22
Aron NovakComment #23
Aron Novak