Comments

twistor’s picture

Status: Active » Needs review
StatusFileSize
new1.94 KB

Let's see what happens.

This has the potential to really break people's existing imports.

twistor’s picture

Issue tags: +Needs tests

To elaborate: It's very likely that some people are depending on the existing behavior of being able to create fields with invalid values, knowingly or not.

samwilson’s picture

Could it be an optional setting then, defaulting to the current behaviour of permitting invalid values?

I tried applying this patch, and it works fine, but then I tried changing FeedsMapperFieldTestCase::test() just to see how it'd fail — I changed the mapping for field_delta in that test, so that it'd try to import the text from the body column of the CSV into the number_float field. I expected this to throw an exception, and for the node to not be created, but the node was created (except with nothing in that field).

I am not familiar with the internal workings of Feeds though, so I'm bound to be doing something quite wrong! I'll investigate further tomorrow hopefully. Thank you for working on this!

twistor’s picture

Feeds does validation for each field type it supports, so that won't be an issue.

megachriz’s picture

I've tested the patch in #1 using:

  1. A link field with an invalid URL.
  2. An integer field with a string instead of an integer.
  3. A text field with a maximum length.
  4. A taxonomy reference field with a maximum number of values.
  5. A text field field with a maximum number of values.

Case 1, 3 and 5 resulted into items failing to import. Case 2 didn't result into errors, because Feeds checks if the value to import is numeric. Case 4 didn't result into errors, because in case of the taxonomy reference field mapper, Feeds checks for the field's cardinality.
Note that without the patch, case 1 and 5 do not result into errors and case 3 results into the SQL error "data truncated".

Fatal error

With the patch in #1, a fatal error occured during an import because of the following code:

+++ b/plugins/FeedsProcessor.inc
@@ -126,7 +126,14 @@ abstract class FeedsProcessor extends FeedsPlugin {
+    catch (FieldValidationException $e) {
+      throw new FeedsValidationException(reset($e->errors));
+    }

This assumes that FieldValidationException sums up the errors in a singular array. Instead, FieldValidationException uses a multidimensional array to report the errors:

FieldValidationException Object
(
    [errors] => Array
        (
            [field_link] => Array
                (
                    [und] => Array
                        (
                            [0] => Array
                                (
                                    [0] => Array
                                        (
                                            [error] => link_required
                                            [message] => De opgegeven waarde voor <em class="placeholder">Link</em> is geen geldige URL.
                                            [error_element] => Array
                                                (
                                                    [url] => 1
                                                    [title] => 
                                                )

                                        )

                                )

                        )

                )

        )

    [message:protected] => Fouten bij de validatie van velden
    [string:Exception:private] => 
    [code:protected] => 0
    [file:protected] => /Websites/drupal/drupalsites/drupal7/modules/field/field.attach.inc
    [line:protected] => 801
)

New patch

The attached patch fixes the fatal error by looping through the FieldValidationException's errors array and note any error messages it come across. I've added the method unravelFieldValidationExceptionErrors() to FeedsProcessor to find these error messages. This is a recursive called method, as I'm not sure if the error messages can be found each time at the same level in the array, so this ensures messages are found wherever they are in the array.
I also added two test methods to FeedsMapperFieldTestCase:

  1. testTextFieldValidation()
    Tests if items with a text longer than a certain maximum length fail to import.
  2. testCardinalityValidation()
    Tests if items consisting of more values than the allowed number of values fail to import.

Conclusion

Yes, this can break people's existing imports, most notably when the field is set to contain a maximum number of values and a greater amount of values is provided by the source.

robhardwick’s picture

Re-rolled patch from #6 against latest dev version (branch 7.x-2.x)

joelpittet’s picture

Re-rolled again, conflicted with language stuff that was added.

FeedsProcessor::entityValidate() had some stuff in it so I added this after the language stuff.

@MegaChriz you've added tests to this, do you need more tests or can we remove the tag?

megachriz’s picture

Issue tags: -Needs tests

I think the tag can be removed. As for this issue, I'm kind of concerned that it breaks more than that it fixes. With this patch, Feeds will become stricter in what sources it accepts, resulting into more failing imports (that's also what the tests show). I think it would be better if Feeds would be more tolerant, and fix the incoming source were possible, rather than showing a red stop sign: "No, I won't let you import this item. Field X contains 6 values and I can only accept 5."

From the cases noted in #6, I think case 1 would justify a failing import, but I'm not so sure if that should also be for case 3 and 5 (case 2 and 4 do not cause import failures). Thoughts?

joelpittet’s picture

My thought is maybe if you are really concerned: set a flag, enable it to validate on new installs/default, disable it for existing feeds in an update hook and add it to the settings to enable per feed.

joachim’s picture

> I think it would be better if Feeds would be more tolerant, and fix the incoming source were possible, rather than showing a red stop sign: "No, I won't let you import this item. Field X contains 6 values and I can only accept 5."

Thing is though, there's no way to fix a problem like that. If you have 6 values coming in, and the field can take only 5, your options are:

A. Throw away 1 item. You're losing the user's data, which is bad.

B. Accept the 6 items. But then the next user to edit that node and try to save it will get an error message, because the edit form will run field validation. So you're just postponing the problem.

C. Reject it.

megachriz’s picture

@joelpittet

Set a flag, enable it to validate on new installs/default, disable it for existing feeds in an update hook and add it to the settings to enable per feed.

That's a nice compromise. Updating existing Feeds importers in an update is tricky though, see #2537290: Call to undefined function feeds_importer_load_all() and #2561357: Fatal error when trying to update from 7.x-2.0-alpha9 to 7.x-2.0-beta1. If the Feeds module isn't enabled during updates, configuration of Feeds importers cannot be updated easily. Also, Feeds importers that exist in code will not be updated. Conclusion: if we would introduce a flag, it will be enabled for all importers (both new and existing).

@joachim
All three options have their cons. Before your post, I thought option A would be the best because whatever the outcome is, that sixth value won't make it in the end. It's either not imported while the rest of the values are (and the rest of the item), or the whole item is not imported (but in this case the user is at least notified about the fact).
Now I think we may need to go for option C, because Feeds cannot know whether or not that sixth value is important.

Note that the current behaviour of Feeds is option A. See the following code in mappers/field.inc (function field_feeds_presave()):

// Check that the number of values doesn't exceed the field cardinality.
if ($info['cardinality'] != FIELD_CARDINALITY_UNLIMITED && count($values) > $info['cardinality']) {
  $values = array_slice($values, 0, $info['cardinality']);
}

I think this was done to prevent the situation that is in option B.
.
I need to think about this some more, I'm currently with my thoughts more into other Feeds issues (like the multilingual issue). Thoughts/opinions still welcome!

megachriz’s picture

Reroll.

I changed my mind a bit on implementing a flag to bypass the validation. In the current situation that would mean some validation is bypassed and some is not. Some validation that's being done could be called quite critical, which we should not bypass (like a missing account name, for example). If some validation is bypassed and other is not, this would be inconsistently from an user point of view. I'm therefore letting it out (at least for now).

It would be good though to commit this shortly after the next release, so there is some more time to find out if it indeed causes issues. I hope to make the next release in a few weeks.

Status: Needs review » Needs work

The last submitted patch, 13: feeds-field-attach-validate-2379631-13.patch, failed testing.

megachriz’s picture

Status: Needs work » Needs review
StatusFileSize
new7.57 KB
new1.54 KB

The list float field is a bit tricky. The number to import (3.14159265) has more digits than MySQL saves. But when I manually change the float in the database to for example FLOAT(20,10) I ended up with a different value being imported as in the source, appearantly because floats are not 'exact'. So in this case validation succeeds, but when going to the edit form, no value is selected for the list_float field.

I'll change the value in content.csv to "3.14159" which MySQL can handle and that seem to be the only way to pass the field validation and have the right value saved for the FeedsMapperListTestCase.

The validation failure for the date value seems to be fixable by changing the empty date value from a string to a NULL value as date_field_validate() checks with isset() if there is a value and if so, it expects other columns to be set as well, like 'timezone' and 'value2'. Not sure if changing the empty value to NULL has other implications.

Status: Needs review » Needs work

The last submitted patch, 15: feeds-field-attach-validate-2379631-15.patch, failed testing.

megachriz’s picture

When collecting an end date, date_field_validate() expects that value key 'value2' is always available.

cboyden’s picture

The patch is working well for me. One thing that would be nice to add would be an indicator of which item in the import is the one generating the error. Maybe the row number, GUID, or title of the item? Otherwise if you have a large import, the error messages are not specific enough to track down where the problem is.

megachriz’s picture

Issue tags: +validation
cboyden’s picture

StatusFileSize
new642 bytes

Here's a proof of concept patch update that adds the item GUID to the exception message.

Some things to note:

  • Core fields include the field label in their validation error messages, but modules that provide their own field types may not. There's no way to tell at this point if the field label is included, so it's not something that this patch should add - might result in duplicate text.
  • Validation for required fields doesn't happen in the default field_attach_validate. It seems to only happen in a form context? This validation is doable with the feeds_tamper "Required field" plugin, but the error messaging (if you configure it to log) might look different.
  • If there is no GUID set on a feeds importer, there may be a way we can find a property flagged as unique and use that in the message.
cboyden’s picture

Sorry, patch didn't get attached to the previous comment.

megachriz’s picture

@cboyden
Nice idea, but the GUID can be empty, so in that case the displayed error would be "@error in item ". Items that failed to import are also logged (import/x/log or node/x/log), complete with the details of the whole item and a var export of the entity. The views module is required to display the log. Wouldn't viewing the log be enough to track down which items need to be corrected in the source?

cboyden’s picture

I'll take a look at switching the error phrase depending on whether GUID is set, and also whether some other unique property is available in that context.

The Log tab is only visible if you have the "Administer Feeds" permission, and for the site I'm working on, the primary users of the import function will not have that permission.

cboyden’s picture

StatusFileSize
new978 bytes
new1.04 KB

Here's another patch that checks for the entity title, then the item GUID, then falls back to the exception message.

Status: Needs review » Needs work

The last submitted patch, 24: feeds-field-attach-validate-2379631-24.patch, failed testing.

dsnopek’s picture

Status: Needs work » Needs review
StatusFileSize
new8.6 KB

It looks like @cboyden just uploaded the interdiff twice in #24 - here's a full patch!

cboyden’s picture

Status: Needs review » Needs work

From what I can tell, allowed values functions don't work in this context. If you have an allowed values function on a field, you'll get a validation error on import even if the data is valid.

cboyden’s picture

Status: Needs work » Needs review

As it turns out the above problem was in the allowed values function itself, not the patch. Feeds import field validation is working with allowed values functions.

djdevin’s picture

Worked fine with https://www.drupal.org/project/feeds_entity_processor with a list field. Rejected values that weren't in allowed values.

megachriz’s picture

Because this has the potential of breaking existing imports, this will be committed after the release of Feeds 7.x-2.0-beta3. Feeds 7.x-2.0-beta4 is planned to be a somewhat more experimental release in which some major bugs are hoped to be fixed. See #2826403: Plan for Feeds 7.x-2.0-beta4 release.

megachriz’s picture

I was about to commit this, then realizing that the indication messages that were added by cboyden could use additional tests. When writing these tests I realized that taxonomy terms and users have no title. At first, I replaced the $entity->title by a check for $info['entity keys']['label'] but then that failed for users as that entity type apparently has no label key specified. No I have replaced the title check with entity_label().

Added additional tests for:

  • Text field validation for nodes without a title.
  • Text field validation for terms.
  • Text field validation for users.

If this passes tests, then this should be done.

  • MegaChriz committed 86b838f on 7.x-2.x
    Issue #2379631 by MegaChriz, cboyden, twistor, joelpittet, robhardwick,...
megachriz’s picture

Status: Needs review » Fixed

Alright, #31 committed!

If this ends up causing real big problems, we can decide to revert it or implement an option to bypass validation. (See #13 for the reason why such an option is not included.)

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

klausi’s picture

This broke lots of Feeds importers for us because we can have data to be imported that is not considered valid by the field API.

We opened #3042007: Option to bypass field validation to disable that.