Problem/Motivation
See #2587957: Arbitrary files can be referenced in file fields even if validation using form API would fail
@yched :
- The 'file_extensions' and 'max_filesize' restrictions are currently only ever applied by the field widgets, in the shape of validation routines passed as the '#upload_validators' property of the underlying ManagedFile FAPI element (more precisely, ManagedFile::valueCallback() delegates to an old-school function, file_managed_file_save_upload()).
- And those are checked only on *new uploads*, which explains that you can forge fake form values referencing an existing fid, and bypass the checks completely.
So yeah, the bug is probably two-fold :
- At the FAPI level (ManagedFile element), maybe the constraints should be enforced not only on file uploads, but checked on any submitted fid. That can be argued, I guess, since the property is named '#upload_validators', so we can't really say it's lying, it never claimed doing more than checking uploads :-)
- At the entity level, there should definitely be constraints that ensure validation fails if a field references a file that is not eligible according to the field settings, independantly of whether the widget / form element catches them or not (which it currently doesn't, as per the above). Those validations would fix REST, and make sure our entity forms have no holes either.
Then it's the usual problem of "it's better if the widget doesn't spend too much time validating, since the work will be done by entity validation anyway". The way this was approached in the EntityRef "autocomplete" widget and the underlying EntityAutocomplete FAPI element, is that the element has a #validate_reference flag, that defaults to TRUE, but that the widget sets to FALSE knowing that validation will be performed at the entity level.
Proposed resolution
Remaining tasks
User interface changes
API changes
Data model changes
Comments
Comment #2
berdirThe problem with this is that uploading files over REST, as currently implemented by #1927648: Allow creation of file entities from binary data via REST requests consists of two steps. You first upload a file, then you create the node and reference it.
The problem with that is that those validations are configured on the field, which we can then only validate on the second step. So we can not prevent anyone from uploading any file, just from referencing it in nodes where it doesn't match.
Comment #3
berdirI also think that the references issue adds a constraint that solves this too.
and it tries to solve the problem mentioned above by defaulting files to status 0 and denies edit access on that, so that users can not create a file that is permanent.
Comment #4
yched commentedComment #5
marthinal commentedI think #1927648: Allow creation of file entities from binary data via REST requests fixes the problem using the constraint when referencing a file from REST but from the UI we are validating 2 times.
The ajax callback validates before uploading and then Constraints are executed(too late in this case). So probably if we use this constraint we need to work on a way to solve this problem.
Comment #6
dawehnerI'm curious whether it would make sense to cherry pick these pieces out.
Comment #7
berdirYes,it should definitely be pulled out of #1927648: Allow creation of file entities from binary data via REST requests, but I doubt it makes sense to have this issue separate from #2587957: Arbitrary files can be referenced in file fields even if validation using form API would fail. It's a single constraint that passes the constraints from the field, which includes both things to the file validate API.
Comment #8
effulgentsia commentedSee #2587957-24: Arbitrary files can be referenced in file fields even if validation using form API would fail. Do you think that issue can be solved with this issue plus #2577963: Let entity_ref Selection handlers be in charge of the field validation? If so, what makes this unseparable from that?
In any case, tagging for RC target triage, while we sort out which issue combinations solve what.
Comment #9
drnikki commentedPromoting to critical based on #2587957-26: Arbitrary files can be referenced in file fields even if validation using form API would fail
Comment #10
iantresman commentedIf I attempt to upload
trojan.exe, it should be prevented, suggesting that all file extensions should be blacklisted by default, and only whitelisted extensions allowed.Comment #11
xjmSince this is critical now, it no longer needs to be triaged as an RC target. Thanks!
Comment #12
dawehnerExtract the constraint into this issue and wrote a test.
Comment #13
berdirDidn't look at the patch yet, but I did the same for #2594565: File extension + max_filesize should be validated using a Constraint on file fields :(
Make sure to read my comments there. In short, there is only one problem, this issue. Which is why I posted the patch for this there, after first proving that your test passes when written correctly.
Comment #14
dawehner@berdir
You linked this issue ...
Well this test is pretty much the same, just way more convenient to use at the end of the day.
Comment #15
klausiThe other validators such as ReferenceAccessConstraintValidator and ValidReferenceConstraintValidator always check the $value is not falsy. I can imagine PHP fatal errors here if $value is NULL? Maybe add a comment why we don't have to check for NULL compared to other validators?
Then I tested this with an Article node having an image file field. If I tamper with the submitted data and swap out the file ID to some *.txt file, then I don't get a form validation error. The txt file is saved in the image field. So I think we also need a web test here that confirms that the constraint actually fires when an attacker modifies file IDs. Same scenario as in #2587957: Arbitrary files can be referenced in file fields even if validation using form API would fail.
Comment #16
klausiAh, because we forgot ImageItem. So we should also have a test case for image item.
Next problem with this patch is that I cannot remove invalid files with the "Remove" button, because the file validation triggers. Removing invalid files should always be allowed, so we also need to fix that.
Comment #17
klausiDecided to spin out the Remove button problem to its own issue, not strictly in scope for this issue. See #2602540: File remove button should always work if file validation fails, has a patch ready for review.
Comment #18
klausiLooks like the Remove button problem is only introduced with this patch, so we will need to merge in the patch from #2602540: File remove button should always work if file validation fails.
Comment #19
klausiMerged in the test case from #2602540: File remove button should always work if file validation fails and expanded the new kernel test to cover image fields.
This should fail now and I'm not sure how we should solve the problem with the Remove button; Simply setting #limit_validation_errors = array() does not work, since the file removal happens in a validation handler.
Comment #21
klausiPerfect, the test fail demonstrates that the Remove button triggers validation errors, but it shouldn't.
I tried to reverse-engineer how and where the Remove button changes the form state, so that the #limit_validation_errors = array() solution works, but I'm lost here. I know that in FileWidget::formMultipleElements() $field_state['items'] are empty with that solution, but there should be the other remaining files. My test case is having a node with 3 valid files and then pushing the Remove button on the middle one, which removes all files instead of only the middle one.
Comment #22
marthinal commentedI can remove the file correctly using NULL for limit_validation_error but the validation problem exists... Let's take a look at the test results.
Comment #24
marthinal commentedSet to NULL makes no sense. I'm learning how it works while trying to fix the problem :) Manually this patch works and I think that this is where we are losing the values for form_state. Anyways let's try.
Comment #25
marthinal commentedGreen!
Comment #26
klausiThanks marthinal, that inspired me to dig a bit deeper. I don't think we should change FormValidator with this patch. Instead of fighting with #limit_validation_errors we can also just skip the flagError() process on the widget if the remove button is clicked. That way we never map back validation errors to the form, which is exactly what we want.
FileFieldValidateTest passing locally for me, let's see if anything else breaks. Interdiff is against #19.
Comment #27
dawehnerDo we actually need this, when the Validator never uses it?
This documentation is kinda wrong now, we validate also max size etc.
Should we repeat this test for Image items, just to be sure?
Comment #28
klausiWe are already testing image item with the kernel test, clarified that in the comment. Removed the unused default constraint message and improved docs.
Comment #29
marthinal commented@klausi great!
About the patch attched to #25
If we are using #limit_validation_error and files are disappearing... maybe this is a bug. For example, I have 2 images then I click on remove and the 2 images disappear.
So the expected behavior is to avoid validation and not to remove the files. Maybe we can avoid future problems. No idea about other use cases at this moment
Comment #30
klausi@marthinal: interesting, so we should definitely go with the latest flagErrors() approach where the removal of files works fine.
Comment #31
marthinal commented@klausi sure.
But just clarify that the problem described in #29, is the problem that #19 shows (test fail using #limit_validation_error). This problem is fixed with #25. So this patch is correct to fix this problem because maybe this is another bug(#19).
Comment #32
marthinal commentedOops I mean you can reproduce #29 using #19 and using #limit_validation_error from #2602540: File remove button should always work if file validation fails
Sorry :)
Comment #33
dawehnerI think for now this patch solves the critical issue without introduces some UX regressions.
Comment #34
alexpottSo does this mean we're validating everything twice when uploading a file through the UI with this patch?
Comment #35
marthinal commentedThe problem with the current Constraint is that we are validating when the remove button is clicked.
We have 2 solutions:
1. Use $element['remove_button']['#limit_validation_errors'] = array();. But test fails because when you click on the Remove button on a field where you have multiple files, all the files disappear. This is fixed by #25 and with this patch the remove button works as expected.
2. Use flagErrors() and avoid to show validation errors.
I don't think that we are validating 2 times with the second option AFAIK but correct me if I'm wrong...
And to be honest I'm not sure why #25 is not the best option because it is not a correct behavior to remove all the files when using #limit_validation_errors.
Comment #36
alexpottThe question in #34 was based around the fact I see where the patch adds all the new validation and it looks great. But we're not removing any validation anywhere - so it appears to me that we might be validating twice.
Comment #37
yched commentedIn order to not validate twice we would need to do something as described in #2587957-50: Arbitrary files can be referenced in file fields even if validation using form API would fail : put a killswitch on the FAPI type element so that the widget can use it without the validation.
(It doesn't really help that we're having the discussion in two issues)
Comment #38
klausiWe are not validating files twice with this patch.This patch only adds validation to existing files that are referenced in file fields, which was missing so far.
Why is the new file validation constraint not triggered when the upload or remove button is clicked?
Because those two buttons use #limit_validation_errors, so the general form validation (including the new FileValidationConstraint) is never executed.Because the file ID is never added as field item, so the constraint is never executed.Why is the old file validation not triggered when an invalid file is referenced in the field already? Because file_validate() is only invoked in file_save_upload(), which is only invoked for new files that are uploaded.
So both validation paths complement each other in a way. The constraint triggers on file references: the file was already uploaded and is now referenced with its ID in the file field. The old file validation triggers on upload: the file is new and validated before storing the ID in the file field.
We could try to unify those 2 validation paths, but that would mean copying the code from file_save_upload() to a new function/method file_save_upload_without_validation_crap_we_use_constraints_yo() which does not invoke file_validate(). I think that would be scope creep for this issue, because we only want to add file validation for existing references here, not get into refactoring hell.
Boldly setting this back to RTBC, let me know if we are still missing something here.
Comment #39
yched commentedNote that #limit_validation_errors does not skip any validation step. It just causes some errors to by ignored by FormState::setError() at the last minute. But the rest of the validation flow that leads to that point (from every #element_validate to the form validate, entity::validate() and the subsequent Widget::flagError() that ultimately calls FS::setError()) are fully executed all the same.
If the low level FAPI element and the entity constraint do the same check, the check is done twice. Hence my comment about the low-level EntityAutocomplete #type and the high-level ER widget and constraint, because they had to solve the exact same issue : the generic element #type has an option to not perform the validation, and the entity widget uses that option because it know calidation will be performed by the constraint.
Comment #40
klausiGood point, I tested the flow with an invalid file, where the validation is only done once and I misinterpreted #limit_validation_errors, sorry.
Case 1: invalid file uploaded: file_validate() is only triggered once because file_managed_file_save_upload() does not return a file ID. The file ID is never added to the field on the form, so the constraint validator is never executed for that field item because it never exists.
Case 2: valid file uploaded: file_validate() is triggered twice. Since there are no validation errors no duplicate error messages are shown on the form, because the file is valid.
Case 3: invalid existing file in the field: file_validate() is only triggered once because no new file is uploaded, so only the constraint triggers.
Case 4: valid existing file in the field: Same. file_validate() is only triggered once because only the constraint triggers it.
I still think this is good enough since we never show duplicate errors messages anywhere. There is one case for valid files where we trigger file_validate() twice, which is acceptable to me.
The alternative is to fork file_managed_file_save_upload() into a method on ManagedFile (the low level form element). That method would not do any validation of the file during the valueCallback() and leave that to the constraint. Thinking more about it, we would probably have to fork the whole ManagedFile class when we remove validation from it, because there is probably contrib code out there that relies on the validation in it.
We can also try an option $should_validate = FALSE as yched suggested in ManagedFile, but we will still have to fork file_managed_file_save_upload() to not call file validators.
Let me know if anyone insists on that in this issue, otherwise we should go back to RTBC.
Comment #41
alexpottDiscussed with @catch - neither of us are yay we're validating valid files twice but doing that is definitely a better place than not validating rest file uploads at all. Given there we agree with compromise outlined in #40 and hopefully we can find a solution on #2587957-50: Arbitrary files can be referenced in file fields even if validation using form API would fail soon.
Comment #42
klausiNo objections so far, so let's get on with it.
Comment #43
alexpottOkay getting this in so we can proceed with #2587957: Arbitrary files can be referenced in file fields even if validation using form API would fail and the other related critical. This moves us closer to a release and if we do release with double validation that is sad but it is a better position than HEAD. Committed e4b6c4d and pushed to 8.0.x. Thanks!