validateConfigurationForm calls Feed::translateSchemes when validating the HTTP fetcher feed form. The URL is not required and if accidentally left empty that empty value gets passed through to translateSchemes. translateSchemes doesn't check for the empty case, and the call doesn't catch the generated exception.

Comments

wolffereast created an issue. See original summary.

wolffereast’s picture

Status: Active » Needs review
StatusFileSize
new1.38 KB

Attaching a patch that both makes the source field required and handles the exception thrown by the translateSchemes function. I'm not sure if the source needs to be required if we are handling the exception, and I'm not sure if the exception should kill the save, but barring a good reason not to make the changes I think this is a good idea.

megachriz’s picture

Returning after the exception is hit seems to be right to me. The saving what happens afterwards ($form_state->setValue('source', $url);) should be the corrected URL and since there is no one, there is no point in putting the URL back into form state.

About whether or not the URL should be required, I'm not sure. It was required in the D7 version, though that there had been a feature request to make it optional for D7. Maybe it made more sense in the D7 version because you could make the source URL field part of a node and some people wanted to fill in the rest of the node and take care about the source URL later.

I think that the translatable error message should not contain the list of supported schemes literally. If a new scheme is supported in the future that would mean that the string would need to be retranslated. Instead I think it is better to add a placeholder for supported schemes like I did in this commit for D7: http://cgit.drupalcode.org/feeds/commit/?id=7f5aaf3
I've changed this in the attached patch. Also in the patch is a method added to \Drupal\feeds\Utility\Feeds that returns the supported schemes.

I discovered that Drupal core validates the given URL as well. See \Drupal\Component\Utility\UrlHelper::isValid. Interesting to note is that the URL validator from Drupal core marks the schemes 'webcal' and 'webcalls' as invalid. So it looks like we still have some work here to support these schemes.

To test error message from the patch, try a URL that starts with ftp://. This is the only scheme that passes Drupal core's URL validator, but is not supported by Feeds.

wolffereast’s picture

Tested out the updated patch and it looks good on my end, with the exception of the webcal[s] validation you mentioned. It also doesnt appear to like feeds.

From what I can tell we should be able to update the allowed protocols stored in the UrlHelper object. I added
UrlHelper::setAllowedProtocols(Feed::getSupportedSchemes()); to the form build function and was able to print out the updated protocols in the isValid handler. Unfortunately the isValid handler isnt taking the allowedProtocols into consideration, and is instead relying on the hard coded regex mentioned above. Swapping out the current text for something like

$allowedProtocols = implode('|', UrlHelper::getAllowedProtocols());

    if ($absolute) {
      return (bool) preg_match("
        /^                                                      # Start at the beginning of the text
        (?:" . $allowedProtocols . "):\/\/                                # Look for allowed schemes
        (?:

fixes the issue.

It looks like this is a known issue in core. Added a related issue that is looking into this: https://www.drupal.org/node/2691099

Do we want to postpone this and wait for the core updates, work around the isValid limitation, or call this as fixed as it can be and open another ticket to wait for the core updates?

Status: Needs review » Needs work

The last submitted patch, 3: feeds-invalid-scheme-2850213-3.patch, failed testing.

megachriz’s picture

I think we need at least catch the exception to prevent a fatal error. So committing a fix for this (if the tests pass) and opening a new issue that is postponed on the core issue seems to be the right thing to do.

wolffereast’s picture

Queued the most recent patch for retesting, as the original tests didnt report any errors and my simplytest.me instance came back clean

wolffereast’s picture

Status: Needs work » Needs review

Moving to needs review as all the retests passed. Do you think we need additional unit tests for these items?

megachriz’s picture

An automated test would be welcome for this issue. Do you want to write one?

wolffereast’s picture

I'll give it a go. That's one of the things I need to get better at...

wolffereast’s picture

Status: Needs review » Needs work

Sorry for the delay on this one. I gave this an initial go but I'm running into trouble trying to get test results. I just spun up a simplytest.me instance and ran some of the feeds tests and Im getting the same result as my local - 0 tests run

a small sample:
Overall results: 0 passes, 0 fails, 0 exceptions
Drupal\Tests\feeds\Unit\Feeds\Fetcher\Form\HttpFetcherFormTest: 0 passes, 0 fails, 0 exceptions
Drupal\Tests\feeds\Unit\Feeds\Fetcher\Form\DirectoryFetcherFormTest: 0 passes, 0 fails, 0 exceptions
Drupal\Tests\feeds\Unit\Feeds\Fetcher\Form\DirectoryFetcherFeedFormTest: 0 passes, 0 fails, 0 exceptions
Drupal\Tests\feeds\Unit\FeedImportHandlerTest: 0 passes, 0 fails, 0 exceptions

I based my initial attempt on the DirectoryFetcherFeedFormTest as it seemed similar enough for a rough start. I noticed that all the tests running under the feeds\Tests namespace are firing were as the Drupal\Tests\feeds\Unit items arent firing. Should I be testing them in a different manner?

megachriz’s picture

@wolffereast
I think you'll need to build Drupal with dev dependencies. The Drupal download (as found on this page) no longer includes them.

See https://www.drupal.org/blog/drupal-8-will-no-longer-include-dev-dependen... and following the instructions underneath "Development and continuous integration workflows". You'll need to install Composer and use the command line.

blast0344’s picture

@MegaChriz
Is it possible to add CSV url for download parser like: example.com/test_csv_to_import.csv ?

I get "Invalid Feed Url" but I cannot find such issue here.
I've applied patch from https://www.drupal.org/node/2443471 and I've tested CSV import with uploading file but it seems that in validation form for download parser and detectType function CSV is not included and it fails to validate the source.

If I make a small change to validation it seems that import is working, but I wanted to check if there is an open issue for this one so that I don't duplicate it if it's somewhere?

wolffereast’s picture

Hi @blastoise
I just spun up a clean install on simplytest.me with only feeds 8.x-3.x installed. I created a new feed type with a download fetcher and csv parser, then added a feed with the URL example.com/test_csv_to_import.csv. That doesn't validate because it's missing the protocol, but saving the feed with http://example.com/test_csv_to_import.csv went through with no error. I got an error when the feed tried to run because it isn't a real page, but that makes sense (Client error: `GET http://example.com/test_csv_to_import.csv` resulted in a `404 Not Found). If you try the url with the protocol and still receive an error please open a new issue with the steps to reproduce so that we can dive in.

In regards to tests for the above patch I ran into some issues getting the test suite to run on windows but eventually got around those problems using a docker instance instead. Nothing like sidestepping the problem entirely! I'm getting an odd error regarding a retrn by reference error that needs fixing before I can put a patch up. I'll take another crack at it soon and then, if I cant make it work, I'll drop into IRC to see if I cant rustle up some help.

blast0344’s picture

Hi @wolffereast
I've tried now with simplytest and still got an error when I use real file.
I've found sample data url: http://samplecsvs.s3.amazonaws.com/TechCrunchcontinentalUSA.csv
When I add that as url I get Invalid feed URL.

Sorry to bother. If you can confirm this then we are ready for opening new issue.

wolffereast’s picture

I think I found it. Check your feed type settings and uncheck the 'Auto detect feeds' checkbox in the Fetcher settings tab and see if that works. I think that kicks an error if it cant find something resembling a properly formatted feed in the destination. Since you are using a csv file there is no feed.

blast0344’s picture

That is correct. Thanks for help and sorry that I've added here unrelated comments for reported issue. I missed that auto detect option.

megachriz’s picture

Status: Needs work » Fixed

I committed #3. It would be good to have an automated test, but avoiding a fatal error has more priority.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.