• Some APIs provide resource URLs that do not point directly to a file, but instead redirect the user to the file. For example, https://www.instagram.com/p/B9vy5EdgZX1/media/?size=l is proviuded by the Instagram API, but this actually redirects to https://scontent-lht6-1.cdninstagram.com/v/t51.2885-15/fr/e15/p1080x1080/89738651_155780038891572_5224457300248488192_n.jpg?_nc_ht=scontent-lht6-1.cdninstagram.com&_nc_cat=100&_nc_ohc=rGVubVRCFoAAX8YKjNO&oh=76c6ac182e0432bfffce7539e372631d&oe=5EB77DAB
  • This redirect causes issues for \Drupal\feeds\Feeds\Target\File::getFileName which expects to get a URI that contains a file extension that it can pass to \Drupal\Core\File\FileSystem::basename
  • One fix would be to make a request to $url as part of \Drupal\feeds\Feeds\Target\File::getFileName and update the value of $url whenever a redirect takes place.

Patch incoming :)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Phil Wolstenholme created an issue. See original summary.

Phil Wolstenholme’s picture

Title: File target fails when the URL contains a redirect to a file rather than being a direct URL of a file resource » File target fails when the URL (e.g. one provided by an API) redirects to a file, rather than being the direct URL of a file resource
Phil Wolstenholme’s picture

Here's a rough patch that uses Guzzle to work out the 'final' URL of a resource after any redirects.

It also includes the changes proposed in https://www.drupal.org/project/feeds/issues/2977565#comment-13262487 to prevent the failures caused by query strings in a URL (another common thing with image resources returned by APIs).

Phil Wolstenholme’s picture

Status: Active » Needs review
Phil Wolstenholme’s picture

Here's an updated patch that adds the missing use statement for use GuzzleHttp\TransferStats;

Status: Needs review » Needs work
Phil Wolstenholme’s picture

I'm not sure wny the test is failing, the patch seemed to work okay for me locally with a range of URLs. Hopefully the patch can help someone else figure out a version that also passes the tests!

Phil Wolstenholme’s picture

Ah, I guess we need to check if we're dealing with a local file path/URI or a remote URL, and only use Guzzle to follow redirects if it's a remote URL.

damondt’s picture

I took a slightly different approach putting the redirect calculation in getFile(). I also made it a selectable option in the target config form because of the performance impact it has.

damondt’s picture

lesleyfernandes’s picture

Updating the patch to work with the latest feeds module updates.

lesleyfernandes’s picture

Status: Needs work » Needs review