I am importing a feed with dynamic enclosures like:
http://productimages.wehkamp.nl/is/image/Wehkamp/?src=Wehkamp%2F415538_pb_01

This type of URL leads to errors during getFile($destination) in FeedsParser.inc:

File temporary://filA163.tmp could not be copied, because the destination directory is not configured correctly.

The reason is that my URL has no basename($filename)

Actually, once you fetch that image, the actual filename dynamically becomes productimages.wehkamp.nl.jpg in the example above. But basename() does not derive that from the URL. It doesn't do the remote call to see what is the actual returned filename.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

marcvangend’s picture

Subscribe.

I have worked around this problem before by placing a custom php script (imagewrapper.php) on my server that accepts a clean url (just like Drupal's index.php does), retrieves the image with CURL and returns it. This way you end up with filenames exactly like you want them, but the downside is the extra request in the chain.

This is what my imagewrapper.php looks like:

<?php
/*
 * @file
 * 
 * This script expects to be accessed over with an http request like this:
 * http:// localhost/image_wrapper/1000x1000/image.jpg
 * In order to achieve that, mod_rewrite needs to be enabled and the following
 * lines must be added to .htaccess:
 * 
 * <IfModule mod_rewrite.c>
 *   RewriteEngine on
 *   # send calls for /image_wrapper directly to imagewrapper.php.
 *   RewriteRule ^image_wrapper/(.*)$ imagewrapper.php?q=$1 [L,QSA]
 * </IfModule>
 * 
 */

// Only accept requests from own server.
if ($_SERVER['SERVER_ADDR'] != $_SERVER['REMOTE_ADDR']) {
  header('HTTP/1.1 403 Forbidden');
  print '<h1>403 Forbidden</h1>';
  exit();
}

// Can't do a thing without input
if (empty($_GET['q'])) {
  header('HTTP/1.1 404 Not found');
  exit();
}

// Set up variables.
$args = explode('/', $_GET['q']);
$size = array_shift($args);
list($width, $height) = explode('x', $size);
$imagepath = implode('/', $args);
$api_url = "http://example.com/getimage.php?value=$imagepath&width=$width&height=$height";

// Retrieve image.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $api_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
$picture = curl_exec($ch);
curl_close($ch);

// Return the image over http.
header('Content-type: image/jpeg');
print $picture;
?>

Note that I do not really recommend this approach, it's only a workaround.

riho’s picture

I had the same problem, so I created a patch for it. It uses the file name it finds from the header or falls back to the basename.

stewart.adam’s picture

Status: Active » Needs review
FileSize
2.13 KB

Rebased to latest git (7.x-2.x I reworked riho's patch a little bit as http_request_get() returns its header array using lowercase keys, so checking against ...->headers['Content-Dispotion'] would fail.

I've also added a secondary regex to support the case where the server doesn't properly quote the filename, such as some IIS servers. branch).

Status: Needs review » Needs work

The last submitted patch, feeds-get_filename_from_http_headers-1104378-3.patch, failed testing.

stewart.adam’s picture

Status: Needs work » Needs review
FileSize
2.19 KB

Attached patch checks for the presence of the Content-Disposition header to prevent PHP notices... This one should pass the test bot.

thatpixguy’s picture

Issue summary: View changes

I need this feature and just made a patch of my own, but before posting it I found this issue request :)

The latest submitted patch seems more complete than my solution, so I won't add noise with mine.

pix

twistor’s picture

Assigned: Unassigned » twistor

hmm yes.

kenorb’s picture

MegaChriz’s picture

kenorb’s picture

Expanded patch and attached below.

kenorb’s picture

kenorb’s picture

As #11, but with small fix:

                 $extensions = array_keys($mapping["extensions"], $ext_id);
                 foreach ($extensions as $extension) {
-                  if (array_search($extension, $this->allowedExtensions)) {
+                  if (in_array($extension, explode(' ', $this->allowedExtensions), TRUE)) {
                     $filename .= ".$extension";

As $this->allowedExtensions isn't an array, but string.

Tested Feed import and it works fine.

c.dan’s picture

Downloaded latest dev version + applied patch #12. Doesn't save the file.
My links are in this format:
http://picscdn.redblue.de/doi/pixelboxx-mss-62050889/fee_786_587_png/101-Dalmatinerna-Barn-DVD

fougere’s picture

Small modification to the previous patch.

The allowed extension check was done before retrieving the filename in the http headers.
So it fails unless your url already ends with the correct extension (like in comment #13).

Status: Needs review » Needs work

The last submitted patch, 14: feeds-get_filename_from_http_headers-1104378-9.patch, failed testing.

fougere’s picture

Unit test fix. The watchdog message for invalid extensions was not the same as before.

AlexKirstenZA’s picture

Adjusted the patch submitted in comment #16, in order to allow NewsCred images to be imported into image fields.

MegaChriz’s picture

+++ b/plugins/FeedsParser.inc
@@ -171,20 +171,6 @@ abstract class FeedsParser extends FeedsPlugin {
-
-  /**
-   * Returns if the parsed result can have a title.
-   *
-   * Parser classes should override this method in case they support a source
-   * title.
-   *
-   * @return bool
-   *   TRUE if the parsed result can have a title.
-   *   FALSE otherwise.
-   */
-  public function providesSourceTitle() {
-    return FALSE;
-  }

@@ -741,43 +802,6 @@ class FeedsDateTime extends DateTime {
   /**
-   * Helper function to prepare the object during serialization.
-   *
-   * We are extending a core class and core classes cannot be serialized.
-   *
-   * Ref: http://bugs.php.net/41334, http://bugs.php.net/39821
-   */
-  public function __sleep() {
-    $this->_serialized_time = $this->format('c');
-    $this->_serialized_timezone = $this->getTimezone()->getName();
-    return array('_serialized_time', '_serialized_timezone');
-  }
-
-  /**
-   * Upon unserializing, we must re-build ourselves using local variables.
-   */
-  public function __wakeup() {
-    $this->__construct($this->_serialized_time, new DateTimeZone($this->_serialized_timezone));
-  }
-
-  /**
-   * Returns the string representation.
-   *
-   * Will try to use the literal input, if that is a string. Fallsback to
-   * ISO-8601.
-   *
-   * @return string
-   *   The string version of this DateTime object.
-   */
-  public function __toString() {
-    if (is_scalar($this->originalValue)) {
-      return (string) $this->originalValue;
-    }
-
-    return $this->format('Y-m-d\TH:i:sO');
-  }
-
-  /**

@@ -891,7 +915,6 @@ class FeedsDateTime extends DateTime {
-

@@ -909,16 +932,12 @@ function feeds_to_unixtime($date, $default_value) {
-
-  if ($date instanceof FeedsDateTimeElement) {
+  elseif (is_string($date) && !empty($date)) {
+    $date = new FeedsDateTimeElement($date);
     return $date->getValue();
   }
-
-  if (is_string($date) || is_object($date) && method_exists($date, '__toString')) {
-    if ($date_object = date_create(trim($date))) {
-      return $date_object->format('U');
-    }
+  elseif ($date instanceof FeedsDateTimeElement) {
+    return $date->getValue();
   }
-

The last patch seems to undo a lot of other changes made in other issues.