We have basically took the public/private/temporary schemes and presumed the world is wired like that. It's absolutely not. Let's say you have foo://this/that/whatever.jpg. We presume there is some relation between that and foo://this/that. This assumption has no grounds, whatsoever. It is perfectly possible that this/that/whatever.jpg is an identifier in and itself and that's it. Whether such a scheme allows iterating the "directories" or not does not effect Drupal core because we have no code that iterates the directories after some content (all we are iterating are the local dirs for code).

Fixing this does not need much code wise just moving some code out of file.inc and into stream_wrappers.inc and -- some API change, alas.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

aaron’s picture

Yes, from conversations in IRC, I agree there are some mistaken assumptions in FileAPI. A low-hanging fruit would be file_uri_target() -- having the function call a stream wrapper class defined function, rather than the calls within the class calling out to that. There's no way that file_uri_target() as it's written will be valid for any wrapper other than public & private, except by chance, and that's only because those classes are already massaging it.

I would suggest we add a ->uriTarget() function in the stream wrapper implementations, and have file_uri_target() call it. That should cover any future cases where the URI & directory structure don't necessarily match.

aaron’s picture

Looking at it in more detail, I'm not even entirely sure what file_uri_target is trying to do that's different than the current ->getExternalUrl()? From the descriptions, I'd expect both to do the same thing. However, file_uri_target() is doing what? Just stripping off the stream. Really bad in general -- if I call file_uri_target() on 'youtube://v/-jubiv7QUco' (which is a valid stream already existing in contrib), I get the completely worthless 'v/-jubiv7QUco', rather than the more useful 'http://www.youtube.com/watch?v=-jubiv7QUco' from ->getExternalUrl(). That's a big huh? in my book...

aaron’s picture

Of course, that's not the best example, being a read-only stream. A more useful example would be when we've more fully implemented the YouTube stream wrapper so you could upload videos as well, or chx's hash stream in the works.

chx’s picture

Yeah file_uri_target needs to die a fiery death and everything calling it needs the surrounding code to be moved to the stream wrapper, that's what I was trying to say.

pwolanin’s picture

target and URL are quite different though e.g. a private file it would be a file system path, but the URL would be like system/files

aaron’s picture

So the first thing that would need to be changed would be the documentation: sounds like a 'target' would mean the place where a file resides within a particular stream?

So that public://texts/my-text.txt would return something like sites/default/files/texts/my-text.txt, private://texts/my-text.txt might be ../private/files/texts/my-text.txt, cdn://texts/my-text.txt could be e3scfe778/my-text.txt, etc?

I'm trying to grasp the intentions of this function, and how it might be used.

aaron’s picture

And going further, it seems as though there could be valid cases where a stream might not want to disclose (or have relevant information) about the target as well, such as for many read-only streams, in which case it may be perfectly acceptable for file_uri_target() to return FALSE.

aaron’s picture

although that's not entirely correct either -- in this case both public:// and private:// would return the relative directory, not the absolute directory, assuming the target is meant to be relative to the stream itself (which must be the case for it to have any meaning). which explains how the mistaken assumption was made in the first place.

aaron’s picture

file_stream_wrapper_uri_normalize() is bad too -- makes similar assumptions. seems as though it should be relegating to getExternalURL rather than building it manually. i'll make a fix for that as well in the forthcoming patch.

aaron’s picture

doh, misread that function... any case, i'll fix and submit a patch soon...

aaron’s picture

Status: Active » Needs review
FileSize
13.43 KB

here's the first try. testing locally as we speak, but it looks good at first glance.

aaron’s picture

I'm not certain about the implementations of hook_download: image_download & user_download i believe are in that patch. This patch simply replicates their existing functionality, so will work as it did before, but I'm not entirely certain that implementation was correct.

This is an extensive patch, so I'm sure there can be more improvements as well.

chx’s picture

Status: Needs review » Needs work

Looking at the patch, the normalize and the dirname functions, after realizing there is a valid scheme needs to hand over to the wrapper. Any sort of mucking around with "directory_path" is simply meaningless. On the other hand, it seems I was wrong and file_uri_target could become a handy wrapper for file_stream_wrapper_get_instance_by_scheme(file_uri_scheme($uri))->getTarget($uri);

aaron’s picture

Status: Needs work » Needs review
FileSize
6.91 KB

i *think* this addresses the concerns.

c960657’s picture

Status: Needs review » Needs work

I'm a bit confused about the relationship between DrupalStreamWrapperInterface::realpath(), DrupalStreamWrapperInterface::getTarget() and DrupalLocalStreamWrapper::getLocalPath().

   /**
+   * Returns the local target of the resource within the stream.
+   *
+   * @param $uri
+   *   Optional URI.
+   *
+   * @return
+   *  Returns a string representing the physical location of the file or
+   *  resource as it is referenced within the stream, or FALSE if that
+   *  information is inaccessible or irrelevant.
+   */
+  public function getTarget($uri = NULL);

The term “physical location” needs to be explained. If it is a path, it should be explained whether it is relative or absolute.

+    // Remove erroneous beginning or ending forward slashes.
+    $data[1] = trim($data[1], '\/');

The comment only talks about forward slashes, but also backslashes are trimmed.

+    // Remove all occurrences of the wrapper's directory path.
+    $data[1] = str_replace($this->getDirectoryPath(), '', $data[1]);

“All”? AFAICT the URI passed to getLocalPath() should never include the directory path. In any case, it should only strip the first occurrences. If the temporary wrapper is configured to use e.g. /tmp as base, it is not unlikely that the directory path would appear several times in a filename (e.g. temporary://foo/tmp.txt that maps to /tmp/foo/tmp.txt).

aaron’s picture

"it is not unlikely that the directory path would appear several times in a filename"

I agree, and scratched my head over that too. I'd opted to maintain the previous behavior, but agree it should be pulled. I'll try to get a new patch tomorrow.

Thanks for the review!

chx’s picture

That whole remove the directory part smells like a complete kludge to me and a vestige from the pre-Drupal 7 era something so ancient and dark kept over the (not too many) file API revisions noone knows any more why it's there... I will do a cvs annotate run later.

Edit: no dice. It does not seem to exist before the stream wrapper conversion.

aaron’s picture

Status: Needs work » Needs review
FileSize
6.76 KB

how about this?

pwolanin’s picture

+  public function dirname($uri = NULL) {
+    $data = explode('://', $uri, 2);
+    $target  = $this->getTarget($uri);
+    $dirname = dirname($target);
+
+    if ($dirname == '.') {
+      $dirname = '';
+    }
+
+    return $data[0] . '://' . $dirname;
+  }

Seems a little weird to call getTarget() here for the local implementation - it's one like of code basically. But that's minor.

Also, it would be clearer to use list() with the explode like:

list($scheme, $path) = explode('://', $uri, 2);
chx’s picture

@pwolanin, it's quite imporant to call getTarget because that allows extending -- the whole issue grow out of hash_wrapper, check its code :)

getExternalUrl() needs to change function getExternalUrl() to self::uriTarget because otherwise it creates another hash wrapper instance which is quite honestly braindead and slow and memory hog and whatnot.

Status: Needs review » Needs work

The last submitted patch, stream_wrapper_getTarget.701358.18.patch, failed testing.

pwolanin’s picture

Status: Needs work » Needs review
FileSize
6.64 KB

testbot seems braindead recently...

Here's a patch with minor improvements to the use of explode.

@chx - I don't follow your last comment about what change is needed.

Status: Needs review » Needs work

The last submitted patch, stream_wrapper_getTarget.701358-22.patch, failed testing.

chx’s picture

Status: Needs work » Needs review
FileSize
7.64 KB

So the failing test was testing functionality which we removed in this issue. Fix is easy: remove it. It's a fantastic piece... one of the things file_stream_wrapper_uri_normalize function did "Remove all occurrences of the wrapper's directory path" -- what directory path?? the local wrappers have directory path, otherwise this functionality can not exist as there is simply no directory path .

drewish’s picture

subscribing, interesting.

drewish’s picture

We don't seem to be using the @ingroup php_wrappers else where in that patch.

I'm scratching my head about what this means:

+   * @return
+   *  Returns a string representing the write location or path of the file or
+   *  resource as it is referenced within the stream; or FALSE if that
+   *  information is inaccessible or irrelevant, for instance with read-only
+   *  streams.
+   */
+  public function getTarget($uri = NULL);

Indenting for the docblock on dirname() has an extra space.

Could we say something about how this doesn't work?

+    * PHP's dirname() does not properly pass streams, so this function fills
+    * that gap.

I'd also like to make it clear when this is called so stream wrapper implementors know what constitues a sane return value. realpath() has a similar conceptual problem and it provides a bit more detail but not really enough for my liking.

I think we could rephrase this:

+    // Remove erroneous beginning or ending forward and backslashes.
+    return trim($target, '\/');

"Remove erroneous leading or trailing, forward-slashes and backslashes."

I feel like DrupalLocalStreamWrapper::getDirectoryPath() should become protected so that it's clear that it's not for core or other modules to use and that it's... and while I'm asking does anyone know why DrupalLocalStreamWrapper::$context and $handle are public? Seems like they should be protected.

drewish’s picture

Status: Needs review » Needs work

So to be more specific, I like the proposed changes to file_uri_target() to delegate to the wrapper.

file_stream_wrapper_uri_normalize() seems sane though DrupalStreamWrapperInterface::getTarget() needs to better define its differences from dirname() and realpath().

We should improve the docs for DrupalStreamWrapperInterface::getUri() to make it clear that the URI has scheme and target components and add @see tags to file_uri_scheme() and ::getTarget() functions that would produce the components if they need those.

quicksketch’s picture

Status: Needs work » Needs review
FileSize
8.09 KB

Here's a reroll addressing as much of drewish's comments as I understood them. Absolutely no code changes from the previous patch, just updated docs and formatting issues.

- Fixed all indenting on docblocks.
- Clarified getTarget() docs as I understand them, including a better function description and more concise return description.
- Instead of going in-depth with the dirname() method documentation, I simply point over to drupal_dirname() from which this method is called.
- Rephrased the slashes/backslashes comment per drewish's suggestion.

I did not make getDirectoryPath() protected or do anything with $context and $handle, since it seems like all the problem absolutely central to this issue have already been addressed, we're just looking at formatting/docs issues at this point.

aaron’s picture

thanks for the dox, quicksketch, looks good now

i agree with drewish that some of those functions should probably be protected. however, i think that should be another issue, as it's unrelated to this, and i would rather see it splode elsewhere...

i would mark as rtbc, except i'm not currently able to apply the patch. might be because my system is in the middle updating to lucid lynx, and lots of things are wonky for me right now, and i wouldn't be surprised if that program has been affected as well...

aaron’s picture

Status: Needs review » Reviewed & tested by the community

helps if i use -p0... so yes, it all seems to work for me now.

drewish’s picture

Status: Reviewed & tested by the community » Needs work

The docs for getTarget() don't make much sense to me:

+   * This function should be used in place of calls to realpath() or similar
+   * functions when attempting to determine the location of a file. While
+   * functions like realpath() may return the location of a read-only file, this
+   * method may return a URI or path suitable for writing that is completely
+   * separate from the URI used for reading.

I'm not sure what the read-only stuff means. Can it really return a URI? Why would it do that? Why does it only work for writeable streams?

The docs for dirname() also need some work:

+   * This method is usually access through drupal_dirname(), which wraps around
+   * the normal PHP dirname() function, which does not support stream wrappers.

usually accessed? How about:
This method is typically called by drupal_dirname() to determine the directory that the file exists in.

The various versions of dirname() don't even seem to agree on the convention for @return values. Some return empty strings, some FALSE.

I'll try to spend some time on this this evening.

quicksketch’s picture

Status: Needs work » Reviewed & tested by the community

Hmm, unless drewish has objections, let's move this back to RTBC. Right now we're being held up on extremely minor points of documentation and there are absolutely no further code changes being suggested. The overall structure makes a lot more sense to me than what we have currently and I already went through docs and corrected them to make sense from my perspective.

Dries’s picture

Status: Reviewed & tested by the community » Needs work
+++ includes/stream_wrappers.inc	25 Apr 2010 04:47:01 -0000
@@ -187,6 +205,21 @@
+   * This method is usually access through drupal_dirname(), which wraps around

Should be 'accessed' instead of 'access'?

+++ includes/stream_wrappers.inc	25 Apr 2010 04:47:01 -0000
@@ -492,6 +539,31 @@
+   * This method is usually access through drupal_dirname(), which wraps around

Should be 'accessed' instead of 'access'?

Requires another reroll.

quicksketch’s picture

Status: Needs work » Needs review
FileSize
8.11 KB

Grammar re-roll.

Status: Needs review » Needs work

The last submitted patch, steam_wrappers_get_target.patch, failed testing.

quicksketch’s picture

Status: Needs work » Needs review
FileSize
8.11 KB

Another attempt at a re-roll.

chx’s picture

Status: Needs review » Reviewed & tested by the community

Back to RTBC, thne

Dries’s picture

Personally, I still find this confusing. For example, the difference between dirname() and getDirectoryPath(), both part of the same class, is not instantly clear? Any why do these functions have such a different signature -- one starting with getXXX() and the other not?

While I agree that this needs to be fixed, the API is a bit of a WTF IMO. We might be able to resolve this by picking better function names.

Anyone has any thoughts on the function names?

chx’s picture

Status: Reviewed & tested by the community » Needs work
drewish’s picture

Dries, the real problem with those APIs can be traced back to PHP's interface (StreamWrapperInterface) which is a) some what inconsistent in its naming and b) uses lowercase names with underscores to word separators. In some places we follow the PHP style and in some places we follow Drupal's camel case conventions.

chx’s picture

Status: Needs work » Reviewed & tested by the community

Tentatively back to RTBC -- also we want to change the API as little as possible this late.

Dries’s picture

I'm happy to commit this 'as is' given where we are in the game. However, could we add a @todo to fix the API in D8?

chx’s picture

FileSize
8.44 KB

Added TODO.

aspilicious’s picture

FileSize
14.14 KB

Chx his patch with some extra newlines in the streamwrapper to make the docs consistent.
If this is to much at once you can commit #43.
Afterwards I'll roll a new one afterwards with only the newlines.

If you can handle some newlines in this patch feel free to commit this one.

Dries’s picture

This looks good to me. I'll commit it in the morning -- let's wait for the test bot.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed to CVS HEAD. Thanks.

JacobSingh’s picture

Priority: Normal » Critical
Status: Fixed » Active

Try this:

file_uri_target('http://drupal.com/favicon.ico');

The result? Fatal error:

Fatal error: Call to a member function getTarget() on a non-object in /Users/jacob/work/drupal-7/includes/file.inc on line 215

The file_stream_wrapper_get_class() function only returns "public" and temporary" and doesn't include PHP defined wrappers. I'm not sure exactly what is the desired fix here, but this looks bad to me.

chx’s picture

Status: Active » Fixed

This looks fine to me. That function accepts Drupal stream wrappers. PHP stream wrappers IMO need not to apply? That would be a feature request...

JacobSingh’s picture

Docs say:

/**
* Returns the target of a URI (e.g. a stream).
*
* @param $uri
* A stream, referenced as "scheme://target".
* @return
* A string containing the target (path), or FALSE if none.
* For example, the URI "public://sample/test.txt" would return
* "sample/test.txt".
*
* @see file_uri_scheme()
*/

I would assume from that it takes any URI and returns either a string or FALSE.

Many other file_* functions which take a $uri component are happy to take any supported stream, be it flickr:// ftp:// http:// s3://.

It doesn't say that it @throws a method does not exist error if you pass in a valid uri which Drupal doesn't know about (but the rest of the world does, like http).

I'll make a patch, but I'm recommending that we change docs to not say it just takes any URI, but a "Drupal uri" or whatever we should be calling it so it is clear what those are and where to find them. Also, I think if it doesn't find a class, it should stop there and either throw a reasonable exception which explains that the input was bad.

Reasonable?

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

Damien Tournoud’s picture

Status: Closed (fixed) » Active

This patch doesn't make any sense to me.

DrupalStreamWrapperInterface::getTarget() looks like:

  • In a way it largely overlaps getLocalPath(). The feature that was added there seems specific to local filesystems extending DrupalLocalStreamWrapper, as a consequence it should not be a generic method implemented in DrupalStreamWrapperInterface
  • In other way it is responsible for returning the part after the '://', which seems to be the true purpose of file_uri_target()

I don't understand how those two are related, and my general feeling is that this whole patch should be reverted.

andypost’s picture

also #49 is not fixed

rfay’s picture

subscribing to read later.

Damien Tournoud’s picture

Status: Active » Closed (fixed)

Now that #858528: file_uri_target() purpose is unclear is in, everyone is happy and all is good.

andypost’s picture