We have basically took the public/private/temporary schemes and presumed the world is wired like that. It's absolutely not. Let's say you have foo://this/that/whatever.jpg
. We presume there is some relation between that and foo://this/that
. This assumption has no grounds, whatsoever. It is perfectly possible that this/that/whatever.jpg
is an identifier in and itself and that's it. Whether such a scheme allows iterating the "directories" or not does not effect Drupal core because we have no code that iterates the directories after some content (all we are iterating are the local dirs for code).
Fixing this does not need much code wise just moving some code out of file.inc and into stream_wrappers.inc and -- some API change, alas.
Comment | File | Size | Author |
---|---|---|---|
#44 | 701358-44.patch | 14.14 KB | aspilicious |
#43 | 701358-43.patch | 8.44 KB | chx |
#36 | steam_wrappers_get_target.patch | 8.11 KB | quicksketch |
#34 | steam_wrappers_get_target.patch | 8.11 KB | quicksketch |
#28 | steam_wrappers_get_target.patch | 8.09 KB | quicksketch |
Comments
Comment #1
aaron CreditAttribution: aaron commentedYes, from conversations in IRC, I agree there are some mistaken assumptions in FileAPI. A low-hanging fruit would be file_uri_target() -- having the function call a stream wrapper class defined function, rather than the calls within the class calling out to that. There's no way that file_uri_target() as it's written will be valid for any wrapper other than public & private, except by chance, and that's only because those classes are already massaging it.
I would suggest we add a ->uriTarget() function in the stream wrapper implementations, and have file_uri_target() call it. That should cover any future cases where the URI & directory structure don't necessarily match.
Comment #2
aaron CreditAttribution: aaron commentedLooking at it in more detail, I'm not even entirely sure what file_uri_target is trying to do that's different than the current ->getExternalUrl()? From the descriptions, I'd expect both to do the same thing. However, file_uri_target() is doing what? Just stripping off the stream. Really bad in general -- if I call file_uri_target() on 'youtube://v/-jubiv7QUco' (which is a valid stream already existing in contrib), I get the completely worthless 'v/-jubiv7QUco', rather than the more useful 'http://www.youtube.com/watch?v=-jubiv7QUco' from ->getExternalUrl(). That's a big huh? in my book...
Comment #3
aaron CreditAttribution: aaron commentedOf course, that's not the best example, being a read-only stream. A more useful example would be when we've more fully implemented the YouTube stream wrapper so you could upload videos as well, or chx's hash stream in the works.
Comment #4
chx CreditAttribution: chx commentedYeah file_uri_target needs to die a fiery death and everything calling it needs the surrounding code to be moved to the stream wrapper, that's what I was trying to say.
Comment #5
pwolanin CreditAttribution: pwolanin commentedtarget and URL are quite different though e.g. a private file it would be a file system path, but the URL would be like system/files
Comment #6
aaron CreditAttribution: aaron commentedSo the first thing that would need to be changed would be the documentation: sounds like a 'target' would mean the place where a file resides within a particular stream?
So that public://texts/my-text.txt would return something like sites/default/files/texts/my-text.txt, private://texts/my-text.txt might be ../private/files/texts/my-text.txt, cdn://texts/my-text.txt could be e3scfe778/my-text.txt, etc?
I'm trying to grasp the intentions of this function, and how it might be used.
Comment #7
aaron CreditAttribution: aaron commentedAnd going further, it seems as though there could be valid cases where a stream might not want to disclose (or have relevant information) about the target as well, such as for many read-only streams, in which case it may be perfectly acceptable for file_uri_target() to return FALSE.
Comment #8
aaron CreditAttribution: aaron commentedalthough that's not entirely correct either -- in this case both public:// and private:// would return the relative directory, not the absolute directory, assuming the target is meant to be relative to the stream itself (which must be the case for it to have any meaning). which explains how the mistaken assumption was made in the first place.
Comment #9
aaron CreditAttribution: aaron commentedfile_stream_wrapper_uri_normalize() is bad too -- makes similar assumptions. seems as though it should be relegating to getExternalURL rather than building it manually. i'll make a fix for that as well in the forthcoming patch.
Comment #10
aaron CreditAttribution: aaron commenteddoh, misread that function... any case, i'll fix and submit a patch soon...
Comment #11
aaron CreditAttribution: aaron commentedhere's the first try. testing locally as we speak, but it looks good at first glance.
Comment #12
aaron CreditAttribution: aaron commentedI'm not certain about the implementations of hook_download: image_download & user_download i believe are in that patch. This patch simply replicates their existing functionality, so will work as it did before, but I'm not entirely certain that implementation was correct.
This is an extensive patch, so I'm sure there can be more improvements as well.
Comment #13
chx CreditAttribution: chx commentedLooking at the patch, the normalize and the dirname functions, after realizing there is a valid scheme needs to hand over to the wrapper. Any sort of mucking around with "directory_path" is simply meaningless. On the other hand, it seems I was wrong and file_uri_target could become a handy wrapper for
file_stream_wrapper_get_instance_by_scheme(file_uri_scheme($uri))->getTarget($uri);
Comment #14
aaron CreditAttribution: aaron commentedi *think* this addresses the concerns.
Comment #15
c960657 CreditAttribution: c960657 commentedI'm a bit confused about the relationship between DrupalStreamWrapperInterface::realpath(), DrupalStreamWrapperInterface::getTarget() and DrupalLocalStreamWrapper::getLocalPath().
The term “physical location” needs to be explained. If it is a path, it should be explained whether it is relative or absolute.
The comment only talks about forward slashes, but also backslashes are trimmed.
“All”? AFAICT the URI passed to getLocalPath() should never include the directory path. In any case, it should only strip the first occurrences. If the temporary wrapper is configured to use e.g. /tmp as base, it is not unlikely that the directory path would appear several times in a filename (e.g. temporary://foo/tmp.txt that maps to /tmp/foo/tmp.txt).
Comment #16
aaron CreditAttribution: aaron commented"it is not unlikely that the directory path would appear several times in a filename"
I agree, and scratched my head over that too. I'd opted to maintain the previous behavior, but agree it should be pulled. I'll try to get a new patch tomorrow.
Thanks for the review!
Comment #17
chx CreditAttribution: chx commentedThat whole remove the directory part smells like a complete kludge to me and a vestige from the pre-Drupal 7 era something so ancient and dark kept over the (not too many) file API revisions noone knows any more why it's there... I will do a cvs annotate run later.
Edit: no dice. It does not seem to exist before the stream wrapper conversion.
Comment #18
aaron CreditAttribution: aaron commentedhow about this?
Comment #19
pwolanin CreditAttribution: pwolanin commentedSeems a little weird to call getTarget() here for the local implementation - it's one like of code basically. But that's minor.
Also, it would be clearer to use list() with the explode like:
Comment #20
chx CreditAttribution: chx commented@pwolanin, it's quite imporant to call getTarget because that allows extending -- the whole issue grow out of hash_wrapper, check its code :)
getExternalUrl() needs to change function getExternalUrl() to self::uriTarget because otherwise it creates another hash wrapper instance which is quite honestly braindead and slow and memory hog and whatnot.
Comment #22
pwolanin CreditAttribution: pwolanin commentedtestbot seems braindead recently...
Here's a patch with minor improvements to the use of explode.
@chx - I don't follow your last comment about what change is needed.
Comment #24
chx CreditAttribution: chx commentedSo the failing test was testing functionality which we removed in this issue. Fix is easy: remove it. It's a fantastic piece... one of the things
file_stream_wrapper_uri_normalize
function did "Remove all occurrences of the wrapper's directory path" -- what directory path?? the local wrappers have directory path, otherwise this functionality can not exist as there is simply no directory path .Comment #25
drewish CreditAttribution: drewish commentedsubscribing, interesting.
Comment #26
drewish CreditAttribution: drewish commentedWe don't seem to be using the @ingroup php_wrappers else where in that patch.
I'm scratching my head about what this means:
Indenting for the docblock on dirname() has an extra space.
Could we say something about how this doesn't work?
I'd also like to make it clear when this is called so stream wrapper implementors know what constitues a sane return value. realpath() has a similar conceptual problem and it provides a bit more detail but not really enough for my liking.
I think we could rephrase this:
"Remove erroneous leading or trailing, forward-slashes and backslashes."
I feel like DrupalLocalStreamWrapper::getDirectoryPath() should become protected so that it's clear that it's not for core or other modules to use and that it's... and while I'm asking does anyone know why DrupalLocalStreamWrapper::$context and $handle are public? Seems like they should be protected.
Comment #27
drewish CreditAttribution: drewish commentedSo to be more specific, I like the proposed changes to file_uri_target() to delegate to the wrapper.
file_stream_wrapper_uri_normalize() seems sane though DrupalStreamWrapperInterface::getTarget() needs to better define its differences from dirname() and realpath().
We should improve the docs for DrupalStreamWrapperInterface::getUri() to make it clear that the URI has scheme and target components and add @see tags to file_uri_scheme() and ::getTarget() functions that would produce the components if they need those.
Comment #28
quicksketchHere's a reroll addressing as much of drewish's comments as I understood them. Absolutely no code changes from the previous patch, just updated docs and formatting issues.
- Fixed all indenting on docblocks.
- Clarified getTarget() docs as I understand them, including a better function description and more concise return description.
- Instead of going in-depth with the dirname() method documentation, I simply point over to drupal_dirname() from which this method is called.
- Rephrased the slashes/backslashes comment per drewish's suggestion.
I did not make getDirectoryPath() protected or do anything with $context and $handle, since it seems like all the problem absolutely central to this issue have already been addressed, we're just looking at formatting/docs issues at this point.
Comment #29
aaron CreditAttribution: aaron commentedthanks for the dox, quicksketch, looks good now
i agree with drewish that some of those functions should probably be protected. however, i think that should be another issue, as it's unrelated to this, and i would rather see it splode elsewhere...
i would mark as rtbc, except i'm not currently able to apply the patch. might be because my system is in the middle updating to lucid lynx, and lots of things are wonky for me right now, and i wouldn't be surprised if that program has been affected as well...
Comment #30
aaron CreditAttribution: aaron commentedhelps if i use -p0... so yes, it all seems to work for me now.
Comment #31
drewish CreditAttribution: drewish commentedThe docs for getTarget() don't make much sense to me:
I'm not sure what the read-only stuff means. Can it really return a URI? Why would it do that? Why does it only work for writeable streams?
The docs for dirname() also need some work:
usually accessed? How about:
This method is typically called by drupal_dirname() to determine the directory that the file exists in.
The various versions of dirname() don't even seem to agree on the convention for @return values. Some return empty strings, some FALSE.
I'll try to spend some time on this this evening.
Comment #32
quicksketchHmm, unless drewish has objections, let's move this back to RTBC. Right now we're being held up on extremely minor points of documentation and there are absolutely no further code changes being suggested. The overall structure makes a lot more sense to me than what we have currently and I already went through docs and corrected them to make sense from my perspective.
Comment #33
Dries CreditAttribution: Dries commentedShould be 'accessed' instead of 'access'?
Should be 'accessed' instead of 'access'?
Requires another reroll.
Comment #34
quicksketchGrammar re-roll.
Comment #36
quicksketchAnother attempt at a re-roll.
Comment #37
chx CreditAttribution: chx commentedBack to RTBC, thne
Comment #38
Dries CreditAttribution: Dries commentedPersonally, I still find this confusing. For example, the difference between dirname() and getDirectoryPath(), both part of the same class, is not instantly clear? Any why do these functions have such a different signature -- one starting with getXXX() and the other not?
While I agree that this needs to be fixed, the API is a bit of a WTF IMO. We might be able to resolve this by picking better function names.
Anyone has any thoughts on the function names?
Comment #39
chx CreditAttribution: chx commentedComment #40
drewish CreditAttribution: drewish commentedDries, the real problem with those APIs can be traced back to PHP's interface (StreamWrapperInterface) which is a) some what inconsistent in its naming and b) uses lowercase names with underscores to word separators. In some places we follow the PHP style and in some places we follow Drupal's camel case conventions.
Comment #41
chx CreditAttribution: chx commentedTentatively back to RTBC -- also we want to change the API as little as possible this late.
Comment #42
Dries CreditAttribution: Dries commentedI'm happy to commit this 'as is' given where we are in the game. However, could we add a @todo to fix the API in D8?
Comment #43
chx CreditAttribution: chx commentedAdded TODO.
Comment #44
aspilicious CreditAttribution: aspilicious commentedChx his patch with some extra newlines in the streamwrapper to make the docs consistent.
If this is to much at once you can commit #43.
Afterwards I'll roll a new one afterwards with only the newlines.
If you can handle some newlines in this patch feel free to commit this one.
Comment #45
Dries CreditAttribution: Dries commentedThis looks good to me. I'll commit it in the morning -- let's wait for the test bot.
Comment #46
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks.
Comment #47
JacobSingh CreditAttribution: JacobSingh commentedTry this:
The result? Fatal error:
The file_stream_wrapper_get_class() function only returns "public" and temporary" and doesn't include PHP defined wrappers. I'm not sure exactly what is the desired fix here, but this looks bad to me.
Comment #48
chx CreditAttribution: chx commentedThis looks fine to me. That function accepts Drupal stream wrappers. PHP stream wrappers IMO need not to apply? That would be a feature request...
Comment #49
JacobSingh CreditAttribution: JacobSingh commentedDocs say:
/**
* Returns the target of a URI (e.g. a stream).
*
* @param $uri
* A stream, referenced as "scheme://target".
* @return
* A string containing the target (path), or FALSE if none.
* For example, the URI "public://sample/test.txt" would return
* "sample/test.txt".
*
* @see file_uri_scheme()
*/
I would assume from that it takes any URI and returns either a string or FALSE.
Many other file_* functions which take a $uri component are happy to take any supported stream, be it flickr:// ftp:// http:// s3://.
It doesn't say that it @throws a method does not exist error if you pass in a valid uri which Drupal doesn't know about (but the rest of the world does, like http).
I'll make a patch, but I'm recommending that we change docs to not say it just takes any URI, but a "Drupal uri" or whatever we should be calling it so it is clear what those are and where to find them. Also, I think if it doesn't find a class, it should stop there and either throw a reasonable exception which explains that the input was bad.
Reasonable?
Comment #51
Damien Tournoud CreditAttribution: Damien Tournoud commentedThis patch doesn't make any sense to me.
DrupalStreamWrapperInterface::getTarget() looks like:
I don't understand how those two are related, and my general feeling is that this whole patch should be reverted.
Comment #52
andypostalso #49 is not fixed
Comment #53
rfaysubscribing to read later.
Comment #54
Damien Tournoud CreditAttribution: Damien Tournoud commentedNow that #858528: file_uri_target() purpose is unclear is in, everyone is happy and all is good.
Comment #55
andypostFiled follow-up to remove added TODO #3239831: Remove outdated todo in \Drupal\Core\StreamWrapper\LocalStream::getDirectoryPath()