If I have a link to an image derivative (images that get created by applying the effects of an image style) and the derivative has not yet been created, Pathologic handles the link as if it is a Drupal link and adds the language prefix.
This error won't be noticed easily because these links arrive at Drupal and are handled by the image module (because the language part is stripped off by then). The image module will detect that the image already exists and will deliver it. However, this leads to a full Drupal bootstrap and less optimized file transfer then when the call would have been handled directly by the web server.
I think that everything under sites/default/files (or actually the directory the public:\\ scheme points to) should be handled as a file, as not only the image module generates files under that directory (aggregated css and js, js language files). Note that also links to the private file system should be seen as files, though in that case you will always get a full Drupal bootstrap to determine access rights and transfer the file from a directory (that is hopefully outside the Drupal root, outside the document root and even not reachable for the webserver (via aliases)).
Note that the idea of user azinck in #1763696: Paths outside of Drupal root incorrectly detected to use menu_get_item() to see if a link points to a file is (further) invalidated by this issue.
| Comment | File | Size | Author |
|---|---|---|---|
| #5 | pathologic_should-1958122-5.patch | 1.52 KB | fietserwin |
Comments
Comment #1
Garrett Albright commentedThe problem with treating a path like it's a file is that, on servers not using clean URLs, the path won't get the ?q= parameter added to the URL. Which is generally fine and dandy, except that now there's no way for Drupal to notice that the derivative needs to be created, and any query to that path will result in a 404 from the daemon. Welcome to Pathologicville; Population 100,000 Edge Cases.
I suppose a fix for Pathologic could be to not pass a language to url() if it looks like the path that will be created will be under the public files directory. There could theoretically be edge cases where an image derivative filter could be language-dependent, but that would probably be pretty rare…
For now, if you're really worried about this and have some development chops, you could implement hook_pathologic_alter() to tell Pathologic to stop adding language prefixes to file paths, at least until this unmotivated developer gets around to fixing it properly. See the pathologic.api.php file.
Comment #2
fietserwinI'll try to have a look at it. I will assign the issue to myself when I really start working on it.
Comment #3
Garrett Albright commentedCurrently, Pathologic tries to be smarter about not trying to add a language prefix if it looks like the path is to a file. Is this still a problem?
Comment #4
fietserwinOK, I gave it a try as I ran again into this problem. Attached patch solves the problems I had with the current dev version of pathologic:
1 patch is just for this issue, the "full" patch solves:
- this issue.
- contains the most recent patch (and thus solves) as in #348421: Multilingual links with Pathologic.
- solves something that, I think, resembles or is equal to #2071695: local URLs don't work when Drupal is in a subdirectory.
To explain the last point:
The problem is thus that due to the giant if statement the order is not preserved. To solve, I think it is better that the order of local paths (as configured) should not be changed so that site managers can put sub-paths before parent paths (this is a bit difficult to program.
To not add the current paths twice, I added some variables that keeps track of that, though I doubt if it really matters (besides a slightly longer matching loop), it would further clean up though.
Comment #5
fietserwinAttached a rerolled patch (only linenumbers and a some doc punctuation and only or this issue, not a combined patch like above).
Furthermore, I think that #1 is not true:
- Links to image derivatives are always generated like sites/default/files/styles/large/my-image.jpg. This to prevent a full Drupal bootstrap when the file already exists. this is also on systems that do not use clean URL's.
- When the image derivative (or any other file in the public files directory that gets created on first access) gets requested and the file does not yet exist, Drupal gets bootstrapped and the function drupal_environment_initialize() will assign the q parameter. See that function and the comment in .htaccess:
Thus, Drupal will create the derivative, even on non clean URL systems.
Therefore this patch will work on both clean and non-clean URL systems. So, I would like to ask you to consider it for committing.
Comment #6
fietserwinThe bug mentioned in comment #4 has been filed separately under #2545396: Pathologic fails to correctly recognize the current local path when on a sub-directory.
Comment #7
herved commentedHello,
Thanks for the patch.
I have an issue (after a migration) where all paths going to files in content are as follows: sites/default/files/...
However my public directory ended up being sites/something/files/... which I didn't expect.
So additionally to checking if the path starts with the public files directory I suggest to apply a regular expression on the path:
something like
preg_match('|^sites/.+?/files/.+|i', $path)It's an edge case but this doesn't hurt and is quite safe.
What do you think?
Thank you,
Hervé
Comment #8
fietserwinDoesn't seem to be a good idea to me.
The public files directory is a system variable and doesn't have to comply to your regexp at all. Might it be that this setting was corrupted during your migration? Or did you go from a multi to a single site install?
The image derivatives get created using the value of the public file system variable and all other modules that do crate files that have to be accessible should do so as well. If not the error is on their side.
So let's stay with using the public file directory.
If you insist and want to check for both: I don't get the ? in the middle of the regexp and the .+ (or at least the +) at the end seem superfluous (you don't have an "end of string" marker).
Comment #9
Garrett Albright commentedIt's the non-greedy operator. It stops the . from matching the /files/ that follows. For example, if you have /a(.+)c/ and you're matching against "abcc", the capture will be "bc", but if you change it to /a(.+?)c/ the capture will be just "b" because the regex engine sees that the first "c" matches the next part of the pattern, so it doesn't let the dot be greedy and match it. (It's so totally awesome how regex has different meanings for its own special characters depending on their position like this, right?)
That being said, I side with fietserwin here - we can't assume that a site's old files directory was "sites/something/files", even though that will be the case 95% of the time at least with a D6 or D7 site. For your case, implementing hook_pathologic_alter() would probably be the best approach. See the pathologic.api.php file in the Pathologic module directory for more info - it's pretty straightforward.
Comment #10
herved commentedHello and thanks for you explanation.
I'm coming from a single and going to a multisite instance.
Indeed hook_pathologic_alter() would be a good solution for me but I don't see any way to mark the url as a file.
Any suggestion are welcome.
Comment #11
Garrett Albright commentedHmm, yeah, there's no way to alter "this is a file" from an alter implementation, is there? That's an oversight. Whelp, still interested in giving us a patch? :P
Comment #12
fietserwin@herved: it seems you have a problem that is bigger than just this issue. You have content pointing to sites/default/files/... but the resources are not there anymore and won't be created there anymore, so I think you better change all your content (export sql dump, use text editor to search and replace, import sql dump) to point to the new public files directory and move all you files over there. Maybe you could also just create a link at the file system level.
Comment #13
herved commentedHello and thanks for your time.
I finally decided to do it as follows:
I saw the LANGUAGE_NONE trick above is used in pathologic.module:386 (on version 7.x-3.1) so I used the same approach for now but tagging an url as a file (using 'is_file') would be the best. I'll try to find some time after new year to submit a patch for this.
Eventually I should indeed replace all occurrences of sites/default/files in DB. But that's actually another issue (on my side).
Happy new year ;)
Comment #14
fietserwinThis almost sounds like you would be better off writing your own custom filter, I guess the boiler plate code shouldn't be too difficult nor the filter itself (that would even be very easy). But if this is temporary, so you can continue working on your site, this solution will do as well.
Comment #15
fietserwinSo, this issue is back to Needs review for the patch in #5.
Comment #16
Garrett Albright commentedfietserwin, could you step me through some steps with which to test your patch? Give me an exact scenario which is broken before the patch and fixed after it.
Comment #17
fietserwinEnvironment:
<img alt="" class="image-thumbnail" src="/drupal7-test/sites/default/files/styles/thumbnail/public/field/image/20120914_151628.jpg?itok=Pkq1OKdU" height="113" width="150"><img alt="" class="image-thumbnail" src="/drupal7-test/en/sites/default/files/styles/thumbnail/public/field/image/20120914_151628.jpg?itok=Pkq1OKdU" height="113" width="150">Note the .../en/...<img alt="" class="image-thumbnail" src="/drupal7-test/sites/default/files/styles/thumbnail/public/field/image/20120914_151628.jpg?itok=Pkq1OKdU" height="113" width="150">Note that the .../en/... is gone.Thus this patch tries to prevent adding a language prefix to links that point to an image derivative even if that image derivative has not yet been created.
Notes:
Comment #19
Garrett Albright commentedOkay, thanks for that - that helped. I could replicate the problem, though I couldn't entirely test the solution because I couldn't figure out a simple way for Drupal to give me the path of an image derivative with a correct "itok" parameter. I could see where Pathologic stopped affecting the URL, though.
Patch in #5 accepted and pushed. Thanks!
Comment #20
fietserwinThanks for adding this patch.
BTW: You can test image derivatives without itok by adding this line to your settings.php:
$conf['image_allow_insecure_derivatives'] = TRUE;