Hello,

I *think* I've hit upon a problem when using in combination with the Insert module. This is possibly connected to #1444012: After using File Paths images in body content disapear.

If the uploaded file, and then inserted file, has spaces in the file name, the _filefield_paths_replace_path function doesn't seem to construct a proper regex value to find and replace the renamed/moved new file.

Steps to reproduce:
1) Configure node to have image field with filefield_paths and insert.
2) Configure filefield_paths with a directory and cleanup the filename using Pathauto.
3) Upload image that contains spaces in the filename.
4) Insert the image into the body text field. (I believe with or without styles, it shouldn't matter)
5) Save the node.

The resulting inserted <img> tag still uses the old, unaltered, image path.

I believe the issue is caused when _filefield_paths_replace_path() calls $info = parse_url(file_stream_wrapper_uri_normalize($old));
The string value given to prase_url is like: "public://file name with spaces.jpg" and then $info['host'] ends up containing "file name with spaces.jpg" when it needs to contain "file%20name%20with%20spaces.jpg" since that is what comes out of file_create_url($old) a few lines below.

So I think the solution is to change

function _filefield_paths_replace_path($old, $new, $entity) {
  // Build regular expression.
  $info = parse_url(file_stream_wrapper_uri_normalize($old));
  $info['path'] = !empty($info['path']) ? $info['path'] : '';
  $info['path'] = drupal_encode_path($info['path']);
  $absolute = str_replace("{$info['host']}{$info['path']}", '', file_create_url($old));

to

function _filefield_paths_replace_path($old, $new, $entity) {
  // Build regular expression.
  $info = parse_url(file_stream_wrapper_uri_normalize($old));
  $info['path'] = !empty($info['path']) ? $info['path'] : '';
  $info['path'] = drupal_encode_path($info['path']);
  $info['host'] = drupal_encode_path($info['host']);
  $absolute = str_replace("{$info['host']}{$info['path']}", '', file_create_url($old));

At least that's my first take on this. Anyone else having issues with the Insert module and files that have spaces in the name?

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

JonMcL’s picture

Possible patch attached. But needs sanity check by a maintainer.

JonMcL’s picture

Title: Issue with _filefield_paths_replace_path and spaces in old filename. » Failed regex in _filefield_paths_replace_path with spaces in old filename.

Better title (?)

JonMcL’s picture

Issue summary: View changes

Fixed link to other issue #

treksler’s picture

wait, why does $info['host'] contain the filename? seems kind of unintuitive

neRok’s picture

Status: Active » Needs work

I didn't realise the issue I made was a duplicate of this one, so closed it. #2215287: Regex not capturing links with special characters

I have done some testing, and can confirm the issues with the first section of code in _filefield_paths_replace_path

I uploaded an image "draft1a - Copy.JPG". My folder token was "regex test/[node:nid]".

Upon creating a node, $old was coming in as public://draft1a - Copy.JPG.
parse_url was processing this as
$info['scheme'] = 'public'
$info['host'] = 'draft1a - Copy.JPG'
$info['host'] currently doesn't get URL encoded, and hence the links don't get updated.

Upon 'active update'-ing this node, $old was coming in as public://regex test/32/draft1a - Copy.JPG
parse_url was processing this as
$info['scheme'] = 'public'
$info['host'] = ' regex test'
$info['info'] = '/32/draft1a - Copy.JPG'
Again, $info['host'] doesn't get URL encoded, and link update fails.

In situations where the first folder path/token doesn't have special characters, there isn't a problem, hence why it may not have been picked up before.

The code in #1 is probably the simplest solution, however amending it slightly may be better/safer. Perhaps as

$info['path'] = !empty($info['path']) ? drupal_encode_path($info['path']) : '';
$info['host'] = !empty($info['host']) ? drupal_encode_path($info['host']) : '';

Note that drupal_parse_url works around this issue by taking in a relative url, and adding "http://example.com/" to the start of the url, then ignoring the returned host (example.com). Perhaps it would be a better idea to convert $old to a relative url, pass it to drupal_parse_url, and let it do the dirty work?

treksler’s picture

Status: Needs work » Needs review
FileSize
1.39 KB

Hi,

Same issue here. The scheme is "public://" and the file name ends up in $info['host'].
As a quick fix, encoding both the host and the path fixes the problem, BUT only for links that are already encoded and only "one way"

Let me explain.

1) upload a file with spaces
2) link to that file from the node body with %20 encoded spaces
3) link to the file from within the node body, but this time use space characters
4) change configuration to rename file path (eg enable transliteration from filefiled paths settings for the upload field)
5) run retroactive update (or active update by saving the node)

What happens?

1) without the patch, nothing happens. neither link is updated in the node body
2) with the patch, only the link with %20 encoded spaces is updated (i.e. transliterated in the node body in our case)

What do i mean by "one way"?

from this state we are in, (transliterated file name attached to node and one transliterated link and the other link with space characters)
1) change configuration to rename file path (eg disable transliteration from filefiled paths settings for the upload field)
2) run retroactive update (or active update by saving the node)

What happens now?

1) the transliterated link is converted to a link with SPACE characters! (not %20 encoded)
i.e. now both links have spaces and will not be touched by filefield paths updates

Now what?

1) I am including a patch which includes the code cleanup from #4 and encodes the paths "both ways"
2) do we want it to work for both encoded and non encoded links in the node body (and other) fields?

i.e. should links with space characters also be updated, not just %20 encoded links?

Deciphered’s picture

Deciphered’s picture

Status: Needs review » Needs work

Patch created incorrectly and no longer applies.

Deciphered’s picture

Status: Needs work » Needs review
FileSize
4.74 KB

Find attached a failing patch that simply adds tests, no functionality changes.

The purposes of this patch is to be used by anyone who wishes to contribute to this issue with future patches; those patches should include this patch so that if those tests pass I can confirm that the patch (potentially) solves this issue.

The patch tests that the following replacements are made:

As well as a more complex test:

As such, I will likely change the way the replacements are done from a complex regular expression to a single replacement for each variant, and probably at some point remove the functionality from File (Field) Paths completely and move it to it's own, more generic module (unless I can find an existing module that provides the same functionality).

Status: Needs review » Needs work

The last submitted patch, 8: regex_tests-2119789-8.patch, failed testing.

  • Deciphered committed b0acd19 on 7.x-1.x
    #2119789: Fixed issue with complex characters in regex functionality.
    
Deciphered’s picture

Status: Needs work » Fixed

Fixed and committed.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.