If you have any attributes in the img tag before the src attribute, they end up getting replaced with the <img="{cdn url}". An example of this is:

<img style="BORDER-BOTTOM: black 1px solid; BORDER-LEFT: black 1px solid; FLOAT: left; MARGIN-LEFT: 10px; BORDER-TOP: black 1px solid; MARGIN-RIGHT: 30px; BORDER-RIGHT: black 1px solid" src="http://www.thomhartmann.com/sites/default/files/resize/community-hands-257x172.jpg" alt="" width="257" height="172" />

When _cdn_preprocess_page_helper does the replacements it becomes:

<img src="http://www.thomhartmann.com/sites/default/files/resize/community-hands-257x172.jpg" alt="" width="257" height="172" />

The fix I put in was to change the arguments for the img parser in cdn.fallback.inc (line 59):

_cdn_preprocess_page_helper($variables['content'], $pattern, 2, 4, 5, '','');

Not sure if that's the best fix, but so far it is working for me.

Comments

wim leers’s picture

Version: 6.x-2.x-dev » 6.x-2.0-rc5
Assigned: Unassigned » wim leers

Interesting. The code for this comes from the Parallel module, so it probably was broken there as well.

I'll review this soon.

Jamie Holly’s picture

Hmmm just playing around real quick, this could be handled with a preg_replace:

$pattern = "#((<img\s+|<img\s+[^>]*\s+)src\s*=\s*[\"|\'](($root)([^\"|^\']*)(\?.*)?)[\"|\'](.*?)>)#i";
$new = preg_replace($pattern,'$2 src="'.$cdn.'/$5" $7>', $text);

Set $cdn to the CDN URL. In my testing I also added '/' in the root (plus added the trailing slash to $base_url). That then grabbed non absolute images (ie: /sites/default/files) and changed them to the right CDN url. It also captures the image source if it's wrapped in single or double quotes.

wim leers’s picture

Could you maybe roll a patch? That'd make it easier to review :)

Jamie Holly’s picture

StatusFileSize
new2.23 KB

Here's the patch. I modified it from my original idea some. I went ahead and did it with a preg_match_callback so that we can still run hook_file_url on each image. I've tested it in a few posts and it seems to be working fine on root-relative and absolute URLs, plus it keeps all the other attributes in there.

wim leers’s picture

Title: _cdn_preprocess_page_helper breaks some image tags » _cdn_preprocess_page_helper() breaks some image tags (attributes before the src attribute are lost)
Status: Active » Needs review
StatusFileSize
new2.55 KB

I don't like the direction you went with your patch. I want a single function that does the file URL altering, not two.

Patch attached. Please test and report back.

wim leers’s picture

StatusFileSize
new2.88 KB

Minor update to the patch: now also appending the slash to the $base_url, as you were doing.

Jamie Holly’s picture

StatusFileSize
new2.91 KB

It wasn't grabbing root-relative images and it also wasn't grabbing src attributes enclosed with a single quote. It took some playing with the regex pattern, but this patch seems to be getting it all now.

wim leers’s picture

Status: Needs review » Needs work

What module is so messed up that it generates src attributes with a single quote?
Why are you adding the (.*?) near the end of the regular expression? To support root-relative URLs? I can't think of anything else. That's not so clean. Please change the value of the $root value to support root-relative URLs.

Jamie Holly’s picture

StatusFileSize
new2.9 KB

It's not a module that's generating single quotes around attributes, it's from people entering them that way. Single quotes are legal around attributes (src), so IMHO it would be best to catch those as well.

I reworked the pattern some, dropping/combining groups. It cut it from 9 groups to 5, so that should be a nice improvement. I don't see much way to make it any smaller and still be able to catch everything.

As far as root-relative paths, it's catching them for me. I did have one problem, but that is unrelated and should probably be a new issue. I uploaded one image that had an upper case extension on it. Perhaps we should to a strtolower() on the extension/key in _cdn_basic_parse_raw_mapping() and then do another on $file_extension in cdn_basic_get_servers()?

I checked the logs on a live site I put the patch on last night and the only images coming off the main server were those with weird casing in the extensions.

wim leers’s picture

Status: Needs work » Needs review
StatusFileSize
new3.41 KB

It's not a module that's generating single quotes around attributes, it's from people entering them that way. Single quotes are legal around attributes (src), so IMHO it would be best to catch those as well.

Fair enough.

I've refactored a combination of my code and your code that now checks for root-relative URLs in the manner that I asked you in #8.
Please review the attached patch.

I've created a new issue for your (valid) concerns regarding extension filtering and case sensitivity: #974992: File extension filtering in Origin Pull mode is case-sensitive.

wim leers’s picture

StatusFileSize
new5.56 KB

New version of the patch. Identical functionality as in #10, but now with some refactoring to allow #971712: Rewrite image URLs in the node body to use this functionality without code duplication.

wim leers’s picture

Title: _cdn_preprocess_page_helper() breaks some image tags (attributes before the src attribute are lost) » Fallback integration method breaks some image tags (attributes before the src attribute are lost)

Better title.

Jamie Holly’s picture

Just gave it a good testing and all seems to be working very well!

wim leers’s picture

Status: Needs review » Fixed

Thanks for the review, committed! :)

http://drupal.org/cvs?commit=452958

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.