I cannot get PDF files mirrored on CloudFront.

I simply added .pdf to the line of allowed extensions at /admin/settings/cdn/details like this:
http://d26engbtkhfc1t.cloudfront.net|.css .js .ico .gif .jpg .jpeg .png .svg .otf .ttf .swf .pdf

This should rewrite relative PDF links in node bodies just as .jpg and others are rewritten.

I am baffled. This module works so well that I am almost sure it must be me who does something wrong. But what?

Comments

wim leers’s picture

Title: .PDF not recognized in Origin Pull mode? » Add support for any type of file referenced in node bodies (currently only images)
Component: Origin Pull mode » Module
Category: bug » feature
Priority: Normal » Minor
Status: Active » Postponed

This is the reason:

/**
 * Implementation of hook_nodeapi().
 */
function cdn_nodeapi(&$node, $op, $a3 = NULL, $a4 = NULL) {
  // We implement hook_nodeapi() for $op == 'alter', to alter the final HTML,
  // to ensure that all file URLs (notably for images) in nodes also point to
  // the CDN.
  if ($op == 'alter') {
    cdn_load_include('fallback');
    if (isset($node->body)) {
      cdn_html_alter_image_urls($node->body);
    }
    else if (isset($node->teaser)) {
      cdn_html_alter_image_urls($node->teaser);
    }
  }
}

/**
 * Alter image file URLs in a piece of HTML.
 */
function cdn_html_alter_image_urls(&$html) {
  $url_prefix_regex = _cdn_generate_url_prefix_regex();
  $pattern = "#((<img\s+|<img\s+[^>]*\s+)src\s*=\s*[\"|\'])($url_prefix_regex)([^\"|^\']*)([\"|\'])#i";
  _cdn_html_alter_file_url($html, $pattern, 0, 4, 5, 1, '');
}

Clearly, only images are currently supported in node bodies. It is doable to add PDF support though. But since this is only rarely needed, I'm going to leave this is a feature request that is yet to be implemented. I'd definitely welcome a patch that adds this, of course. To qualify to be committed, it will need to be neatly integrated with the config UI, too, though.

wim leers’s picture

Upon looking at this issue again, I thought I should automatically serve any "downloads" from a CDN. So: presentations (.pdf, .key, .ppt, .pps …), rich text (.doc, .docx, .rtf, .pages …), archives (.zip, .7z, .rar …), but … these files are often served using private file downloads. But often these are served using private downloads.

Plus, there's nobody else requesting this functionality. So I'm about to just close this issue, unless somebody objects.

Stealth’s picture

I am not able to get orgin pull to with with .swf files

vacilando’s picture

But often these are served using private downloads.

True, but that statement could also apply to images.
In other words, the decision whether or not to CDN a file should remain the responsibility of the site admin and not limited by the module.

I would still very much welcome if CDN allowed serving any kind of file (if specified by extension).

wim leers’s picture

Version: 6.x-2.1 » 7.x-2.x-dev

The problem is that it's by definition not supported to serve private files from a CDN, because then the "private" nature of the file can no longer be guaranteed. Some CDNs support this, but they all have different mechanisms…

However, if the files are served publicly, then you're absolutely right.

wim leers’s picture

Private files support is answered here more completely: #1294822-4: Document private files support.

This issue is now solely about supporting non-image files in node bodies using Drupal's "public files".

wim leers’s picture

Status: Postponed » Needs review
StatusFileSize
new1.36 KB

Attached is an initial, untested patch for D6. Likely, you'll have to make changes to the regex.

Note that this is pretty much a non-issue in D7: File Field URLs will automatically be altered by the CDN module, and hence this will just work there. Except when you link to these files from within the body of a node, then a similar patch will be necessary.

vacilando’s picture

Version: 7.x-2.x-dev » 6.x-2.x-dev
Status: Needs review » Needs work

Patched the latest D6 dev, turned out invalid HTML.

The PDF at this link got malformed -- the second quotation mark was encoded into %22 so the URL never finished.
Something like <p><a href="sites/vacilando.net/files/pdf/rocnikovka/rocnikova_praca_19890523.pdf%22%3EPopis%20ro%C4%8Dn%C3%ADkovej%20pr%C3%A1ce%3A%20pripojenie%20...

brettbirschbach’s picture

Just hit this issue as well...was trying to leverage the CDN to serve public PDFs on our site.

wim leers’s picture

Version: 6.x-2.x-dev » 7.x-2.x-dev

#9: then please test #7 and reroll the patch as needed!

brettbirschbach’s picture

#10: On my TODO list, thanks!

brettbirschbach’s picture

Does fixing this issue fundamentally conflict with https://drupal.org/node/1623252? I have a local D7 implementation of the patch from #7, but it undermines the solution for https://drupal.org/node/1623252.

Looking at issue https://drupal.org/node/1623252, I'm not sure what it means that the module is "improperly" converting the link reference to a CDN URL. In my mind the link is only converted if it matches the extensions configured to use the CDN, which I see as desirable. Please help me out with the piece I am missing.

brettbirschbach’s picture

Status: Needs work » Needs review
StatusFileSize
new6.27 KB

Attached is a first go at a D7 patch for this issue.

I coded the solution in such a way that anchor links are converted to CDN links *only* if the extension of the anchor link specifically matches a defined extension for the CDN mapping. That way we don't convert things like /node or /hello.html.

Updated/added tests as well.

brettbirschbach’s picture

StatusFileSize
new6.61 KB

New patch, adding a missing comment and fixing an issue where occasionally a call to cdn_load_include was needed.

meecect’s picture

Issue summary: View changes

We just got hit by this today. We had pdf's uploaded into our public file directory and a content page was created that linked directly to it in the node body. This is not an unusual case for us. What else is needed to get this committed? Testing #14?

Thanks,
Cliff

wim leers’s picture

StatusFileSize
new7.18 KB

Rerolled.

wim leers’s picture

Status: Needs review » Reviewed & tested by the community
StatusFileSize
new1.36 KB
new1.36 KB

Sorry for taking such a long time. Since #14 includes test coverage, I feel confident committing this as-is. I have only tiny nits.

Status: Reviewed & tested by the community » Needs work

The last submitted patch, 17: 1115686-17_D7.patch, failed testing.

wim leers’s picture

Status: Needs work » Reviewed & tested by the community
StatusFileSize
new7.14 KB

Oops. Interdiff was right, patch was not.

wim leers’s picture

Status: Reviewed & tested by the community » Fixed

Thanks again! :)

brettbirschbach’s picture

You're welcome! :)

joelpittet’s picture

I just ran into something I think with this.
URLs seemed to be generated as this:
http://example.com/abc123.cloudfront.net/sites/default/files/js/js_y-bxY...

When the mapping is just 'abc123.cloudfront.net'. Does this mean the protocol is not required in the mapping? Or maybe it always was?

wim leers’s picture

#23: could you perhaps open a new issue? And post both the original markup and the resulting markup? That would answer a few initial questions I have.

joelpittet’s picture

@Wim Leers good call, I had a small discrepancy between staging (which didn't have CDN turned on before testing) and production which had the protocol already in there.

joelpittet’s picture

Added a new issue as a documentation, probably unrelated to this, just the IS on this was helpful.
#2667156: Add help text or documentation on CDN mapping patterns

wim leers’s picture

So, no actual problem, is that what you're saying?

joelpittet’s picture

@Wim Leers may need a different issue for the 'problem' which is protocol-less mappings. But I need to confirm it wasn't something stupid I did first;)

wim leers’s picture

And yes, just providing a domain isn't enough; it must be either http://absolute-url.com or //protocol-relative-url.com. That's always been the case.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

brettbirschbach’s picture

Component: Module » Origin Pull mode