We noticed that video files served from Amazon S3 don't have Expires headers, which means that browsers tend to redownload these large, rarely-changing files a little too eagerly. Research indicates that there is a way to request that S3 add certain HTTP headers to a file when it is downloaded. We'd like to add this feature and submit it as a patch, but before I get too deep, I thought I'd poke in here and see if anyone else has already worked on this, and if anyone has any tips or suggestions with regards to implementation.

Comments

Garrett Albright’s picture

Okay, I've got this working. It was actually pretty simple.

There's a logistical problem, though - namely that the Expires header needs to be set as an absolute date, which will eventually pass. So if today we set it to expire in two weeks, then in two weeks, the expiration date will appear to constantly be in the past - which may be worse than not sending an Expires header at all.

Now it turns out that we can update the metadata (like the headers) for files currently existing in the bucket without reuploading them, but the library included with Video doesn't support this. The official library from Amazon does, though (the docs). So perhaps it's an option to update the Expires headers for all currently-uploaded objects on cron run.

Not sure what to do yet… Might look in to patching the library that Video includes to support updating currently-existing objects as well…

Garrett Albright’s picture

Status: Active » Needs review
StatusFileSize
new5.36 KB

Okay, so it turns out "updating" a file according to the official library is actually just copying it to the same place. We can do that with Video's unofficial one, too. Patch!

Thinking of expanding this to handle Cache-Control headers too, though those are a little more complex.

Hoping others can experiment with this and see if it works for them as well. Here's an easy way to test if the Expires header is being added or updated:

curl -I 'http://bucket.s3.amazonaws.com/path/to/file'

The -I tells curl to only fetch and show the HTTP headers and not fetch the actual file data (an HTTP HEAD request).

Garrett Albright’s picture

Title: Amazon S3 and Expires headers » Amazon S3: Expires headers, CloudFront support
StatusFileSize
new8.58 KB

We need to change video URLs a little if we want to support Amazon CloudFront. Everybody stand back. I know regular expressions.

We're not using SSL or authentication at this point. If someone who is could give this a try and confirm that it doesn't break anything, let me know.

Garrett Albright’s picture

StatusFileSize
new10.84 KB

And now for limited Cache-Control header support (namely the max-age parameter) for telling CloudFront edge servers how frequently to look for changed or deleted files.

mrwhizkid’s picture

I'm surprised no one has commented on this. After all, for serving up video content through S3, Cloudfront is almost a must.

This seems to work really well! Thanks for your efforts on this.

I just have one request...I'm sure it would be easy to implement although I'm not sure how.

Is there anyway for me to be able to enter the FULL cloudfront URL into the s3 settings as opposed to something.cloudfront.net?

I use CNAMES and it would be helpful to be able to use these here.

Thanks again!

Garrett Albright’s picture

Status: Needs review » Needs work

Thanks for the kind words, mrwhizkid.

Let me make sure I understand what you're asking. Basically, you're using some DNS trickery to mask your CloudFront URL to something that isn't something.cloudfront.net and you'd rather have the module replace the URL with with that instead, so you need me to tweak the field to allow any arbitrary URL instead of just the "something" part of a something.cloudfront.net URL. Does this sound correct?

It's not something we ourselves need, but if that's all, then it should be easy enough to implement.

mrwhizkid’s picture

Hi,

Thanks for the reply. That's exactly right. My cloudfront URL is something like 1x8e8d4kd93kdo.cloudfront.net ... I am CNAMING that to something.mysite.com to make it look as if the media is being directly delivered from my server.

It would be helpful if I could input that something.mysite.com in for the cloudfront URL so that I can get rid of the 'ugly' looking cloudfront url.

I've been looking at your code trying to figure out how to do it but I'm fairly new at this kind of thing and I just haven't quite gotten there yet.

Thanks again.

Garrett Albright’s picture

Status: Needs work » Needs review
StatusFileSize
new10.95 KB

Okay, here's a patch which allows for that. You'll need to revert the previous patch (use the -R command with the patch program to revert a patch) before applying this one. I haven't been able to test it much yet, though… hope you don't mind being a guinea pig.

mrwhizkid’s picture

Thanks. I'll try this out and let you know what happens!

chrisschaub’s picture

This all looks fine, and the url works for the non-cloudfront streaming. But I get a "not found 200" error when I apply this patch.

If I comment out:

if ($cf_subdomain = variable_get('amazon_s3_cf_domain', FALSE)) {
  //$video->url = preg_replace('/[^\/]+\.s3\.amazonaws\.com/', $cf_subdomain, $video->url);
}

the old url is used and all is well. We are using "Enable Private" and have the cloudfront deployed as "Download." It would be cool to support rtmp btw. So, I'm not sure why it doesn't work. I can access the url from the browser and get a download, so I think it's there, just flowplayer is choking on the cloudfront url maybe?

Garrett Albright’s picture

schaub, if you comment out that part of the code, then your video is not in fact being served from CloudFront.

What is giving you the "not found 200" error? Seeing as how the 200 status number actually means "file found and all is well," perhaps what is displaying that error is, in fact, in error.

Maybe double-check that the CloudFront URL is entered correctly? You didn't put "http://" at the beginning of it, right?

chrisschaub’s picture

I think it has to do with private files and private paths.

parasox’s picture

Are you using the 200 error patch, that helped me (though I don't use cloudfront or this patch)

http://drupal.org/node/955656

Milan0’s picture

Indeed with private files there will be parameters passed in the file URl, so each time the file is served it will be unique, so the browser will so: geesh let me download that NEW FILE.

There is a fix for it, actually i have already submitted a patch for it, bassicly it will keep the private url paths the same untill the end of the lifetime of the URL. "Stole" it from some Amazon S3 "ultimate guide" .

mrwhizkid’s picture

I recently switched to zencoder for transcoding with this module and now the URL's are not being masked with Cloudfront anymore. Any suggestions about how to get this to work with cloudfront?

hypertext200’s picture

Status: Needs review » Closed (fixed)

Added to dev

mrwhizkid’s picture

Version: 6.x-4.1-rc6 » 6.x-4.2
Status: Closed (fixed) » Active

Hi. Is anyone using Cloudfront with Zencoder? I just turned it on today and it doesn't appear to be working...at least not for existing videos that I have on my server. I am just getting the S3 video url.

I did use this once before (on one of the RC's with the patch) and it did work but I wasn't using zencoder.

The cloudfront URL is simply masking the s3 url so it should apply to existing videos, correct?

mrwhizkid’s picture

Update on this:

It doesn't work with new videos either. I just uploaded a flash video...it plays fine off the Amazon S3 url but the Cloudfront URL is not there. And I guess this shows that it isn't a Zencoder problem because I'm using just one player a flash player so my .flv file wouldn't have been sent to Zencoder...just to S3.

Garrett Albright’s picture

We're not using Zencoder, so I haven't tested that. Though as you said, I'm not sure why it wouldn't work. Up for working on a patch? :D

mrwhizkid’s picture

Hi again -- I'm struggling to make sense of this.

Here is the relevant code from the s3 plugin. I checked everything but I can't see the problem. Am I correct in assuming that function video_s3_video_load(&$video) gets called every time a video is loaded on a node? And this is after the video has been transcoded and sent to S3 by either Zencoder or the video module, correct?

It seems pretty straightforward. It should be pulling in that variable from the admin pages and replacing the amazon s3 url. I don't pretend to be an expert on PHP...

This part here if ($cf_subdomain = variable_get('amazon_s3_cf_domain', FALSE)) { confuses me. I see that we are setting $cf_subdomain to be amazon_s3_cf_domain but I don't understand what the FALSE is for.

function video_s3_video_load(&$video) {
  module_load_include('inc', 'video_s3', '/includes/amazon_s3');
  $s3 = new video_amazon_s3;
  if ($amazon = $s3->get($video->fid)) {
    // Fix our filepath
    $video->filepath = $amazon->filepath;
//    $video->url = $amazon->filepath;
    if (variable_get('amazon_s3_private', FALSE))
      $video->files->{$video->player}->url = video_s3_get_authenticated_url($amazon->filename);
    else
      $video->files->{$video->player}->url = $amazon->filepath;
    // Are we using CloudFront?
    if ($cf_subdomain = variable_get('amazon_s3_cf_domain', FALSE)) {
$video->files->{$video->player}->url = preg_replace('/[^\/]+\.s3\.amazonaws\.com/', $cf_subdomain, $video->files->{$video->player}->url);
    }
mrwhizkid’s picture

Ahhh...! I'm getting close here. Maybe Heshan can give me some insight into this but I think I've figured out the problem...

Is it possible that the function that I just copied and pasted above is only used if the video files are being sent directly to Amazon S3 from my server? (in other words, not going through Zencoder). As I understand it, Zencoder actually sends the file to Amazon S3 which means that this function is probably not used as my server isn't involved with the transfer.

Is this is why the Cloudfront settings don't work? Are they are being completely ignored because the video is going to Zencoder first and then directly to S3?

If this is the case, I need to figure out where I can use the variable 'amazon_s3_cf_domain' as the video is being loaded AFTER it has been converted and come back down from Zencoder to my server.

Jorrit’s picture

Status: Active » Postponed (maintainer needs more info)

Please test using 6.x-4.6-rc1, let me know if CloudFront works as expected.

Jorrit’s picture

Status: Postponed (maintainer needs more info) » Closed (cannot reproduce)

Closed because of lack of response.