Hello!

In speaking with @rfay and integrating Commerce File S3, it became clear that we need additional functionality to attach files to Commerce products. It seems that the best solution for this is a helper module that allows for automatic discovery of files in an Amazon S3 bucket, and populates the Commerce Product form with an autocomplete widget. The user experience could be the following (this is more of an "ideal" solution for me):

  1. Upload files to S3 (via CloudBerry, Amazon S3 module, direct connection, S3 import/export, etc.)
  2. Trigger a "Refresh files" action, either automatically or by user input.
  3. Edit a node; autocomplete widget will find filenames as the user types.

I think it's important to keep in mind that there are multiple ways to get files into S3, and so it should be possible to trigger the file discovery function. (In Ubercart, this was done automatically every time a Product Add File Feature form was loaded, and if there were many files in a bucket - as is with ours - this could take some time to complete.)

I'll be working on this kind of a solution in the near future, and would be happy to brainstorm with you (and Randy) to figure out the logistics and best way of handling the functionality included in the proposed helper module.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

justafish’s picture

This kind of functionality isn't something that would be specific to AmazonS3 if it's built for stream wrapper usage. Have you tried http://drupal.org/project/filefield_sources ?

rfay’s picture

I think he's just proposing the new kind of module that would discover remote files (or local ones) in general and make file_managed objects out of them, and *then* filefield_sources might just do the trick. I think we were talking about this in London.

torgosPizza’s picture

Yes, exactly. The idea is to get the Amazon S3 filenames from the Commerce File bucket into a table that could then be used by filefield_sources, or something similar.

rickvug’s picture

#1215724: Any thoughts on importing or batch content creation that uses S3 files. is also requesting the same feature. One of these two issues should be marked as a duplicate.

torgosPizza’s picture

Ah, you're right. I hadn't thought to check since that is a completely different title. I'd suggest marking the other one as a duplicate, simply because it started out as a somewhat different issue. (Either way is fine, but if we mark this one as the dupe, we should fix the title of the other one to be more accurate.)

EDIT: Upon further review, I think that issue should possibly be kept separate. It sounds like the OP was asking for a batch creation of nodes based on files uploaded to S3, i.e., with no human intervention. For example I could see this use case:

1) Upload a new photo album.
2) Run a batch script in Drupal.
3) A bunch of nodes are created, possibly based on folder structure.

What they're asking for in that case is much more complex than what I've proposed (and what you asked for in that issue) which is a small module that simply populates an autocomplete field with a single filename.

andyjh122’s picture

subscribe

yes, this would be great! been trying to see if file field sources can do the attach thing with s3 files uploaded with s3 client, then attached to product. Looked at db schemas to try to attach by hand, but that was challenging.

also looking at media mover for d7 as it matures to see if that can help as well...

patricko67’s picture

subscribe

BrendanP’s picture

subscribe

Yes please, be a great help for me. Cheers

magnusk’s picture

re #1, could we connect Amazon S3 to Filefield Source so the latter can "see" objects stored in S3?

rickmanelius’s picture

I'm still not able to attach an S3 file to the commerce file input form. What am I missing?

rfay’s picture

@rickmanelius you want #1197202-3: Implement authenticated links (and torrent links... and force download header) - that worked fine for me with Commerce File. Eventually we'll get it cleaned up and committed.

rickmanelius’s picture

Much appreciated @rfay! I'll give it a whirl...

justafish’s picture

Commerce File is just a filefield, I can't think why an S3 file wouldn't attach to it regardless of that patch. What error are you getting rickmanelius?

rickmanelius’s picture

Apologies for not getting back to you sooner @justafish. I'll have to try it again as I have a client that is super interested in creating a download store with Drupal Commerce and we need support for large file sizes.

justafish’s picture

Status: Active » Postponed (maintainer needs more info)
deggertsen’s picture

Status: Postponed (maintainer needs more info) » Active

So where exactly are we on this? Is there already a way to make existing S3 files discoverable? I've been reading through this and other threads and still just can't figure it out. I've tried filefield sources, but the autocomplete option doesn't work. It just gives me a field that doesn't do anything. Files are uploading to Amazon S3 just fine, but I can't seem to use those that were already uploaded.

Thanks for the help.

deggertsen’s picture

I've created an issue concerning the autocomplete option with filefield sources, but I'm sure it's completely related to this issue. #1605010: Autocomplete not working with existing Amazon S3 files

I'm willing to pay to have this done if anybody can figure it out.

torgosPizza’s picture

I'm working on this again for our own site. Currently finishing up a "Rebuild S3 Cache" function.

Related issue: #1651096: Adding file metadata to {amazons3_file} fails with filesize > 2GB, use BIGINT instead

markwk’s picture

Ok. I tried the autocomplete thing but it didn't work very well. So I did it in a simpler way for a site on a deadline. Sorry I completely forget to post the code earlier. Here is a simple S3 File Widget module. There is a lot of hardcoded stuff there but I think it should provide a good starting point for this kind of feature: https://github.com/markwk/s3_file_widget

bradhawkins’s picture

@markwk this looks great and I think might be helpful on my project, but I was just wondering if you could explain a little better your workflow?

I can see where you've hard coded some of the fields and various elements, but was wondering how exactly you used it. Did this replace your file upload field, or was it just used to populate the database and use file field sources to autocomplete the intended field?

Also were you using it with Commerce File, or just the regular file field?

torgosPizza’s picture

Also would like to know! I'm finally restarting work on this need, to try and create a new Filefield Reference widget (as well as modifications to the AmazonS3 module that allows cached files to be referenced by the widget). I'm still kind of learning the Field API so it's a bit slower for me. Things that look like they should work tend not to :/

I'd be happy to check out the code in #19 though!

markwk’s picture

So it's just a different input widget format, so when you add a text of the file you want to add to that field and then the node is saved, it attempts to validate the backend stuff stuff and to create the reference in the current files tables in Drupal.

deggertsen’s picture

So with your module here you have required that those inputting the s3 file know the exact file name. Do you think this could be adaptable to use an auto-complete widget?

I also tried to change it so that it would serve as a widget for commerce file, but when I tried to create a product I got an undefined offset Notice: Undefined offset: 0 in s3_file_widget_field_widget_form() (line 23 of /sites/all/modules/s3_file_widget/s3_file_widget.module).

I'll keep working on it when I have time. Thanks for the start! Hopefully we'll be able to come up with a more versatile solution.

markwk’s picture

Yeah, it does require you to use the exact title which isn't ideal. Auto-complete didn't really work for me. I can't recall why but I think it went into a death spin. I think a better option would be use something like

I'm not sure about commerce file since I've never used it. It's not really the most versatile module so love to see it get better :)

torgosPizza’s picture

The thing with Commerce File is it just uses a generic File Widget, really, that's called Commerce File. Since it uses the Field API you can also use Filefield Sources with it. That's what I am basing my modifications on; integrating the AmazonS3 module with File field Sources, and extending it so that the cached files from the S3 bucket are available as a source.

deggertsen’s picture

Yes, I like that approach torgosPizza. Looking forward to having something to test out!

torgosPizza’s picture

Just FYI I almost have this working. The process will be:

1) Rebuild the AmazonS3 cache. Maybe we can do this on Cron? 1
2) Make sure "S3 reference" is selected in your Product Type's File Field configuration.
3) Choose the file from the autocomplete field, hit select, the file is migrated into the Commerce File data tables (and attached to the product).

I hope to post a patch (or a tarball, but probably a patch) in the next day or two.

1 We'll want to see if there's a way to keep the list of files in sync, but I'm not sure how best to handle that. For instance if a file is deleted from the S3 bucket, that may negatively affect files and products, but that may be out of the scope of this first attempt at a feature addition. In any event it's a toss-up between constantly re-syncing during Cron, or making sure this only re-syncs on demand. I mention this because it could be a lengthy process, depending on how many files you have in your bucket.

torgosPizza’s picture

Status: Needs review » Active
FileSize
58.42 KB

Okay, here is my first run-through. The patch does a lot, so I will lay it out:

1. Does a bit of whitespace fixing. This was from my editor, so I've left it in.

2. Adds a new file "amazons3_reference.inc" which contains all of the filefield_sources functionality.

3. Adds Batch API hooks (for rebuilding the S3 cache). (These could probably also be split into their own .inc file.)

4. Includes the $e (error code from catch()) in some of the Amazon S3 errors, for verbosity.

5. (And this is major) It changes a bunch of "protected" functions in AmazonS3StreamWrapper.inc into public functions. I had to do this in order to be able to access metadata-fetching functions, for example. I will go through in a future patch, since I'm obviously not using ALL of the functions, and only make the ones that I need as public. Unless no one objects to keeping them all public; I'm not really sure why you'd want reusable functions such as these to be protected, but I'm still a bit of an OOP novice. (I have a @todo in there which states that we could use reflection, basically extending the wrapper class and using some trickery to make those protected functions accessible. We could do it this way as another solution, but I don't think it's quite as nice as simply reusing what's there. Why reinvent the wheel?)

This code is kind of messy as I haven't proofed it yet, and there is at least one section I think needs to be improved (grabbing the filename from the widget field uses a nested array that's all sorts of fugly) but I am short on time so I'm posting this here in the hopes that others can see it and improve it.

Let me know how it works for you; I've tested it on my local stack and it seems to work pretty seamlessly.

Note that this patch requires you to install and enable File Field Sources. (Perhaps this should eventually become its own module, so we can use File Field Sources only as a requirement for it, instead of amazons3?)

Setup:
1. Patch your module (naturally). If you can't, let me know and I'll tar up my module and post it.
2. Clear your caches.
3. Under admin/config/media/amazons3, scroll down and click "Rebuild file metadata cache." Note: at the moment this also truncates your {amazons3_file} table, but I don't suppose that is a huge issue. If it is, you should comment it out in the module; future versions of this patch should try to only add newly-uploaded files.
4. Wait for the Batch process to finish. It will cache all of your S3's files in increments of 100 during the batch process. (I will eventually make this value configurable.)
5. Now in your Product Type's file field setup, add the Amazon S3 Reference as a source.
6. Edit your product, and start typing a filename into the S3 Reference autocomplete textfield.
7. Highlight your file and click Select. Continue uploading or Save your product as necessary.

Let me know if you run into any issues.

EDIT: Use the patch in #31, as it is much cleaner.

torgosPizza’s picture

That patch was pretty ugly with the AmazonS3StreamWrapper.inc stuff. Let's try a new one.

EDIT: Yikes, this one is even worse. I've tried fixing it in a few different progs to no avail. Sorry :/ For some reason my local install of Git just really does not like that include file. I'll try again tonight from my other machine (that isn't Windows!)

torgosPizza’s picture

Status: Active » Needs review

I'll set this as "needs review" just so people can have eyes on it :)

torgosPizza’s picture

Okay this patch does look better. I've tested and it seems to work, but as always, let me know if there are any problems.

deggertsen’s picture

Looks like everything works for me too! The patch applied without any problems.

Note: at the moment this also truncates your {amazons3_file} table, but I don't suppose that is a huge issue. If it is, you should comment it out in the module; future versions of this patch should try to only add newly-uploaded files

This seems like a must to me. I definitely think that something needs to be done to make it so that you can cache more than 100 files though... If I understand correctly from what you said above, it will delete the cache and then run a batch of up to 100 files. Am I wrong?

Anyway, so glad we have something that works!

torgosPizza’s picture

Status: Active » Needs review

Great, thanks! That's awesome news.

If I understand correctly from what you said above, it will delete the cache and then run a batch of up to 100 files. Am I wrong?

Nope, you're wrong! :) It will cache every file in your bucket, but with a Batch process that only grabs 100 objects at a time. (I want to make this value configurable, too.) So as you watch the cache get rebuilt, you'll see the "currently processing..." number jump by 100 every time the progress bar is updated. I did this because too much higher and you would see a long delay between batches, which looked like an unresponsive website but is really just PHP connecting to the S3 bucket and then looping through its objects.

To test, I used our backup S3 bucket which has ~5000 objects in it, and all were imported with ease.

One issue I did notice, though, that I will fix in another patch, is that files with "S3://" will show up in the autocomplete. A minor thing, but I don't want users to have to see that!

EDIT: I edited my earlier comment for clarity :)

destinationsound’s picture

Thanks for the hard work guys! I just wanted to let everyone know that i patched the module and it is now communicating with S3 as described. I do, however, have an issue. When i make a test purchase and try to download the file i get an access denied from S3.

Has anyone else experienced this issue? Is it something simple like a setting within the module or within my amazon account?

(note: Before i implemented the patch i was doing small test uploads from commerce_file field to my S3 and i could, upon purchase, be granted access to the S3 files. Now with the patch and Autocomplete reference it doesn't seem to recognize that i have the license to download.)

Edit:
My module versions:
- Commerce file: 7.x-1.0-beta4
- Entity API: 7.x-1.0-rc3
- AmazonS3: 7.x-1.0-beta7

i have not patched any other modules. Just the S3 module patch posted #31

torgosPizza’s picture

Category: feature » support

Is the patch the only thing that you changed? Or did you update other modules as well (such as Commerce File)?

If you add another product with a file that you upload, does the license still get granted?

I'm currently looking at #1568240: Entity API Update Breaks License Creation which might be related to this, and which mostly seems to indicate that the latest Commerce File handles its entities and Commerce File fields differently.

Can you tell me what versions of the modules you're using? (Entity API and Commerce File in particular?) Also, if you revert the patch, does the problem go away? I'm currently experiencing "licenses not being granted" with files that have been uploaded through the Upload interface as well as files referenced or imported via Migrate.

destinationsound’s picture

When uploading a small .zip file using Drupal's regular upload method I do not get any access denied errors. #31's patch is still in place.

torgosPizza’s picture

It seems that my issue is relating to changes in how Commerce File handles entities. When I upload a file through the Upload interface, I can create a license through the Order Edit page (clicking the "Create" button). But with orders that have Commerce File fields that were imported (and which are hosted on S3) I get the error "No licenses were created."

In your case it sounds like the licenses are being created but you are not getting access to your files. Can you tell me if the access denied error happens at your site, or at Amazon? If it's on the Amazon side, perhaps the file is not being created with proper permissions? Also, are you able to download the file from the Product Edit page? (In other words, once you've attached the file, and the link to the file appears, if you click on it, is that also resulting in an error?)

destinationsound’s picture

My error happens at the Amazon S3 site. I am able to purchase the license on my site, i check my "files" tab within my account, i click the download button, it sends me to the linked amazon page and has lines of code: "AccessDenied" and a bunch of code that this forum strips so i did not add

I am a novice when it comes to coding and web design so it might be my error.

torgosPizza’s picture

That's definitely an AmazonS3 error, then, not Drupal, although it could be that you haven't configured your site for private URLs. Try adding an asterisk (*) to the "presigned URLs" at admin/config/media/amazons3. However I have tested it and it does not seem to actually presign URLs, so I think something might be borked in the module itself.

For your testing purposes, login to your S3 bucket with something like Cloudberry or S3Fox, or even the AWS admin, and make the file you'd just uploaded "Public". That way we can make sure that the issue is with your S3 permissions, and in fact requires that the Presigned URLs functionality to work, but isn't.

destinationsound’s picture

UPDATE:

The issue for allowing private files to work is a simple configuration within the AmnazonS3 module. Under the "Presigned URLs " section type 60|/*

here is a link with explanation: http://drupal.org/node/1526848

So, with patch posted on comment #31 and the above setting the S3/Commerce_file communication is perfectly functional.

Thanks all!

torgosPizza’s picture

Back to the helper module situation:

I have the "Reference" field working pretty well, the only issue is it does seem like old files that I have migrated in (via Commerce Migrate) are failing to be passed as real entities, for some reason. New files that I've added via Commerce File to a product seem to work fine and are granted license access, so I'm continuing to see exactly what I'd missed that could be causing such a discrepancy.

gmenzel’s picture

Thanks for this great patch! Exactly what I was looking for.

But I have a problem: amazons3_rebuild_process does not cache any files. More precisely, get_object_list returns an empty array ($total is 0), despite the assigned bucket containing thousands of files. The AWS SDK config seems to be okay.

I have no idea what the problem could be or what I might be doing wrong. Any help on this would be appreciated.

torgosPizza’s picture

Hmm, interesting. Actually I think the patch posted here may be behind my local working version. For now, can you tell me if any errors were returned? (Especially in Watchdog...) if not perhaps we can communicate directly to try and nail this issue down.

Can you tell me anything more about your bucket? Is there anything in the main root of the bucket, or are they all in folders? Etc.

gmenzel’s picture

Thanks for your readiness to help, torgos. I was just about to amend my post. As it turns out, the problem seems to be NOT with your patch, but rather with the fact that my bucket name contains dots. I see that there already is an open issue concerning this problem. I will have to look into that tomorrow.

But, nevertheless, I would be very much interested in any updates for this patch. Would you perhaps be so kind as to share your current version? This simple approch seems much superior to all the other modules/techniques I tried so far.

torgosPizza’s picture

Awesome! I'd be happy to share it. I'm still in the process of working on our major D6->D7 migration, and so the "amazons3_reference.inc" I've created for the Filefield Sources module may continue to evolve, but it seems to work perfectly for our needs. (Though I haven't yet fully tested whether these files are able to be purchased, as is our initial usage with Drupal Commerce.. that's where the testing comes in, I suppose.)

Give me a little bit to post an update!

bmango’s picture

I've just tried applying the patch in #31 but it is not working for me. I saved the patch file into the amazonS3 module folder and then using Cygwin (I'm on Windows) tried to apply the patch. It came up with an error: Can't find file to patch at input line 5. I can't see where the problem is as AmazonS3StreamWrapper.inc is in the same folder. The patch can also not find the amazons3.module file either.

I could be making a basic mistake here and would appreciate any guidance.

The reason I want to use this patch is because I'm uploading videos to Amazon S3 and I think because of their size I can't upload them through Drupal (I was getting a white screen on upload, the site is hosted on a shared server) and so was trying to find a solution where I can upload them directly to S3 and then reference them from the Drupal site.

Edit:

I don't know if it is relevant but I'm not using Commerce...

torgosPizza’s picture

It sounds like you need to use the -p1 or -p0 option, since patch simply can't find the correct files.

was trying to find a solution where I can upload them directly to S3 and then reference them from the Drupal site.

That's exactly how my solution works. Download and enable Filefield Sources, and then for your content type's file fields you can enable S3 Autocomplete for each. (But first you'd need to rebuild the cache in the AmazonS3 settings page. Once that's built, you'll have a list of files for the autocomplete widget to grab.)

--

I know this patch is needed for a few other people, so I will try to post mine here soon once I've tested it a little bit more. It will most definitely need some extra help, though, as my Filefield Source integration may be a bit janky. Also it still needs the ability to periodically update the Amazon S3 File List (Cache) without user intervention - right now an administrator has to go to the Settings page and click a "Rebuild cache" button, but it'd be more elegant if Cron handled this as well.

Additionally I'd love to have support for not JUST autocomplete widgets (as it stands now) but, if a user only has a few S3 files, a select list as well. Of course I could see a bucket with many "directories" being an issue here, but that could be fodder for some follow-up feature requests.

bmango’s picture

Thanks torgosPizza for your suggestion. I'll try that and hopefully get it to work. I'm sure I'm probably missing something basic.

Edit:

Got it to work using the -p1 option. Thanks again for the help.

bmango’s picture

I applied the patch in #31 successfully and followed the steps carefully in #28. Everything is now working for me. Thank you very much for the patch.

It did take me a little while to work out that because my videos are stored in a sub-folder I needed to put in the name of that sub-folder into the auto-complete field first. Maybe it would be useful (at a later date) to have the option for both "Starts with" and "Contains" for the S3 reference field?

bmango’s picture

Is there a way to refresh the list of S3 files being referenced using cron? At the moment when new files are uploaded to S3, to refresh the list in Drupal, I need to rebuild the file metadata cache on the AmazonS3 config page.

torgosPizza’s picture

Is there a way to refresh the list of S3 files being referenced using cron? At the moment when new files are uploaded to S3, to refresh the list in Drupal, I need to rebuild the file metadata cache on the AmazonS3 config page.

At the moment, no - I mentioned this in my comment in #47. This is a feature that needs to come later as it's mandatory, IMO.

Same goes for the other issue regarding subfolders. I'd love to allow more granular control - perhaps even only allowing certain subfolders on a per-field basis, and stripping the folder names out. But this was just a first-draft approach, and as I mentioned, my Field API-foo is not very strong.

bmango’s picture

Thanks for your reply torgosPizza. I think I was getting a little ahead of myself and forgetting that you're doing this voluntarily.

deggertsen’s picture

Priority: Normal » Major
FileSize
10.19 KB

I had to update the patch for the most recent dev version. I haven't tested it yet, but I think I got everything right in this patch. Note #51 as this patch may still need some work, though I vote that we make that a separate issue and just get this portion applied asap.

Also bumping this to major as I noticed there is another module dependent on this patch. http://drupal.org/project/media_s3

deggertsen’s picture

Oops, found a few mistakes in my previous patch. This one installed and is working just fine for me with minimal testing.

deggertsen’s picture

Status: Needs review » Reviewed & tested by the community
FileSize
10.19 KB

I have now installed this on 3 production sites along with this patch (this module is useless to me without it). I am marking this as reviewed and tested seeing as it appears to be working so well. I'm no programming expert, but I would at least like to see the maintainers giving a nod in this direction. I also fixed a few whitespace errors in the patch.

deggertsen’s picture

Category: support » feature

I just noticed that this is marked as a support request. I think it should be a feature request.

torgosPizza’s picture

Looks great Dave!

Thanks for getting more work done on this. I've been dealing with Commerce issues at the moment and meant to get back to these modules at my next opportunity. Glad to know things are working well - I've had no problems either, in fact it's one of the few parts of our D7 development site that are working without my intervention :)

chellman’s picture

I'd like to test this, but I'm having trouble applying the patch in #55. It doesn't apply cleanly to the current dev or the beta release for me. Does it need to be rerolled, or does my brain need to be rebooted?

deggertsen’s picture

It applied cleanly for me using:
git clone --branch 7.x-1.x http://git.drupal.org/project/amazons3.git

then applying with the following after downloading the patch file to the new amazons3 folder:

git apply -v amazons3_reference-1277152-55.patch

However, I found some issues in the patch file (more whitespace and modifying the .info file unnecessarily). So I have once again rerolled the patch. @chellman Make sure you right click the patch and save link as rather than opening it in your browser first and then saving. This has corrupted patches for me in the past.

torgosPizza’s picture

Nice work, Dave! I also just noticed that patch doesn't alter the .info file to include the Filefield References module as a dependency, when it probably should.

deggertsen’s picture

Very good point... One thought here though is that maybe this should be put together as a sub module, so that those who don't need this functionality don't need to install Filefield References as a dependency.

Thoughts?

torgosPizza’s picture

I had thought about it but that would take some refactoring. The ideal solution would not require the filefield reference module, my goal was to keep things in line with how I thought it should be done "the Drupal way." But I admit installing 2 modules when you only really need the one is not great housekeeping.

If you have an idea how to best achieve it, let me know. The filefield reference module does a lot of things under the hood that I was able to leverage (various hooks that allow the widget and things like that) but I don't think it'd be impossible to decouple them.

deggertsen’s picture

Couldn't you simply take everything from line 125 of the patch and move it to a sub module? I'm not super familiar with how that works as I'm more of a themer, but it seems like you could just dump all of that into a sub module. We would still have to change the protected functions to public. You would also have to hook the amazons3_admin() function as well as the amazons3_menu() function to add some things into the sub module, but would that be very difficult to do? Again, I'm still learning how to code modules so I'm pretty ignorant when it comes to how this all works.

torgosPizza’s picture

Well, I think I understand what you're saying but you may be confusing two issues. I like the idea of this functionality living in its own submodule, but as it stands now, that submodule would still require Filefield Sources, as it uses several hook_filefield_sources hooks to create its main functionality.

So the patch could be re-rolled so that instead of creating the amazons3_reference.inc file, it just creates a new sub-module, but as I mentioned that sub-module would still require filefield_sources. Does that make sense?

deggertsen’s picture

@torgosPizza Yes, that makes sense and I think that's the way we should go with this so that those who don't need the functionality this issue offers wont need to get filefield sources. I personally don't mind and actually think it is great to have filefield sources if you do need this functionality. We just need to give them the option of not getting filefield sources if they don't need this particular functionality.

I personally probably wont get around to re-rolling the patch anytime too soon, but I think it's good we have this discussion in case the maintainers are watching. I would be interested to know their thoughts on what needs to happen in this issue before it gets committed.

destinationsound’s picture

Hey all!
So I've been following this thread for a long time. I deeply appreciate all of the work being done on this module. I agree with a few of the above comments that this needs a few new additions.

What i need (that no one else seems to miss =-) is the ability to utilize Commerce File module's "access limits" options. For me and i assume others this is very important. These options exist to prevent customers from abusing the download system and causing our Amazon bill to skyrocket.

EDIT: good point torgosPizza. Ive created a new feature request for this issue: https://drupal.org/node/2021997

torgosPizza’s picture

@destinationsound: You should probably open a feature request that is separate from this issue for that particular functionality.

deggertsen’s picture

FileSize
27.99 KB

@destinationsound. Maybe I'm not understanding correctly, but I think that what you're asking for is already working... I am able to set access limit options just fine for all my Amazon S3 files for download on a commerce file field. I've attached a screenshot of a part of my product creation form where I have two commerce file fields. Patch #59 allows me to search for and use files on the S3 server, but I still have all the functionality of the commerce file. Is that not what you want?

torgosPizza’s picture

@deggertson: Does it actually honor those access limits? AmazonS3 hasn't, at least in my testing, which I'll admit hasn't been extensive. We also use "unlimited" for most things so haven't had the need to yet.

deggertsen’s picture

Ha! You're right! It doesn't pay any attention to the access limits. That's funny I hadn't noticed that until now... Any idea how to fix that? Is it at all similar to what is discussed at #772930: Remove fsockopen(), use s3->getAuthenticatedUrl to get files as a force-download?

torgosPizza’s picture

Similar, yeah, although I think that issue was more for performance. fsockopen() being called basically runs the download through the browser (your machine acting like a middleman to "fetch" the download from s3 and serve it to the end user) so that's why we wanted to stop going that route. However, it's still perfectly possible to create an authenticated URL after access has been verified - either with the help of a url_alter function or some other kind of redirection that takes place. (For example, clicking a link brings you to example.com/download/filename which checks access and upon success just redirects you with a drupal_goto() to an authenticated S3 url).

Both options I described above seem feasible but we should probably keep the discussion over at the newly-created Issue, #2021997: Honor access limits when serving files through AmazonS3

deggertsen’s picture

Hey, this patch works with the new 2.x branch of commerce file as well. I have not yet tested access limits though. I think the patch in this issue may fix the access limit problem as well. #2049481: Ensure S3 compatibility

deggertsen’s picture

I just figured out that I had left out the amazons3_reference.inc file out of my last patch. Here's a new patch with that corrected.

justafish’s picture

Issue summary: View changes

Can we have this as a separate module instead of rolled into amazons3? Happy to list it on the module page!

justafish’s picture

I think you can also achieve this with https://drupal.org/project/filefield_sources and imce?

justafish’s picture

Ok sorry, I didn't review that patch properly. I see this is a plugin for filefield_sources. Can we make it a separate installable module as part of amazons3 please?

justafish’s picture

Status: Reviewed & tested by the community » Needs work
justafish’s picture

Priority: Major » Normal
deggertsen’s picture

@torgosPizza Any chance of you doing this? I'm sure I would screw something up...

torgosPizza’s picture

Yes, in fact I'm working on a way to implement this now, with a custom module and a file widget specifically geared to pulling from the Amazon cache. Unfortunately the Field Widget API is new to me, and file uploads make it particularly tricky. I'll be working on it over the next few weeks.

deggertsen’s picture

Great news! I'll be looking forward to seeing and testing what you come up with.

eojthebrave’s picture

Is there a reason that you're querying the amazon_files DB instead of just asking the S3 API directly for a list of files? I'm sure DB query is more performant but I was able to re-write this to query S3 directly and that solves the problem of having to do some kind of automated cache warming via cron. In my instance the files are found on S3 and the *_create_record() function you wrote creates a record in {file_managed} as necessary. So far, in my testing at least it's working pretty slick.

Also, FWIW. You can do this in your *_create_record() function and that should eliminate the need to make changes to the AmazonStreamWrapper class which means you can totally move this to it's own module without having to change anything about the amazons3 module.

  $file_uri = 's3://' . $filename;
  $size = filesize($file_uri);
  $mime = file_get_mimetype($file_uri);
torgosPizza’s picture

I'm not doing that because, in our case, we have 5000+ (fairly large) files in our bucket, and querying the S3 API for all of them every time is very slow.

haysuess’s picture

Just throwing my hat in the ring to say I'd love to see this feature.

I have big files, most of which fail to upload via the Drupal interface, so I want to simply upload them to my S3 bucket, then reference them using FileField Sources.

To top that off, I need to upload 75 of them, so it's a few GB worth of stuff, and would be AMAZING to be able to just reference them instead of upload a placeholder 1KB ZIP file for each node, then replace them in my S3 bucket manually.

torgosPizza’s picture

Indeed! I had it working with Filefield Sources and the Reference field there, but that's a couple additional dependencies that we'd (ideally) like to avoid.

Just a little while longer and I'll be able to focus on this again.

rbennion’s picture

This is such a great module. The ability to upload files and not have to manually refresh the S3 cache would be ideal!

Thanks for all the great work on this module.

haysuess’s picture

Not posting this to rush torgosPizza, but for anyone else that needs an option in the meantime, here's what I did.

  1. Make a dummy ZIP file that is just a renamed TXT file with 1-2 words in it.
  2. Upload the dummy ZIP file via the node edit form.
  3. Replace the dummy ZIP file with the real ZIP file of the same name, uploading directly to my Amazon S3 account (via S3 Browser/Cloudberry/etc).
  4. Edit the file size (in bytes) in the "files_managed" table in the database so the file size is correct on the node edit pages.

It's a little gangster, but works. I could definitely still use this module, but had to upload some of the files because I kept getting emails about it from customers.

torgosPizza’s picture

I'm finally committing some time to this. Right now I have a widget created with autocomplete lookup, and a "rebuild cache" mechanism that pre-populates the amazons3_file table. I'm right about to dive into the "attach a file object to the node" part. If anyone has tips or a detailed, advanced tutorial, this might take me a bit. I might go into IRC for some pointers as well, because the documentation for this type of thing is kind of sparse.

bluewallmedia’s picture

This is fantastic. Thanks @torgosPizza
So appreciative to see progress here. Thank you. I'm happy to help test as needed.
Good Luck ~ peter

benjy’s picture

FileSize
17.45 KB

I re-rolled #73 to work against HEAD (34c5dd7).

torgosPizza’s picture

Working on this again, I will provide a patch ASAP.

torgosPizza’s picture

I am almost done with integration with Filefield Sources. I had to add an additional hook implementation (hook_file_download) to get around a Commerce File limitation (Related: #2360383: commerce_file_is_licensable() uses wrong approach, breaks compatibility with Filefield Sources). (An added bonus is I was able to figure out how to do what I needed without having to patch the AmazonS3StreamWrapper class and making protected functions public, which is a plus; however I'm interested to know if there is a faster alternative to stream_stat() to get metadata from an S3 object.)

I was originally working on a new plan that was a new "Autocomplete" file widget for File fields, but eventually decided that Filefield Sources is the correct way to go - at least for now, because it abstracts a lot of the File API out and does the majority of the heavy lifting. This means AmazonS3.module doesn't have to reinvent the wheel just to add a file to a product; that being said, I may try to take another stab at a new widget just for AmazonS3 so we don't have to require Filefield Sources as a dependency.

Though, it would probably be a good idea to decouple the "populate the AmazonS3 cache" batch functionality from this patch as that is a function that could be used even if not using S3 as a file field source. In fact I may do that before I post it, because we also need to get Cron integration with that and find a cleaner way to update that table. (To give you an idea, even taking 500 objects at a time, our bucket has 10k+ items in it and it takes some time to populate that table. Either we need a faster solution or a new process that finds only the NEW files in S3 and adds just those to the cache table.)

Will post the patch as soon soon.

torgosPizza’s picture

Okay, attached are two patches:

1. Is for the "rebuild cache" mechanism. I realized with AmazonS3 the cache table {amazons3_file} includes the s3:// prefix in the uri field. My code reflects that.
CAVEAT: I have included a call to stream_stat() with each object that is fetched from the bucket. This is slow and we need to find a higher-performing method of retrieving file metadata, such as filesize, during this process. (I also wanted to try and NOT hack the stream wrapper file..)

2. This one is for the rest of the Filefield Sources integration. I put it in a new "includes/" folder for better separation, and called it filefield_source.inc just to give it a little more distinction. You'll probably want to delete the old amazons3_reference.inc file from previous patches.

Also I would like to make the theme a bit better as we probably don't need the "s3://" prefix to show in the textfield when you type into it or choose a file.

Please test this by rebuilding your AmazonS3 cache and re-selecting the AmazonS3 source in your file field settings.

torgosPizza’s picture

Status: Needs work » Needs review

Setting to Needs Review.

torgosPizza’s picture

Title: Helper module to add/discover S3 files » Integration with Filefield Sources

Changing the title since the issue now has a clear focus. Since I've gotten relatively comfortable with how the API works, if I end up writing an autocomplete field widget that does not require Filefield Sources I will make a separate issue for that.

deggertsen’s picture

@torgosPizza, will this mess up my existing installations that are using old patches? Would I have to go in and re-link all my S3 files? Or will that not be a problem? I've been using the old patches so long with success that I'm afraid to try anything new...

torgosPizza’s picture

No, it won't mess anything up. The only thing that may have changed is the name of the Amazons3 reference field, because I tried following the patterns of Filefield Sources more closely (with regard to naming conventions, etc.) It won't mess up any existing entity relationships, because at the end of the day, file fields are just an array attached to an entity, and since those exist, nothing will break in that sense. The only thing changing is the process (behind the scenes) for adding new files to entities going forward.

The only things you'll need to do if you're coming from an old version are:

1. Rebuild your AmazonS3 cache. (You want to make sure they all have an "s3://" prefix, since that is how AmazonS3's own "use database caching" mechanism stores them in {amazons3_file}, and I wanted to make sure we didn't have dupes in the table.)

2. You will probably need to re-add AmazonS3 as a source in the file field for your entity.

Of course if you are going to try this on a production site I would back up your databases before testing; that being said, I'm going to be using these patches with AmazonS3 and Filefield Sources on our live production site immediately because we need it to work correctly. :) See my related issue in comment #92 - without the hook_file_downloads() implementation, a "new" file can never be added to a Commerce File field. With this patch it now works perfectly.

I also took the time to fix some mb_strlen errors that were popping up when Drupal was attempting to run translations on arrays. I pretty much did a huge cleanup of this patch and reworked it to be what I consider relatively solid. Please test it on a dev site and like I said, backup your databases!

torgosPizza’s picture

Apparently I'm wrong about one thing: the {amazons3_file} table does not include the s3:// prefix. I'm rewriting my code to reflect this. The confusion came from the fact that I was doing a db_merge as well as a stream_stat(). Running stream_stat() does place the file into the table if you have database caching enabled. Enabling the db caching in AmazonS3 is a requirement for this integration with Filefield Source to work, but I'm not sure how obvious that is.

Anyhoo - I will post an updated patch asap.

torgosPizza’s picture

Assigned: Unassigned » torgosPizza
Status: Needs review » Needs work

Finding some issues with the patches I generated... using wrong naming scheme and stuff. Not sure how since the local copy I was working from was the most recent... I may have accidentally applied an old patch to mine and that got included in the new patch. Sorry about that !Setting this back to Needs work while I clean it up and will re-post it later.

acausing’s picture

Hi torgos,

new 1277152-amazons3_rebuild_cache-93.patch 8.78 KB
new 1277152-amazons3_filefield_source-93.patch 12.09 KB

I have a newly installed amazons3-dev, can you point me to direction which patch to go first, or there should be a flow in patching which part i'm missing.

Tried versus to amazons3-7.x-1.x-dev.tar.gz and got some failed on the patch

torgosPizza’s picture

Yeah, see my comment above. Patch might not be 100% so I'm rewriting it a little bit. Hang tight.

acausing’s picture

Thanks @torgosPizza,

I'll help on testing, almost same requirements i'm tackling.

- Commerce License
- Commerce File
- Commerce License File
- AmazonS3
- AwsSDK
- Filefield Resource (Hopefully S3)

Very excited here! Thanks you in advance.

torgosPizza’s picture

I'm much more confident about this patch! I decided to combine them both again for ease of application. To test:

1. Backup your database!
2. Apply the patch.
3. Rebuild your AmazonS3 cache @ admin/config/media/amazons3*
4. Configure the "File source" settings for your content types.

Now when you edit a node with this field you can type the filename (or subfolder) and click "Select". Then you can add the description (if configured) for the file. We do this on our Commerce File products and use the description as the link text along with the filesize.

Please test it out and let me know if you run into issues applying it. I've been testing it like crazy and I'm pretty confident with it. Got rid of a bunch of cruft, too.

* NOTE: This might take awhile, as we run a stat() on each file. Buckets with tens of thousands of files might take a while - we need to find a faster solution for this step.

torgosPizza’s picture

Status: Needs work » Needs review
acausing’s picture

This is when you add products in product type variation.

An AJAX HTTP request terminated abnormally.
Debugging information follows.
Path: /system/ajax
StatusText: n/a
ResponseText:
Fatal error: Call to undefined function amazons3_filefield_source_value() in /var/www/test/sites/all/modules/filefield_sources/filefield_sources.module on line 300
ReadyState: undefined

acausing’s picture

Conflicting?
amazons3.info line 6 -> files[] = includes/filefield_source.inc
amazons3.module line 9 -> module_load_include('inc', 'amazons3', 'filefield_source');

While i don't have the filefield_source.inc
1. I've run the previous patch 1277152-amazons3_filefield_source-93.patch then it creates the filefield_resource.inc

then
2. cd..; rm -rf amazons3
3. tar -zxvf amazons3-7.x-1.x-dev.tar.gz
4. cd amazons3
5. wget https://www.drupal.org/files/issues/amazons3-1277152-filefield_sources-1...
6. patch < amazons3-1277152-filefield_sources-103.patch
7. tried removing line 6 of amazons3.info
8. drush cc all

Also i cannot find function amazons3_filefield_source_value() in filefield_source.inc, amazons3.module (Maybe if you can upload the latest filefield_source.inc)

An AJAX HTTP request terminated abnormally.
Debugging information follows.
Path: /system/ajax
StatusText: n/a
ResponseText:
Fatal error: Call to undefined function amazons3_filefield_source_value() in /var/www/test/sites/all/modules/filefield_sources/filefield_sources.module on line 300
ReadyState: undefined

Thanks

torgosPizza’s picture

FileSize
16 KB

Ah, so for some reason my patch didn't include the new file. Weird. Had to use the --staged flag which seems to help.

Sorry about that! Try this patch instead.

eojthebrave’s picture

How do you feel about moving this to its own module? Either as a sub-module of the amazons3 module, or as a new project? justafish alluded to wanting that in her comment at #76. I've got a need for similar functionality and would be willing to help out where I can.

Perhaps amazons3_filefield_sources.module?

torgosPizza’s picture

Not a bad idea. I'll move it to its own project and either re-roll the patch or link to it here.

acausing’s picture

The auto complete works now for file field (Yey!, jumping around end up with thumbs up pose),

Scenario: files are uploaded directly to S3, then use filefield_source to use the file.

Test 1
-> "This is the error when i auto-complete the file from S3 and click Select Button"

Error
An AJAX HTTP request terminated abnormally.
Debugging information follows.
Path: /file/ajax/field_product/und/form/commerce_file/und/0/form-j0ZyNq_4uPpxNTf2CJtAqzYkVLcUIZJvioL279dLdpE
StatusText: n/a
ResponseText:
Error
Error message
PDOException: SQLSTATE[23000]: Integrity constraint violation: 1048 Column 'id' cannot be null: INSERT INTO {file_usage} (fid, module, type, id, count) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4); Array
(
[:db_insert_placeholder_0] => 63
[:db_insert_placeholder_1] => file
[:db_insert_placeholder_2] => commerce_product
[:db_insert_placeholder_3] =>
[:db_insert_placeholder_4] => 1
)
in file_usage_add() (line 696 of /var/www/test/includes/file.inc).
The website encountered an unexpected error. Please try again later.

ReadyState: undefined

--------------

Test 2
Creating the products seems stuck (when i clicked the create product button the circular progress indicator just running forever), this is not clicking the "Select" button of filefield but the "Create Product".

torgosPizza’s picture

Ah so it seems to me an issue with creating a new product. Let me test that and work on a solution. I'm going to be traveling today too so that might make things a bit difficult - and we have to release a new product today. Yay! The timing couldn't be worse! :)

Thanks for testing. I'll be back with you shortly (with a new module!).

acausing’s picture

HI torgos,

Any update, so excited here!

Regards :D :p

torgosPizza’s picture

Very very soon! I promise! I'm working on making the file discovery faster. Had to read a bunch of S3 documentation, and then I was traveling on business this week. I'll pick it back up and try to put it to bed over the weekend. I just used it to release a few products today so it is working pretty well :)

acausing’s picture

Hi Torgos,

How are you doing, have you started on the module, really excited to test it out.

More power...

Regards...

torgosPizza’s picture

Yes! It's still being worked out. I'm almost done (I managed to get the "rebuild cache table" stuff working and much faster than it was before) but I still need to fix the error @deggertson was facing.

Sorry for the delay, my day job has been stretching me thin lately. I will post it soon, I promise.

acausing’s picture

Wow thanks Torgos, Is there a way we can help out on development. Exciting...

torgosPizza’s picture

I'm going to post it to drupal.org as a sandbox project in the next day or so.

torgosPizza’s picture

Status: Needs review » Closed (fixed)

Created a new module and project page here: https://www.drupal.org/project/amazons3_ffs

I went with "amazons3_" for the module prefix because some of the naming conventions used by Filefield Sources caused a minor conflict. This way seemed easier (plus I had anticipated making this module a sub-module of AmazonS3).

Anyways please feel free to test and make note of my TODOs. I will be creating some tasks so I can handle adding the few features that are currently not very robust (for instance S3 prefixes and the settings for Filefield sources are not yet worked out).

torgosPizza’s picture

Status: Closed (fixed) » Fixed

I'll just set this to Fixed so people see the green :)

Please submit any issues to the issue queue over at the new module's page. The -dev gz file should be ready soon.

acausing’s picture

Wow Thanks Torgos, well test the module...

deggertsen’s picture

So far my tests are positive! I updated to the most recent dev of AmazonS3 and enabled the AmazonS3_ffs, then I just had to go into my fields that used the ffs amazons3 selector and save them (didn't even have to make changes) and everything worked as expected.

Great work torgosPizza!

torgosPizza’s picture

Yes, I should probably mention that the latest -dev of AmazonS3 is recommended.

Glad to hear you're having good results so far. I put a ton of work into this, even though it's a relatively small module! Now that it's out in the wild hopefully we can get more hands on deck to help continue development. Thanks Dave!

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.