Closed (fixed)
Project:
Views data export
Version:
7.x-3.0
Component:
Code
Priority:
Critical
Category:
Bug report
Assigned:
Unassigned
Reporter:
Created:
15 Sep 2016 at 19:03 UTC
Updated:
20 Oct 2016 at 08:34 UTC
Jump to comment: Most recent, Most recent file
Comments
Comment #2
steven jones commentedSad times.
Hmm...okay, so you're just clicking the link to start the export, it proceeds and you get to the download page, but see a 403 Access Denied. Yeah?
Comment #3
ficklecatThat is correct.
Comment #4
steven jones commentedWhat's the export path of your VDE display?
Does it end in .csv or somesuch?
Does your site happen to be behind Varnish or something else that could be stripping the Drupal session cookie on such paths?
Comment #5
ficklecatActually looking into this more this is to do with the way temporary files are created on Windows (sorry). It is creating a short filename so the views_data_export_is_export_file() filename check is invalid:
A quick fix for me was to use 'vde' for the filename prefix instead (see patch). Not sure if changing this has other issues.
Comment #6
ficklecatComment #7
steven jones commentedThanks for the detail, working on a fix for this, and for the issues in #2798759: Ensure access control is performed on batched export downloads
Comment #8
steven jones commentedHere's a patch that uses the file_usage table to keep track of which files were created by views data export.
@ficklecat it would be awesome if you could apply the patch to 7.x-3.0 and see if it fixes it for you.
Comment #9
steven jones commentedComment #10
ficklecatYes this works great for me against 3.0. Thanks for your help and quick response.
Comment #11
steven jones commentedHere's what might be a better patch.
We remove the stuff controlling file access per user, and instead control access to the entire export token via drupal_get_token, which we then validate when using.
This should be less performance intensive, and doesn't need to work out if the file is 'ours' etc.
This should mean that we keep the security checks that were missing before 7.x-3.0 but they're enforced earlier in the process, and instead of getting an access denied, it just starts off the export process again afresh for the new session.
Comment #12
steven jones commentedIf I can get a positive review from someone that this works in their situation, load balancers, private directory instead of temp etc. then I'll commit the patch in #11.
Comment #13
steven jones commentedComment #14
jamsilver commentedTypo: aorund/around
Also, I'm not sure we need to put something into the session? It doesn't sound like it was needed to fix the OP's problem, and it introduces the additional question: should we be tidying up this item from $_SESSION later, and where/when?
Comment #15
steven jones commentedOh, good catch on the typo. Patch attached with that fix.
The documentation for drupal_get_token says essentially we should ensure that a session has been started prior to calling this function.
From my reading of it,
batch_processdoes essentially the same, by setting a value into the session that its use ofdrupal_get_tokenisn't invalidated/changed along the way.It doesn't clean it up as far as I can tell, so people will likely get and retain a session anyway. We're not making that any worse I don't think.
Comment #16
steven jones commentedComment #17
steven jones commentedComment #18
jamsilver commentedOK, fair enough. I think we would need to have a think about when to clear it up. It looks like batch API does clean up the session - https://api.drupal.org/api/drupal/includes%21batch.inc/function/_batch_f...
Not sure exactly when we should do it - we could do it immediately after the user has downloaded the file the first time? That way anonymous users only get one chance to download. But I'm not sure that's too bad?
Comment #19
steven jones commentedOh, whoops, missed that cleanup in batch api.
If the session token changes, then they just get sent through the pipeline again, so yeah, they'd either end up in an endless loop trying to download (because we might have broken something somewhere or somesuch) or they'd go back around once and get a different copy of the file.
Comment #20
steven jones commentedAttached patch cleans up the session in the
execute_finalmethod, which is only used for batching.We clean up the file, and then transfer it immediately.
I wonder if this will break the tests...
Comment #23
steven jones commentedDrupalCI fail :(
Comment #26
steven jones commented@Steven Jones fail actually.
Comment #29
steven jones commentedUrgh, multiple @Steven Jones fails there.
Comment #30
steven jones commentedComment #31
plachI tested the patch and it seems to work fine :)
Typo :)
Do we really need to drop this now it's in 3.0? Wouldn't it make sense to keep file sizes accurate even if they are temp files?
Would it make sense to keep this API function around anyway, given it's in 3.0?
Comment #32
ficklecatThe patch seems works for me too (although not using a load balancer).
Is this still required? If so it will not run on Windows because of the short filename issue mentioned in #5.
Comment #33
steven jones commentedThanks for the reviews!
Here's an additional patch that adds the
views_data_export_is_export_fileback but implements it using thefile_usagesystem.It also restores the file size header functionality. This should be good to go.
Comment #34
ficklecatWorks for me. Thanks.
Comment #35
steven jones commentedGah!
Sorry everyone, having to post patches is confusing me and I'm posting the wrong patches :(
This one has the token work, and the tweaks to the file size and usage stuff.
Comment #38
steven jones commentedHave actually run the code and tests locally this time :)
Have also added a new test for garbage collection to make sure that code is working correctly too.
Comment #39
plachWorks fine here, RTBC +1!
Comment #41
steven jones commentedThanks everyone!
Comment #42
sagashiteirukedo commentedSteven -- sorry to have missed the item up in #12 for feedback on the use of private files. I saw 3.1 came out this morning and snagged that for my testing environment. I can report that 3.1 (which looks to be the culmination of this thread) worked with the use of the private file area for me.
I'm looking forward to seeing how 3.1 handles garbage collection. [I'd been seeing garbage clean-up issues with 3.0 and it's patch on that development system -- the two views_data_export tables wouldn't get flushed out and it wouldn't remove the temporary files since they were reported as in-use by views_data_export still.]
Thanks again for your work on all of this!
Comment #43
tvasilia commentedHi everyone!
Since this fix was committed, VDE temp files are not deleted during garbage collection.
The watchdog has entries noting that the temporary file is still used by VDE module.
Looking at the code, it seems that the temp file is added to the file_usage table for better handling, but it never gets deleted from that table (when all OK).
I'm not sure where the best place is for correcting this, maybe in views_data_export_garbage_collect() ?
Just before file_delete is called, we could call file_usage_delete() for cleaning the usage entry, hence releasing the file for delete.
Any comments ?
Thank you all for your great work on this module !
Comment #44
steven jones commented@tvasilia So the call to file_delete in views_data_export_garbage_collect does actually pass in the force parameter, so that we don't need to clean up the usage before calling file delete.
However, because of our huge lifetime on the exports, the garbage collection doesn't happen for quite some time.
I think we could reduce this, and then your issues would go away I think.
I'll raise a follow up issue along these lines.
Comment #45
tvasilia commentedHi Steven,
thank you for your kind reply.
I apologize, you're absolutely right. I didn't notice the force flag...
Watchdog entries are caused by the system_cron cleanup process, where the file age is set to 6 hours.
Eventually our temp files will be flushed after the VDE views_data_export_gc_expires timeout (which is much larger as you point-out).
Thanks again for your kind reply.
It's not that much of a problem, but yes the lifetime could be lowered if there are no other significant reasons for that high value.
Comment #46
plach