Just upgraded to the 7.x-3.0 release today and noticed that my previous batched export functionality no longer works even for user/1. I get the 'Data Export Successful Message' but the file fails to download.

If I switch back to 7.x-3.0-beta8 it all works fine. I did not see any new permission requirements. I assume it is related to #2798759: Ensure access control is performed on batched export downloads ?

Comments

ficklecat created an issue. See original summary.

steven jones’s picture

Sad times.

Hmm...okay, so you're just clicking the link to start the export, it proceeds and you get to the download page, but see a 403 Access Denied. Yeah?

ficklecat’s picture

That is correct.

steven jones’s picture

What's the export path of your VDE display?
Does it end in .csv or somesuch?
Does your site happen to be behind Varnish or something else that could be stripping the Drupal session cookie on such paths?

ficklecat’s picture

StatusFileSize
new1.22 KB

Actually looking into this more this is to do with the way temporary files are created on Windows (sorry). It is creating a short filename so the views_data_export_is_export_file() filename check is invalid:


// uri: temporary://vie7547.tmp

function views_data_export_is_export_file($uri) {
  return file_uri_scheme($uri) == 'temporary' && strpos(file_uri_target($uri), 'views_data_export') === 0;
}

A quick fix for me was to use 'vde' for the filename prefix instead (see patch). Not sure if changing this has other issues.

ficklecat’s picture

Status: Active » Needs review
steven jones’s picture

Status: Needs review » Needs work

Thanks for the detail, working on a fix for this, and for the issues in #2798759: Ensure access control is performed on batched export downloads

steven jones’s picture

Status: Needs work » Needs review
StatusFileSize
new5.37 KB

Here's a patch that uses the file_usage table to keep track of which files were created by views data export.

@ficklecat it would be awesome if you could apply the patch to 7.x-3.0 and see if it fixes it for you.

steven jones’s picture

ficklecat’s picture

Status: Needs review » Reviewed & tested by the community

Yes this works great for me against 3.0. Thanks for your help and quick response.

steven jones’s picture

Status: Reviewed & tested by the community » Needs review
StatusFileSize
new8.87 KB

Here's what might be a better patch.
We remove the stuff controlling file access per user, and instead control access to the entire export token via drupal_get_token, which we then validate when using.
This should be less performance intensive, and doesn't need to work out if the file is 'ours' etc.
This should mean that we keep the security checks that were missing before 7.x-3.0 but they're enforced earlier in the process, and instead of getting an access denied, it just starts off the export process again afresh for the new session.

steven jones’s picture

If I can get a positive review from someone that this works in their situation, load balancers, private directory instead of temp etc. then I'll commit the patch in #11.

steven jones’s picture

Priority: Major » Critical
jamsilver’s picture

+++ b/plugins/views_data_export_plugin_display_export.inc
@@ -306,9 +315,13 @@ class views_data_export_plugin_display_export extends views_plugin_display_feed
+    // Pop something into the session to ensure it stays aorund.

Typo: aorund/around

Also, I'm not sure we need to put something into the session? It doesn't sound like it was needed to fix the OP's problem, and it introduces the additional question: should we be tidying up this item from $_SESSION later, and where/when?

steven jones’s picture

Oh, good catch on the typo. Patch attached with that fix.

The documentation for drupal_get_token says essentially we should ensure that a session has been started prior to calling this function.
From my reading of it, batch_process does essentially the same, by setting a value into the session that its use of drupal_get_token isn't invalidated/changed along the way.
It doesn't clean it up as far as I can tell, so people will likely get and retain a session anyway. We're not making that any worse I don't think.

steven jones’s picture

StatusFileSize
new8.87 KB
steven jones’s picture

jamsilver’s picture

OK, fair enough. I think we would need to have a think about when to clear it up. It looks like batch API does clean up the session - https://api.drupal.org/api/drupal/includes%21batch.inc/function/_batch_f...

Not sure exactly when we should do it - we could do it immediately after the user has downloaded the file the first time? That way anonymous users only get one chance to download. But I'm not sure that's too bad?

steven jones’s picture

Oh, whoops, missed that cleanup in batch api.

If the session token changes, then they just get sent through the pipeline again, so yeah, they'd either end up in an endless loop trying to download (because we might have broken something somewhere or somesuch) or they'd go back around once and get a different copy of the file.

steven jones’s picture

StatusFileSize
new9.43 KB

Attached patch cleans up the session in the execute_final method, which is only used for batching.
We clean up the file, and then transfer it immediately.

I wonder if this will break the tests...

Status: Needs review » Needs work

The last submitted patch, 20: views_data_export-2800895-eid_tokens.patch, failed testing.

The last submitted patch, 20: views_data_export-2800895-eid_tokens.patch, failed testing.

steven jones’s picture

Status: Needs work » Needs review

DrupalCI fail :(

Status: Needs review » Needs work

The last submitted patch, 20: views_data_export-2800895-eid_tokens.patch, failed testing.

The last submitted patch, 20: views_data_export-2800895-eid_tokens.patch, failed testing.

steven jones’s picture

Status: Needs work » Needs review
StatusFileSize
new9.43 KB

@Steven Jones fail actually.

Status: Needs review » Needs work

The last submitted patch, 26: views_data_export-2800895-eid_tokens.patch, failed testing.

The last submitted patch, 26: views_data_export-2800895-eid_tokens.patch, failed testing.

steven jones’s picture

StatusFileSize
new9.43 KB

Urgh, multiple @Steven Jones fails there.

steven jones’s picture

Status: Needs work » Needs review
plach’s picture

I tested the patch and it seems to work fine :)

  1. +++ b/plugins/views_data_export_plugin_display_export.inc
    @@ -306,9 +315,13 @@ class views_data_export_plugin_display_export extends views_plugin_display_feed
    +    // Pop something into the session to ensure it stays aorund.
    

    Typo :)

  2. +++ b/plugins/views_data_export_plugin_display_export.inc
    @@ -404,9 +417,6 @@ class views_data_export_plugin_display_export extends views_plugin_display_feed
    -        $this->outputfile_update_size();
    

    Do we really need to drop this now it's in 3.0? Wouldn't it make sense to keep file sizes accurate even if they are temp files?

  3. +++ b/views_data_export.module
    @@ -31,36 +31,6 @@ function views_data_export_views_api() {
    -function views_data_export_is_export_file($uri) {
    

    Would it make sense to keep this API function around anyway, given it's in 3.0?

ficklecat’s picture

The patch seems works for me too (although not using a load balancer).

/**
 * Implements hook_file_presave().
 */
function views_data_export_file_presave($file) {
  // Ensure temporary files really are temporary.
  // @see: https://drupal.org/node/2198399
  if (strpos($file->filename, 'views_data_export') === 0) {
    // There is no FILE_STATUS_TEMPORARY.
    $file->status = 0;
  }
}

Is this still required? If so it will not run on Windows because of the short filename issue mentioned in #5.

steven jones’s picture

StatusFileSize
new5.37 KB

Thanks for the reviews!

Here's an additional patch that adds the views_data_export_is_export_file back but implements it using the file_usage system.

It also restores the file size header functionality. This should be good to go.

ficklecat’s picture

Works for me. Thanks.

steven jones’s picture

StatusFileSize
new13.01 KB

Gah!

Sorry everyone, having to post patches is confusing me and I'm posting the wrong patches :(

This one has the token work, and the tweaks to the file size and usage stuff.

Status: Needs review » Needs work

The last submitted patch, 35: views_data_export-2800895-eid_tokens.patch, failed testing.

The last submitted patch, 35: views_data_export-2800895-eid_tokens.patch, failed testing.

steven jones’s picture

Status: Needs work » Needs review
StatusFileSize
new18.14 KB

Have actually run the code and tests locally this time :)
Have also added a new test for garbage collection to make sure that code is working correctly too.

plach’s picture

Works fine here, RTBC +1!

  • Steven Jones committed 43ed58a on 7.x-3.x
    Issue #2800895 by Steven Jones, ficklecat, jamsilver, plach: Access...
steven jones’s picture

Status: Needs review » Fixed

Thanks everyone!

sagashiteirukedo’s picture

Steven -- sorry to have missed the item up in #12 for feedback on the use of private files. I saw 3.1 came out this morning and snagged that for my testing environment. I can report that 3.1 (which looks to be the culmination of this thread) worked with the use of the private file area for me.

I'm looking forward to seeing how 3.1 handles garbage collection. [I'd been seeing garbage clean-up issues with 3.0 and it's patch on that development system -- the two views_data_export tables wouldn't get flushed out and it wouldn't remove the temporary files since they were reported as in-use by views_data_export still.]

Thanks again for your work on all of this!

tvasilia’s picture

Hi everyone!

Since this fix was committed, VDE temp files are not deleted during garbage collection.
The watchdog has entries noting that the temporary file is still used by VDE module.
Looking at the code, it seems that the temp file is added to the file_usage table for better handling, but it never gets deleted from that table (when all OK).

I'm not sure where the best place is for correcting this, maybe in views_data_export_garbage_collect() ?
Just before file_delete is called, we could call file_usage_delete() for cleaning the usage entry, hence releasing the file for delete.

Any comments ?

Thank you all for your great work on this module !

steven jones’s picture

@tvasilia So the call to file_delete in views_data_export_garbage_collect does actually pass in the force parameter, so that we don't need to clean up the usage before calling file delete.

However, because of our huge lifetime on the exports, the garbage collection doesn't happen for quite some time.
I think we could reduce this, and then your issues would go away I think.

I'll raise a follow up issue along these lines.

tvasilia’s picture

Hi Steven,

thank you for your kind reply.
I apologize, you're absolutely right. I didn't notice the force flag...

Watchdog entries are caused by the system_cron cleanup process, where the file age is set to 6 hours.
Eventually our temp files will be flushed after the VDE views_data_export_gc_expires timeout (which is much larger as you point-out).

Thanks again for your kind reply.
It's not that much of a problem, but yes the lifetime could be lowered if there are no other significant reasons for that high value.

plach’s picture

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.