Currently the export functionality of the module can load a lot of data all at once. This is partly due to PHPExcel, which uses a huge amount of memory for a large spreadsheet (Something like 1.17KB per cell).

Using PHP_XLSXWriter for the export would make it possible to export thousands of products with little memory usage.
https://github.com/mk-j/PHP_XLSXWriter

The export will end up using an xlsx file instead of xls but that's fine since #2785127: Allow xlsx files.

Performance comparison between PHPExcel and PHP_XLSXWriter
http://www.zedwood.com/article/php-excel-writer-performance-comparison

Case 1: 10 columns, 10,000 rows in 4 sheets (400,000 cells)
PHP_Excel: 	458MB 	101.55s
PHP_XLSXWriter: 	11MB 	7.56s
Case 2: 10 columns, 10,000 rows in 6 sheets (600,000 cells)
PHP_Excel: 	685MB 	202.72s
PHP_XLSXWriter: 	15MB 	11.31s
Case 3: 10 columns, 20,000 rows in 4 sheets (800,000 cells)
PHP_Excel: 	912MB 	301.31s
PHP_XLSXWriter: 	22MB 	14.94s

 

Requirements:

  • Document the dependency on PHP_XLSXWriter.
  • Implement hook_requirements()
  • Warn the user if the library is not available when on the export page.
  • Load product display nodes and their referenced products in batches.
  • Stop caching loaded node and commerce_product entities.
  • Append rows to the document as we go so we don't need all the spreadsheet data in memory.

We could decide to leave the current export functionality in so it's still possible to export with PHPExcel but it's unlikely anyone would want to do that unless they have a particularly small set of products.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tbradbury created an issue. See original summary.

travis-bradbury’s picture

Issue summary: View changes
travis-bradbury’s picture

Status: Active » Needs review
FileSize
17.89 KB

This patch utilizes PHP_XLSXWriter and also clears the node and commerce_product entity caches periodically. It doesn't really change the way product data is gathered but does use the batch API and saves the export to a file instead of delivering it directly.

You'll need https://github.com/mk-j/PHP_XLSXWriter

smccabe’s picture

Issue summary: View changes
Status: Needs review » Needs work
FileSize
72.03 KB

The requirements hook doesn't seem to work, I get a blank line and a bunch of errors about title

smccabe’s picture

Status: Needs work » Needs review
FileSize
18.03 KB

Ah, looks like you're adding some extra nesting to the array that doesn't need to be there.

I redid the requirements hook based on the cleaner example from PHPExcel, you also don't need to check if the library function exists, as we should just have that as a module requirement. It is by proxy currently since phpexcel requires it.

travis-bradbury’s picture

Ah, the hook_requirements() implementation looks a lot nicer now.

I added a note to the README about the new library requirement.

smccabe’s picture

Status: Needs review » Needs work

Shouldn't we be using this functionality for the import and validation results as well as the template generation? Might as well use it for all writing?

What about importing, do we have a memory issue with that even though we do the batch setup or is that ok?

Also as a patch note, the new export is missing the header coloring the old setup had.

travis-bradbury’s picture

Importing seems to be better than the export for resource usage but we can probably make improvements by switching to another spreadsheet reader if we want to.

https://github.com/nuovo/spreadsheet-reader/ is pretty light and can read from xlsx files without keeping the whole thing in memory. I believe that PHPExcel still keeps everything in memory when using xlsx but I don't know that beyond doubt.

If we went that route (and probably even if we don't go that route) we should switch everything we have over to using xlsx since xls is a hog regardless of the library we use.

PHP_XLSXWriter doesn't support colouring so if we switch the rest of the spreadsheet creation over to it we should have a different way to show errors to the user. This would be beneficial anyway since the messages saved for errors ("SKU already in use" or whatever else) are lost; the colours don't convey why a cell has an invalid value.

smccabe’s picture

Hmm... colors are sort of key to the way the whole thing works though, thats how we show what things errored in a user friendly manner and we intended to add the specific error messages in as comments.

smccabe’s picture

Status: Needs work » Needs review
FileSize
23.27 KB

Added an export test and fixed a couple bugs. Also I've added color support to PHP_XLSXWriter in a pull request, nothing is setup in this module yet.

https://github.com/mk-j/PHP_XLSXWriter/pull/69

smccabe’s picture

FileSize
32.38 KB
travis-bradbury’s picture

Re-rolled for the header aliasing feature.

travis-bradbury’s picture

FileSize
26.77 KB

People should really try applying their patches before uploading them.

  • smccabe committed 9f01376 on 7.x-1.x authored by tbradbury
    Issue #2785625 by tbradbury, smccabe: Reduce memory usage of export
    
smccabe’s picture

Status: Needs review » Fixed

Added child issue for using this for other outputs #2790325: Convert Results and Template to XLSXWriter

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.