Currently the export functionality of the module can load a lot of data all at once. This is partly due to PHPExcel, which uses a huge amount of memory for a large spreadsheet (Something like 1.17KB per cell).
Using PHP_XLSXWriter for the export would make it possible to export thousands of products with little memory usage.
https://github.com/mk-j/PHP_XLSXWriter
The export will end up using an xlsx
file instead of xls
but that's fine since #2785127: Allow xlsx files.
Performance comparison between PHPExcel and PHP_XLSXWriter
http://www.zedwood.com/article/php-excel-writer-performance-comparison
Case 1: 10 columns, 10,000 rows in 4 sheets (400,000 cells) PHP_Excel: 458MB 101.55s PHP_XLSXWriter: 11MB 7.56s Case 2: 10 columns, 10,000 rows in 6 sheets (600,000 cells) PHP_Excel: 685MB 202.72s PHP_XLSXWriter: 15MB 11.31s Case 3: 10 columns, 20,000 rows in 4 sheets (800,000 cells) PHP_Excel: 912MB 301.31s PHP_XLSXWriter: 22MB 14.94s
Requirements:
- Document the dependency on PHP_XLSXWriter.
- Implement
hook_requirements()
- Warn the user if the library is not available when on the export page.
- Load product display nodes and their referenced products in batches.
- Stop caching loaded node and commerce_product entities.
- Append rows to the document as we go so we don't need all the spreadsheet data in memory.
We could decide to leave the current export functionality in so it's still possible to export with PHPExcel but it's unlikely anyone would want to do that unless they have a particularly small set of products.
Comment | File | Size | Author |
---|---|---|---|
#13 | reduce_memory_usage_of-2785625-13.patch | 26.77 KB | travis-bradbury |
Comments
Comment #2
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedComment #3
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedThis patch utilizes PHP_XLSXWriter and also clears the node and commerce_product entity caches periodically. It doesn't really change the way product data is gathered but does use the batch API and saves the export to a file instead of delivering it directly.
You'll need https://github.com/mk-j/PHP_XLSXWriter
Comment #4
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedThe requirements hook doesn't seem to work, I get a blank line and a bunch of errors about title
Comment #5
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedAh, looks like you're adding some extra nesting to the array that doesn't need to be there.
I redid the requirements hook based on the cleaner example from PHPExcel, you also don't need to check if the library function exists, as we should just have that as a module requirement. It is by proxy currently since phpexcel requires it.
Comment #6
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedAh, the
hook_requirements()
implementation looks a lot nicer now.I added a note to the README about the new library requirement.
Comment #7
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedShouldn't we be using this functionality for the import and validation results as well as the template generation? Might as well use it for all writing?
What about importing, do we have a memory issue with that even though we do the batch setup or is that ok?
Also as a patch note, the new export is missing the header coloring the old setup had.
Comment #8
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedImporting seems to be better than the export for resource usage but we can probably make improvements by switching to another spreadsheet reader if we want to.
https://github.com/nuovo/spreadsheet-reader/ is pretty light and can read from xlsx files without keeping the whole thing in memory. I believe that PHPExcel still keeps everything in memory when using xlsx but I don't know that beyond doubt.
If we went that route (and probably even if we don't go that route) we should switch everything we have over to using xlsx since xls is a hog regardless of the library we use.
PHP_XLSXWriter doesn't support colouring so if we switch the rest of the spreadsheet creation over to it we should have a different way to show errors to the user. This would be beneficial anyway since the messages saved for errors ("SKU already in use" or whatever else) are lost; the colours don't convey why a cell has an invalid value.
Comment #9
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedHmm... colors are sort of key to the way the whole thing works though, thats how we show what things errored in a user friendly manner and we intended to add the specific error messages in as comments.
Comment #10
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedAdded an export test and fixed a couple bugs. Also I've added color support to PHP_XLSXWriter in a pull request, nothing is setup in this module yet.
https://github.com/mk-j/PHP_XLSXWriter/pull/69
Comment #11
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedComment #12
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedRe-rolled for the header aliasing feature.
Comment #13
travis-bradbury CreditAttribution: travis-bradbury at Acro Commerce commentedPeople should really try applying their patches before uploading them.
Comment #15
smccabe CreditAttribution: smccabe as a volunteer and at Acro Commerce commentedAdded child issue for using this for other outputs #2790325: Convert Results and Template to XLSXWriter