Hi

Im using webform with webform conditional and I use the "textarea" feature and I have some problems when I export the results to Excel.
First I have to open the XLS file from excel (not open by doubleclicking it) and use the textimport wizard with some minor settings. This is a minor problem.
The big issue here is when someone uses return in a textarea it automatically changes the row in excel making it extremely hard to read. Is there a way of making webform ignore the "return" entries.

Kind regards Marcus

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

vernond’s picture

Hi Marcus, that return thing can be quite an irritation...

The way that webform currently runs the export you are either going to have write code to remove the returns from the input yourself, or use a different exporting methodology.

I found a way around it that I wrote a little bit about here #884922: Export data in BIF. See also, in particular, the last paragraph in Quicksketches first post on that issue.

Marcus 78’s picture

Thanks for the reply vernond.

Im afraid that it might be just above what I can do in coding right now. Gonna try and post here if i succeed.
This must be a common problem.
I would be extremely happy if someone had a code snippet for this?

vernond’s picture

I'm starting work on a 6.3 version of the previous lot in a day or few - I'll give you shout here when I have something useful (probably only on 02 May or thereabouts).

quicksketch’s picture

This issue would also help you solve it much easier if it were implemented: #298784: Run CSV downloads through theme functions.

vernond’s picture

@quicksketch: Mmmmm... Not a bad set of ideas at all... I'll give it a go and see what comes out at the end of it.

vernond’s picture

@Marcus 78: Okay, so my diary was cleared and I had some time to knock a thing together. Coding standards may be a little dodgy as this is my first bona fide Drupal module (up to now I've been silly enough to just hack existing modules, but Quicksketch made me see the light :-D ). Be sure to go through the readme file before you do anything else.

Marcus 78’s picture

Thanks a lot vernond for this module and for proving that open source rocks!
Gonna try it tomorrow morning.
Hope this might help more peolpe aswell.

vernond’s picture

@Marcus 78: So does the module do what you hoped it would?

quicksketch’s picture

Title: Excel export problem » Ignore new-lines in textareas during Excel export (Excel-native exporter)
Category: support » feature

Wow @vernond this module is cool! You know I've always wanted to see another module implement a Webform export type, this is the first I've seen and I'm glad to see that it works! I tried to write this exact handler (using the BIFF format) several times but ended up just using a Tab-separated text file for Excel imports in the end. It looks like you have a webform_biff_exporter class for no reason though, seems like you could just drop the entire class.

I don't quite understand what the original problem described was though. New lines should be read in just fine and stay within the original cell.

vernond’s picture

Thanks quicksketch

1) webform_biff_exporter class : I strongly suspect that you're quite right, it could probably be dropped. If I recall correctly, I had the webform_exporter_biff class extending your original webform_exporter, but this was failing with "class not found". Being naive/ignorant around OOP and all, I just renamed webform_exporter to webform_biff_exporter in my module to prevent the error message for the time-being until I discovered exactly what the error was;

2) Newlines in tsv/csv have caused problems for us as well, from various providers. I'm not certain exactly why it is a problem either, but clearly I'm not the only that has had issues. Using this download mechanism pretty much guarantees that things come out the way you expect them to;

3) As you can tell from point 1 above and the comments in the code, it is not yet as polished as I would like, but I've not had sufficient time to get it done properly (Work - the curse of the hobbyist class!);

It has it's limitations (i.e. 255 string length limit, 65535 rows limit), but has served us well enough over the last year. I'll be giving it a bit more love as time allows, specifically around getting just the 80-or-so lines needed out of the PHPExcel library to offer MS Excel 2003 (pardon the language) support as well, which would relieve us of the string length and max rows limits (definitely NOT considering adding anything along the lines of formulas, text formatting etc!).

[Edit] re point 2: If I remember correctly the issue is most prevalent when the cell containing newlines is the last cell in a row, the interpreters seem to get their knickers in a tangle.

vernond’s picture

FileSize
61.39 KB

I noticed some data type casting issues and cleaned up code a bit, so here is an improved version

Marcus 78’s picture

Sorry for the late reply. Got caught up in the it works "high". Thanks again!

vernond’s picture

:-D Gotta love a happy customer! Be sure to use the second version in post 11 to avoid "funnies" cropping up in the future.

quicksketch’s picture

Version: 6.x-3.9 » 7.x-4.x-dev

I'm writing a Google Docs exporter and have run into an issue where Google doesn't import TSV files correctly (see http://stackoverflow.com/questions/17643567/upload-a-tsv-file-into-googl...). If we had an Excel-native exporter maybe that'd be able to create a file Google could read directly without problems. I wonder what it would take to overcome the 255 character and 65535 rows limits.

quicksketch’s picture

I wonder what it would take to overcome the 255 character and 65535 rows limits.

The 255 character limit can be overcome with a "continue" command in the BIFF format. We could probably adapt this easily from PHPExcel. Since they're also GPL, I don't think we'd have a license problem.

http://phpexcel.codeplex.com/SourceControl/latest#trunk/Classes/PHPExcel...

quicksketch’s picture

Hm, actually that approach only works if you build a table of shared strings and then reference that separately. Bummer.

Another approach we haven't tried here is using the Excel 2007+ format, which is essentially XML. That could be a solution that is easy to implement and wouldn't have any of the existing limitations.

quicksketch’s picture

Status: Active » Needs review
FileSize
7.24 KB

I split out the different exporter classes in #2057719: Move exporter classes to separate files. After that patch, we can add this new Excel-native exporter. This uses the Excel 2007 format, which is XML files in a directory and then zipped up. This XLSX format supports very-large data, handles new lines properly, and is UTF-8 compatible. From Wikipedia:

the number of rows was now 1,048,576 (220) and columns was 16,384 (214; the far-right column is XFD).

I would think that's going to be enough for just about anyone. :D

quicksketch’s picture

Some problems with the current patch:

- Dates are stored as strings.
- Note all systems has ZipArchive available, which is required for this exporter.

I'm also not sure if we should keep the "fake" Excel exporter or not. Maybe use it only if ZipArchive is not available? Or just throw a warning in hook_requirements() and only give people delimited text if ZipArchive is not available?

quicksketch’s picture

This patch removes the display of the old Excel exporter (with a hidden variable to enable it if desired). If you don't have ZipArchive on your system, the normal Excel exporter is not available either. This patch also fixes the storage of dates, so dates/times/datetimes are all displayed in the "short" format, as defined by Excel. Apparently this is locale-specific on the Excel-side, which means the "short" format will adjust based on the computer the Excel file is opened on (which is nice).

While doing this round, I found that editing our embedded XML files was a pain, so in this version we're using the <<<EOL syntax for providing the default text of these XML files.

quicksketch’s picture

Revised patch after testing with Excel 2010 on Windows. Apparently there are a few caveats we need to watch out for with the Windows version of Excel. It's more picky about the font information and [Content_Types].xml file.

I also added hook_requirements() that presents a warning if ZipArchive is not available. For the time being, I think removing the legacy exporter is a good idea in the name of simplicity. It can be re-enabled, but considering it's just a TSV file with the extension renamed I don't think this is going to be a terrible loss. I think this patch is good to go.

quicksketch’s picture

Title: Ignore new-lines in textareas during Excel export (Excel-native exporter) » Excel-native exporter, fixing new line and UTF-8 importing problems
Status: Needs review » Fixed

Gave it one more review. Committed to 7.x-4.x. Yay.

These new export files work great in Excel (Mac 2008+/Windows 2007+), Google Docs, LibreOffice, and OpenOffice. Considering these files have native UTF-8, support for over a million rows, and handle new lines properly, I can see them being the most widely utilized files for importing and exporting.

Other than the funky date handling, overall I'm actually really impressed with this file format. It's no wonder it's so widely supported, considering its considerably easier to implement than the old BIFF format. I'll note that Google Docs doesn't support exporting XLS files either, which makes me feel even better about this decision.

And as mentioned before, the old exporter is still available, it's just hidden from the UI. It can be enabled via a hidden variable in settings.php:

$conf['webform_excel_legacy_exporter'] = TRUE;

But since it's really the same thing as a TSV with a different extension, I don't think this is going to be widely needed.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.