Problem/Motivation

When exporting spreadsheet data (XLS/XLSX) with Strip HTML enabled, cell text is truncated if the content contains plain-text < / > characters (for example low pressure, <1 MPa).

In Xls::formatValue() (src/Encoder/Xls.php) the operations run in this order:

  1. Html::decodeEntities($value) — e.g. <1 MPa becomes <1 MPa
  2. strip_tags($value) — PHP treats <... as an HTML tag and removes or truncates the remainder of the string

This also affects Views Data Export when the display format is XLS/XLSX and strip-tags behaviour is active (including the encoder default when xls_settings are not passed from the data export style plugin).

Steps to reproduce

  1. Drupal 10.x, xls_serialization 2.1.0, views_data_export.
  2. Create content with a formatted text field containing HTML markup and text such as Buffer hydrogen gas holder: low pressure, <1 MPa.
  3. Create a View with a Data export display, format XLS or XLSX, with strip-tags behaviour enabled (XLS settings and/or encoder defaults).
  4. Export and open the spreadsheet.

Expected: HTML tags removed; plain < and > preserved in the cell.

Actual: Text after < is removed or truncated (e.g. export ends at low pressure,).

Proposed resolution

Swap the order in formatValue() so tags are stripped before entities are decoded:

if ($this->stripTags) {
  $value = strip_tags($value);
  $value = Html::decodeEntities($value);
}

Real HTML tags are removed first; encoded plain-text symbols are decoded afterwards and remain in the export.

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

gtriant created an issue. See original summary.

mably made their first commit to this issue’s fork.

mably’s picture

Status: Active » Needs review
Issue tags: -strip_tags

Thanks @gtriant for your patch.

Can you confirm that this issue's MR fixes your problem?

gtriant’s picture

Yes, it does

  • mably committed 7fdf051e on 2.1.x
    fix: #3591045 Xls encoder truncates text with "<" / ">" when Strip HTML...
mably’s picture

Version: 2.1.0 » 2.1.x-dev
Status: Needs review » Fixed

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.