Overview

This bug was introduced in #3467959: SDC and code component props should be able to receive HTML, editable in formatted text fields+widgets.

I have a basic text component. The schema invokes CKEditor:

props:
  type: object
  properties:
    text:
      title: Text
      type: string
      contentMediaType: text/html
      x-formatting-context: block

However when I output this as {{ text }}, I get escaped HTML. I can fix this by doing {{ text|raw }}, but this isn't in the examples, and obviously has security implications. I'd also argue, this negatively affects the developer experience.

We had discussion in Slack at https://drupal.slack.com/archives/C072JMEPUS1/p1759666595931849. One note is that

My concern is that this text component could be used in different page builders outside of Canvas. And the other page builders might not use CKEditor to filter.

This could result in a XSS vulnerability.

Proposed resolution

Output as `markup` so the text doesn't get double escaped.

Issue fork canvas-3550334

Command icon Show commands

Start within a Git clone of the project using the version control instructions.

Or, if you do not have SSH keys set up on git.drupalcode.org:

Comments

mherchel created an issue. See original summary.

effulgentsia’s picture

Version: 1.0.0-beta1 » 1.x-dev
Component: … to be triaged » Shape matching
Assigned: Unassigned » larowlan
Issue tags: +RC blocker

We definitely don't want people to use the |raw filter in Twig to work around this, so tagging as an RC blocker. @larowlan and @Wim Leers have some prior art from a different issue for how to fix this, so assigning to @larowlan because his workday starts sooner. If he can't get to it, we'll make sure that someone else does early this week.

larowlan made their first commit to this issue’s fork.

larowlan’s picture

Title: Text formated with CKEditor within Canvas gets double escaped when output » Text formatted with CKEditor within Canvas gets double escaped when output
Assigned: larowlan » Unassigned
Status: Active » Needs review

Moved the private MR to this issue

larowlan’s picture

Credited @shitalb who also asked about this when the issue was private and was added there.

effulgentsia’s picture

Status: Needs review » Needs work

Thanks! Looks great so far, but needs an upgrade path (post_update function?) for existing Component config entities (anywhere else?) unless those are automatically regenerated with this new expression on cache clear.

larowlan’s picture

Thanks! Looks great so far, but needs an upgrade path (post_update function?) for existing Component config entities (anywhere else?) unless those are automatically regenerated with this new expression on cache clear.

They will autogenerate but I _think_ that will create a new component version, so any content pointing at the old version will likely get double escaped. We might want to do an update hook to update the version of any content pointing to that version.

edwardsay’s picture

The patch from the MR doesn't resolve the issue. Values are still displayed escaped if no |raw is used.

wim leers’s picture

Priority: Normal » Critical
Issue tags: +Needs update path, +Needs update path tests

We might want to do an update hook to update the version of any content pointing to that version.

+1


@edwardsay: It will work only for new component instances. To confirm this: 1) apply patch, 2) drush cr, 3) create component instance, 4) observe that the new component instance works as expected, and pre-existing component instances continue to NOT work as expected.

edwardsay’s picture

Yeah, I tried it. But it doesn't work even for new components. I'm gonna continue debugging this.
screenshot

edwardsay’s picture

Okay, looks like I'm onto something.

web/core/modules/text/src/TextProcessed.php line 59

$item->format is empty for SDC components. So, Plain Text is used.

I'll continue later, don't have time now.

wim leers’s picture

$item->format is empty for SDC components. So, Plain Text is used.

Huh! That shouldn't be the case! It should be using the canvas_html_block text format:

!isset($schema['x-formatting-context']) || $schema['x-formatting-context'] === 'block' => new StorablePropShape(shape: $shape, fieldTypeProp: new FieldTypePropExpression('text_long', 'value'), fieldWidget: 'text_textarea', fieldInstanceSettings: ['allowed_formats' => ['canvas_html_block']]),

\Drupal\canvas\JsonSchemaInterpreter\JsonSchemaType::computeStorablePropShape()

wim leers’s picture

StatusFileSize
new663.12 KB

Debugged. Tests are failing for roughly the same reason as #12.

Guess why? Yet another broken ::generateSampleValue() in core 😅

  public static function generateSampleValue(FieldDefinitionInterface $field_definition) {
    $random = new Random();
    $settings = $field_definition->getSettings();
…
    $values = [
      'value' => $value,
      'summary' => $value,
      'format' => filter_fallback_format(),
    ];
    return $values;
  }

\Drupal\text\Plugin\Field\FieldType\TextItemBase::generateSampleValue()

👆format is always set to the fallback format, instead of respecting the allowed_formats setting 🙄

thoward216 made their first commit to this issue’s fork.

penyaskito made their first commit to this issue’s fork.

wim leers’s picture

The screenshots in #11 match the failure in prop-types.cy.js. IOW: they predicted test failure.

If we look at Canvas' test-only all-props SDC, we see the same:

I bet this is related to Drupal’s “safe markup” concept — SDCs expect pure strings (to conform to JSON schema), but that means Drupal re-escapes it. Dug in and … at least the SDC subsystem is so far not getting in the way: despite receiving FilteredMarkup objects, it does not fail SDC's ComponentValidator:

And it even makes it all the way to the final per-SDC render array:

Which tells me the problem is deeper. 😅

Zero issues against the SDC subsystem for FilteredMarkup, too: https://www.drupal.org/project/issues/drupal?text=filteredmarkup&status=...

wim leers’s picture

FilteredMarkup also survives (i.e. is not downcast to just string which was my original hypothesis):

  • SDC's \Drupal\Core\Render\Element\ComponentElement::preRenderComponent()
  • SDC's \Drupal\Core\Render\Element\ComponentElement::generateComponentTemplate()

… which means my hypothesis was wrong.

The root cause: \Drupal\text\Plugin\Field\FieldType\TextItemBase::applyDefaultValue() in core is wildly broken:

 public function applyDefaultValue($notify = TRUE) {
    // @todo Add in the filter default format here.
    $this->setValue(['format' => NULL], $notify);
    return $this;
  }

That @todo dates back to at least 2013, probably long before (!!!). It should've been solved back when #784672: Allow text field to enforce a specific text format landed.

wim leers’s picture

Status: Needs work » Reviewed & tested by the community
Issue tags: +Needs upstream bugfix
StatusFileSize
new890.9 KB
new92.84 KB

Fix pushed. Works fine locally now.

Tested with SDC:

@jessebaker tested with code components:


We discussed this internally yesterday; and the conclusion between @effulgentsia and I was:

  1. Merge the actual fix FIRST
  2. Then work on the update path

So: will do exactly that.

wim leers’s picture

Title: Text formatted with CKEditor within Canvas gets double escaped when output » [upstream] Text formatted with CKEditor within Canvas gets double escaped when output
Issue tags: +Needs followup

Needs 2 upstream core issues fixed:

  1. default format value based on field instance's allowed_formats setting: CoreBugFixTextItemBaseDefaultValueTrait
  2. respecting that same setting in ::generateSampleValue(): CoreBugFixTextItemBaseGenerateSampleValueTrait
wim leers’s picture

Assigned: Unassigned » penyaskito

I'd like @penyaskito to do a final review pass 😇

wim leers’s picture

Funny, sad and impressive all at once: one critical security bug in Canvas, the fix in Canvas is trivial (the update path won't be as trivial), but … it revealed TWO core bugs! 🤯

(Those required orders of magnitude more work, ironically!)

I should've spotted this bug in #3467959: SDC and code component props should be able to receive HTML, editable in formatted text fields+widgets 😬🫣

penyaskito’s picture

Assigned: penyaskito » wim leers

  • wim leers committed 32ccb02a on 1.x authored by larowlan
    [#3550334] fix(Shape matching): Text formatted with CKEditor within...
wim leers’s picture

Title: [upstream] Text formatted with CKEditor within Canvas gets double escaped when output » [update path] [upstream] Text formatted with CKEditor within Canvas gets double escaped when output
Assigned: wim leers » Unassigned
Status: Reviewed & tested by the community » Needs work

Merged! Back to Needs work for update path.

And probably anything with an update path also needs a change record 😇

wim leers’s picture

Assigned: Unassigned » wim leers

Multi-tasking and merging an MR == bad idea. I broke HEAD 🫣

wim leers’s picture

Assigned: wim leers » Unassigned

HEAD is green again ✅

effulgentsia’s picture

Issue tags: -RC blocker +stable blocker

Yay that the fix itself got in for beta2! The update path doesn't need to block RC. In the meantime, people can manually edit and re-save their existing content that uses HTML props that are getting double escaped. Though since we're already past beta, it would be courteous for us to provide the update path relatively soon and certainly before a stable 1.0 release.

wim leers’s picture

In the meantime, people can manually edit and re-save their existing content that uses HTML props that are getting double escaped.

No, they can't.

They'd need to recreate the component instances.

Only once #3463996: [META] [PP-1] When the field type, storage/instance settings, widget, expression or requiredness for an SDC/code component prop changes, the Content Creator must be able to upgrade is done, what you wrote will be true. So … to be on the safe side, restoring the tag.

effulgentsia’s picture

Issue tags: -RC blocker +rc target

I discussed this with @wim leers and @lauriii. Ideally, since beta1 marked the point where we started promising to not break existing site data, we'd get this update path in before RC1. However, this will be the first Canvas update path that requires updating content entities, so it might take some time to write it and make sure it works properly, and DrupalCon is starting next week. Meanwhile, the lack of this update path doesn't result in a broken site per se, just some component instances (ones with HTML props) continuing to be double-escaped and needing to be replaced with new instances of the same component, copy the values from the old instance to the new one, and delete the old one, in order to fix the double escaping. Definitely annoying, but beta1 hasn't been out that long, so not that many sites affected by it, and for the sites that are, it should only require a few minutes of manual effort. Therefore, downgrading to an RC target, meaning a high priority thing to work on after we tag RC if it doesn't make it in before.

wim leers’s picture

Outline of how this update path can/should work:

  1. Update all Component config entities which have >=1 version with >=1 prop whose expression is either ℹ︎text_long␟processed or ℹ︎text␟processed. Why these 2 expressions? They're the ones specified/hardcoded in \Drupal\canvas\JsonSchemaInterpreter\JsonSchemaType::computeStorablePropShape(), which is what https://git.drupalcode.org/project/canvas/-/merge_requests/194 fixed.

    All affected Component versions must be updated. @penyaskito and I just did something like that in #3532514: Gracefully handle components in active development: ensure great DX!

  2. Crucially, while doing that, track: A) the IDs of the Component config entities that were updated, B) the list of updated Component versions, C) the new active (latest) version of each of those Components.
  3. Then, use that information to update:
    Content entities
    Use that tracked information to call \Drupal\canvas\Audit\ComponentAudit::getContentRevisionIdsUsingComponentIds(array $component_ids, array $version_ids) → this tells us which content revisions need updating.

    However, things are simpler: because we know that no stored data needs to change, only Component versions, AND we know that only the Page content entity type can have a component tree field today, we can instead do something like:

    UPDATE canvas_page__components WHERE components_component_version = old_updated_version_hash SET components_component_version = new_active_version
    


    Config entities
    Use that tracked information to call \Drupal\canvas\Audit\ComponentAudit::getConfigEntityDependenciesUsingComponent(), which unfortunately does not yet support filtering by Component version.
  4. Test coverage

You'll see that many of these things were actually built for \Drupal\Tests\canvas\Kernel\Plugin\Canvas\ComponentSource\ComponentInputsEvolutionTest, which was part of #3523841: Versioned Component config entities (SDC, JS: prop_field_definitions, block: default_setting, all: slots for fallback) + component instances refer to versions ⇒ less data to store per XB field row.

wim leers’s picture

@penyaskito pointed out a weakness in my proposal: for content entities, my proposal assumes that all inputs are collapsed. Which is what #3538487: Don't allow passing uncollapsed inputs if using default expression fixed and provided an update path for.

But then again, the XB → Canvas rename (~2 months after that) means that anybody out there realistically has been running with that all along.

But, it'd still be safer to bring back that update path. Even if untested. (Because we can't really test it anymore, since our reference is now not XB 0.7.x, but Canvas alpha1… unless we fake it of course.)

larowlan changed the visibility of the branch 3550334-text-formated-with to hidden.

larowlan changed the visibility of the branch 3550334-wim-shall-not-merge-while-multitasking to hidden.

penyaskito’s picture

Assigned: Unassigned » penyaskito

penyaskito changed the visibility of the branch 1.x to hidden.

wim leers’s picture

The update path is present and has test coverage. Nice work, @penyaskito!

We dug into the depths of this MR and all it relates to. 99% of what we discussed is captured as MR comments.

High-level conclusions:

  1. Tracking which old Component version became which new version, and then updating all existing Component instances that used the old version to the new version is … nice, but A) unnecessary (because the update path updates the old versions' expression!), B) overkill/out-of-scope, because that's literally what #3463996: [META] [PP-1] When the field type, storage/instance settings, widget, expression or requiredness for an SDC/code component prop changes, the Content Creator must be able to upgrade is about. Let's fix it holistically there, rather than paritally here.
  2. This issue was only ever about StaticPropSources — DynamicPropSources would've been correct already. Unless somebody manually constructed the expressions in DynamicPropSources in recipes/content's DB entries/exported config. But if that happened, you're on your own already.

1+2 together means much of the update logic + test coverage becomes obsolete: the update path only needs to handle Component config entities, and doesn't need to track versions 👍

penyaskito’s picture

This is ready to go.

The only blocker is properly clarify why some updates run multiple times, and why the number of component versions is different between the sdc and the js examples.

nagwani’s picture

Assigned: penyaskito » wim leers
Status: Needs work » Needs review
larowlan’s picture

larowlan’s picture

I postponed #3556327: Don't save `Component` config entities unnecessarily on this because the new tests here will give us coverage for it.

wim leers’s picture

Assigned: wim leers » Unassigned
Status: Needs review » Reviewed & tested by the community
  1. ✅ One crucial thing was missing: validating the updated Components
  2. ✅ I got the tests passing — thanks to @penyaskito for your kind, patient explanations! ❤️
  3. ✅ CR for the prior update path (referenced by the deprecation error requested by @larowlan): https://www.drupal.org/node/3556444 ←I finished this and published it.
  4. 🟡 CR for this issue's update path (referenced by the deprecation error requested by @larowlan): https://www.drupal.org/node/3556442 ←needs to be finished & published 🙏
  5. 🔴 Needs 2 core follow-ups — see #20.

This is ready! 🚀 Thanks for the epic work here, @penyaskito!

wim leers’s picture

Assigned: Unassigned » penyaskito
Status: Reviewed & tested by the community » Patch (to be ported)

Assigning to @penyaskito for the last 3.

I chimed in with proposed next steps at #3556327-6: Don't save `Component` config entities unnecessarily.

penyaskito’s picture

Now that this issue is closed, review the contribution record.

As a contributor, attribute any organization that helped you, or if you volunteered your own time.

Maintainers, credit people who helped resolve this issue.

penyaskito’s picture

Assigned: penyaskito » Unassigned

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.