Since I've applied the 8.3.1 upgrade to my site, every time I paste from word (using the paste from word in CKEditor) the text now retains font-size and font-family styles in containing spans.

If I save the content and then edit and save again then it eventually loses the font styling but keeps the spans.

This behaviour is new since 8.2.x.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

tobykilroy created an issue. See original summary.

Wim Leers’s picture

Title: Since 8.3.x update paste from word retains font » [upstream] Since 8.3.x update paste from word retains font
Status: Active » Postponed (maintainer needs more info)
Issue tags: +Needs upstream bugfix

After you've pasted from word and you see that font-size + font-family: can you please click the "view source" button and paste that HTML here? Then we can see what's going on.

In any case, this is an upstream CKEditor bug. I'll ping them as soon as you've provided that information.

danquah’s picture

CKEditor 4.6.0 just flipped the default for whether to strip word font styling from true to false
http://docs.ckeditor.com/#!/api/CKEDITOR.config-cfg-pasteFromWordRemoveF...

So, if core does not explicitly set the configuration, that might be the explanation.

Does core do anything to setup the advanced content filter the ckeditor documentation mentions?

danquah’s picture

Issue summary: View changes
Status: Postponed (maintainer needs more info) » Active
FileSize
37.95 KB
40.01 KB
28.5 KB

With regards to the test data I just reproduced the issue on 8.3.2

1: Created a google docs document, formatted a single red line of text with a large font size and times new roman as family
2: Pasted the string into a "paste from word" popup from ckeditor

Result: The text was pasted into ckeditor with all font formatting preserved
Expected result: The text was pasted without formatting

"View source"

<p><span style="font-size:36pt"><span style="font-family:&quot;Times New Roman&quot;"><span style="color:#ff0000"><span style="vertical-align:baseline"><span style="white-space:pre-wrap">A times new roman test string</span></span></span></span></span></p>

And just to verify the theory mentioned in #3 if I alter the js-config for ckeditor like this

function my_module_admin_editor_js_settings_alter(array &$settings) {
  if (!empty($settings['editor']['formats']) && is_array($settings['editor']['formats'])) {
    foreach ($settings['editor']['formats'] as $text_format_name => &$format_settings) {
      if ($format_settings['editor'] === 'ckeditor') {
        // CKeditor 4.6 deprecated pasteFromWordRemoveFontStyles and switched
        // its default from true to false. As a fix until core comes up with a
        // better way to solve this (see https://www.drupal.org/node/2871539 ).
        $format_settings['editorSettings']['pasteFromWordRemoveFontStyles'] = true;
      }
    }
  }
}

The text gets pasted correctly ("view source" again):

<p><span style="vertical-align:baseline"><span style="white-space:pre-wrap">A times new roman test string</span></span></p>

With regards to what to do - we could have core set pasteFromWordRemoveFontStyles to true, but as the configuration is deprecated it would only work until the variable gets removed completely. The recommended solution is to use CKEditors "Advanced Content Filter" to strip all font styling. Should we go with the latter approach I guess core would ultimately be choosing what "stripping word styling" actually means.

Can't really tell whether this is problem that should be resolved upstream or in core?

Screenshots for good measure:
paste from word popup
text pasted into ckeditor
resulting source

Wim Leers’s picture

Version: 8.3.1 » 8.4.x-dev

@danquah Thanks for providing the detailed information! Sorry for having missed this at first :( Pinged the CKEditor team!

Comandeer’s picture

The easiest way would be to listen to afterPasteFromWord event and filter evt.data.dataValue value: https://jsfiddle.net/Comandeer/8bn1e82d/
It's the most flexible solution as it allows to choose exactly what styles or elements will be filtered out.

Wim Leers’s picture

Title: [upstream] Since 8.3.x update paste from word retains font » [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font due to change in defaults
Issue tags: -Needs upstream bugfix

Thanks, @comandeer! Go CKSource team! Manual testing confirmed this works great.

So we're effectively undoing a BC break on the CKEditor side by restoring prior CKEditor behavior on the Drupal 8 side. https://ckeditor.com/blog/CKEditor-4.6-released said this:

Note: Some Paste from Word configuration options were either dropped or changed their default values. Please refer to the release notes for more details.

… but that didn't indicate that there was a pretty significant BC break to me.

https://ckeditor.com/release/CKEditor-4.6.0 does actually document this clearly:

The solution in #6 only restored the prior behavior for pasteFromWordRemoveFontStyles, not for pasteFromWordNumberedHeadingToList or pasteFromWordNumberedHeadingToList. We probably want to restore that behavior too.

@Comandeer, surely Drupal is not the only one noticing these differences in defaults? Why not retain the old defaults?

Wim Leers’s picture

@Comandeer: unfortunately, the "Paste from Word" button is completely broken in Drupal 8.4 + Drupal 8.5 due to https://github.com/ckeditor/ckeditor-dev/issues/624. Which means I also can't actually fix this yet :(

Wim Leers’s picture

Issue tags: +Needs upstream bugfix
Wim Leers’s picture

Title: [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font due to change in defaults » [regression] [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font due to change in defaults
Priority: Normal » Minor

(The problem described in #8 only exists with CKEditor 4.7, not CKEditor 4.6. Therefore installing Drupal 8.3 allowed me to continue testing this. Below you can find the result.)

What @danquah did not mention, is that this problem only exists when using the Full HTML text format. Because the Basic HTML text format already has HTML restrictions (one of which is that no style attributes are allowed), which are automatically translated into settings for CKEditor's ACF (Advanced Content Filter), which means that the font is not actually pasted when pasting from Word, when the text format is Basic HTML.

Therefore this problem is solely a regression for text formats using CKEditor that do not have any HTML restrictions. Which out of the box means that the Full HTML text format indeed has the regression explained in #4.

So this is indeed a regression, but a very tightly scoped one. And it could be argued that this is actually more accurate.

@danquah: Why are you using Full HTML? Basic HTML is almost always a better choice.

Wim Leers’s picture

Title: [regression] [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font due to change in defaults » [regression] [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font when using "Full HTML" text format, due to change in CKEditor's defaults

More accurate title.

danquah’s picture

Correct, we saw the issue with a Full HTML "filter"

We usually go with that filter to enable our customers to disable the editor and now use the field for eg. embedding a Google Maps.

So I guess we mostly see the field as something that contain HTML, and that HTML can come from CKEditor, or from a superuser that needs to enter raw markup. The assumption is that our Editors (role) knows that they should stick to ckeditor most of the time and only disable it in very specific situations.

Are we using the ckeditor support wrong?

Wim Leers’s picture

We usually go with that filter to enable our customers to disable the editor

Full HTML is a text format, not a filter. A format applies a list of filters in a particular order. I know, it's a bit confusing!

Are we using the ckeditor support wrong?

Kind of, yes! Full HTML is inherently insecure. It should never be exposed to "regular content editors", only to technical people who very well know what they're doing. It's easy to paste a malicious <script> tag in a text field using the Full HTML format and have it attack all your users!

It's strongly recommended to either add additional HTML tags to the list of allowed tags by the Basic HTML text format, or to create a new text format (e.g. Extensive HTML) and then explicitly allow the tags necessary for Google Maps. I'd strongly recommend using https://www.drupal.org/project/ckeditor_media_embed.

danquah’s picture

Ah yes, it's hard to remember the difference without having the interface in front of me :)

I do agree that it is a risk, but we often let the customers make the choice as they are paying for the extra robustness. That being said I'll definitely take a look at ckeditor_media_embed, if it works as well as it looks we might just go with that in our default setup going forward, thanks for the tip, much appriciated!

The reason I ran in to this issue was that it is not clear that there is a dependency between the CKEditor (in the way it is configured for Drupal anyhow) and text filtering. As I understand it, just running Xss::filter() without any restrictions should be enough right? How about adding a CKEditor specific filter to core that ensures that any filtering that is required to make CKEditor work as expected is done? The filter could be enabled pr default if CKEditor is chosen, and could then subsequently be manually disabled in the situations where one actually needed the unfiltered input.

Wim Leers’s picture

Title: [regression] [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font when using "Full HTML" text format, due to change in CKEditor's defaults » [BC break] [upstream] Since 8.3.x, CKEditor's "Paste from Word" retains font when using "Full HTML" text format, due to change in CKEditor's defaults
Status: Active » Closed (works as designed)

we often let the customers make the choice as they are paying for the extra robustnes

But enabling Full HTML = extra brittleness, not robustness! Make one mistake, and the site is insecure or broken!

The reason I ran in to this issue was that it is not clear that there is a dependency between the CKEditor (in the way it is configured for Drupal anyhow) and text filtering.

There is no hard dependency, but they have to be configured in a way that the text format is compatible with CKEditor, yes. CKEditor can't function if the <p> and <br> tags aren't allowed, that is the bare minimum. And CKEditor automatically respects HTML restrictions imposed by the "Allowed HTML" filter.

As I understand it, just running Xss::filter() without any restrictions should be enough right?

In case of Full HTML, there is no XSS protection.

How about adding a CKEditor specific filter to core that ensures that any filtering that is required to make CKEditor work as expected is done?

No filter is required to make CKEditor work.


To summarize: the "bug" reported here is actually a BC break in CKEditor itself. I can't change anything about that.

The reason I'm suggesting to use Basic HTML, is because it's almost never sane to use Full HTML, because it literally means anything is allowed. ("Raw HTML" would be a better name.) If you were using the Basic HTML text format, then the restrictions imposed by that format would have ensured that the BC break you are seeing would have been mitigated.

I can't do anything else to help you, I'm afraid :( It's a BC break on the CKEditor side, which they consciously made, there's nothing else I can do besides recommending to not use Full HTML. Which I would have recommended anyway.

streever’s picture

So if someone has a client who is used to the functionality, we just tell them sorry, it's gone now?

Andras_Szilagyi’s picture

Hi,
we had the same issue in our drupal 7 instances with the new versions of ckeditor. We solved it with a drupal behaviour, see below:

/**
 * @file
 * AfterPasteFromWord.
 */

(function ($) {
  Drupal.behaviors.multisite_wysiwyg_paste_from_word = {
    attach: function () {
      $(function () {
        var activeFilter = [];

        for (var i in CKEDITOR.instances) {
          CKEDITOR.instances[i].on('paste', function (evt) {
            if (activeFilter[i] !== true) {
              return;
            }

            // Create a standalone filter.
            var rules = 'h2 h3 h4 h5 h6 em strong abrr cite blockquote code ul ol li dl dt table tr td br; a[href]; img[src, alt]; p[align]';
            var filter = new CKEDITOR.filter(rules),
            // Parse the HTML string to a pseudo-DOM structure.
            fragment = CKEDITOR.htmlParser.fragment.fromHtml(evt.data.dataValue),
            writer = new CKEDITOR.htmlParser.basicWriter();
            // Some needed transformations.
            filter.addTransformations([
              [
                {
                  element: 'b',
                  right: function (el, tools) {
                    tools.transform(el, 'strong');
                  }
                }
              ],
              [
                {
                  element: 'i',
                  right: function (el, tools) {
                    tools.transform(el, 'em');
                  }
                }
              ]
            ]);

            filter.applyTo(fragment);
            fragment.writeHtml(writer);
            evt.data.dataValue = writer.getHtml();
            activeFilter[i] = false;
          });

          // Enable filer when button is pressed.
          CKEDITOR.instances[i].on('beforeCommandExec', function (event) {
            if (event.data.name === 'pastefromword') {
              activeFilter[i] = true;
            }
          });

          // Remove browser incompatibility notice.
          CKEDITOR.instances[i].on('instanceReady', function (ev) {
            ev.editor.lang.clipboard.pasteNotification = "Press %1 to paste.";
          });
        }
      });
    }

  }
})(jQuery);

above was added to a file, we used hook_wysiwyg_editor_settings_alter to include this at the right moment into the page.

The way you use this is to click the "paste from word" button, and then use CTRL + V and the filter will apply.

liviokujur’s picture

For Drupal 9 I did the following:
Configuration > Content authoring > Text formats and editors
Full HTML > Configure
Enabled filters SELECT Limit allowed HTML tags and correct faulty HTML