From #1260052-93: Candidate WYSIWYG editors:

From my comment in http://drupal.org/node/1260052#comment-6743142 I had a look at how these two editors compared in terms of pasting from Word. I didn't use the examples in #89 because these both look to have been setup to strip pretty much everything, so it wouldn't be much of a comparison.

I used:

http://www.aloha-editor.org/demos/3col/ and
http://nightly-v4.ckeditor.com/3952/samples/inlinebycode.html

I used http://nicedit.com/demos.php as a control since I know it does no stripping, leaving in the weird classes, spans, styles, tags, measurements etc., just to check that Word wasn't giving me good markup in the first place ;-)

Both seemed to work pretty well, stripping out all the the worst. A few things I did notice:

  • CKeditor didn't strip out the 'align' attribute.
  • Ckeditor allowed various inline styles through, particularly prevalent being margin-left with pt measurements for Word indentations
  • Aloha allowed but CKeditor converted to which seems better
  • Aloha performed slightly better at pasting lists from older documents, yet both regularly failed, pasting a disc character instead and filling the space between the disc and the text with nbsps, and CKeditor applying margin styles as well
  • CKeditor occasionally allowed DIVS to be pasted in, and them coming from Word they never made any sense
  • Aloha performed better with tables, with CKEditor applying a width to every single cell, and applying border, cell padding, cell spacing etc. to tables which didn't seem desirable in this context. Aloha basically took tables completely over with its plugin, removing/replacing all such stuff.

I get the feeling that many of the above observations with CKeditor may just be down to configuration? In any case, in my testing they both clean up the major cruft very well, but with Aloha performing slightly better at retaining formatting where it can, and cleaning up tables, which gives it a few extra points for me currently in terms of 'paste from word'.

Upon switching from Aloha Editor to CKEditor, we identified 8 functional gaps (#1260052-148: Candidate WYSIWYG editors). One of which was improved pasting from Word, based on the above feedback.

The key way that was going to be addressed in CKEditor is via https://dev.ckeditor.com/ticket/9829. That's now part of CKEditor 4.1 (already part of Drupal 8 right now). They called it "ACF" (Advanced Content Filter): http://ckeditor.com/blog/CKEditor-4.1-Released.
However, to fully leverage that in their "PFW" (Paste From Word) feature, they still have to tackle this ticket: http://dev.ckeditor.com/ticket/9991. It will only be done after CKE 4.2. Which still leaves plenty of time before Drupal 8 release, but this issue exists to ensure it gets tackled before release.

Reference document to test?

Ideally, we would have a reference Word/Pages document, paste that, and have reference target (cleaned up) markup.

Current status

Over at #1936392: Configure CKEditor's "Advanced Content Filter" (ACF) to match Drupal's text filters settings, we're working on configuring ACF to closely match what Drupal's text format filters allow. So, to evaluate Paste From Word, you currently should apply the patch over there. Easily test it on simplytest.me: http://simplytest.me/project/drupal/357ac577dfd1817ad3a72dabee9cb01fe7aad577?patch[]=http://drupal.org/files/ckeditor_acf-1936392-6.patch

With Basic HTML, you'll note that a lot of cruft is stripped out: no class attributes, but there still are empty span tags, for example.

With Full HTML, you'll note that a lot of cruft is still allowed: class attributes, and so on. But this is because ACF also allows that for this text format.

The thing is that even when using Full HTML, we'd expect e.g. the class attributes to stripped away, even though class attributes are allowed by the text format.

CommentFileSizeAuthor
#10 CKEDITOR.gif451.96 KBsriharise
Support from Acquia helps fund testing for Drupal Acquia logo

Comments

webchick’s picture

Priority: Normal » Major
Status: Postponed » Active

Now that we're post-API freeze, this kind of thing seems like a good thing to start thinking about.

Wim Leers’s picture

Status: Active » Postponed

We can only evaluate this once http://dev.ckeditor.com/ticket/9991 is done. So, for now, this should still be postponed.

Wim Leers’s picture

Issue summary: View changes

Fix simplytest.me link

catch’s picture

Issue tags: -revisit before beta

No idea why this would be tied to the release. Looks like that issue isn't fixed yet...

Wim Leers’s picture

Issue tags: +revisit before release candidate

Agreed, shouldn't block beta, but should be verified & fixed before release.

catch’s picture

Issue tags: -revisit before release candidate

Still not clear why this would hold up the release if it doesn't work, and why if it should, that this issue isn't critical.

Wim Leers’s picture

It looks like updating to CKE 4.5.3 (i.e. #2521820: Update CKEditor library to 4.5.3) should fix this.

Wim Leers’s picture

Assigned: Wim Leers » Unassigned
Status: Postponed » Active
Issue tags: +Needs manual testing

#2521820: Update CKEditor library to 4.5.3 landed. We can now test this. I pinged @lightsurge, who did the original testing (cited in the IS) at #1260052-93: Candidate WYSIWYG editors.

Wim Leers’s picture

Status: Active » Fixed

Sadly, we haven't had anybody do this testing yet, and I don't think it will still happen.

OTOH, one particularly annoying Paste-from-Word issue (#2516932: When pasting from Word, empty paragraphs were created) was indeed fixed by upgrading to CKE 4.5.3. (#2521820: Update CKEditor library to 4.5.3). Plus, the CKEditor team has done a lot of work to improve this in CKE 4.5.3:

This release brings a number of enhancements to the Paste from Word plugin, improving document parsing, content clean-up and error handling. The update is therefore recommended for users who work with the plugin and expect the best compatibility with external text editors. CKEditor is regularly praised for the quality of its Paste from Word solution which is often listed as a deciding factor when choosing a WYSIWYG editor, so we take particular care of this feature and try to polish it whenever we have a chance.

If there are any remaining things, I'm sure they will be raised during the Drupal 8 release cycle, and we'll report them to the CKEditor team.

Therefore, it feels slightly wrong to close this without a big round of validation. But, I think it is fair to consider this "done". So, closing this. Feel free to reopen if you care about this and want to do the manual testing.

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

sriharise’s picture

FileSize
451.96 KB

Tested in drupal 8.5 and found this is still not working. Attached the gif for reference.

bbuchert’s picture

I'm still having a lot of unnecessary span tags inside the code pasted from word.

For now I'm cleaning up the code here by hand: https://www.htmlwasher.com/

bkosborne’s picture

The empty spans you get are most likely because your text filter is configure to only allow certain attributes on span elements. When you paste from word, it creates a bunch of spans with style attributes to achieve the correct styling, but then the advanced content filter in CKEditor is stripping them.

I'm looking for a solution for this, ideally by processing the HTML after the paste from word plugin and scanning for empty spans to remove them.

wranvaud’s picture

@bkosborne you are right, repeated span tags are used for styling. It is possible to allow styles in span tags adding
<span style>
To the text format "Allowed HTML tags" at /admin/config/content/formats/manage/basic_html