From #1260052-93: Candidate WYSIWYG editors:
From my comment in http://drupal.org/node/1260052#comment-6743142 I had a look at how these two editors compared in terms of pasting from Word. I didn't use the examples in #89 because these both look to have been setup to strip pretty much everything, so it wouldn't be much of a comparison.
I used:
http://www.aloha-editor.org/demos/3col/ and
http://nightly-v4.ckeditor.com/3952/samples/inlinebycode.htmlI used http://nicedit.com/demos.php as a control since I know it does no stripping, leaving in the weird classes, spans, styles, tags, measurements etc., just to check that Word wasn't giving me good markup in the first place ;-)
Both seemed to work pretty well, stripping out all the the worst. A few things I did notice:
- CKeditor didn't strip out the 'align' attribute.
- Ckeditor allowed various inline styles through, particularly prevalent being margin-left with pt measurements for Word indentations
- Aloha allowed but CKeditor converted to which seems better
- Aloha performed slightly better at pasting lists from older documents, yet both regularly failed, pasting a disc character instead and filling the space between the disc and the text with nbsps, and CKeditor applying margin styles as well
- CKeditor occasionally allowed DIVS to be pasted in, and them coming from Word they never made any sense
- Aloha performed better with tables, with CKEditor applying a width to every single cell, and applying border, cell padding, cell spacing etc. to tables which didn't seem desirable in this context. Aloha basically took tables completely over with its plugin, removing/replacing all such stuff.
I get the feeling that many of the above observations with CKeditor may just be down to configuration? In any case, in my testing they both clean up the major cruft very well, but with Aloha performing slightly better at retaining formatting where it can, and cleaning up tables, which gives it a few extra points for me currently in terms of 'paste from word'.
Upon switching from Aloha Editor to CKEditor, we identified 8 functional gaps (#1260052-148: Candidate WYSIWYG editors). One of which was improved pasting from Word, based on the above feedback.
The key way that was going to be addressed in CKEditor is via https://dev.ckeditor.com/ticket/9829. That's now part of CKEditor 4.1 (already part of Drupal 8 right now). They called it "ACF" (Advanced Content Filter): http://ckeditor.com/blog/CKEditor-4.1-Released.
However, to fully leverage that in their "PFW" (Paste From Word) feature, they still have to tackle this ticket: http://dev.ckeditor.com/ticket/9991. It will only be done after CKE 4.2. Which still leaves plenty of time before Drupal 8 release, but this issue exists to ensure it gets tackled before release.
Reference document to test?
Ideally, we would have a reference Word/Pages document, paste that, and have reference target (cleaned up) markup.
Current status
Over at #1936392: Configure CKEditor's "Advanced Content Filter" (ACF) to match Drupal's text filters settings, we're working on configuring ACF to closely match what Drupal's text format filters allow. So, to evaluate Paste From Word, you currently should apply the patch over there. Easily test it on simplytest.me: http://simplytest.me/project/drupal/357ac577dfd1817ad3a72dabee9cb01fe7aad577?patch[]=http://drupal.org/files/ckeditor_acf-1936392-6.patch
With Basic HTML, you'll note that a lot of cruft is stripped out: no class attributes, but there still are empty span tags, for example.
With Full HTML, you'll note that a lot of cruft is still allowed: class attributes, and so on. But this is because ACF also allows that for this text format.
The thing is that even when using Full HTML, we'd expect e.g. the class attributes to stripped away, even though class attributes are allowed by the text format.
Comment | File | Size | Author |
---|---|---|---|
#10 | CKEDITOR.gif | 451.96 KB | sriharise |
Comments
Comment #1
webchickNow that we're post-API freeze, this kind of thing seems like a good thing to start thinking about.
Comment #2
Wim LeersWe can only evaluate this once http://dev.ckeditor.com/ticket/9991 is done. So, for now, this should still be postponed.
Comment #2.0
Wim LeersFix simplytest.me link
Comment #3
catchNo idea why this would be tied to the release. Looks like that issue isn't fixed yet...
Comment #4
Wim LeersAgreed, shouldn't block beta, but should be verified & fixed before release.
Comment #5
catchStill not clear why this would hold up the release if it doesn't work, and why if it should, that this issue isn't critical.
Comment #6
Wim LeersIt looks like updating to CKE 4.5.3 (i.e. #2521820: Update CKEditor library to 4.5.3) should fix this.
Comment #7
Wim Leers#2521820: Update CKEditor library to 4.5.3 landed. We can now test this. I pinged @lightsurge, who did the original testing (cited in the IS) at #1260052-93: Candidate WYSIWYG editors.
Comment #8
Wim LeersSadly, we haven't had anybody do this testing yet, and I don't think it will still happen.
OTOH, one particularly annoying Paste-from-Word issue (#2516932: When pasting from Word, empty paragraphs were created) was indeed fixed by upgrading to CKE 4.5.3. (#2521820: Update CKEditor library to 4.5.3). Plus, the CKEditor team has done a lot of work to improve this in CKE 4.5.3:
If there are any remaining things, I'm sure they will be raised during the Drupal 8 release cycle, and we'll report them to the CKEditor team.
Therefore, it feels slightly wrong to close this without a big round of validation. But, I think it is fair to consider this "done". So, closing this. Feel free to reopen if you care about this and want to do the manual testing.
Comment #10
sriharise CreditAttribution: sriharise at TATA Consultancy Services commentedTested in drupal 8.5 and found this is still not working. Attached the gif for reference.
Comment #11
bbuchert CreditAttribution: bbuchert as a volunteer commentedI'm still having a lot of unnecessary span tags inside the code pasted from word.
For now I'm cleaning up the code here by hand: https://www.htmlwasher.com/
Comment #12
bkosborneThe empty spans you get are most likely because your text filter is configure to only allow certain attributes on span elements. When you paste from word, it creates a bunch of spans with style attributes to achieve the correct styling, but then the advanced content filter in CKEditor is stripping them.
I'm looking for a solution for this, ideally by processing the HTML after the paste from word plugin and scanning for empty spans to remove them.
Comment #13
wranvaud CreditAttribution: wranvaud commented@bkosborne you are right, repeated span tags are used for styling. It is possible to allow styles in span tags adding
<span style>
To the text format "Allowed HTML tags" at /admin/config/content/formats/manage/basic_html