Problem/Motivation
As of #1890502: WYSIWYG: Add CKEditor module to core, we have a WYSIWYG editor in Drupal core. But: the module is not yet enabled by default, nor is CKEditor associated yet with Filtered/Full HTML.
Sounds simple, right?
The problem is that right now, both the "Filtered HTML" and "Full HTML" text formats are using the filter_autop
and filter_url
filters. The consequence is that having these enabled while at the same time using CKEditor causes mismatches between what the editor sees in the WYSIWYG editor versus what the reader sees in the (filtered) end result!
Let me clarify:
filter_url
:- create or edit an article node with either the "Filtered HTML" or "Full HTML" text format
- paste this text:
http://drupal.org
into the body field
- save (note that we did not turn this into a link!)
- view: now this has become a link, because of
filter_url
:
<a href="http://drupal.org">http://drupal.org</a>
- edit again: not a link → a persistent mismatch!
Now, this is the weakest case of the two; but note that there is a persistent mismatch between what it looks like in CKEditor and what it looks like to the end-user/reader.
filter_autop
:- create or edit an article node with either the "Filtered HTML" or "Full HTML" text format
- switch CKEditor to "Source mode" (where you can edit the HTML directly) and paste this text:
<p>Paragraph 1.</p><p>Paragraph 2.</p>
into the body field
- go out of "Source mode" — you will now see two paragraphs, as expected
- view: now our content consists of a single paragraph instead of two! Look at the source and you will see this:
<p>Paragraph 1.Paragraph 2.</p>
- edit again: two paragraphs again → another persistent mismatch!
Again a persistent mismatch, but this time with severe consequences: depending on the whitespace in the HTML, the (filtered) end result may be perceived as corrupted! Only if one were to inject the whitespace (newlines:
\n
AKA U+000A LINE FEED (LF)) expected byfilter_autop
, the intended HTML would also be served to the end-user/reader after filtering.
Note that this is in conflict with the HTML spec, which says that whitespace outside of text nodes ("inter-element whitespace") doesn't matter:The space characters, for the purposes of this specification, are U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR).
http://dev.w3.org/html5/spec-LC/common-microsyntaxes.html#space-character
The space characters are always allowed between elements. User agents represent these characters between elements in the source markup as text nodes in the DOM. Empty text nodes and text nodes consisting of just sequences of those characters are considered inter-element whitespace.
Inter-element whitespace, comment nodes, and processing instruction nodes must be ignored when establishing whether an element's contents match the element's content model or not, and must be ignored when following algorithms that define document and element semantics.
http://dev.w3.org/html5/spec-LC/content-models.html#content-models; emphasis mine.
(It's very well possible that I'm quoting the wrong part of the HTML spec — that thing is a huge maze, but the point still stands.)
Proposed resolution
- Non-controversial parts:
- Enable editor.module + ckeditor.module by default.
- Associate CKEditor by default with the Filtered HTML and Full HTML text formats.
- Controversial parts:
- Remove the
filter_autop
andfilter_url
filters from both the "Filtered HTML" and "Full HTML" text formats. - Provide an upgrade path that automatically updates the "old" "Filtered HTML" and "Full HTML" content from Drupal 7 to Drupal 8. Detect if the format is modified, and if that's the case, then don't apply the upgrade path, and disable CKEditor for each modified format.
- Remove the
For keeping the allowed tags in sync with the CKEditor toolbar configuration: that shouldn't be covered in this issue, but over at #1894644: Unidirectional editor configuration -> filter settings syncing.
Remaining tasks
The controversial parts :)
User interface changes
The UI will change insofar that having a WYSIWYG editor enabled by default will "change" the UI — no new or modified UIs though.
API changes
None.
Original report by Wim Leers
From #1890502: WYSIWYG: Add CKEditor module to core, first listed follow-up:
Enable the CKEditor module by default in the standard install profile, enable it as the default Text Editor for the Filtered HTML and Full HTML text formats, provide default configurations of CKEditor for both of those text formats, and potentially modify the filters/filter settings of those text formats.
Comment | File | Size | Author |
---|---|---|---|
#67 | filter_new_formats_FOLLWUP-1911884-67.patch | 2.07 KB | Wim Leers |
#61 | filter_new_formats-1911884-60.patch | 14.61 KB | quicksketch |
#56 | filter_new_formats-1911884-56.patch | 14.95 KB | Wim Leers |
#56 | interdiff.txt | 2.02 KB | Wim Leers |
#55 | filter_new_formats-1911884-55.patch | 14.68 KB | Wim Leers |
Comments
Comment #1
Wim LeersInitial patch, which only implements the non-controversial parts :)
Comment #2
Gábor HojtsyI don't see the controversial parts so much controversial. The new release can make new defaults on new installs. Also looks like you have a plan to detect on upgrade if the previous formats were safe for the new setup. There will likely be other formats on a real site anyway, so some manual intervention will be needed when upgrading. This is a significant change, so I don't think that is a problem.
Comment #3
aspilicious CreditAttribution: aspilicious commentedWhat is the reasoning behind this single dash. Is this a nested array? And why?
Comment #4
Wim Leers#3: it's the first row of the toolbar. A CKEditor toolbar can have multiple rows. So it's like this:
$editor->settings['toolbar']['buttons'][0] = array('Bold', 'Italic', …);
. Good point though :)Comment #6
nedjo@Wim Leers: Thanks for your detailed explanation.
This is an issue that's going to come up a lot: we have a filter system, but there's a built-in contradiction between filtering and in place editing. Besides just removing the filters, what are our options? What will we do for example with a WYSIWYG media asset tool that uses a token in WYSIWYG and then a text filter to generate the relevant markup? Dynamically reverse the filter transformation (substitute back the token for the rendered markup)? Should we be modeling here what contrib can do to harmonize text filtering with in place editing?
Comment #7
Wim Leers#6: interesting that you bring that up. In this issue, please consider only CKEditor's "iframe mode". I.e. in-place editing is out-of-scope for this issue. In particular, because Edit already solves this… by not solving it at all — indeed, when you want to edit a piece of text in-place, it will check if it is filtered, (simplified explanation follows) and if that's the case (i.e. most of the time), then it will go back to the server and fetch the unfiltered version. See
EditController:: getUntransformedText()
for details. This will be leveraged by CKEditor via this patch: #1886566: Make WYSIWYG editors available for in-place editing.But, again, please let us not discuss that here, if you want to discuss it, then please do so at [#1886566. Note that this has been discussed at length earlier, in #1782838: WYSIWYG in core: round one — filter types, so if you want to reopen this discussion, I'm afraid I have to ask you to read all of that issue again.
Comment #8
nedjo@Wim Leers: thanks, I'd misunderstood the issues you were addressing here.
Comment #9
quicksketchThe first problem I don't find to be serious (the auto-linking of URLs). The second problem only occurs because the
<p>
tag isn't in the default list of tags, it's only added by the auto-p filter. If we just add the<p>
tag, the auto-p filter can stay on. However, this would cause other problems in that a piece of content created without paragraph tags in source-code view would loose all its paragraph tags when editing in the WYSIWYG editor, because we don't auto-p on the JavaScript side. This is possible to fix if we want. We can just take the auto-p JavaScript filter from Wordpress (our PHP auto-p filter is also from Wordpress). Overall I don't think this is too bad a problem, but if users who previously were *not* using a WYSIWYG want to switch to CKEditor, there's a good chance a lot of their content *won't* have paragraph tags currently (like here on Drupal.org). I think making a JavaScript auto-p filter is really the best solution. In the mean time we just enable the<p>
tag in the list of allowed tags to fix the original problem Wim described.Comment #10
quicksketchI realize my last post is difficult to follow. Here's a recap example:
<p>
to the list of tags in Filtered HTML format, leave auto-p filter enabled.So the situation is similar, but the point of failure shifts from the front-end to the back-end, where CKEditor is at fault for removing the P tags upon editing, instead of Filter module removing them on output.
On more important thing to consider here. ALL visual editors are incapable of properly maintaining whitespace as the user input them. Because browsers do not make white space accessible in the DOM, when switching updating content through a WYSIWYG, ALL whitespace formatting MUST be lost. You can reformat the whitespace in a consistent manner (CKEditor actually does a good job of making pretty HTML output), but you can't maintain arbitrary new lines of whitespace input by the user.
Wordpress takes the approach of *always* having an auto-p filter on both input and output. So markup in Wordpress never contains P tags in the database-stored version of the content. This has the interesting effect of making it so you *can't* specify
<p>
tags manually. If you insert them manually, they will be stripped out the next time that post is edited in the WYSIWYG and replaced with a plain new line (which then will be converted back to P tag on output). The output stays the same, but manual P tags on input tend to be discarded.Comment #11
quicksketchSorry I'm a post-creating machine. One last thought:
I don't think we should enable CKEditor on the Full HTML format by default, for a few reasons:
<script>
tags provided by 3rd party services, such as advertisements from AdWords, or widgets from social services like Twitter or Facebook. CKEditor never allows script tags for security.I think the first point is by far the most important. Most sites need Full HTML for embed codes from other services. I'd end up needing to either disable CKEditor on Full HTML for every new site, or make another Text Format without an editor.
If we want to have two different levels of editors (a good idea, IMO), we should introduce a 3rd, in-between text format that still has the tag filter enabled but a much longer list of allowed tags.
Comment #12
jcisio CreditAttribution: jcisio commentedI quite agree with quicksketch in #9. In fact, it was discussed multiple times, and CKEditor module already does that in #1050118: Added better support for Line break converter. CKEditor is now able to load properly content where new line characters were used, well not completely, but it mostly works.
About the URL filter, I don't think people care much about url that does not convert to link in the richtext editor, but it does when view. It's a simple, well known functionality and does not matter whether a richtext editor is enabled or not. The only problem is that if someone tries to when viewing a node copy/paste the content into another node, it messes up. But again, it's a infamous problem with filters that exists for age and we are not trying to solve here.
I also agree with quicksketch about not enabling CKEditor on Full HTML format. We can't introduce more tags into the Filtrered HTML format, neither, as it may cause security problem (tags which were filtered will no longer be). A rich text format with most tags allowed sounds wise.
Comment #13
quicksketchJust a clarification, *any* changes we make to the default list of formats would have no effect on existing sites that upgrade to Drupal 8. We're only talking about modifying the default installation profile so that it comes with CKEditor enabled and the default formats modified. Users upgrading from D7 won't have any of their text-formats modified upon upgrading, nor will the CKEditor module be automatically enabled for them upon upgrading. Overlay or Toolbar followed similar patterns for users upgrading from D6. We don't change existing sites upon upgrading, this is just for new sites.
Recently I read an article making the basic suggestion that rel=nofollow should be enabled by default for comments, which makes a lot of sense. So actually maybe rather than making a new more-privileged text format, we could think about adding a less-privileged text format specifically for basic HTML that has the nofollow filter enabled, while at the same time we add more tags to Filtered HTML. Hmmm, I could go either way. The end result should be the same.
I do think adding another text format may be outside the scope of this issue, where we should focus on just fixing the problems with Filtered HTML so we can enable CKEditor on it by default. Is anyone familiar with an existing request to add an additional text format? I couldn't find one with basic searching.
Comment #14
sunThanks for actively discussing this. I disagree with the proposed solution.
So far, we've discussed how changing configuration here and there would resolve this issue and how that would circumvent potential problems. By adding a WYSIWYG editor to core, we inherently added the requirement for D8 to be capable of switching between non-wysiwyg, filter-enabled formats and wysiwyg, filter-less formats. Larger topic. Requires sophisticated solution. Not of any interest here. In turn:
Consequently, this is what we want to do:
(KISS)
Comment #15
sunNote that removing those filters from Full HTML quite potentially requires to update a couple of test assertions/expectations; is so, those tests should ideally be changed to expect the markup that is generated by the default plain_text text format that is shipped with and installed by Filter module.
Comment #16
quicksketch@sun I don't think I agree with your suggestions. A recap of your statements seems to be: A) WYSIWYGs should work only if there's no filtering and B) in order to use the WYSIWYG you should use Full HTML.
This would encourage even more sites to allow all HTML tags for everyone (if that was the only way to get a WYSIWYG). I agree that anonymous users should not have a WYSIWYG by default (that's why I proposed either a more or less privileged text format), but there should be the ability to allow anonymous users a WYSIWYG if so desired. In such a case, we would definitely need to accomodate filtering at the same time as the WYSIWYG.
Perhaps in your statements above, you meant "filtering" as the auto-p filter and the URL-filter should be disabled when a WYSIWYG is used, not *all* filtering and especially the HTML tag filtering.
As I said above, I think keeping the Full HTML format is important for purposes of advertising and embed codes that require iframes or script tags. I've never yet worked on a Drupal site that didn't have inline script tags *somewhere* on the site.
So as an alternative that is equally simple but accomplishes a different goal, we could do the following:
1. Make a new text format (say "Basic HTML") in the default profile that has a long list of available tags and the editor enabled.
2. Permissions-wise it is only accessible to authenticated users. Make it the first text format so it is used by default for all authenticated users.
3. Leave Filtered HTML and Full HTML alone.
This way Filtered HTML becomes the default for anonymous users only (per my suggestion above, this also means we could add rel=nofollow to it by default). At the same time we get a new format that has all the tags we want to support out-of-the-box (e.g. img). In a truly perfect world, we would also hide the "Plain text" format and make it truly a "fallback" format. Then our filter breakdown would look like this:
- Anonymous: Filtered HTML only (no dropdown)
- Authenticated: Basic HTML only (with WYSIWYG, no dropdown)
- Administrator: Basic HTML, Filtered HTML, Full HTML. (with dropdown, Basic by default)
I think the naming here isn't optimal. "Filtered HTML" is more like "Restricted HTML", but we can save that for later.
Comment #17
sunIt sounds like you're interpreting the current "Full HTML" format as "Raw HTML (developers only)" format?
That never occurred to me. The Full HTML format is typically accessible to the site admins and editors/moderators (if any). And those are the users that want to use a WYSIWYG editor.
Why do I need yet another format? Why do I need any filters in that format? "Full HTML" cuts it to the max: This is Full HTML. Select it, use it, do whatever you want, it's yours.
The Full HTML format also never had a HTML filter and thus was never restricted to any particular tags. That's fine, why change it?
We can discuss whether it makes sense to add a new default text format to Standard profile that sits between Filtered HTML and Full HTML, and is basically just Filtered HTML with an enabled editor, but that's a completely separate issue and not really relevant here.
This is just default config for the Standard install profile. Users can and will futz with the text formats either way. And developers and site builders don't use the Standard install profile in the first place, because it's way too heavy, involves too many Drupalisms, and only has the purpose of show-casing/demonstrating Drupal to first-time users.
Lastly, I definitely did not say that an editor should only work for a format that has no filters. That should absolutely work, but that's irrelevant for the default configuration of the Standard install profile. We can simply remove the filters and enable the editor instead. For the default config, I do not see a point in adding paragraph/URL filters to a format, whose text values are populated by a WYSIWYG editor, which produces paragraphs and URLs already.
Comment #18
quicksketchMy belief is that out of the box, when a user goes to create a piece of content, a WYSIWYG editor is already there. No need to change the format to "Full HTML". If we go with your approach, it sounds like in order to achieve this effect we would need to make Full HTML the default format (set its weight to the lowest value). That doesn't seem like a good course of action as it's making more content prone to security risks and again, likely to cause more sites to just use Full HTML for everything because it's the one that has an editor on it out of the box.
I don't agree with this statement either. I've personally never installed a Drupal site with the minimal profile. If you're the type of person who doesn't use the standard profile, perhaps you're speaking from a viewpoint that doesn't reflect the expectations of an average site-builder.
For the most part, maybe. But in my experience the editorial team is frequently tasked with advertising on a site and commonly uses Full HTML for embed codes. So I definitely would not refer to it as "Developers Only". I'm saying that it's common to need a format that doesn't have a WYSIWYG enabled on it so that ad and embed codes may be used. The users who input this data are regular, non-technical people who just know how to copy and paste.
Comment #19
jcisio CreditAttribution: jcisio commentedI'm not sure what do you mean by "HTML filter". There are not many filters in core though. But do you think that filters like http://drupal.org/project/inline should be enabled on Full HTML format, with RTE editor enabled? I think yes.
Comment #20
quicksketchI'm fairly sure "HTML Filter" refers to "Limit allowed HTML tags" filter. Internally its machine name is "filter_html", so it's easy to call it the HTML Filter for short in a human-readable way.
I very rarely encourage any users to utilize the Full HTML filter. Setting it as the default for administrators is problematic if the site has less-privileged users who share a burden of content creation. If a user with the "administrator" role creates a piece of content, but then the content team finds a typo, they should be empowered to correct that mistake. I've built a lot of sites where the user is trusted in their earnest to create content but not in their technical prowess. Giving them Full HTML is dangerous to the site, but making it so they can't edit content created by a user with the administrator role is undesirable.
So in summary, I think both the following are true:
- A WYSIWYG *should not* be enabled for anonymous users by default.
- A WYSIWYG *should* be enabled for authenticated users by default.
- Authenticated users *should not* have have Full HTML as the default format.
If these statements are true, then that means we need a format that is not given to anonymous users and *is* given to authenticated users that is *not* Full HTML. Thus, I think a 3rd out of box format would be the best solution. However at the same time, I think limiting the available formats is also important. Really users should never be troubled with format at all unless they're an administrator. See #16 for my recap there.
Comment #21
quicksketchRegardless of if we add a new text format to the standard profile or not, I think this issue needs revisiting: #788114: Unprivileged users should only get one text format by default. I've filed a new patch over there that eliminates the fallback format from being presented in the text format drop down.
Naming convention-wise, I think we should make new text formats that match the following:
- Basic HTML: The new text format. Weight 0, the default text format for authenticated users. Has WYSIWYG enabled.
- Restricted HTML: The renamed "filtered_html". It's only accessible to anonymous users only. Since Basic HTML eclipses its functionality entirely, an authenticated user would have no need to have access to it. This format typically would be used for anonymous comments only.
- Full HTML: No changes from today. Only admins have access to this text format by default.
Combined with #788114: Unprivileged users should only get one text format by default, this results in the following displays:
Anonymous:
Authenticated:
Administrator:
This way we get the WYSIWYG for authenticated users by default. Anonymous users do not get the WYSIWYG by default. Plus we can tweak the differences between these two formats to cater to their audiences better, such as adding images by default for authenticated users (basic_html) and adding rel=nofollow for anonymous users (restricted_html). Neither user-level is shown the name of the format because it's the only format to which they have access.
This reflects the common filter setup for most sites the have WYSIWYGs today. It's what I do on almost every site (including my personal blog, but also huge client, multi-user sites). It's also been commonly recommended throughout the Drupalsphere, and heavily at Drupal conferences. Examples:
Jen Lampton has presented her best-practices WYSIWYG talk that sets up a dedicated WYSIWYG text format a total of 10 times, including Paris 2009, DrupalCamp Austin 2011 - video, and DrupalCon Munich
Andrew Mallis have given his presentation What you see isn't always what you get (but it can be) twice at BADCamp in 2011 and 2012, and DrupalCon Denver.
Setting up a dedicated format is also recommended at Drupalize.me (though unfortunately only in the not-free portion of videos).
As far as I've seen, there's a large group of experienced Drupal developers who always set up a dedicated format for WYSIWYG and also recommend that to broad audiences. Therefor I think it would be quite reasonable to provide this configuration in the out-of-box-experience.
Comment #22
catchI don't understand why this is marked as a major task.
Comment #23
Damien Tournoud CreditAttribution: Damien Tournoud commentedFull HTML needs to die. HTML is *never* what you want in a CMS (do you want to be able to use
<iframe>
,<meta>
,<head>
,<style>
,<canvas>
or<button>
in your content? no you don't), what you want is something I call "Rich Text". It might or might not be a subset of HTML.If we go with the proposal from #21, I would like to see the mention of "HTML" removed from the name of the text formats.
Comment #24
quicksketchComment #25
aspilicious CreditAttribution: aspilicious commentedYeah same for the projects I work on. In most cases they need to run some custom stuff in an iframe.
Comment #26
catchOK we can do this then if you want to keep it major. I don't think features that have just gone in should block other work (unless they've introduced major regressions like the contextual links stuff but at a certain point we'd roll back if those never got resolved).
Comment #27
Bojhan CreditAttribution: Bojhan commentedI do not think, this should be labeled as a feature request - the fact will be that, Drupal 8 will be marketed as "now comes with a WYSIWYG". That you are then required to go through various difficult configuration steps, to get it working out of the box will be a really bad representation of Drupal and confirmation that Drupal is hard to use, because we do not provide them with an easy experience for such a fundamental feature out of the box.
Comment #28
sunThe "HTML" suffix exists to clarify for visitors/users that the expected input is HTML and not BBCode or WikiSyntax or Markdown or whatever other input format other popular sites on the net might be using. As a visitor/user of other (most often non-Drupal) sites, I continue to get lost and confused when there is no clarification on what kind of input is expected. Most often, only trial & error via Preview is the last resort. Worst case, preview isn't possible. Therefore, I disagree with its removal.
The problem I see with #21 is that there's barely a difference between the Filtered HTML format and the Basic HTML format. Technically, the formats are pretty much identical. You just want to have an editor for authenticated users.
Can we add permissions to configured editors instead? Or simply a global editor.module permission "Use editor [plugin]"?
Also happy to rename "Filtered HTML" to "Basic HTML" or "Some HTML" (seen the latter more often on the net recently).
Comment #29
quicksketchAnonymous users should have rel=nofollow added to their links and not be able to post images. That combined with the editor being enabled and configured makes "Restricted HTML" quite a bit different from "Basic HTML", so I think adding another text format makes a lot of sense here.
Just to be clear and as seen in the screenshots above, if #788114: Unprivileged users should only get one text format by default is completed, then regular users won't be shown the name of the text format at all, since the select list isn't shown if they only have access to a single format. So in general, I think we should try to make the list of format names make sense to administrators, since they're the only ones that see the list of formats by default.
The filter tips are still displayed though, which I think is plenty of indicator that HTML tags are allowed for anonymous users. As a separate issue, we should address moving/removing the filter tips when a WYSIWYG is present, since listing HTML tags only adds to confusion if a WYSIWYG is present. Users can't actually type the tags listed because they'd be converted to HTML entities by the WYSIWYG. The list of tags is only helpful when in source code view or with the WYSIWYG disabled.
Comment #30
quicksketchHere's an initial patch that does the following:
<p> <span> <img>
. It has the secure image filter turned on.On my local I was getting several problems in the testing suite around OpenID and the Locking tests, which didn't seem related to the text formats. I'll let test bot have a try at it and see if it encounters the same problems.
Comment #32
quicksketchHm, reroll using command line. Testbot seems to be increasingly sensitive about patches created by Eclipse IDE.
Comment #33
sunMost of these test changes are unnecessary and can be reverted.
Only tests that rely on the Standard profile configuration need to be adjusted. And out of those, the majority should be changed to not rely on the filtered_html/full_html formats but instead use the plain_text format (which potentially means that the entire test can be changed to not use the Standard profile).
This and related tests need to be updated for the changed expectations instead. Let's do that in the other issue first.
The upgrade path must not be changed - D7 sites still have a filtered_html format.
Hm. The restricted_html format is not accessible to authenticated users.
This means that content moderators (or rather all authenticated users) are not able to edit content that has been submitted by anonymous users.
Can we find a better label for this? "Restricted" doesn't sound friendly.
Comment #34
quicksketchThanks for the review @sun.
Yeah, the tests are a bit funny right now, since they use the names "filtered_html" and "full_html", but they're not actually configured differently. In most places, they're just named formats with no non-default configuration.
Yes, this is true. This doesn't seem like a likely scenario though. This would mean that either the authenticated user role would need to have "edit any X content" or "administer comments" permissions, which is likely to be only allowed by more privileged roles. If you had a non-administrator role that was responsible for moderating content, you could grant them access to the restricted_html filter.
Comment #35
quicksketchI'm open to suggestions. We could use something like "Minimal HTML" or "Limited HTML". The problem we get into is confusion between "Basic" and the other adjectives. If we want to clarify the difference, we might use something like "Extended HTML" for the authenticated format and "Basic HTML" for the anonymous format (so then you would have Basic, Extended, and Full HTML formats).
I'll try to get #788114: Unprivileged users should only get one text format by default rerolled independently, as it seems that @sun's preference is to keep the two issues separate.
Here's a reroll patch without #788114: Unprivileged users should only get one text format by default and without renaming a lot of the test filters. This leaves the format "filtered_html" in a lot of our tests, but as @sun pointed out, we can name these whatever we like as long as they're individually created as part a test the uses the minimal profile.
Comment #37
wwalc CreditAttribution: wwalc commentedconfig.allowedContent
totrue
. So if CKEditor was enabled in Full HTML with this option, user could enter whatever he wants in "Source" mode and it would remain there. Handling script tag in 4.1 is investigated in http://dev.ckeditor.com/ticket/10089 I do agree however, that offering exactly the same interface in Full HTML and Basic/Filtered HTML would be a bit pointless.Comment #38
webchickI agree with #27; this isn't a feature. This is a straight-up follow-up task that was punted from the original issue. It's probably even critical, but leaving priority alone for the moment.
Comment #39
quicksketch@wwalc: Thanks for this comment. I think enabling this option may make sense for all formats for the time being. We get into a nasty problem with CKEditor's filtering if you're editing existing content. Say you're in a situation like this:
- You create a new post tediously using Full HTML, typing out the HTML by hand. It includes all kinds of stuff: tables, script tags, style tags, whatever.
- Thinking, "gee, using a WYSIWYG sure would help previewing this content", you decide to change the format to "Basic HTML" just to preview/update the content in CKEditor.
- BAM! CKEditor throws away all manually entered, non-allowed tags. Changing back to Full HTML, you see with horror that all your tediously typed HTML has been deleted.
This allowedContent option is very new I realize, but is there a way that we can have allowedContent work *only* on paste? Destroying existing content in the textarea seems like too dangerous an option to keep enabled by default.
I'm working on rerolling this patch. Regarding the current names ("Restricted HTML" and "Basic HTML"), I'm inclined to keep the unfriendly term "Restricted" for the anonymous format. Why? Because you *don't* want to assign this format to authenticated users. The term "Restricted" gives it a label that indicates "give this format to people you don't trust". And because it's the *only* format that these untrusted people should have, they won't be shown the name of the format anyway (again, assuming #788114: Unprivileged users should only get one text format by default is completed).
Comment #40
quicksketchMinor update fixes the remaining tests.
Long-term, we'll need to enable the auto-p filter on the Basic HTML format. Right now if you write a post in Full HTML (with no p-tags), then switch your format to "Basic HTML" (or decide to add CKEditor to Full HTML), all of your whitespace in between paragraphs is lost and your post gets all mashed together once its loaded in the WYSIWYG. Note this is a problem with all WYSIWYGs, because the DOM doesn't provide an ability to maintain white-space. In order to really make formats interchangeable without damaging your content, we need a JS-side filter that auto-p's the content before it's loaded into the WYSIWYG. Fortunately since our auto-p filter on the PHP side originally came from WordPress, we can simply copy the JS equivalent auto-p filter that they have on their WYSIWYG. All of that is beyond this particular task however, so I'm simply making a note of it that we need to be aware of the problem and followup on it.
I think this patch is ready for any further human/conceptual review. On the technical side everything should be working.
Comment #41
wwalc CreditAttribution: wwalc commentedThis is a very nice scenario to find out what is the correct way to handle changing input formats on the fly.
Okay, but if I change format to "Basic HTML" I should see what I will get in Basic HTML. That's the point of WYSIWYG ;)
If we go ahead an implement this:
Than again, using the scenario above will lead to unexpected results. Let's say that I just edited an article using Full HTML. Then I switch to "Basic HTML" to see how it will look like. If CKEditor will not filter contents instantly, then once the user will start cutting and pasting parts of the text to format the article better, he'll notice that suddenly the content started disappearing, because the filter started working on pasted content.
It is doable to change CKEditor to make it possible to disable filtering on startup and then enable it, but I do not see any valid use case for that at this moment, because of the cutting/pasting issue that I mentioned above?
What if Drupal could simply remember how the the content looked like in previous format (js variable, web storage etc.)? In case of changing the format accidentally, there could be a way to revert things back, by going back to the correct format and loading "saved" content.
The proper solution for this particular case (being able to occasionally see Full HTML content in WYSIWYG while editing) would be to:
a) either manually create the same format, but with CKEditor enabled
b) or eventually be able to select more than one editor for particular text format and introduce the possibility to switch editors on the fly... but that would be imho a way too complex I guess.
Comment #42
quicksketchYeah cut/copy/paste from within the same document would be a tricky scenario. Perhaps CKEditor could know if the contents being pasted were originally cut from the same editor and not strip HTML? Technically I'm not even sure if it's possible to do this, nor am I sure that would be the ideal solution. Maybe striping out things on paste would be acceptable if the user was notified that tags had been stripped out on paste.
True, but no user would expect a convenience to delete/mangle their content. Perhaps we could place an alert above/below the textarea notifying the user that some displayed tags won't reflect the true output; listing these tags that will be stripped? Again, not sure how we should handle this. I just don't think the immediate filtering option is safe enough to leave on by default.
Maybe this whole issue should be discussed separately. The current patch doesn't actually disable allowedContent feature, but the patch in #1879120: Use Drupal-specific image and link plugins — use core dialogs rather than CKEditor dialogs, containing alterable Drupal forms does, since I couldn't get it to allow IMG tags for some reason. Any ideas you can provide on why that is would be helpful in that issue.
Comment #43
wwalc CreditAttribution: wwalc commentedWe could investigate it, but still the original case is that user switched to "Basic HTML" because he knows that WYSIWYG is there. That's not the purpose of format selection. Imho if one switch to Basic HTML he'd rather expect to see the result of Basic HTML format applied to the content, just because its only purpose is format selection, it's not a toggle to switch wysiwyg on/off.
If the main reason to change the input format is that CKEditor is enabled there, then having CKEditor enabled in all places where it makes sense (including Full HMTL, with allowedContent = true in this particular case) would reduce the risk of such actions.
I agree that here we're entering an interesting part. Everyone got used to the fact that changing the format to more restricted and saving the content does not result in the content being really removed. Still, when one upgrades to the next major version, he should be prepared for various (documented) changes in the behavior of an application.
Some kind of a warning might be handy, however I'd warn user in case of changing the format and knowing some elements are going to be removed there.
Comment #44
Wim LeersI rerolled #788114: Unprivileged users should only get one text format by default + fixed its tests, to get that moving forward, since that is more or less a dependency for this one.
Which text formats (+ editor) combinations should ship with the Standard install profile?
And further in #29, responding to criticism in #28 (which claimed that "Restricted HTML" and "Basic HTML" were basically the same, the only real difference being CKEditor disabled or enabled):
I think this is a very fair/balanced proposition. It's very likely acceptable to most. So we can build on this assumption.
Finally, thanks to #788114: Unprivileged users should only get one text format by default, no text format name will be displayed anymore for users with only access to a single text format, so any concerns about text format names being too scary are not quite as far-reaching. Nevertheless, the filter tips are still present, so it is clear what the allowed syntax is.
<script>
tags"filter_html
filter (or, more generally, any text format without anyFILTER_TYPE_HTML_RESTRICTOR
type filters):<pre>
)allowedContent=true
for formats withoutfilter_html
(see above). More importantly, "switching text formats to do previews" really isn't a valid use case. It's a hack. Like wwalc says:
#9, #12 indicate this is less important. I agree. Let's omit that from this issue.filter_url
filter_autop
<p>
tags, you can just leave a blank line in between two paragraphs (similar to how it will be rendered), andfilter_autop
will automatically insert<p>
tags.\n
s doesn't mean anything in HTML and hence to a browser. So if we'd showline 1\n\nline 2
via a WYSIWYG editor, we'd actually seeline 1line 2
instead of<p>line1</p><p>line 2</p>
.filter_autop
play nice with WYSIWYG editors?filter_autop
, we need to:<p>
tags whenever starting to use a WYSIWYG editor (e.g. atnode/1/edit
, just before CKEditor is initialized, the<textarea>
's contents blank lines need to be replaced with<p>
tags)<p>
tags back to blank lines whenever stopping to WYSIWYG editor (e.g. when saving the node edit form)<p>
tags, so the WYSIWYG-created<p>
tags would be stripped, and depending on how the WYSIWYG editor generates whitespace between these tags (which according to the HTML spec should be ignored), no or just some<p>
tags may be generated on viewing the content. So, this effectively means that if we do what quicksketch proposed in #10: Then the above problem is solved. (After a long discussion on IRC, quicksketch had me convinced of this.) But…<p>
tag would still be disallowed by thefilter_html
filter. Hence CKEditor would operate under the assumption this tag were disallowed, which is not true. Really, "<p>
is not allowed" is a lie, because it is generated automatically.<p>
tags!filter_autop
because it actually needs two, and the two latter are only using it to facilitate switching text formats… should we really keep it enabled at all for the latter two? It doesn't serve any other purpose besides the ability to switch to and from the anonymous users text format. I think it is only sensible that switching to a different text format implies the potential for data loss unless you know what you're doing. And that's why anon and auth users would each only have access to a single text format, which prevents any potential problems out-of-the-box. Only admin users would need to be careful when switching text formats. And they'd have very few reasons to do so in the first place.filter_autop
to no longer relying on it?_filter_autop()
and save it. Now the content is "proper HTML"; WYSIWYG editors can deal with this directly, no problems exist.The patch in #42 already does most things "right" in my POV, so this is my assessment of the current patch:
filter_autop
for "Basic HTML"filter_autop
for "Full HTML" (probably because CKEditor is not yet enabled, in which case it makes sense)config.allowedContent = true
should be set whenever a text format does not have any filter of the typeFILTER_TYPE_HTML_RESTRICTOR
Sorry for the long post, I tried to take everything into account to hopefully achieve consensus sooner rather than later. I guess this could serve as a starting point for an updated issue summary as well.
Comment #45
David_Rothstein CreditAttribution: David_Rothstein commentedSince I just left some critical comments on #788114: Unprivileged users should only get one text format by default, I want to mention here that I don't see why it would be a blocker for this one at all.
Having the WYSIWYG show a modal like you describe (presumably it would only show when a WYSIWYG is being used) makes sense, but it doesn't have to be scary or advanced. Try going into Gmail and creating a new e-mail message and then switching back and forth between WYSIWYG and plain text - it pops up a similar modal and it's not hard to understand at all.
In any case, plenty of Drupal sites will have not-very-tech-savvy users with access to more than one text format (whether it's because those users are non-technical site administrators or because the site is configured differently than envisioned here) so the messaging in such a modal dialog would have to be "non-advanced" anyway.
Comment #46
quicksketchIt sounds like everything done so far is in the good column (I agree), but more work is needed after this (not surprising). I think it makes sense to bundle together changes to config.allowedContent and the Full HTML format at the same time. CKEditor being able to not strip out script tags indeed makes enabling an editor on Full HTML more feasible.
I'm also content with the warning if you switch text formats; hopefully we'll be able to make this contextually displayed. Our situation is a bit more complicated than Gmail, since we don't really know when you're moving to a less privileged format or if changing the format is actually going to cause a loss of tags.
Comment #47
jessebeach CreditAttribution: jessebeach commentedquicksketch, it should be possible to compare the set of allow tags in the switched-from format to the set of allowed tags in the switch-to format and at least know in a crude way that some tags are not present in the switch-to format and throw up a warning. It won't contain details about what information will be lost, but at least the potential for loss would be known.
Comment #48
David_Rothstein CreditAttribution: David_Rothstein commentedHm, somehow on my initial read-through I think I assumed that you were planning to enable the WYSIWYG for anonymous users.
Given that's not the case, I have to ask, what's the actual purpose of "Restricted HTML"?
My assumptions are:
If those are true, why introduce Restricted HTML vs. Basic HTML at all? Why not just do this:
To do that and meet the other goals above might require changes to allow the (Plain Text) URL filter to add "rel=nofollow" on its own, but that would really be a good idea anyway...
Comment #49
sunI wholeheartedly agree with this part of @wwalc's comment in #43:
Exactly. Editor-selection is not the purpose of the text format selector.
Associating editors with text formats was a major milestone for WYSIWYG editor integration in Drupal's history. However, the idea of a 1:1 relationship dates back to Wysiwyg module's inception in 2007 (and actually a patch by @nedjo for TinyMCE module #118747: Upgrade to jquery, improve obsolete js approaches, which didn't see the attention it should have deserved).
We've taken over that approach for the Editor module in D8 core now — even though we already identified in 2010 that there is a strong need for #748980: Allow multiple editor profiles for the same format, exactly because of what @wwalc mentions above: There is a 1:n relationship between text formats and editors. Not respecting that leads to the confusing and bogus situation that users need to switch the text format in order to toggle an editor, whereas the only difference between the formats may be the associated editor.
Before we shoehorn our configuration into that misconception, let's rather implement support for multiple editors per format. (which does not seem to be very hard to do)
I also disagree with @David_Rothstein in #48 — the plain_text format was only ever meant as (literal) fallback format and caused by technical edge-case scenarios only. It wasn't actually meant to be exposed in the visible way it is today.
Specifically,
1) when all existing formats are accessible to certain user roles only, and
2) the currently logged-in user does not belong to any of those roles, but
3) a text-format-enabled text field happens to be exposed to that user, then
4) Filter module would logically have no idea what to do, because it has been told to accept user input that needs to be filtered, but there is no format the user is allowed to use.
That's the one and only reason for why the fallback format exists. There are multiple possible ways to reach that situation, so we were not able to say "Stuff that, that edge-case is caused by a bogus configuration, fix your configuration instead." We introduced the fallback format to resolve that problem.
FWIW, that is also the reason for why the plain_text format is super-restrictive and escapes all HTML. It is only used in situations when the current user technically does not have access to any format. That is also why the fallback format ID is not configurable from the UI. It was a pretty big mistake to expose it as a configurable format in the administrative UI and I believe we need to fix that.
That is not the format that anonymous users should use.
Anonymous users should still get a format that is like filtered_html. They should be allowed to enter basic HTML. They should only get one format and should not see a format selector.
I hope this helps to clarify the concepts and situation a bit.
Comment #50
quicksketchI've been commenting in #788114: Unprivileged users should only get one text format by default also, but I'll reiterate it here. The only problem with that approach is that authenticated users would still get the text format dropdown with no way of disabling it (other than to install Better Formats).
Comment #51
Wim Leers#46:
So… are you saying we should work on Full HTML &
allowedContent
*after* this issue, or as part of this issue? I'd incline to the latter.#49 part 2:
+1
#49 part 1:
Absolutely!
However:
No!
First of all, "a strong need for #748980: Allow multiple editor profiles for the same format" seems severely exaggerated. That issue has four followers. It hasn't received a comment since July 2011. If anything, that indicates exactly the opposite of your point.
Secondly, I think you're misinterpreting @wwalc's words. AFAICT, he was arguing it's simply wrong to switch to a different text format to get a preview. That does not imply he thinks it's a good idea (at the bottom of #41 he specifically calls out that idea and indicates he considers it "way too complex", which I agree with).
He said this, after the part that you cited:
This I agree with also.
So, IMHO, what this all really boils down to is this:
So: what prevents us from using CKEditor on all text formats (Restricted, Basic and Full HTML)?
(And note that it's still possible to write/paste manual HTML in CKEditor by using its source mode.)
Conclusion
So it sounds everybody generally +1s (#46 appears to do so) with the conclusion at the end of #44.
The only contentious/tricky aspect is the "switching of text formats" aspect, which is de facto solved by enabling CKEditor for all text formats (Restricted, Basic & Full HTML), assuming we can get consensus that the fallback format really is a last resort that typically (w|sh)ouldn't be exposed to end users.
Comment #52
Wim LeersRetrying to add tag (somehow failed in #51).
Comment #53
quicksketchI think the current patch is good as-is. It accomplishes the original goal and gets a WYSIWYG on by default for all authenticated users.
I think we could indeed enable CKEditor on Full HTML in the event that it's not going to garble script/iframe tags, but I would like to make that issue separate from this one. I'm not sure that a WYSIWYG for anonymous users is really necessary since they're typically only allowed a minimal number of tags. And compared to content creation, comments tend to contain much less formatting/images where a WYSIWYG provides value. I think there may be push back based on page-weight concerns, both in bandwidth and client-side performance, so let's make a dedicated issue to discuss this option.
Regarding the destruction of content by switching text formats, assuming end-user administrators don't grant authenticated users access to Restricted HTML AND Basic HTML (and why would they?),
CKEditor filtering concerns may be mitigated in most situations if Full HTML gets a WYSIWYG also.EDIT: I realized this isn't actually the case, switching from Full HTML to Basic HTML would still potentially destroy content (i.e. if you had a scriptor style tag in your Full HTML content).Anyway, this patch is still ready to go. This patch isn't necessarily dependent upon #788114: Unprivileged users should only get one text format by default, but the Drupal experience is significantly better with both patches applied.
Comment #54
sun#1932544: Remove all traces of fallback format concept from the (admin) UI
Comment #55
Wim LeersStraight reroll of #40 now that #788114: Unprivileged users should only get one text format by default has landed.
Comment #56
Wim Leers(Changing component to ckeditor.module; that component was not yet available when this issue was created.)
Final review
rel=nofollow
setting to enabled.I removed the
filter_url
filter from the "Basic HTML" text format.Follow-ups
Enabling CKEditor for Full HTML is likely going to result in a bikeshed over which buttons should be enabled by default, hence deferring to #1933914: Enable CKEditor in the Standard install profile's "Full HTML" text format.
Related follow-up issues that I just created:
Comment #57
mgiffordLooks good in SimplyTest.me - thanks everyone for getting this in. Will be such a great intro to folks using D8.
Comment #58
Wim LeersNo further feedback for days, marking RTBC to get this on the fasttrack. It would be great to get this committed tonight, because then the worldwide D8 sprints will see CKEditor in action and hopefully start providing feedback for other CKEditor issues. Potentially even report bugs :)
Comment #59
webchickAwesome!! this is *really* going to be helpful at Drupal Global Sprint Weekend for people new to Drupal 8 to be able to test it out!
HOWEVER... it needs a re-roll for standard.info.yml. :(
Comment #60
webchickActually, nevermind. I am living dangerously and hand-editing. ;)
Committed and pushed to 8.x. (I hope ;)). Seems like the text format changes introduced here though could use a change notice.
Comment #61
quicksketchReroll for YAML-style standard.info file.
Comment #63
quicksketchOh, well FINE THEN. ;)
Comment #64
quicksketchp.s. Thank you. :)
Comment #65
tstoecklerLooking at http://drupalcode.org/project/drupal.git/commit/f47ff3e it seems as though the basic_html config file was left out of the commit. Tests apparently pass, but I don't know if that is a good or bad thing :-/
Comment #66
Wim LeersWorse, restricted_html was *also* left out :(
I have also no idea how it is possible that tests still pass, TBH. I'm quite shocked myself…
Comment #67
Wim LeersThese are the three files (all config files) that webchick forgot to commit. They're the same as in the last patch, hence RTBC'ing directly to rectify this ASAP.
Comment #68
xjmFamous last words! :D
Comment #69
Wim LeersOkay, so it's good to see that the tests actually *are* working. #67 contains the fix.
Comment #70
webchickDamn it all. :P This is what I get for trying to be fancy. :P Thanks, Wim.
Committed and pushed to 8.x.
Comment #71
Wim LeersComment #72
Barnettech CreditAttribution: Barnettech commentedI've read through the issue above and would like to help writing up the change notice. But I'm unclear as to what exactly should be in this change notice. Are we enabling ckeditor by default for full html? Looks like the issue mostly fixed some problems with the editor and conflicts with filters. If you give me your thoughts I will write up the change notice.
Comment #73
webchickCKEditor being enabled by default is less important than the fact that core now ships with a new text format by default: Basic HTML, and Filtered HTML was renamed to Restricted HTML. So we should add a change notice about that, and also explain the difference between the two.
Comment #74
Barnettech CreditAttribution: Barnettech commentedHere's a first crack at the change record
SUMMARY:
BEFORE:
In drupal 7 the available text formats were: plain text, filtered HTML, and full HTML.
After:
In Drupal 8 the text formats are:
The first is "Restricted HTML" for use, by default, by anonymous users (renamed from "Filtered HTML")
The second is "Basic HTML" for use, by default, by authenticated users
The 3rd is still "Full HTML", which is the same as existed previously
Comment #75
benjifisher@barnettech:
You left off "Plain text", which is still available. It is now described on admin/config/content/formats as This format is shown when no other formats are available.
Judging from the comments on this issue, I think we should be clear that this change applies to the Standard installation profile.
I suggest adding information about the filters applied by each format. Even though @webchick says it is of secondary importance, I suggest mentioning that "Restricted HTML" has CKEditor enabled.
Here is my suggestion:
Summary
The text formats provided in the Standard installation profile have changed.
Formats by role
By default, the following roles have access to these text formats. The one listed first will be the default. For Drupal 8, Anonymous and Authenticated users have different default formats; and the default for Authenticated users has CKEditor enabled.
Drupal 7
Anonymous: Filtered HTML, Plain text
Authenticated: Filtered HTML, Plain text
Adminstrator: Filtered HTML, Full HTML, Plain text
Drupal 8
Anonymous: Restricted HTML, Plain text
Authenticated: Basic HTML, Plain text
Adminstrator: Basic HTML, Restricted HTML, Full HTML, Plain text
Filters by format
These are the filters enabled in each format.
Drupal 7
Drupal 8
Comment #76
benjifisherForgot to update the status.
Comment #77
Wim LeersThanks, barnettech & benjifisher!
I started from the skeleton that you've provided and improved it in a few locations. I credited you in the revision log message, not in the change notice, because change notices never contain credit.
http://drupal.org/node/1941642
Changes:
filter_html
filter is configured exactly.Comment #78.0
(not verified) CreditAttribution: commentedMinor clarification.