Problem/Motivation
For the Configurable Help Topic system (see #2351991: Add a config entity for a configurable, topic-based help system and my sandbox project https://www.drupal.org/sandbox/jhodgdon/2369943 which is developing it), I need a way for a config entity to have a "Body" field containing formatted text.
The obvious way to do this is just to put the "body" into one big HTML string, but this is problematic. The main reason is that strings on config entities get translated (on localize.drupal.org if they are provided in Core), and having large translated strings with HTML tags in them poses problems:
- Translations get invalidated if any part is changed. So it is better if the strings are shorter rather than the entire text of the "body", so that if one sentence changes, only one shorter string is invalidated.
- Long strings take longer to translate and review for translators, and are harder to get right (one error in one place means it can get rejected).
- It is difficult for the translators to get all the HTML tags right. They can get corrupted or become unmatched.
So, rather than having one big string with HTML embedded in it, it is preferable to have shorter strings that are about a paragraph in length or so, and preferably without the HTML tags.
Besides config entities, it would also be desirable to use this for hook_help() implementations, which currently do things like:
$output .= '<p>' . t(something) . '</p>';
return $output;
which removes the HTML tags and shortens the strings, but is ugly. You cannot even do that that in a config entity, though -- you need some kind of a system for splitting up the string and abstracting out the HTML tags.
Proposed resolution
Related issue #2697791: Add Plugin system for abstracted HTML-formatted text is exploring a plugin system to do this.
On this issue, the idea is to use some type of markdown format. We would need to do the following to get this done:
a) Choose a particular flavor of markdown. On #2697791-3: Add Plugin system for abstracted HTML-formatted text, bojanz suggested Commonmark; however it currently lacks a way to do definition-style lists (DL/DT/DD in HTML terms -- which you could do by embedding the HTML tags directly, but that kind of defeats the purpose of this whole idea if you have to include HTML tags to use these lists, which are very very very common in things like hook_help() and help topics in general).
b) Add a PHP library that can parse/render this markdown flavor to the vendor area of Core (as a composer dependency).
c) Document a way to separate a long string of (text + markdown) formatting into smaller strings for translatability. The idea would be similar to what is used in #2697791: Add Plugin system for abstracted HTML-formatted text : each "paragraph" or "section" of markdown-formatted text would be a separate string. Developers or config entity editors would choose how to split it up.
d) Figure out how to express this in a render array (define a render element etc.).
e) Figure out how to express this in a config entity schema/renderer (should be fairly simple).
Remaining tasks
a) Figure out all of the unknowns in the Proposed Resolution.
b) Convert a hook_help() implementation in Core to use this, as an illustration of how much nicer it is than returning HTML strings.
c) Make a patch.
d) Try using this for the Configurable Help config entity and verify it is workable for the UI and config schema, as a viable replacement for the "text sections" plugin system and element of #2697791: Add Plugin system for abstracted HTML-formatted text.
e) Write a change notice for the new stuff.
f) Decide on #2697791: Add Plugin system for abstracted HTML-formatted text vs. this issue (which one is better), and get one of them committed to Core.
User interface changes
None.
API changes
API addition: New markdown render element and library added to vendor area. No API changes.
Data model changes
No.
Comments
Comment #2
jhodgdonWhoops, I guess you get all the issue metadata when you clone an issue. ;)
Comment #3
bojanz CreditAttribution: bojanz at Centarro commentedSame way as in the plugin approach, by having the config entity field have multiple values/paragraphs.
Comment #4
jhodgdonAh, so you mean you would have the admin/user edit the "body" as one big text field, and then upon save, separate at paragraph breaks (probably defined as two newlines in a row), or something like that?
Or would you give them separate "body paragraph" items, with an "Add new paragraph" button, the way I did in Configurable Help with the "sections"? (This is working in my sandbox module referenced in the issue summary -- each section the admin/editor chooses the section type, like header, paragraph, or bullet item, and then has a text field to put the text in, with a standard Drupal text format.)
Comment #5
bojanz CreditAttribution: bojanz at Centarro commentedThis, individual sections, except you don't need a section type cause it's all markdown.
Comment #6
jhodgdonOK, good. Adding this idea to the issue summary.
Comment #7
jhodgdonWhoops, wrong issue links in several places in the summary.
Comment #8
jhodgdonAny comments on the merits of this approach vs. #2697791: Add Plugin system for abstracted HTML-formatted text?
Comment #9
jhodgdonSo one thing that occurs to me with this approach... Let's say I'm editing a help page using this, and I am defining a bullet list. My "chunks" might look like:
a)
b)
etc.
Then you might have a numbered list, which depending on the flavor of markdown being used, might look like:
c)
or
or whatever the formatting code for numbered list is.
So... The problem is that the translator has to get the formatting code right. Depending on the markdown flavor, this might include a character that isn't recognized unless it is in the correct position in the line (all the way left, indented, etc.). Or they might need to have the correct number of === signs in a row to get the header level right. Etc. So, it puts the responsibility back on the translator to get the formatting stuff right. They aren't doing HTML tags at least, and Markdown formatting stuff is probably easier, but it still could be a problem.
With the plugin approach being discussed on #2697791: Add Plugin system for abstracted HTML-formatted text, the "what type of chunk is this" is taken out of the chunk itself. So that might be one point in favor of the other approach.