OK, here's my first patch. I tried to follow all the docs and ask around to learn how best to submit this. I asked for discussion in http://drupal.org/node/14481 but didn't get any. Is the README.txt in the contributions CVS module more correct (I checked into my sandbox this patch) or is http://drupal.org/patch more up to date? Am I over analyzing? :-)

/* $Id: README.txt,v 1.2 2004/12/17 04:00:06 grantbow Exp $ */

This is a patch to filter.modules that adds "long" "tips" so they appear on the ?q=filter/tips page. This same approach is code-wise identical to how the existing PHP filter is implemented in the same file. The text explains default HTML tags. A sample is at http://softwaremanagers.org/filter-tips-sample.html Discussion in 'Usability and accessibility' at http://drupal.org/node/14481

Applying is simple: 'patch filter.module filter.module.patch'.

I would also appreciate any constructive feedback and hints as this is my first submitted patch to the project.

Comments welcome.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Steven’s picture

1) Using sandboxes for patches is deprecated. Attach the file(s) here instead.

2) It is useless to provide a static piece of help for the default tags, as a site maintainer can change the allowed tags easily. At the very least the code should check if the tags it discusses are actually allowed.

3) There are several spelling errors in here as well as missing semi-colons at the end of entities. IE will render these fine, proper browsers won't.

4) Using UTF-8 means usage of HTML entities is not required for non-ASCII characters, but only for ampersands and angles, as well as quotes in HTML attributes. Users should enter the characters literally rather than using entity escapes. Using entities for them is wasteful and deprecated.

5) The help seems to be structured rather chaotically. Wouldn't it be more useful to provide a nice, structured list of markup examples with the HTML snippets escaped in a pre block?

6) I would much rather not link to unofficial sites for reference. "Builder.com.com" is not an official site like w3.org and has ads on it.

7) Proper typography dictates only one space after a period. Two spaces is wrong, it is an artifact of the typewriter age propagated by bad dactylography teachers. On top of that, it is wasteful as HTML compresses series of whitespace into one space anyway.

grantbow@civicspacelabs.org’s picture

Thanks for taking the time to take a look and comment, Steven. I appreciate your feedback. My intent was to provide "kinder, gentler" documentation for those that may be less familiar with HTML than you and I. Of course I need something similar to this patch for my own site's users but I think addressing something so simple as help text would help address usability comments. Dan Gilmore is a reasonably well respected journalist.

1) The reason I checked in my patch to my sandbox is that the README.txt file is one of two files specifically mentioned in my new CVS account email from Dries, . Thanks for confirming that using sandboxes for patches is deprecated. Can you or someone else please update the README.txt file?

2) I'll work on a new patch that has help text for each specific tag and shows them if they are configured for use. I'm thinking that a system similar in nature of how the filters show their own text automatically should fit the need. My short term goal was getting something into Drupal that gave more feedback than is currently there and to begin the discussion about the contents.

3) After checking each entity I found exactly one that's missing a semicolon. My aspell also found I had mispelled one word: hexadecimal. Do you see others? Thanks for catching those mistakes I made.

4) Regarding the use of entities, my goal was to provide simple instructions that any drupal user can use to input text they would like. I would be happy to use any alternative text you or anyone else can provide. Since using entities is fully supported by current web specifications and Drupal end users who know HTML might want to use them I thought that including them would be best. Additional text about directly entering UTF text is not something I've extensively tested across browsers and Drupal versions. I feel that whatever works should be included in these instructions somehow, allowing the users looking for help in entering text to understand their options.

5) I chose the help structure to exactly match the order of tags used by my default installation. I agree that a better structure and <pre> blocks would work better and be more clear.

Other similar types of projects have attempted to teach their users end HTML (or other markups) such as TWiki.org's descrption and table. Layout wise (not content since it's TWiki specific), do you think either of these examples is helpful?

6) The reason I chose to propose linking to builder.com is because it had useful beginners guides to HTML. "Unofficial" or not it has helpful information. Terse w3.org specs are less helpful to people that may potentially be less familiar with standard HTML tags and their use, but of course not including them would be a gross omission for those wanting to understand the details and for more experienced Drupal end-users. If we write other guides or have alternative sites that help beginners to learn HTML then we should, of course, include those instead.

7) The next version of this patch will include only one space between sentences.

Thanks again, I'll work on the code necessary for a dynamic help system for each tag in the list. I would appreciate feedback from anyone who proposes specific text for each tag. I would also appreciate some feedback from those that will make the decision on whether to apply a future version of this patch.

grantbow@civicspacelabs.org’s picture

FileSize
13.71 KB

Here's a new patch against the latest CVS that should address the concerns raised previously. I've also updated the HTML demonstration page. The alternating light and dark classes for the table rows doesn't show for the pushbutton theme. Comments welcome.

Steven’s picture

I like the improvements, but this patch is still not up to snuff:

- You use (explode("> <", $allowed_html) for parsing the tags supplied by the user. This is bad as the space between tags is optional. It is much better to use preg_match_all('/<([a-z]+)[^a-z]/i', ...), as this matches the exact parsing that strip_tags() applies itself.

- The order of the tips depends on what order the admin typed in. It is probably preferable to present them in a fixed order, going from simple to complex tags, grouping related tags together (e.g. table/tr/th/td as "tables", "ol/ul/li" as "lists").

- The tables should be presented using the standard table theme functions (don't worry about the alternating colors, none of the core themes have that as far as I know).

- There is an unnecessary space around the table headers and the capitalization is inconsistent (only first letter of the line vs. all first letters). There are also still some double spaces in the text. Also beware the usage of spaces in the code. Drupal code uses "string". $var ."string", not "string".$var."string".

- Instead of the repeated if ().... else if()... statements, just put all your help in a giant array which is indented properly:

array('br' => t('help text for br'),
          'h1' => t('help text for h1'), 
           ...);

Then you make a simple loop which outputs the specific help. This is much easier to read and much easier to modify later on.

- It is preferable to use drupal_specialchars() rather than htmlspecialchars(), so we can easily affect all entity parsing across drupal by modifying just one function.

- I still don't see the point of the entities table except for < and &. Most people have all the symbols they need on their keyboard, and most operating systems provide a character map to enter any Unicode character. There are tons of IME's and IME-like utilities for using quick key combinations to enter rare characters too.

- You should avoid the transitional valign="top" attribute and instead use CSS. For example, give the table(s) a class of "filter-html-tips" and have a table.filter-html-tips td { vertical-align: top; } CSS rule.

Sorry if I sound like a nitpick, but Drupal has high standards when it comes to code. I still want this patch to go in once improved, as it showcases the power of dynamic filter tips.

Steven’s picture

My main problem with the entities table is that it will never encompass every character. Sites about copyrights might need to use ©®™ a lot (all entered without typing any character codes), sites about currencies might need those characters a lot.

The solution to Unicode usage is not in teaching people a long list of HTML entities, as this is useless in any other context, but making characters easier to enter directly. Accent combination keys, tertiary symbols on a key, IMEs, inline dynamic character maps, etc. Those enable people to type symbols rapidly and easily. This is something which needs to be done on the client OS side, and is not really our concern.

grantbow@civicspacelabs.org’s picture

FileSize
10.68 KB

OK, this patch is coming along. Comments welcome.

- Thanks for the regular expression, it works great.

- The order of the tips is purposely in the order provided in $allowed_html. Consistency of order is a good thing for those not very familiar with HTML. It allows anyone to know to find a particular tag near the top or near the bottom of the table based on the order displayed by the string. A fixed order would take away flexibility easily afforded to site administrators and impose truly arbitrary ordering with no way for a site administrator to change it.

- The table theme functions are extremely nice. I used the two examples in the forum.module as guides.

- Thank you again for pointing out two spacing and a couple capitalization errors. I fixed as many as I could find. I am sure you can fix any I miss if you choose to accept some version of this patch.

- The array arrangement is much nicer and is required for using the table theme functions.

- I snuck in one more drupal_specialchars() call lower down. I know it's preferable to keep patches of different functionality separate but it was just one. I originally used the existing coding convention used in the file.

- I reworded the entity description. I hope it makese more sense. The point of help instructions is to provide as much concise information as possible about functionality that works in practice regardless of what should work or ought to work. I would also love to see every system on the planet using UTF-8 as a standard character set out of the box, though I have heard on good authority that even UTF-8 doesn't cover everything yet. I think reality will take awhile to catch up; lowest common denominators, like good old 7 bit us-ascii that's identical in most character sets, will need to be allowed until it does.

- Moving to the table functions eliminated the need for playing with CSS.

I understand all about coding standards and preferences. Each project has their own standards for very good reasons. I'll keep submitting my first patch until I learn the ropes. I greatly appreciate your last comment, giving me some indication that this patch will be useful at some point.

Steven’s picture

The problem with Unicode is purely on the input side. Every browser out there supports UTF-8 perfectly, and UTF-8 covers the entire Unicode Character Set, which in turn can represent text in any other encoding. In fact, unlike UTF-16 (the encoding used in Windows and Java), UTF-8 can expand to the full 31-bit space while UTF-16 is limited to 21-bit. Though I don't think either limit will ever be reached in any meaningful timespan ;).

In any case, your 'good authority' needs to check his facts.

The two main roadblocks are:
1) getting good font coverage (not so bad given that dynamic font mapping is built-in to browsers and some OSes)
2) providing a way to input the characters. Most systems have a character map, some offer more advanced options. But most people are not aware of them and don't use them. I do not think teaching people a list of entity codes to use is good. They are not converted into characters until the output is displayed, so they are user-hostile during editing.

You might be interested in a little app I wrote, I've considered making a Javascript version of it:
http://www.acko.net/blog/sprankle

Steven’s picture

I like the current incarnation of the patch already though. Some ideas:

- Instead of having an escaped and non-escaped version of each code snippet, just store the unescaped version and run it through drupal_specialchars when needed?

- You can define arrays implicitly with numerical indexes. You don't need to specify them. array('foo', 'bar') is equivalent to array(0 => 'foo', 1 => 'bar'). You use this already in some places, you should do it in every situation where the array keys are not important.

- Instead of this:
+ array('data' => t('No help provided for tag '. drupal_specialchars($tag)), 'class' => 'description'),
+ array('data' => t(''), 'class' => 'type'),
+ array('data' => t(''), 'class' => 'get')
You should define one column with colspan 3.

- There is a commented out statement in there.

- Instead of $s consider naming the variable $output or something. It's more readable. Also, statements like this should be avoided:
+ $s .= t('
');
Spacing and margins are up to the theme and should be handled through CSS, if needed, in drupal.css or a specific style.css.

Bèr Kessels’s picture

I am working on a way to allow theming and/or removing the filter tips under the textareas. Should I slip that in here, or do you prefer that to become a new issue?

grantbow@civicspacelabs.org’s picture

FileSize
10.48 KB

v4. Happy Holidays.

Dries’s picture

I'm not sure about this patch (regardless the remaining coding style issues).

The help texts shouldn't mention 'Drupal': the average visitor does not now what Drupal is. Words like 'UTF-8' and 'character encoding', and the links to the raw HTML specs can be debated too. It reads like a help text for power users or web developers (who probably don't need any in first place).

Do we need examples for each tag? Maybe it's better to use simple <acronym>s?

grantbow@civicspacelabs.org’s picture

The current code provides users with only the tag list. This is not very helpful to those that don't understand how to use tags. I'm trying to a) improve my own site and b) improve the Drupal end-user experience for ALL users. Online help text is a key way to provide support for users. The tips are specifically focused on HTML tags that are available. Tips for other filters should be added to the other filters. My first patch was specifically similar to the existing PHP filter example. Providing additional links for those that want to dive in won't hurt the user experience of most and will aid those that would like additional detail of what they can type in their own text entry box.

I believe examples, given in the context of the way readers of the page want to use them, are the clearest way to explain any new concept, especially to those without prior knowledge.

I hope people can weigh in sooner (rather than later) with specific alternate examples of what they would prefer to see so I can respond. Changing the link from Drupal to the site's own URL is simple enough after I find the right variables. Rewriting to remove mention of Drupal and use "this site" should not be hard. More text about the concept of character encoding seems appropriate. Adding the use of drupal_specialchars() instead of duplicated data in the array is easy enough.

I'll keep submitting my patches (after Christmas) until this goal is reached, balancing all input until there is agreement among all interested parties. Short term I will (of course) incorporate the outstanding items mentioned and we will see how close we are to an acceptable result. Comments encouraged.

grantbow@civicspacelabs.org’s picture

FileSize
8.87 KB

For review.

grantbow@civicspacelabs.org’s picture

FileSize
8.44 KB

Thanks to Dries and Steven for the IRC discussion today. I think this patch addresses most of the issues. We also discussed adding a toggle to function _filter_html_settings to disable the help text, useful when the HTML filter is used along with other filters and modules like htmlarea, textile, etc.

One concern about this patch was about bloat. Are there others that share this concern about this patch? Comments welcome.

grantbow@civicspacelabs.org’s picture

FileSize
8.45 KB

I realize end-user text like this is rather dry, but for end users that need it the value is fairly significant. This version works with the &lt;h1&gt; type tags, changing from [a-z]+ to [a-z0-9]+ in the regex.

Obviously I do not feel this patch is bloat. I don't think it is extraneous to help users (of various levels) understand how tags look in whatever theme they are using, but I do understand the concern. Changes to the text could be might make this long filter tip work better in more complex scenarios where this is not the only filter in use. This might also minimize the need for (and would be easier than) a special toggle.

grantbow@civicspacelabs.org’s picture

FileSize
8.63 KB

More additional requirements met (like proper use of t()) after IRC discussions. I've also updated the sample at http://softwaremanagers.org/filter-tips-sample.html Comments welcome. Happy New Year!

grantbow@civicspacelabs.org’s picture

Steven’s picture

I added a toggle to make the help optional as discussed, shortened some of the texts, tweaked the t() usage here and there (trying to find a balance between avoiding HTML inside t() and keeping text that belongs together in one piece) and tweaked the coding style a bit.

Committed to CVS/HEAD.

Anonymous’s picture