In #414424: Introduce Form API #type 'text_format' the text format handling in D7 was fixed and greatly improved. However, one feature of the new system is that the default behavior is to always put the submitted text format in a standalone 'format' property within $form_state['values']. In many cases, this winds up at the top level where it can conflict with other elements on the form.

The main benefit of this approach is that it's the place where a majority of use cases want it to be. Thus, it's more likely that form submit handlers can continue to do things like this when they have text formats in them:

drupal_write_record('some_table', $form_state['values']);

without having to massage $form_state['values'] beforehand.

However, the disadvantage is that anyone who uses hook_form_alter() to add a text-format-enabled textarea onto an existing form has to find, read, and completely understand this code comment in the PHP Doc of filter_process_format():

+ * If multiple text format-enabled elements are required on the same level of
+ * the form structure, modules can set custom #parents on the original element.
+ * Alternatively, the #after_build may be unset through a subsequent #process
+ * callback.

This is really easy to get wrong, even in very simple uses of hook_form_alter(). For example, code like this:

/**
 * Implements hook_form_FORM_ID_alter().
 */
function mymodule_form_block_add_block_form_alter(&$form, &$form_state) {
  $form['mymodule_extra_field'] = array(
    '#type' => 'text_format',
    '#title' => 'Whatever',
  );
  $form['#submit'][] = 'mymodule_custom_submit_function';
}

looks fine, and from the perspective of the module author will work fine; on the form they will get a new textarea with text format selector attached, and in their submit handler will receive $form_state['values']['mymodule_extra_field'] storing the entered text, and $form_state['values']['format'] storing the chosen text format.

However, this code actually leads to a major problem. The selected text format for this new element will additionally clobber and replace the one selected for the custom block body (which also appears on that form). When the form is saved, the custom block body will therefore be silently assigned the incorrect text format, which is a critical bug and in some cases a security issue. I think it is very difficult for the module author to track this down unless they know about it.

It's true that Drupal 6 had a similar issue where the default behavior was to put 'format' on the top level of the form. However, (a) this is still bad, and (b) I think the difference is that there you were a fair amount less likely to introduce this bug on your own because in that case, you had to be more explicit and your code wouldn't work if you did hook_form_alter() the naive way. For example, in D6:

  $form['mymodule_extra_field'] = array(
    '#type' => 'textarea',
    '#title' => 'Whatever',
  );
  $form['mymodule_extra_field_format'] = filter_form();

would simply never pass anything at all into $form_state['values']['mymodule_extra_field_format'], so your code would be broken if you left it like that.

To solve this, I would suggest:

  1. Going back to the earlier proposal discussed in #414424: Introduce Form API #type 'text_format', which would always put the text format somewhere in the array structure where it would not interfere with other text formats; e.g., in the above D7 example, you would get $form_state['values']['mymodule_extra_field']['value'] and $form_state['values']['mymodule_extra_field']['format'] in your form submit handler.
  2. If that is not possible, at least documenting much more clearly and prominently that this is a gotcha and simple ways to get around it. For example, I think the simplest way to get around it might be to encourage anyone using hook_form_alter() to add something with a text format to always put it in a wrapper. Code like this does not experience the bug I described above:
    /**
     * Implements hook_form_FORM_ID_alter().
     */
    function mymodule_form_block_add_block_form_alter(&$form, &$form_state) {
      $form['mymodule_wrapper'] = array(
        '#tree' => TRUE,
      );
      $form['mymodule_wrapper']['mymodule_extra_field'] = array(
        '#type' => 'text_format',
        '#title' => 'Whatever',
      );
      $form['#submit'][] = 'mymodule_custom_submit_function';
    }
    

    i.e., because it results in $form_state['values']['mymodule_wrapper']['mymodule_extra_field'] and $form_state['values']['mymodule_wrapper']['format'] in the submit handler, it will not interfere with other form elements by default.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

David_Rothstein’s picture

Priority: Normal » Critical

Tentatively putting this at critical based on the potential for it leading to critical bugs and security problems, but I would understand if someone put it back.

webchick’s picture

Subscribing. Thanks so much for the thorough explanation. I'll need to re-read it a couple of times to get it, but will do so after some sleep. ;)

andypost’s picture

$form_state['values']['mymodule_extra_field']['value'] and $form_state['values']['mymodule_extra_field']['format']

As for me it seems more intuitive

sun’s picture

Still not sure whether this is critical.

...would simply never pass anything at all into $form_state['values']['mymodule_extra_field_format'], so your code would be broken if you left it like that.

And since that would pass into $form_state['values']['format'], you would also immediately see the bug, because none of your blocks would use the format you selected.

In both cases, your altered-in field won't have a format assigned. In HEAD, you would see a PHP notice, because you try to access a form value that does not exist.

catch’s picture

Priority: Critical » Normal

If you get a notice from the broken code, then this isn't critical IMO, and there's only the potential of a bug here, not an actual bug. A notice should be enough to let you know you're doing something wrong and fix the issue before anything worse happens,

David_Rothstein’s picture

Priority: Normal » Critical

As I said above, I'm OK with moving this out of critical - however, it has to be for the right reasons :) The scenario I described for Drupal 7 does not generate a PHP notice at all, nor any problem at all that the person writing the module would ever easily notice. All it does is (silently) replace the chosen text format for some other module's textarea with one the user did not select.

Gábor Hojtsy’s picture

I'd agree with @andypost from #3, the purpose of adding the #text_format type was to remove magic, and looks like the magic it replaced the previous magic with causes more security issues, so why not retract to an array structure by default then? It is already different to the non-format field, and then it will be different based on parenting in the current state, so you could have 3 different data formats based on the use of formats. Why not reduce that to two?

sun’s picture

Issue tags: +Security improvements

The purpose of #type text_format was not to remove magic, but to remove an invalid and impossible simple conversion from plain-text to filtered input and vice-versa. The previous #text_format property suggested that things could be changed that easily.

I'm not strictly opposed to do 1), but as mentioned in the original issue, most of the code in Drupal core and contributed modules assumes 'format' next to the original form element's value and would partially require some heavy lifting of form values to get the values back (and forth) into the locations where they would be with this change.

Security-wise, technically, and logically, there is no difference to how this worked in Drupal 6 and below. And I think we survived.

But again, if someone is willing to work on this (monster) patch before we reach beta stage, I'll happily review it.

duellj’s picture

Status: Active » Needs work
FileSize
4.45 KB

Unless I'm mistaken, this seems like it's not quite a monster patch, if we do go with solution #1.

filter_form_after_build() moves the children of the 'text_format' element only if element['#columns'] isn't present. Fields sets the #columns attribute, so fields are already expecting values in the format of $form_state['values']['text_format']['value'] and $form_state['values']['text_format']['format']. So only direct form calls to the text_format element are expecting the side-by-side values. It looks like only taxonomy and block call text_format directly from forms.

So would it be as simple as doing away with filter_form_after_build and changing the way taxonomy and block handle their respective form_values? Here's the start of a patch that does that. If this is the right direction, then the documentation of filter_process_format() needs to be updated as well.

tstoeckler’s picture

Status: Needs work » Needs review
FileSize
4.44 KB

Reuploading without whitespace and setting 'needs review' to see what the bot says.

David_Rothstein’s picture

Neat! It seems to work based on light testing, and the explanation in #9 makes sense...

In addition to filter_process_format(), it looks like block_custom_block_save() would need some documentation updates as well.

I also wonder if taxonomy_form_term_validate() should make sure to convert the data into the correct format, to be safe. I don't think it matters for core, but other modules that attach field validation functions onto this might theoretically care.

Dries’s picture

Status: Needs review » Reviewed & tested by the community

I'm OK with this patch. I'm not convinced this is critical though -- certainly is a major problem but let's leave it at critical for now. I'm going to mark this RTBC and hopefully there will be some additional reviews.

duellj’s picture

Status: Reviewed & tested by the community » Needs review
FileSize
6.51 KB

Here's some documentation updates for filter_process_format and block_custom_block_custom_block_save.

I also wonder if taxonomy_form_term_validate() should make sure to convert the data into the correct format, to be safe. I don't think it matters for core, but other modules that attach field validation functions onto this might theoretically care.

Good point, it might be confusing if other modules are expecting form_values to be the same as the $term data. This patch doesn't include this change, just want to make sure this is something that's desirable.

Dries’s picture

Status: Needs review » Fixed

Thanks for the documentation improvements. I'm going to commit this as it fixes a critical bug -- I had previously marked it RTBC. We can come back to the finer points in a follow-up, if necessary. Thanks all.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.

sun’s picture