Refactor the core filter.module to provide a more consistent internal API for formats and filters management, remove duplicated code and clean up current code.

Currently there are several calls to hook_filter and to hook_filter_tips withing the same function, this is because of the way that filters are defined and provided by modules through two separated hooks, hook_filter and hook_filter_tips. Multiple invokes to the hook_filter is required to obtain different properties of the filters.

Code in nested switch for $op and $delta results in hard to maintain and debug. Would be good to have filters definitions similar to menu definitions, defined in an associative array of filter properties.

Proposition

hook_input_filter
Introduce a new hook_input_filter to allow modules to provide input filters. The inputs filters would be declared using an associative array (similar to hook_menu).

Remove the concept of numeric deltas and use unique filters' names to define and identify filters.

The hook_input_filter may looks like this:

/**
 * Implementation of hook_input_filters(). Contains a basic set of essential filters.
 * - HTML filter:
 *     Validates user-supplied HTML, transforming it as necessary.
 * - Line break converter:
 *     Converts newlines into paragraph and break tags.
 * - URL and e-mail address filter:
 *     Converts newlines into paragraph and break tags.
 */
function filter_input_filters() {
	$filters['filter_html'] = array(
	  'name' => t('Limit allowed HTML tags'), 
    'description' => t('Allows you to restrict the HTML tags the user can use. It will also remove harmful content such as JavaScript events, JavaScript URLs and CSS styles from those tags that are not removed.'),
	  'process callback' => '_filter_html',
	  'settings callback' => '_filter_html_settings', 
	  'tips callback'  => 'filter_html_tips'
	);
  $filters['filter_autop'] = array(
    'name' => t('Convert line breaks'), 
    'description' => t('Converts line breaks into HTML (i.e. <br> and <p>) tags.'),
    'process callback' => '_filter_autop',
    'tips callback' => 'filter_autop_tips'
  );
  $filters['filter_url'] = array(
    'name' => t('Convert URLs into links'),
    'description' => t('Turns web and e-mail addresses into clickable links.'),
    'process callback' => '_filter_url',
    'settings callback' => '_filter_url_settings',
    'tips callback' => 'filter_url_tips'
  );  
  $filters['filter_htmlcorrector'] = array(
    'name' =>  t('Correct broken HTML'), 
    'description' => t('Corrects faulty and chopped off HTML in postings.'),
    'process callback' => '_filter_htmlcorrector',
    'tips callback' => 'filter_htmlcorrector_tips'  
  );   
  $filters['filter_html_escape'] = array(
    'name' => t('Escape all HTML'), 
    'description' => t('Escapes all HTML tags, so they will be visible instead of being effective.'),
    'process callback' => '_filter_html_escape',
  );  
  return $filters;
}

$filter array key: unique identifier of the filter
'name': Human readable name of the filter
'description': Filter description as provided by hook_filter('description')
'settings callback': Callback function that provides settings for the filter, provided by the current hook_filter('settings', $delta, ...)
'process callback': Callback function that processes (filter) the text to be filtered current hook_filter('process', $delta, ...)
'tips callback': Callback function that provides tips for the filter

For example, the tips for the filter_autop filter would look like this:

function filter_autop_tips($format) {
	return array(
    'short' => t('Lines and paragraphs break automatically.'),
    'long' =>t('Lines and paragraphs are automatically recognized. The <br /> line break, <p> paragraph and </p> close paragraph tags are inserted automatically. If paragraphs are not recognized simply add a couple blank lines.'), 
	);
}

With the new hook_input_filters, hook_filter_tips won't be required.

The database schema should be refactored as well to get rid of numeric deltas.

I am working on the patch to implement the required modifications. Feedback please!

Comments

dropcube’s picture

A discussion about a similar refactoring to the blocks system is in this thread: http://drupal.org/node/257032

dropcube’s picture

For the PHP filter, the hook_input_filters would look like this:

/**
 * Implementation of hook_input_filters(). Contains a basic PHP evaluator.
 *
 * Executes PHP code. Use with care.
 */
function php_input_filters() {
  $filters['php'] = array(
    'name' => t('PHP evaluator'), 
    'description' => t('Executes a piece of PHP code. The usage of this filter should be restricted to administrators only!'),
    'process callback' => 'drupal_eval',
    'tips callback' => '_php_filter_tips', 
    'cache' => FALSE 
  ); 
}
dave reid’s picture

This looks much cleaner. +1 from me.

catch’s picture

+1 from me too, encountered the mess which is filter_filter() when reviewing #299050: Help System - Master Patch.

xano’s picture

Subscribing. I think we should keep and refactor hook_filter() and possibly move hook_filter_tips() to the help system since in essence, the filter tips are help. Although this last task isn't yet possible (one of the major flaws in the old _and_ new help system) I do think we should move to that point.

kika’s picture

- As we changed "input filters" to "text filters" in UI strings, does it make sense to introduce that in this cleanup effort? Or is it killing small furry creatures?
- can *_autop_* be changed into something more descriptive? Cryptic shortnames are not Drupal-way.

sun’s picture

I'm not sure whether I like this proposal. Yes, repeatedly invoked switches are bad, but then again, I'm not convinced that a hook_menu()-alike registry structure is really better. After all, 99% of all filters need all available operations, so you have to define the obvious. Defining the obvious is cruft.

dropcube’s picture

Title: Filter module refactoring » hook_filter_info() and filter DB revamp
Category: feature » task
Priority: Normal » Critical
Status: Active » Needs review
Issue tags: +FilterSystemRevamp
StatusFileSize
new21.81 KB

The patch introduces hook_filter_info(), a hook_menu()-like for defining filters. It also revamps the underlying DB structure, removes the concept of numeric deltas and uses unique filters' names to define and identify filters.

The DB structure looks like this:

Table format: Stores text formats: custom groupings of filters, such as Filtered HTML, defined by admins and configured in filter.mdoule

Field Type Null Default
==============================================
fid int(11) No Primary Key: Unique ID for format.
name varchar(255) No Name of the text format (Filtered HTML).
roles varchar(255) No A comma-separated string of roles; references role.rid.
cache tinyint(4) No 0 Flag to indicate whether format is cacheable. (1 = cacheable, 0 = not cacheable)
weight tinyint(4) No 0 Weight of text format to use when listing.

Indexes:
==============================================
Keyname Type Cardinality Field
PRIMARY PRIMARY 2 fid
name UNIQUE 2 name

Table format_filter: Stores the filters that contain each text format.
Field Type Null Default
==============================================
fid int(11) No 0 Foreign key: The format.fid
module varchar(64) No The origin module of the filter.
filter_id varchar(64) No The filter referenced unique ID within the module.
weight tinyint(4) No 0 Weight of filter within format.

Indexes:
==============================================
Keyname Type Cardinality Field
PRIMARY PRIMARY 7 fid
module
filter_id

Filters are identified by the unique filter_id, which is the key of the definition array in hook_filter_info(). The filter origin module is stored in the DB to perform cleanups when modules are uninstalled, see filter_modules_uninstalled().

IMO, it's not required to store filters definition in DB, may be all the filters structure in cache, we should evaluate what performs better, storing the structure in cache, or running module_invole_all('filter_info') once per request, if it's needed.

It's only a initial patch to get feedback, there are still several issues.

Status: Needs review » Needs work

The last submitted patch failed testing.

damien tournoud’s picture

Collapsing format into fid while there is a filter_id makes the table structure harder to understand. Why not simply renaming format to format_id?

Crell’s picture

Cleaning up the filter API to make it into an info hook gets a big +1 from me!

I would definitely not call module_invoke_all('filter_info') every time. The two main benefits of registry-style info hooks is that they do NOT get re-called every time (which is a performance win and means more code that we can push off to a lazy-include file using the registry and then never call again) and that it's dead-simple to have an alter hook. You should do both, so that this hook behaves the way one would expect.

Remember that Schema API now supports foreign keys, so you can specify those in the schema definition. We don't do much with that information yet, but it should be specified just the same.

I would name the process callback key "process callback", not just "process", to match the standard used by the menu API. It's clearer that you're talking about a callback function. I'm tempted to ask if we could work callback arguments in here, but I'm not entirely sure how. :-)

I just gave it a quick once-over, not an in depth review, but I like the general direction a lot.

dropcube’s picture

Status: Needs work » Needs review
StatusFileSize
new39.41 KB

Attached an updated patch with the following changes:

- DB tables are prefixed with filter_, so filter_format stores formats, and filter_format_filter stores formats-filters relationship.
- Callback definitions keys are 'process callback', 'settings callback', and 'tips callback', as noted by @Crell.
- filter_format table primary key is format_id, consequently, $format_id references a format id, while filter_id references a filter id.
- Revamped and cleaned code in filter.admin.inc, required to work with the new DB structure..

dropcube’s picture

Title: hook_filter_info() and filter DB revamp » hook_filter_info() and filter module revamp

Status: Needs review » Needs work

The last submitted patch failed testing.

dropcube’s picture

Status: Needs work » Needs review
StatusFileSize
new73 KB

Updated patch with some fixes and tests.

Status: Needs review » Needs work

The last submitted patch failed testing.

dropcube’s picture

Status: Needs work » Needs review
StatusFileSize
new91.4 KB

Updated patch, removing hook_fiter_info and fixing tests that fail.

dropcube’s picture

StatusFileSize
new91.99 KB

updated patch

dropcube’s picture

Title: hook_filter_info() and filter module revamp » Filter system revamp
Status: Needs review » Needs work

The last submitted patch failed testing.

sun’s picture

Status: Needs review » Closed (duplicate)