I realized while trying to figure out the right way to properly abstract the .tpl.php files that I had not gone quite far enough with the theme registry. This follow up patch rectifies this oversight:

It adds One More _variables_* function, the 'theme_variables_HOOK' function that a module can implement. This serves as the default place to transform business logic into presentable data. It also moves the discovery of these functions to cache time, saving a few cycles during the actual rendering of the template. And it creates a simple variable that a theme can use to tell Drupal to *not* use the ancestor _variables_ functions (in the case that it is doing something totally different, and instead of replacing inherent functionality it wishes to completely override. There are some good use cases for this: in particular, dvessel mentioned that if his theme wants to add a stylesheet, he doesn't want to have to run drupal_get_css() twice. Likewise, if a theme wants to do something with theme('status_messages') it can't, because theme('status_messages') has the 'side effect' of clearing the status messages when it retrieves them. Most of the use cases for this are purely for performance: It allows the theme to skip running code that won't be used anyhow).

Most of the remaining code in phptemplate.engine is then moved into theme.inc (except for the comment related stuff which is in comment.module). The remaining templates are put in the system module folder (or comment.module folder).

phptemplate.engine is now truly tiny, and it is now a little bit easier to abstract theme functions into .tpl.php files.

Applying this patch will require a patch to the hook_theme() documentation in contrib.

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

Steven’s picture

Let me just say: HALLELUJAH!

We've long needed a split between the code needed to transform data into output, and the code that transforms output into themed HTML. Not just to make life easiers on designers, but for security reasons.

I've longed for the situation where no original data would be passed into the theme functions, only safe-for-output escaped text.

The problem with that until now has been two-fold:

  • Theme functions in modules as well as themers often need complicated logic (sometimes with good right), which means they need access to actual data (e.g. the raw $node object).
  • Doing escaping in the theme layer (where it belongs) puts the burden on the themer, who is generally less knowledgeable about security than a module coder and knows less about where the data comes from.

So in theory, before this patch, a module should have been passing in safe-for-output values into the theme functions, as well as any raw data that may be needed. Obviously this is not the case and would result in bloat.

However, this has not been a problem for most of core, exactly because we had the phptemplate default handlers for nodes, comments and such which pre-processed the original data into safe output for the templates, while the _variables hooks let you inject custom logic and queries relatively cleanly.

For every themable function that is to be 'tpl-ified', all the logic that transforms/escapes data would need to be separated from the logic that renders output and tags. The only bridge between the two would be a fresh array of values. In the case of e.g. node bodies and CCK, this means providing an array of the field values, pre-escaped/filtered. Doing any sort of actual data logic in the tpl.php becomes impossible: you need a custom theme_variables_HOOK() for that.

In this system, implementing theme_variables_hook() would also not pass any original arguments into the tpl.php. The hook acts like a filter (in the non-drupal sense), and it creates an 'upper class' of safe theme functions that are easily audited for proper escaping.

This is why the idea of "escape on input" has come up so often when talking about XSS problems: for input, there was already a centralized location to patch. For output, there was only a bunch of module code. With the new policy, a module's output would consist only of calls to either API functions (like l()) or themeable functions, whether directly or through e.g. Form API. All escaping takes place in theme_variables_HOOK(), which is relatively safe to audit.

We should also look at some of the logic that surrounds themable functions and see if some of it isn't display logic that can be moved down the chain.

Obviously a module is still free to write unsafe code, just like they are free to put user data into db_query(). And a themer can still use the entire Drupal API inside a tpl.php. But it gives us a badly needed 'official coding pattern' for output escaping. This even lets e.g. a designer stick only to the tpl.php while a coder does the theme_variables_HOOK() work (think flatforum-level enhancements). It also means drop-in snippets for core themeable functions can be a lot simpler and safer to write and share.

We could phase it in progressively after this patch, converting core as we go. The downside is that this radical split (i.e. not just a 'clean up' by taste) is sure to cause some extra slowdown—the question is how much.

Regarding that 'override variables' option: if you instead always ignore the ancestor's variables, the hook function itself is free to first call theme_variables_HOOK() directly to inherit them, no? This is how normal theme overrides have worked until now: they replace, not extend. It would also be a bit more direct and not require a phptemplate_theme() implementation in the template.php. Themers who need to do advanced overrides are going to need to read some docs or look at existing examples any way.

Dries’s picture

I agree with Steven -- this has a lot of potential, to the point it makes me all gibberish (in a good way).

The security aspect of this is key, so let's think about this a bit more so we can get most out of it.

merlinofchaos’s picture

Regarding that 'override variables' option: if you instead always ignore the ancestor's variables, the hook function itself is free to first call theme_variables_HOOK() directly to inherit them, no? This is how normal theme overrides have worked until now: they replace, not extend. It would also be a bit more direct and not require a phptemplate_theme() implementation in the template.php. Themers who need to do advanced overrides are going to need to read some docs or look at existing examples any way.

Because variables tend to be additive, as near as I could guess the majority of the time override will be an exception. Thus, I think it would be clumsy to have to call back to the original, especially if someone in the middle just wants to add one variable and stuff that comes later isn't necessarily aware of which ancestors are doing what -- it just wants the ancestors to do their thing.

I predict that the override would be used very very infrequently; only when the functionality provided by the module is either insufficient or heavy (i.e, the examples for theme_page), but otherwise this will be a rarely used flag, and instead the phptemplate_variables_HOOK() function will more than likely be used to either change one or two small aspects or add other aspects.

webchick’s picture

sub.. will try and test soon.

adrian’s picture

Just going to put my 2c in here.

When i originally designed the theme system there were 3 new namespaces.

Apart from the existing theme_ namespace, which was for the default templates.

The new ones were :
1. engine_ namespace. Which was all the template engine related stuff, including finding the templates, and passing them into the theme system. It originally also used something like .info files to find templates / themes. Most of the functions which ended up being phptemplate_$hook were meant to be in this namespace.
2. the $engine_ namespace, which consisted entirely of one or 2 functions in the engines/$engine/$engine.engine directory. This was only for actually calling the template library to return the themed output.
3. the template_ namespace. This completely replaced the $theme_ namespace, and made all standard php themes completely copyable. template.php actually took it's name from this namespace, in that only functions in the template_ namespace was meant to be included in it.
Since only one template is loaded at the time , this meant that there would be no namespace conflicts.

I like the patch, i'm just not sure i like the idea of polluting the theme_ namespace with more functions that aren't theme functions.

merlinofchaos’s picture

Adrian;

I'd originally written this patch to use the module name, not theme_ -- and then I switched it over to theme_ for the sake of consistency (and because I didn't like system_variables_page much).

It doesn't have to be theme_variables_* by any means; but it did seem obvious to use it. The theme_ namespace is pretty iffy right now. It's used for theme functions; it's used for theme helper functions. And as this patch gets more followup patches, it'll probably be used for fewer and fewer true theme functions -- hopefully the bulk of those will be translated to templates.

I'll happily take some suggestions, I'm not sure which direction to go here.

adrian’s picture

how about a variables_ namespace, which overrides / extends an engine_ namespace.

engine_ is the variables the modules provides, variables_ is the override / extension of that.

and apart from theme_X perhaps we should rename any theme helper functions that aren't actual theme functions to drupal_theme_X

That might make it more transparent exactly what we are talking about, and probably easier to document, since you could look at the function name and be 100% sure of what it's meant to return / do.

adrian’s picture

Would also stop people from being able to do theme('variables_page').

merlinofchaos’s picture

People can't do theme('variables_page') anymore because theme functions are now specifically registered. So that isn't really an issue.

The overloaded namespace is more an issue from an understandability perspective.

We now have a phptemplate_engine_* namespace. I don't think the 'variables_' namespace works out either. It ends up looking like this:

module: variables_variables_HOOK(); (or just variables_HOOK());
phptemplate.engine: phptemplate_engine_variables_HOOK();
garland: phptemplate_variables_HOOK();
chameleon: chameleon_variables_HOOK();

is variables_variables_HOOK() actually better than theme_variables_HOOK()? Imagine this in code:


function theme_variables_forum_page(&$variables) {
  ...
}

function theme_forum_page($forum, $other, $data) {
  ...
}

Part of me likes the theme_ appearing on both. Part of me is skeptical. This definitely requires some more opinion.
Note that with this patch, it's not clear to me if we should retain the phptemplate_engine_variables namespace above, in any case, since this patch allows the engine to stay out. The thing is, I'm not sure if other, less PHPy engines will still need such a thing, so I am very hesitant to just say "We don't need it". Maybe smarty or phpTal will need it.

moshe weitzman’s picture

sbscrb

Crell’s picture

For my part, I'd prefer to keep the theme_ namespace "clean" for overridable functions, even if they do require registry. Could we repurpose the template_ namespace, perhaps, if theme overrides are going to be more nestable anyway?

merlinofchaos’s picture

Keep in mind there isn't a 'template_' namespace at the moment. Certainly we could utilize that.

I'm not sure I consider theme_variables_* unclean, but I can see the reasoning.

chx’s picture

Title: Improve _variables_ functions » Provide a way to secure the theme variables system
Assigned: Unassigned » chx
Category: feature » task
Priority: Normal » Critical
merlinofchaos’s picture

FileSize
38.41 KB

Based on feedback, this patch is exactly the same as the last patch I posed, except the prefix for module-defined functions are now 'template_variables_HOOK' instead of 'theme_variables_HOOK'.

It is my belief that this is ready to go; Steven has done a code review, but it could use some testing reports. (And hey, I tested this and everything appears to work +1 would make me happy).

eaton’s picture

I'll be testing tonight/tomorrow after chx and I roll up some of the form related stuff we're working on, but I am absolutely in favor of the use of template_variables_HOOK() for this code. At a purely syntactic level, it also helps clarify the difference between just a 'theme function' and what gets handed off to an actual template file.

Good stuff, good stuff all around.

merlinofchaos’s picture

FileSize
36.64 KB

The attached patch changes it from _variables_ to _preprocess_ which is much more descriptive of what is *really* happening here.

Also, this one will correctly delete the tpl.php files in phptemplate -- I had the syntax for faking a cvs rm wrong.

Dries’s picture

theme_ vs template_ -- aren't we introducing an inconsistency here? If we use template here, we probably want to use it at a couple other places as well. The difference between a template and a theme is currently not well-defined, IMO and I think we need to be careful about this.

kika’s picture

Can we just move on - go with drupal_theme_preprocess() or something, get it into core and then readjust if better naming convention comes? This interesting patch seems to float too long because of a tiny roadblock.

merlinofchaos’s picture

Dries: I thought about this for quite some time before I finally agreed to make the switch, and I decided template_* made a lot of sense:

1) The difference between a template and a theme is well defined. A template is a single instance of a .tpl.php file whereas a theme is a collection of of theme implementations that may be templates or functions.
2) When I implement theme('foobar') I will either implement it as:

theme_foobar(...args...);

OR

template_preprocess_foobar(&$variables);

And provide foobar.tpl.php in my modules directory to go with it.

This actually makes sense, at least to me; that particular signature makes it clear that I'm working on a template, not a theme function.

eaton’s picture

As one of the people who argued in favor of template_variables_HOOK() in #drupal, I think the difference is pretty explicit right now. theme functions can be almost ANY HTML generation, anywhere in the system -- from theme_user() to theme_placeholder() and so on. You override those in your theme's code, or your template.php file, and there you have it. Templates -- designer-editable files on disk -- are a very clear line of demarcation. Previously, you could only get to them by using _phptemplate_callback() and _phptemplate_variable(). This simplifies things a lot, but the 'theme functions' versus 'template files' distinction is still a big one. The naming convention makes sense, IMO. ;-)

Dries’s picture

The difference between a theme and a template is well-defined, but not in the code.

I'd define a theme as a set of templates + CSS + images.

The point is that it is not well-defined in the code/APIs.

Wouldn't it be more logical to call it template_foobar(...args...) ?

Imagine that John and Matt had to explain this in their book -- would it feel natural and intuitive? If you honestly think the answer is yes, I'd be happy to commit this patch as is.

merlinofchaos’s picture

I think the answer is yes, now that it's been renamed as 'template_preprocess_' from theme_variables. The _variables name was historical and I stuck to it longer than I ought to have because it was familiar.

To me, though, 'template_preprocess_foobar' is pretty intuitive. If I explain to you that "this function does preprocessing on the variables passed to the foobar template before it is invoked", I think that sentence says a lot about what is really going on here, and the function name includes the relevant words from the sentence, AND satisfies the clean namespace requirement. The only difficult part will be understanding that several of these preprocess functions may get run, and understanding the order will be important, but even that can be explained in a way that I believe is intuitive.

merlinofchaos’s picture

Oh, let me add: If it's template_foobar(), then what is the version that goes in template.php -- phptemplate_foobar() -- that doesn't work, it restricts theme_foobar.

Here are the definitions I propose:

A theme is a complete look of a site. It contains templates, theme functions, CSS and images.

A template is an implementation of a specific theme hook/function; it can easily be defined to themers as "a single .tpl.php file".

Crell’s picture

Not to muddy the waters even more, but looking at the patch in #16 and the template_preprocess_* functions, isn't it essentially an alter hook? It at least looks like one at the implementation level. It gets inserted between the definition and execution of some process and mutilates some big definition array to suit its needs, and stacks. Is there some way we could leverage the growing _alter pattern, either in code or terminology, to save code space or brain space? Just thinking aloud; feel free to tell me it's too different. :-)

As for theme vs. template, how would one describe the following:

theme_foo(): A theme function?
phptemplate_foo(): A theme function override? A template function?
foo.tpl.php: A template file?

I like "a theme is a collection of templates, CSS, and images"; I'm just checking the edge cases.

webchick’s picture

Earl and I talked a bit about the theme vs. template terminology on IRC. At the end of the day, I can't think of any better terms. I don't think it will be too confusing. Smarty calls its index.tpl files and whatnot "template files" as well, so there is transferrable knowledge that themers might already have (and if not, a quick Google will clear things up).

Ultimately, I think this is going to come down to us documenting this well in the handbook. Merlin's two-sentence description is really all that's probably required for most people... we just have to make sure we have some good examples to go along with that.

merlinofchaos’s picture

We've been discussing this in IRC, and 'template_prepare' has come up as an alternative to 'template_preprocess' -- which is fine.

'template_alter' IMO is bad because it doesn't follow the HOOK_alter form and would actually be more confusing, I think.

webchick’s picture

-1 to alter. we're moving to a consistent array structure for all _alter() functions, and this doesn't (can't) use that same thing. Let's not bring about inconsistency.

On IRC, Chris Johnson brought up 'prepare' rather than 'preprocess' and imo that's the way to go.

Dries’s picture

To me, 'prepare' sounds as something that is optional, and 'preprocess' sounds as something that is always required. I might be tainted by my C/C++ background.

At the end of the day, both work for me. I think we have a couple of _prepare hooks in Drupal already. It might be more consistent.

We might also want to keep future directions into mind (i.e. security system improvements).

I'm happy with the patch, but I think it's great to have us double-check the API and terminology. It should be intuitive to explain and use, and if it prevents that we have to break backward compatibility later on, it might safe us time down the road. Sometimes, it helps to write the documentation -- it can give you a fresh perspective on the API.

I'll leave it up to Merlin to mark this RTBC. I'm happy when Earl's happy. ;-)

merlinofchaos’s picture

Status: Needs review » Reviewed & tested by the community

Ok, I ran over the logic of this with a less technical type (i.e, my wife) and here is where I ended up:

_preprocess_ is better for developers. It is more technically correct and has more out-and-out meaning than prepare.
_prepare_ is marginally better for themers. It has some consistency with what exists right now; primarily with node_prepare. The thing there is, it would actually conflict with node_prepare, because the node would get prepared for display...and then a prepare woudl be run again. I think that would actually end up *more* confusing. Now, down the road when the node output stuff is re-tooled, node_prepare might just disappear, but it's not certain that set of patches will make it into Drupal 6, and it's not certain exactly how it would be retooled.

I think, based on that, the scale is slightly tipped in favor of sticking with preprocess. So I will RTBC this patch as it is.

Dries’s picture

 includes/common.inc                           |    4 
 includes/theme.inc                            |  387 +++++++++++++++-----------
 modules/comment/comment.module                |   27 +
 themes/engines/phptemplate/block.tpl.php      |    8 
 themes/engines/phptemplate/box.tpl.php        |    8 
 themes/engines/phptemplate/comment.tpl.php    |   25 -
 themes/engines/phptemplate/default.tpl.php    |    1 
 themes/engines/phptemplate/node.tpl.php       |   29 -
 themes/engines/phptemplate/phptemplate.engine |  221 --------------
 themes/garland/template.php                   |    2 
 10 files changed, 257 insertions(+), 455 deletions(-)

Are you sure that's correct? The patch deletes a ton of code ...

merlinofchaos’s picture

FileSize
36.64 KB

Damn, I fixed the part that deletes the .tpl.php files from phptemplate/ and then broke the part that adds the files back in the modules directory. This patch restores that.

merlinofchaos’s picture

FileSize
38.27 KB

Ugh, last patch uploaded out of wrong directory. Thanks RicardoR for noticing.

Dries’s picture

* One more question: shouldn't we put node.tpl.php in modules/node instead of modules/system?

* "Prepare the variables passed to the page.tpl.php template Uses the arg()" -- lacks a point.

merlinofchaos’s picture

Status: Reviewed & tested by the community » Needs work

Theoretically we should put node.tpl.php, but theme_node historically existed in theme.inc not node.module. To be honest, I'm not sure why that is, but I was a little bit afraid to move it. I should probably just do that.

chx’s picture

Status: Needs work » Reviewed & tested by the community
FileSize
38.22 KB

Seems an HTTP error eaten the patch I wanted to upload yesterday. GRR!

Dries’s picture

Status: Reviewed & tested by the community » Needs work

I've committed this to CVS HEAD. Thanks for the hard work. I'm marking this 'code needs work' until the documentation has been updated.

merlinofchaos’s picture

The *.tpl.php files in engines/phptemplate did not get cvs rm'd in the patch; those files are no longer necessary and need to be removed.

moshe weitzman’s picture

Status: Needs work » Reviewed & tested by the community

lets RTBC so a committed notices that those files need to be removed from CVS. after doing so, please set this to 'needs work' for docs.

Dries’s picture

Status: Reviewed & tested by the community » Fixed

Committed. Thanks.

dww’s picture

Status: Fixed » Needs work

docs...

dvessel’s picture

Just keeping tabs on this.

moshe weitzman’s picture

have docs been provided? if so, please set to fixed.

morphir’s picture

Category: task » feature

I would appreciate docs within the code freeze, thanks!

chx’s picture

Title: Provide a way to secure the theme variables system » theme preprocess documentation does not exist
Assigned: chx » Unassigned
Category: feature » bug
dvessel’s picture

Component: theme system » documentation
Assigned: Unassigned » dvessel
Category: bug » task
Status: Needs work » Active

I plan on writing about it in the new theming handbook but been busy lately. I'll try and get around to it before RC1 at the latest.

http://drupal.org/node/171179

moshe weitzman’s picture

has this documentation been posted? if not, any volunteers?

merlinofchaos’s picture

dvessel's handbook aimed at the themer is in a partially usable state, but it is not yet linked into the main documentation: http://drupal.org/node/171179

My much shorter guide aimed at module developers is here: http://drupal.org/node/165706

dvessel’s picture

Status: Active » Fixed

http://drupal.org/node/173880

I updated it some.. I still plan on improving it further so it's easier to grasp for new users but the docs are there. So marking as fixed.

Anonymous’s picture

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for two weeks with no activity.