When I first enabled this feature, it just causes a PHP error, which I found might have been solved by installing a copy of CSSTidy on the root...

Well, I didn't want to do that, and looking into the HTML Purifier module code, it didn't look like there was any easy way to get the Filter.ExtractStyleBlocks option to actually do something, even if I could get it to find CSSTidy...

So I added a few (several) lines to _hmlpurifier_process to support the Filter.ExtractStyleBlocks option...
This probably isn't the best way to do this, particularly since it doesn't provide any help in setting Filter.ExtractStyleBlocks.Scope, which would, ideally, be set to '#node-[nid]'
(but since the $node context isn't available in a filter, I couldn't think of an easy way of doing this - I used the RepTags module)

Support from Acquia helps fund testing for Drupal Acquia logo

Comments

ezyang’s picture

Before I start reading this bug, I'd like to mention that no engineering effort was made to integrate this configuration option with Drupal, so it's not surprising that it doesn't work.

ezyang’s picture

Ok, a few questions to be asked:

* Easy thing to fix: we should give a nice error when someone tries to enable ExtractStyleBlocks when csstidy is not available.

* I really don't like putting the style tags inline with the output; it's not standards compliant and feels dirty anyway. Unfortunately, we can't call the Drupal function to add styling either, since the filter isn't going to get called on every request. So somehow, when Drupal renders a node, it needs to know to add some styling. That seems hard w/o patching.

* Scope should get set programmatically by the HTML Purifier module. Are we sure the context doesn't get given in any way at all?

Vector-’s picture

I agree that putting the style tags inline is not very desirable at all - everything I've done regarding this feels very hackish, to be honest :p

As per: http://drupal.org/node/226963#comment-2686074

context aware filters are, essentially, not possible - the alternative is, apparently, to use hook_nodeapi rather than hook_filter

I would imagine this would be the appropriate way to both get the styles in the header where they belong, as well as being able to set the scope appropriately?

Without converting the entirety of the HTML Purifier module to go through hook_nodeapi, something along this lines would work, I think?
- if Filter.ExtractStyleBlocks cache the style blocks created by CSSTidy in _htmlpurifier_process
- in htmlpurifier_nodeapi load the style blocks from the cache, replace a helper string used in the ExtractStyleBlocks.Scope with $node->nid, and add the style blocks

Am I on the right track, at least? :p

ezyang’s picture

I don't know the extent of what nodeapi gives you, but it seems plausible. You have to be careful about sticking information in the cache, since it's never guaranteed to be there; it might be better to actually stick the style in the HTML, and then break it out again when nodeapi does it. Helper string is also pretty risky, since it should ideally be unforgeable.

Vector-’s picture

Here's an attempt to do this a little bit better...

- StyleBlocks created by CSSTidy are cached when _htmlpurifier_process runs, and an argument comment is added
- argument comment is processed in hook_nodeapi and used to find the appropriate cached StyleBlocks
- cached StyleBlocks checked for the [%HTMLPURIFIER:NID%] - ! need to set Filter.ExtractStyleBlocks.Scope to #node-[%HTMLPURIFIER:NID%]
- parsed StyleBlocks included in the head of the document via a drupal_set_html_head
- cache is updated once [%HTMLPURIFIER:NID%] tokens are replaced

Faults:
- [%HTMLPURIFIER:NID%] is unlikely to get reused, but might not be secure - it's inserted by CSSTidy, so it's not terrible - couldn't really come up with a better way to do this
- Only stores and loads StyleBlocks when caching is on (can $cache even be FALSE without custom PHP code?)

Vector-’s picture

Should probably wrap the $style elements in HTML comments before printing them, for old browsers, or somethin...

Vector-’s picture

Erm... sorry for the repeat posts - better HTML commenting of the $style elements, with "\n"s included

ezyang’s picture

The usual technique for making a token unforgeable is to 1. automatically generate it, and 2. not tell the user what it is until after they've submitted their text.

The cache can certainly be disabled; and at the very least, flushed, so you really do need to find a better spot to stick the CSS. I think inline is actually a quite workable solution, as long as you don't mind globally gunking up the internal representation of HTML.

Vector-’s picture

Unless I've missed something, as long caching is turned on, this should work fine when the cache is flushed...
And without caching on, the style blocks will be lost, but considering they'd be lost anyhow without the Filter.ExtractStyleBlocks option, not the end of the world, is it?

I suppose the other option would be to save the CSSTidy output to a file, and link it?
That would probably get it cached by the CSS Aggregator too... not sure if that's a good thing, or a bad thing here :p

As for any problems with losing cached values...
Since the CSS cache is generated along with the normal HTML Purifier cache, and then updated at node view,
Well, I've used both cache flushing and node edits to double check the functionality and I haven't had any trouble so far...

I'm not quite sure where the best place to add an override for Filter.ExtractStyleBlocks.Scope would be...
I'm guessing _htmlpurifier_get_config ? But I can't think of an easy way to add a unique token without making it take another parameter...

I'm also somewhat curious... considering CSSTidy is prepending #node-[%HTMLPURIFIER:NID%] to every CSS selector in the style blocks...
How much harm could a user using this token cause? They could certainly get $node->nid inserted into their style block with it, but the style blocks have been cleaned reasonably thoroughly at that point?
And I don't think the token is ever displayed to the user at all? Though the hash code used for caching ends up in the source, but this could be removed in the hook_nodeapi after it's been processed...

Vector-’s picture

A few more improvements:
- Removes the hash key left by hook_filter to find the CSS cache from the node content variable when we're done with it
- Moved most of the logic into a helper function - made the above easier, but mostly to help ease CCK integration

A few major issues still:
- Still no secure token - it's never exposed to the user, and I can't see it causing any trouble
- Still relies on the cache system to function, otherwise the extracted style blocks will never be saved, they will just disappear
- Still no CCK support, only extracted stylesheet links found in node body / teaser are parsed

I'm sure, somehow (though I can't imagine how), the unsecure token will come back to bite me, and eventually I'll probably try to add some stuff into _htmlpurifier_get_config to create a more secure token to replace [%HTMLPURIFIER:NID%]. This really shouldn't be a big deal once I've run through the code enough to figure out where the finalization happens on the config variables - chances are pretty good we're going to want Filter.ExtractStyleBlocks.Scope to be #node-[nid] ... or some similar variant for other themes, I suppose? Simple tokens could be used here and replaced with a hash code before the config is finalized, which would then get replaced in hook_nodeapi.

I'm not much inclined to care about when caching is turned off, either, but I suppose the appropriate way to do this would be to write the CSS out to a file with a name based on the hash key... personally I was unsure of how to make sure, with css aggregation on, we loaded the correct CSS after altering it moments before...

The CCK support is somewhat disappointing, though I'm honestly not using this filtering on any CCK fields (can you even do so without another module?) so I'm not too worried about this at the moment either...

Ohh, and I was too lazy to move things around to where they belong too... still, pretty workable for the moment, probably going to leave things like this for awhile :p

ezyang’s picture

Status: Needs work » Patch (to be ported)

Upon further consideration, I think storing the CSS in the htmlpurifier cache table will be ok, because if the entirety of the HTML Purifier cache gets flushed, we'll automatically repopulate it. I’ll test and merge this patch in.

ezyang’s picture

Status: Patch (to be ported) » Needs work

Patch doesn't give a nice error message if csstidy is not installed.

ezyang’s picture

Status: Needs work » Fixed

Committed to CVS.

Status: Fixed » Closed (fixed)

Automatically closed -- issue fixed for 2 weeks with no activity.