Support for Drupal 7 is ending on 5 January 2025—it’s time to migrate to Drupal 10! Learn about the many benefits of Drupal 10 and find migration tools in our resource center.
In node.tokens.inc, I found a few errors with references to $node->body.
The lines in question are 139-148
case 'body':
if (!empty($node->body)) {
$replacements[$original] = $sanitize ? $node->body[0]['safe'] : $node->body[0]['value'];
}
break;
case 'summary':
if (!empty($node->body)) {
$replacements[$original] = $sanitize ? $node->body[0]['safe_summary'] : $node->body[0]['summary'];
}
break;
There are two problems in the above code. First, $node->body is keyed by language code, then field instance. Second, 'safe' and 'safe_summary' do not exists in the object.
Comment | File | Size | Author |
---|---|---|---|
#27 | node_body_token-609118-27.patch | 1.21 KB | yched |
#19 | node-609118-19.patch | 1.13 KB | plach |
#17 | node-609118-17.patch | 1.13 KB | plach |
#2 | node-609118-2.patch | 1.08 KB | David_Rothstein |
#1 | node-609118-1.patch | 1.24 KB | te-brian |
Comments
Comment #1
te-brian CreditAttribution: te-brian commentedOk, I'm not 100% sure about this one, but here's a patch.
I keyed the $node->body by FIELD_LANGUAGE_NONE and changed the 'safe' and 'safe_summary' to actual calls to check_markup().
Patch Attached.
Comment #2
David_Rothstein CreditAttribution: David_Rothstein commentedAdding FIELD_LANGUAGE_NONE definitely makes sense, but I looked into it a bit, and I'm not sure why we would need to call check_markup() directly?
The Field API already takes care of sanitizing the text correctly and putting it in 'safe' and 'safe_summary', so in general it seems like it would be wasteful to run it through check_markup() again. What is the use case where you found that 'safe' and 'safe_summary' don't exist? I just tried it now via a direct call to node_tokens() passing in the result of node_load() for the $node object, and the 'safe' and 'safe_summary' keys were there.
The attached patch is what I used for that. I think this is the correct patch - and anything that doesn't have 'safe' or 'safe_summary' indicates a bug in the calling code rather than in this function - but I could be totally wrong about that :)
Comment #3
David_Rothstein CreditAttribution: David_Rothstein commentedAlso, it is an interesting question whether it is even the responsibility of node.module to define these tokens, or rather they should all be defined by the Field module (since the node body is essentially just another field). However, that's probably not for this issue to solve :)
Comment #4
Dries CreditAttribution: Dries commentedCommitted to CVS HEAD. Thanks.
Comment #5
plachBy using
FIELD_LANGUAGE_NONE
we are skipping language negotiation/fallback here. This problem is still being discussed in #571654: Revert node titles as fields.Comment #6
yched CreditAttribution: yched commented- Plach is right. + the question of how tokens should behave wrt field values languages doesn't seem clear to me.
- 'safe' and 'safe_sumary' can indeed not be expected to be found in a loaded $node. Text module precomputes them and stores them in field cache when a text field is loaded - text_field_load() -, but *not* if the input format is not cacheable, in which case they are only added at node display time - text_field_sanitize().
Comment #7
te-brian CreditAttribution: te-brian commentedYeah, @yched, this is what I experienced ... I think. After reading David's review I just assumed the particular version of HEAD I was using at the time had something else going on, because I can not reproduce safe and safe_summary not loading now.
Would your recommendation be to check for their existence and then conditionally sanitize the output? Or is this a bigger issue than this one instance?
Comment #8
plachComment #9
fagohm, I stumbled over the same problem when coding entity metadata integration. Previously I used hook_sanitize() to sanitize the field values if they are not present, but that's not possible any more since #658302: repurpose hook_field_sanitize() into hook_field_prepare_view(). I wonder how a fields can be sanitized now? Is there an API function for that?
My current hack works for text fields only - however a generic fied-type agnostic approach would be certainly better.
Comment #10
yched CreditAttribution: yched commentedhook_field_sanitize() was a flawed concept. There is no such thing as generic field data sanitation, even inside a given field type - it depends on what you want to do with the data and which formatter runs. All formatters don't have the same sanitation requirements.
The fact that formatter functions, hook_field_formatter_view(), now return render arrays instead of being theme functions mean they can do their own sanitation trhemselves.
field_format() should take care of running a given formatter on a given field value - that is, once it gets fixed in #612894: field_format() is just a pile of code that doesn't work :-(.
I don't think, however, that tokens should use the 'formatters' system. We don't want to have tokens like [node:field_foo|formatter_name], do we ? Plus, formatters can have settings, and this becomes unmanageable.
Each field type should expose its own tokens, and the code expanding its tokens should take care of doing its own sanitation if needed.
In the specific case of text fields - or in the more specific case of title and body fields (since currently we hardcode tokens for just those fields), the
_text_sanitize($instance, $langcode, $item, $column)
helper function can be used to return the sanitized 'value' or 'summary'.It takes into account the fact that sanitized data is pre-computed at load time when possible, and runs the right function (check_plain or check_markup) according to the instance configuration.
Comment #11
fago>There is no such thing as generic field data sanitation, even inside a given field type - it depends on what you want to do with the data and which formatter runs. All formatters don't have the same sanitation requirements.
Really? I can't think of a case where sanitation requirements are different when the formatter changes? Could you give me an example?
@entity-metadata:
What I want here is to get the actual, useful data stored in the field, e.g. for exporting to XML. In case of text fields having input formats the interesting data is the already processed data as the text input is written with certain text processing in mind. Thus it needs to be applied to get the desired output. However processing the data is tied to sanitizing the data - not a problem, but that's the cause why I ask for such an API function.
@token: Well I think it would be useful to have different formats available in tokens, but yes I see the problem with unmanageable settings. Using build modes would work, but it isn't the best option as different token formats might be desired for different uses. So ideally I think one could define multiple formatting presets depending on the field/data type which then can be applied using tokens [node:field_foo|preset_name].
Comment #12
yched CreditAttribution: yched commented> "I can't think of a case where sanitation requirements are different when the formatter changes? Could you give me an example?"
text_default and text_trimmed do not need to sanitize the summary
text_plain does not need to run check_markup()
As been pointed in the 'token in core' thread, sanitation depends on the target and context.
Formatters are targeted to produce HTML output. We would need to hear from the token folks here, but IMO formatters are not appropriate for token generation.
Comment #13
fago>text_default and text_trimmed do not need to sanitize the summary text_plain does not need to run check_markup()
I see, I missed text_plain. However I don't think it does well for processed text fields. If the field is configured to have text processing, the text plain formatter will bypass it - thus it won't respect the setting. I think it should run check_markup() to do text processing and then strip the tags. Actually check_markup() is a bit misnamed, as it doesn't only checks markup, it applies the filters which might do much more than just checking markup.
So well, stripping tags is also a kind of sanitation, so I have to agree with you that sanitation depends on the formatter. But still (text) processing depends only on the field instance - we already have the text_processing flag in the instance settings. So a "give me the processed data" API function would imho make much sense.
>As been pointed in the 'token in core' thread, sanitation depends on the target and context.
Whether sanitation is necessary - yes. But how the data needs to be sanitized can only know the token provider. Still for some certain use cases you need extra sanitizing steps, e.g. for use in URLs. Those then can be applied by the token consumer.
Comment #14
mcarbone CreditAttribution: mcarbone commentedFYI, this will get test coverage: #701818: Several tokens are broken, especially file tokens; test coverage of every core token
Although those tests will have to wait until the issue here is resolved, as it's not very clear how to test the body and summary tokens at the moment.
Also, should this perhaps be critical? Otherwise we are potentially shipping with some broken tokens. Not quite certain of myself to flip the switch, tho.
Comment #15
plachI think this could be solved by #657972: Make field fallback logic actually reusable. Another possible solution, having monolingual tokens, might be relying on #632172: Node language and field languages may differ, which should allow to use
$node->language
to select an always-valid field value.Not sure about the critical status, though.
Comment #16
mcarbone CreditAttribution: mcarbone commentedRe: the summary and safe_summary keys -- I'm finding that node_load loads the right value for them only if a summary was added separately when creating the node. If the summary wasn't expanded and added separately, node_load has empty values for these keys (although the summary renders properly on the front page, which means that the rendering process doesn't use the summary and safe_summary values directly).
Re: the token tests -- I was able to test the body and summary fields by making sure to explicitly add a summary, so that issue no longer relies on this. However, if those tests are committed first, and this issue ends up changing the body and summary structure, those will need to be fixed in the tests.
Comment #17
plachThis should work also with locale enabled, let's leave true multilingual handling to contrib modules.
Comment #18
plachComment #19
plachActually the previous patch was only partially correct.
Comment #20
plachAt the moment the body token simply does not work for nodes having language different from LANGUAGE_NONE.
Comment #21
catchLooks good.
Comment #22
webchickLooks like a straight-forward fix. Committed to HEAD. Thanks!
Comment #23
David_Rothstein CreditAttribution: David_Rothstein commentedIs the issue with 'safe'/'safe_summary' not always being available in this token being handled in some other issue? As far as I can tell from above, it isn't, so this is not fixed yet.
Comment #24
catch@David - has anyone reproduced that bug yet? As far as I can see it's not been.
Comment #25
plachuntagging
Comment #26
yched CreditAttribution: yched commentedsee #6
Comment #27
yched CreditAttribution: yched commentedAttached patch should fix :
- the fact that a loaded $node does not contain precomputed sanitized strings if the input format is not cacheable (PHP filter...) (see #23 and #6)
- the fact that the key for the sanitized full body is 'safe_value', not 'safe'
Comment #28
catchCould we check if value and safe_value are set before calling text_sanitize? Seems like that'd save a check_markup().
I'm wondering a little if this is a duplicate of #691078: Field tokens, although that's pushed to contrib now, but still.
Comment #29
yched CreditAttribution: yched commentedthat check is done within _text_sanitize() - the function exists because text.module's own formatters needed that same extra "sometimes it's already there, sometimes not" logic.
Comment #30
catchDoh, good point.
Comment #31
Dave ReidWe don't have tests to back this up?
Comment #32
webchickLooks like tests are covered in #701818: Several tokens are broken, especially file tokens; test coverage of every core token. So I think this is good to go, providing that one makes it in. I just bumped it to critical.
Committed to HEAD. Thanks!
Comment #33
catchTests are added in #701818: Several tokens are broken, especially file tokens; test coverage of every core token, however that patch doesn't have the correct fix for this particular issue. I've marked this as duplicate and iinked from that issue.
Comment #34
catchOr that's another option :)