node.tokens.inc: $node->body issues [#609118]

Comment	File	Size	Author
#27	node_body_token-609118-27.patch	1.21 KB	yched

#19	node-609118-19.patch	1.13 KB	plach

#17	node-609118-17.patch	1.13 KB	plach

#2	node-609118-2.patch	1.08 KB	David_Rothstein

#1	node-609118-1.patch	1.24 KB	te-brian

Comment #1

te-brian CreditAttribution: te-brian commented 20 October 2009 at 03:49

File	Size
node-609118-1.patch	1.24 KB

Ok, I'm not 100% sure about this one, but here's a patch.

I keyed the $node->body by FIELD_LANGUAGE_NONE and changed the 'safe' and 'safe_summary' to actual calls to check_markup().

Patch Attached.

Log in or register to post comments

Comment #2

David_Rothstein CreditAttribution: David_Rothstein commented 4 November 2009 at 23:52

File	Size
node-609118-2.patch	1.08 KB

Adding FIELD_LANGUAGE_NONE definitely makes sense, but I looked into it a bit, and I'm not sure why we would need to call check_markup() directly?

The Field API already takes care of sanitizing the text correctly and putting it in 'safe' and 'safe_summary', so in general it seems like it would be wasteful to run it through check_markup() again. What is the use case where you found that 'safe' and 'safe_summary' don't exist? I just tried it now via a direct call to node_tokens() passing in the result of node_load() for the $node object, and the 'safe' and 'safe_summary' keys were there.

The attached patch is what I used for that. I think this is the correct patch - and anything that doesn't have 'safe' or 'safe_summary' indicates a bug in the calling code rather than in this function - but I could be totally wrong about that :)

Log in or register to post comments

Comment #3

David_Rothstein CreditAttribution: David_Rothstein commented 4 November 2009 at 23:54

Also, it is an interesting question whether it is even the responsibility of node.module to define these tokens, or rather they should all be defined by the Field module (since the node body is essentially just another field). However, that's probably not for this issue to solve :)

Log in or register to post comments

Comment #4

Dries CreditAttribution: Dries commented 5 November 2009 at 03:35

Status:

Needs review

» Fixed

Committed to CVS HEAD. Thanks.

Log in or register to post comments

Comment #5

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 5 November 2009 at 09:40

Status:

Fixed

» Needs work

By using FIELD_LANGUAGE_NONE we are skipping language negotiation/fallback here. This problem is still being discussed in #571654: Revert node titles as fields.

Log in or register to post comments

Comment #6

yched CreditAttribution: yched commented 5 November 2009 at 11:04

- Plach is right. + the question of how tokens should behave wrt field values languages doesn't seem clear to me.

- 'safe' and 'safe_sumary' can indeed not be expected to be found in a loaded $node. Text module precomputes them and stores them in field cache when a text field is loaded - text_field_load() -, but *not* if the input format is not cacheable, in which case they are only added at node display time - text_field_sanitize().

Log in or register to post comments

Comment #7

te-brian CreditAttribution: te-brian commented 5 November 2009 at 18:34

Yeah, @yched, this is what I experienced ... I think. After reading David's review I just assumed the particular version of HEAD I was using at the time had something else going on, because I can not reproduce safe and safe_summary not loading now.

Would your recommendation be to check for their existence and then conditionally sanitize the output? Or is this a bigger issue than this one instance?

Log in or register to post comments

Comment #8

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 2 December 2009 at 14:36

Issue tags:

+translatable fields

Log in or register to post comments

Comment #9

fago

German

Vienna

CreditAttribution: fago commented 29 December 2009 at 23:00

hm, I stumbled over the same problem when coding entity metadata integration. Previously I used hook_sanitize() to sanitize the field values if they are not present, but that's not possible any more since #658302: repurpose hook_field_sanitize() into hook_field_prepare_view(). I wonder how a fields can be sanitized now? Is there an API function for that?

My current hack works for text fields only - however a generic fied-type agnostic approach would be certainly better.

Log in or register to post comments

Comment #10

yched CreditAttribution: yched commented 30 December 2009 at 00:59

hook_field_sanitize() was a flawed concept. There is no such thing as generic field data sanitation, even inside a given field type - it depends on what you want to do with the data and which formatter runs. All formatters don't have the same sanitation requirements.

The fact that formatter functions, hook_field_formatter_view(), now return render arrays instead of being theme functions mean they can do their own sanitation trhemselves.
field_format() should take care of running a given formatter on a given field value - that is, once it gets fixed in #612894: field_format() is just a pile of code that doesn't work :-(.

I don't think, however, that tokens should use the 'formatters' system. We don't want to have tokens like [node:field_foo|formatter_name], do we ? Plus, formatters can have settings, and this becomes unmanageable.

Each field type should expose its own tokens, and the code expanding its tokens should take care of doing its own sanitation if needed.

In the specific case of text fields - or in the more specific case of title and body fields (since currently we hardcode tokens for just those fields), the _text_sanitize($instance, $langcode, $item, $column) helper function can be used to return the sanitized 'value' or 'summary'.
It takes into account the fact that sanitized data is pre-computed at load time when possible, and runs the right function (check_plain or check_markup) according to the instance configuration.

Log in or register to post comments

Comment #11

fago

German

Vienna

CreditAttribution: fago commented 30 December 2009 at 23:21

>There is no such thing as generic field data sanitation, even inside a given field type - it depends on what you want to do with the data and which formatter runs. All formatters don't have the same sanitation requirements.

Really? I can't think of a case where sanitation requirements are different when the formatter changes? Could you give me an example?

@entity-metadata:
What I want here is to get the actual, useful data stored in the field, e.g. for exporting to XML. In case of text fields having input formats the interesting data is the already processed data as the text input is written with certain text processing in mind. Thus it needs to be applied to get the desired output. However processing the data is tied to sanitizing the data - not a problem, but that's the cause why I ask for such an API function.

@token: Well I think it would be useful to have different formats available in tokens, but yes I see the problem with unmanageable settings. Using build modes would work, but it isn't the best option as different token formats might be desired for different uses. So ideally I think one could define multiple formatting presets depending on the field/data type which then can be applied using tokens [node:field_foo|preset_name].

Log in or register to post comments

Comment #12

yched CreditAttribution: yched commented 30 December 2009 at 23:51

> "I can't think of a case where sanitation requirements are different when the formatter changes? Could you give me an example?"
text_default and text_trimmed do not need to sanitize the summary
text_plain does not need to run check_markup()
As been pointed in the 'token in core' thread, sanitation depends on the target and context.

Formatters are targeted to produce HTML output. We would need to hear from the token folks here, but IMO formatters are not appropriate for token generation.

Log in or register to post comments

Comment #13

fago

German

Vienna

CreditAttribution: fago commented 31 December 2009 at 09:33

>text_default and text_trimmed do not need to sanitize the summary text_plain does not need to run check_markup()
I see, I missed text_plain. However I don't think it does well for processed text fields. If the field is configured to have text processing, the text plain formatter will bypass it - thus it won't respect the setting. I think it should run check_markup() to do text processing and then strip the tags. Actually check_markup() is a bit misnamed, as it doesn't only checks markup, it applies the filters which might do much more than just checking markup.

So well, stripping tags is also a kind of sanitation, so I have to agree with you that sanitation depends on the formatter. But still (text) processing depends only on the field instance - we already have the text_processing flag in the instance settings. So a "give me the processed data" API function would imho make much sense.

>As been pointed in the 'token in core' thread, sanitation depends on the target and context.
Whether sanitation is necessary - yes. But how the data needs to be sanitized can only know the token provider. Still for some certain use cases you need extra sanitizing steps, e.g. for use in URLs. Those then can be applied by the token consumer.

Log in or register to post comments

Comment #14

mcarbone CreditAttribution: mcarbone commented 1 February 2010 at 23:26

FYI, this will get test coverage: #701818: Several tokens are broken, especially file tokens; test coverage of every core token

Although those tests will have to wait until the issue here is resolved, as it's not very clear how to test the body and summary tokens at the moment.

Also, should this perhaps be critical? Otherwise we are potentially shipping with some broken tokens. Not quite certain of myself to flip the switch, tho.

Log in or register to post comments

Comment #15

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 2 February 2010 at 09:45

I think this could be solved by #657972: Make field fallback logic actually reusable. Another possible solution, having monolingual tokens, might be relying on #632172: Node language and field languages may differ, which should allow to use $node->language to select an always-valid field value.

Not sure about the critical status, though.

Log in or register to post comments

Comment #16

mcarbone CreditAttribution: mcarbone commented 26 February 2010 at 01:46

Re: the summary and safe_summary keys -- I'm finding that node_load loads the right value for them only if a summary was added separately when creating the node. If the summary wasn't expanded and added separately, node_load has empty values for these keys (although the summary renders properly on the front page, which means that the rendering process doesn't use the summary and safe_summary values directly).

Re: the token tests -- I was able to test the body and summary fields by making sure to explicitly add a summary, so that issue no longer relies on this. However, if those tests are committed first, and this issue ends up changing the body and summary structure, those will need to be fixed in the tests.

Log in or register to post comments

Comment #17

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 8 April 2010 at 22:05

Status:

Needs work

» Needs review

File	Size
node-609118-17.patch	1.13 KB

This should work also with locale enabled, let's leave true multilingual handling to contrib modules.

Log in or register to post comments

Comment #18

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 9 April 2010 at 00:50

Issue tags:

+Quick fix

Log in or register to post comments

Comment #19

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 9 April 2010 at 00:54

File	Size
node-609118-19.patch	1.13 KB

Actually the previous patch was only partially correct.

Log in or register to post comments

Comment #20

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 14 April 2010 at 10:35

Priority:

Normal

» Critical

At the moment the body token simply does not work for nodes having language different from LANGUAGE_NONE.

Log in or register to post comments

Comment #21

catch

he/him

English

CreditAttribution: catch commented 14 April 2010 at 12:08

Status:

Needs review

» Reviewed & tested by the community

Looks good.

Log in or register to post comments

Comment #22

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 15 April 2010 at 02:42

Status:

Reviewed & tested by the community

» Fixed

Looks like a straight-forward fix. Committed to HEAD. Thanks!

Log in or register to post comments

Comment #23

David_Rothstein CreditAttribution: David_Rothstein commented 15 April 2010 at 04:46

Status:

Fixed

» Needs work

Is the issue with 'safe'/'safe_summary' not always being available in this token being handled in some other issue? As far as I can tell from above, it isn't, so this is not fixed yet.

Log in or register to post comments

Comment #24

catch

he/him

English

CreditAttribution: catch commented 15 April 2010 at 04:53

@David - has anyone reproduced that bug yet? As far as I can see it's not been.

Log in or register to post comments

Comment #25

plach

he/him

Italian

Venezia

CreditAttribution: plach commented 15 April 2010 at 09:15

Issue tags:

-Quick fix, -translatable fields

untagging

Log in or register to post comments

Comment #26

yched CreditAttribution: yched commented 15 April 2010 at 09:43

see #6

Log in or register to post comments

Comment #27

yched CreditAttribution: yched commented 15 April 2010 at 12:52

Status:

Needs work

» Needs review

File	Size
node_body_token-609118-27.patch	1.21 KB

Attached patch should fix :
- the fact that a loaded $node does not contain precomputed sanitized strings if the input format is not cacheable (PHP filter...) (see #23 and #6)
- the fact that the key for the sanitized full body is 'safe_value', not 'safe'

Log in or register to post comments

Comment #28

catch

he/him

English

CreditAttribution: catch commented 15 April 2010 at 13:52

Could we check if value and safe_value are set before calling text_sanitize? Seems like that'd save a check_markup().

I'm wondering a little if this is a duplicate of #691078: Field tokens, although that's pushed to contrib now, but still.

Log in or register to post comments

Comment #29

yched CreditAttribution: yched commented 15 April 2010 at 14:01

Could we check if value and safe_value are set before calling text_sanitize? Seems like that'd save a check_markup().

that check is done within _text_sanitize() - the function exists because text.module's own formatters needed that same extra "sometimes it's already there, sometimes not" logic.

Log in or register to post comments

Comment #30

catch

he/him

English

CreditAttribution: catch commented 15 April 2010 at 14:28

Status:

Needs review

» Reviewed & tested by the community

Doh, good point.

Log in or register to post comments

Comment #31

Dave Reid

he/him

English

Nebraska USA

CreditAttribution: Dave Reid commented 20 April 2010 at 07:49

We don't have tests to back this up?

Log in or register to post comments

Comment #32

webchick

she/they

English

Vancouver 🇨🇦

CreditAttribution: webchick commented 20 April 2010 at 08:00

Status:

Reviewed & tested by the community

» Fixed

Looks like tests are covered in #701818: Several tokens are broken, especially file tokens; test coverage of every core token. So I think this is good to go, providing that one makes it in. I just bumped it to critical.

Committed to HEAD. Thanks!

Log in or register to post comments

Comment #33

catch

he/him

English

CreditAttribution: catch commented 20 April 2010 at 08:01

Status:

Fixed

» Closed (duplicate)

Tests are added in #701818: Several tokens are broken, especially file tokens; test coverage of every core token, however that patch doesn't have the correct fix for this particular issue. I've marked this as duplicate and iinked from that issue.