Cannot submit any special character in translation [#97640]

Comment	File	Size	Author
#30	locale_xss_1.patch.txt	5.34 KB	Heine
#28	locale_xss_0.patch	5.34 KB	Steven

#19	locale_xss.patch	877 bytes	Steven

#12	patch0.txt	1.43 KB	Takafumi
#8	locale_diff1.txt	907 bytes	Takafumi
#3	bootstrap_fix_diff1.txt	602 bytes	Takafumi

Comment #1

chx CreditAttribution: chx commented 17 November 2006 at 21:36

Status:

Active

» Closed (won't fix)

I am not inclined to fix this the language string interface is for quick & dirty fixes, if you want anything fancy export, edit, import.

Log in or register to post comments

Comment #2

Jax CreditAttribution: Jax commented 18 November 2006 at 15:54

Status:

Closed (won't fix)

» Active

I'm sorry but the language interface is used often on multi-lingual sites. When you create menu links and they have to be translated my customers use this to enter the menu strings in the other languages. For example I have a site where I needed to add the menu entry "Purchase & Sale". When the customer translated it, it didn't look ok. So I had to edit it manually in the DB.

Maybe you are nog inclined to fix it but at least give me the opportunity to provide a patch.

Log in or register to post comments

Comment #3

Takafumi CreditAttribution: Takafumi commented 19 November 2006 at 12:03

File	Size
bootstrap_fix_diff1.txt	602 bytes

I think that this issue is bug.
So, accept this patch to fix it.

Log in or register to post comments

Comment #4

Takafumi CreditAttribution: Takafumi commented 19 November 2006 at 12:04

Status:

Active

» Needs review

Log in or register to post comments

Comment #5

Jax CreditAttribution: Jax commented 19 November 2006 at 13:00

The issue here is that the & is saved as & in the table. Modifiing check_plain so that the & is turned back into a & at display time isn't a good idea IMO.

Log in or register to post comments

Comment #6

Takafumi CreditAttribution: Takafumi commented 19 November 2006 at 15:18

I think that it's processing necessary to prevent XSS(or other attack) that encodes “&" to the character entity before storing to DB.

Log in or register to post comments

Comment #7

Jax CreditAttribution: Jax commented 19 November 2006 at 16:06

Look at how nodes are saved in the db table. There the & is saved as a &. It's not good practive to save altered data in the db. You should escape html entities at display time only.

Log in or register to post comments

Comment #8

Takafumi CreditAttribution: Takafumi commented 19 November 2006 at 17:55

File	Size
locale_diff1.txt	907 bytes

Try this patch if you don't want to encode '&'.

Log in or register to post comments

Comment #9

drumm

he/him

NY, US

CreditAttribution: drumm commented 27 November 2006 at 01:17

Status:

Needs review

» Fixed

Committed to HEAD.

Log in or register to post comments

Comment #10

Takafumi CreditAttribution: Takafumi commented 28 November 2006 at 04:52

Status:

Fixed

» Closed (fixed)

thanks.

Log in or register to post comments

Comment #11

Jax CreditAttribution: Jax commented 28 November 2006 at 07:47

Title:	Cannot submit "&" in translation	» Cannot submit any special character in translation
Status:	Closed (fixed)	» Active

All this patch does is reconvert the & to & which was converted by filter_xss() in the first place. It does nothing to fix the cause.

Try to translate with any other 'special' character like < or >, they are converted to their html entity (by filter_xss() and saved as an entity in the DB. Then the & from the entity is escaped again at display time and you end up with < or > in the menu.

If we wanted to solve those as well in the same manner as this patch all we're implementing is filter_xss()^-1.

Changed title to indicate we're not only talking about &.

Log in or register to post comments

Comment #12

Takafumi CreditAttribution: Takafumi commented 28 November 2006 at 10:06

Status:

Active

» Needs review

File	Size
patch0.txt	1.43 KB

The root of the issue is in check_plain().
So, accept this patch to fix it again.

Log in or register to post comments

Comment #13

Jax CreditAttribution: Jax commented 28 November 2006 at 12:45

Status:

Needs review

» Needs work

IMO, the root of the issue is that the value to be saved goes through filter_xss_admin before it is saved to the database. This happens in locale.inc on line 425.

There the & is converted to a & and afterwards, that value is saved in the table. The value that should be saved in the database is the original & NOT the &.

Your patch reconverts the saved & to & before it is converted to & again. Not the root cause of this issue.

Log in or register to post comments

Comment #14

Takafumi CreditAttribution: Takafumi commented 28 November 2006 at 14:03

Status:

Needs work

» Needs review

Storing it after the special character is converted to character entity is not a issue.
However, character entity can't be used in the text as long as the patch is not applied.

Log in or register to post comments

Comment #15

Jax CreditAttribution: Jax commented 28 November 2006 at 16:09

Imagine, like I said before, it is bad practice to store altered characters in the database. You want to submit the data the user submitted. Maybe stripped from some characters, but the characters you're going to keep should be unaltered.

Because if you change a & to a & before storing, how will you & when a user submits it? Also if your display policies change you have no way of knowing what the user actually submitted as data the first time.

I'm done explaining this. This patch breaks other behavior that is there by design (eg. http://drupal.org/node/97642) and is just a bad idea. I'm not putting this back in 'code needs work' since imagine will probably just set it back to review.

Log in or register to post comments

Comment #16

drumm

he/him

NY, US

CreditAttribution: drumm commented 29 November 2006 at 19:33

Status:

Needs review

» Needs work

Please provide a few specific test cases and what is expected to be stored in the database and output to the screen.

Log in or register to post comments

Comment #17

killes@www.drop.org CreditAttribution: killes@www.drop.org commented 1 December 2006 at 10:03

I've backported the patch that drumm committed.

Log in or register to post comments

Comment #18

Takafumi CreditAttribution: Takafumi commented 1 December 2006 at 10:29

Jax's opinion are "how to store the entered strings in DB", but the issue that I judged is "how to output the entered character entity".
I decided to report as another issue.
Because, I think that "how to store the entered strings in DB" is not a problem, and it seems to differ from **Jax's** issue.

Log in or register to post comments

Comment #19

Steven CreditAttribution: Steven commented 1 December 2006 at 18:48

Status:

Needs work

» Needs review

File	Size
locale_xss.patch	877 bytes

All these solutions are essentially horrible. Patching check_plain() is bad because it applies globally, and doesn't give you the benefit of using HTML, but still just forces you to double escape HTML entities when talking about them.

The root of the problem is that the text passed into t() can be both plain-text as well as HTML. Because there is no reliable way to distinguish the two, we should just remove the call to filter_xss_admin() on locale import. It is obviously causing problems.

We really can't reliably clean up imported locale strings. At most we can come up with hacks that will work in most cases. But I think it's better to just remove it from core.

Perhaps a better approach is simply to make a locale checker script that throws warnings? We could use the following snippet to reliably detect possibly bad strings, without false negatives:

if (decode_entities($value) != decode_entities(filter_xss_admin($value))) { /* possibly bad value */ }

This would not cause problems when a lone ampersand is used in a string, but still throw errors when bad stuff is included. We can't use a similar snippet on import though, because it does not preserve HTML validity of the value.

Log in or register to post comments

Comment #20

Takafumi CreditAttribution: Takafumi commented 2 December 2006 at 14:30

This issue doesn't exist only in t(). It's necessary to recognize exist in also all functions through the check_plain().

Log in or register to post comments

Comment #21

Takafumi CreditAttribution: Takafumi commented 2 December 2006 at 15:08

~~This issue doesn't exist only in t().~~
It's not only t () this issue exists.

Log in or register to post comments

Comment #22

Jax CreditAttribution: Jax commented 3 December 2006 at 10:58

Imagine, there is nothing wrong with check_plain(). When your text contains a special character, like & (either in title or body) and you submit it. The & is saved in the table. At display time the & is converted to & by check_plain to make sure it displayed correctly. check_plain() is nothing more than a call to htmlspecialchars with an extra option.

Log in or register to post comments

Comment #23

Takafumi CreditAttribution: Takafumi commented 3 December 2006 at 11:45

No, it is wrong recognition.
Well, there is no problem if '&' is used alone. However, '&' will be used as a part of '&gt' and '&lt'(and much more) of character entity.
Therefore, the problem occurs if '&' is doubly encoded in check_plain().

Log in or register to post comments

Comment #24

Steven CreditAttribution: Steven commented 4 December 2006 at 18:59

There is no need to use HTML entities inside check_plain()'d text. You can already use any unicode character directly.

The actual problem here has already been described. Locale assumes its strings are all HTML, which is not true.

Log in or register to post comments

Comment #25

Takafumi CreditAttribution: Takafumi commented 5 December 2006 at 17:29

If you want to keep existing bug, I stay out of this issue.

Log in or register to post comments

Comment #26

Jax CreditAttribution: Jax commented 6 December 2006 at 08:05

Anyway, I confirm that patch #19 solves the issue I was experiencing which was that it was not possible to add characters like & > etc.... to a translated string.

Log in or register to post comments

Comment #27

Takafumi CreditAttribution: Takafumi commented 6 December 2006 at 11:48

It's effective only at first time post, even if the patch of #19 is applied.
It's converted into character entity when editing again and enters the text area.
Therefore, patch of #19 is not too important.

Log in or register to post comments

Comment #28

Steven CreditAttribution: Steven commented 12 December 2006 at 21:56

File	Size
locale_xss_0.patch	5.34 KB

There were indeed more calls to filter_xss_admin() elsewhere. Attached a patch to remove them all, including the str_replace.

Log in or register to post comments

Comment #29

drumm

he/him

NY, US

CreditAttribution: drumm commented 19 December 2006 at 09:50

Priority:

Normal

» Critical

I think this should be fixed and looks okay. Any objections? Security review?

Log in or register to post comments

Comment #30

Heine CreditAttribution: Heine commented 19 December 2006 at 09:57

File	Size
locale_xss_1.patch.txt	5.34 KB

Rerolled without parse error. Will review.

Log in or register to post comments

Comment #31

Heine CreditAttribution: Heine commented 19 December 2006 at 18:24

Right now we have two permissions that enable a user to gain more privileges:

1. administer users
2. administer access control

That these two permissions can be used to escalate someone's access is quite obvious. That is not true for the administer locales permission, however.

Now, I know that the administer locales permission will not be handed out that often, but if you have a permission it should sharply define what is possible.

I see several options:

1. Won't fix the current double escape problem.
2. Remove string editing from core all together.
3. Make it obvious that administer locales is a 'dangerous' permission.
4. Tie administer locales to another permission.
5. Patch goes in as is.
6. ?

Log in or register to post comments

Comment #32

Heine CreditAttribution: Heine commented 19 December 2006 at 18:55

Status:

Needs review

» Needs work

Killes alerted me to the fact that this patch combined with our CVS translation policies allow malicious CVS account holders to execute XSS attacks via our .po files:

I had added this filtering during March when I noticed that imported
files could be used to insert xss. Our PO files are not reviewed or
protected in any way and malicious cvs account owners could add evil XSS
stuff to any PO file, even core PO files.

I realize malicious CVS account holders can do harm via their own modules, but never they cannot taint core or other modules atm.

So a -1 on this patch *until* we fix our CVS policies and work out the issues I mentioned above.

Log in or register to post comments

Comment #33

Steven CreditAttribution: Steven commented 19 December 2006 at 23:09

Where translations come from is a matter of trust... holding this patch because the necessary CVS scanning infrastructure is in place is kind of silly. That's like saying we should keep module installation broken because someone might post an insecure one.

Whether or not a module can 'taint core' is stupid. A module could easily leave behind PHP code in a node or block to do its bidding after it's been uninstalled too, or a module could even insert malicious translation strings itself on install.

As I already said before, the following test can be used to detect possibly bad translation strings. It should not have any false positives. But it is only a test, not a transformation, as it destroys valid HTML.

if (decode_entities($value) != decode_entities(filter_xss_admin($value))) { /* possibly bad value */ }

Either we build a scanning script for cvs, or we add this check to Drupal core when importing. I cannot guarantee that this will never give any false positives, although for the typical cases of translatable strings, it should not be a problem. It should certainly not let any XSS through.

However, I believe we should treat this as a separate issue. This patch itself fixes a bug when importing and editing locales, namely that some strings are irreversibly corrupted.

Log in or register to post comments

Comment #34

killes@www.drop.org CreditAttribution: killes@www.drop.org commented 21 December 2006 at 20:33

IMO this is a "won't fix". We do not have any strings on core that have & for the menu.

Log in or register to post comments

Comment #35

drumm

he/him

NY, US

CreditAttribution: drumm commented 21 December 2006 at 21:05

Should I revert the previously applied patch?

Log in or register to post comments

Comment #36

Anonymous (not verified) CreditAttribution: Anonymous commented 24 December 2006 at 00:58

not sure where to say this but with using the t() function, there are places where for example in formsapi where there is something like..

// taken from imagefield.module
$form['image_path'] = array(
        '#description' => t('Optional subdirectory within the "%dir" directory where images will be stored. Do not include trailing slash.', array('%dir' => theme('placeholder', variable_get('file_directory_path', 'files')))), 
      );

now with having the theme('placeholder'.. that adds in , it's converting the tags out so the style isn't applying and you just see the tag.

Log in or register to post comments

Comment #37

Steven CreditAttribution: Steven commented 29 December 2006 at 08:31

Status:

Needs work

» Fixed

Steve: you're not making much sense. If %value placeholders are used, the output is HTML and should be printed as such. Forms API just passes through the #description strings, it does not escape them.

Anyway. Discussed with Dries: the filter_xss calls are removed. We will look at an xss safeguard for .po files for 6.0 or for contrib, but this is a highly theoretical attack vector. There are much easier ways to screw up Drupal sites, and we have our hands full already.