Split fromas the issue is much less problematic on D7, and nobody gives a damn.
The problem: our JSON is not RFC 4627 compliant, which makes it unparsable by a large number of JSON parsers (.NET, iPhone, browsers and newer jQuery versions). The most common problem is that we escape characters with \xNN instead of \uNNNN.
We are dealing with a number of circumstances:
- At least two different parsers may process the JSON (drupal_add_js setting):
- HTML parser
- JSON parser
- Clients we have no control over need to interpret the JSON: jquery versions, iphone apps, .NET classes, world+dog
This means we need to produce RFC4627 compliant JSON that, at the same time, will not have certain 'special characters' such as ', ", <, > and & be interpreted by an HTML parser.
As RFC4627-2.5 clearly states that "Any character may be escaped", we can avoid special treatment of characters ', ", <, > and & by an HTML parser through simple substitution with a Unicode escape sequence (\uXXXX).
A number of characters MUST be escaped for the JSON parser. These are:
- U+0000 - U+001F
For a number of characters (eg "), it is possible to choose the escape form and either precede them with a backslash (JSON Escape Sequence, eg \"), or use the \uXXXX (Unicode escape sequence, eg \u0027) form. HOWEVER, because we also need to deal with the HTML parser, _it_ may interpret the quote before the the JSON parser even has the chance to run. Because of this, it is advisable to use the Unicode escape sequence (\uXXXX) here as well.
Right now, drupal_to_js is not RFC4627 compliant. This causes problems with a number of non-core JSON consumers.
- Control characters such as U+001F are not escaped
- It uses the undefined escape sequence \xXX, which only works on certain JSON implementations (likely, only JS eval)
- It improperly uses the non-existing JSON escape sequence \0 for the 0 byte instead of the Unicode escape sequence \u0000
- It risks interpretation of certain characters as special by the HTML parser due to its use of addslashes (\")
Now, enough about the problems of the current implementation, on to the patch.
It correctly escapes U+0000 - U+001F (Ascii 0 - 31) and the potential special HTML characters.
AFAIK we do not have to escape /, U+2028 and U+2029 according to the specs, but naive JSON parsers (eval) may have trouble with it.
As there's no harm in escaping these characters, I've kept them in the revised patch. All this reroll did was to expand the U+0000 - U+001f range and add comments.
FAILED: [[SimpleTest]]: [MySQL] Invalid patch format in drupal-json_encode_fix-1086098-62.patch.