Updated: Comment #N
Problem/Motivation
Moving this into a public issue form the security team tracker, since the attack is a general one on web servers, and not specific to Drupal, and the information is already public.
Django made a security announcement regarding the BREACH attack:
https://www.djangoproject.com/weblog/2013/aug/06/breach-and-django/
Their approach of suggesting you turn off gzip/deflate seems infeasible for most people.
There's more information here: http://breachattack.com/
Drupal is less vulnerable than many web applications, since the CSRF token is usually different per form.
However, if a sensitive or administrative form is on the same page with a form without CSRF protection (e.g. a search form that takes values from the query string), it's possible to be vulnerable to this attack. You could also be vulnerable if there is other sensitive data on a page that includes such a form.
Also, this attack is more effective against a stream cipher like rc4, which many people witched to using due to the BEAST attack.
Proposed resolution
Possible suggested BREACH mitigations:
- Disabling HTTP compression
- Separating secrets from user input
- Randomizing secrets per request
- Masking secrets (effectively randomizing by XORing with a random secret per request)
- Protecting vulnerable pages with CSRF
- Length hiding (by adding random number of bytes to the responses)
- Rate-limiting the requests
#1, #2, and #3 don't seem very feasible.
#4 could be pretty trivially implemented by essentially splitting the form token into 2 parts - a secret that's constant (essentially what we send as the token today), and a XOR mask that varies on every request. The the valid token check would XOR the 2 parts and compare to the expected secret.
#5 We already protect most forms and actions against CSRF, so Drupal start out from a relatively strong position against this attack already.
#6 should be easy to implement as well:
At least for BREACH and possibly for the vulnerabilities described at http://www.isg.rhul.ac.uk/tls/ for rc4, a possible counter-measure would be injecting a random number of random bytes as a header, and also injecting a random number of random bytes into the HTML content (e.g. in the HEAD section)
#7 seems more suitable to a WAF or load balancer outside Drupal.
Remaining tasks
pick 1 or more aproaches and implement
User interface changes
N/A
API changes
Possible changes to token generation/checking, but these could be transparent at the API level
Comment | File | Size | Author |
---|---|---|---|
#46 | 2234243-nr-bot.txt | 151 bytes | needs-review-queue-bot |
#41 | drupal-2234243-41.patch | 1.62 KB | prudloff |
#27 | breach.png | 328.28 KB | grendzy |
#17 | interdiff.txt | 2.02 KB | alansaviolobo |
#17 | include_defenses-2234243-17.patch | 1.2 KB | alansaviolobo |
Comments
Comment #1
pwolanin CreditAttribution: pwolanin commentedHere's totally untested code, to show what I have in mind re: #4 above.
Comment #2
pwolanin CreditAttribution: pwolanin commentedA few tweak - this seems to work to allow me to submit a form with proper token checking. Let's see if it breaks any core tests.
Comment #3
pwolanin CreditAttribution: pwolanin commentedComment #5
catchhttp://drupal.og/project/cacheable_csrf does #2 to an extent. The token isn't included in the HTML markup, but is injected via JavaScript based on the value of a cookie. So there's no reflection which a quick read of that doc suggests is what this attack relies on. Obviously the cookie is there, and the server still has to set it when it's not available.
Comment #6
pwolanin CreditAttribution: pwolanin commentedI'm not sure if we are worried about the cost of potentially multiple random bytes calls per page? I'm not sure we've profiled that with the openssl default.
If it is a worry we could probably make do with something much less random but that still changed every page, like a hash of mt_rand() and microtime().
Comment #7
grendzy CreditAttribution: grendzy commentedHere are the possible resolutions as I see it. Since I somehow managed to reach opposite conclusions from the current issue summary, I'll leave this chart in a comment until we are closer to consensus.
Cache-control: no-transform
header. Honoring no-transform is a MUST in the HTTP RFC, but it's real-world effectiveness needs more research. Sadly, the first program checked (Apache httpd/mod_deflate) fails.HTTP GET https://victim.com/user/1/edit?chosen-plaintext=aaaa@gmail.com
Repeated requests to this URL (forged via <img> tag on 3rd-party site) would result in the attacker stealing the admin's e-mail address. Clearly we cannot require CSRF tokens on simple GET requests as this would break basic hyperlinking on a the web.
Attached is a patch for the first option with no-transform.
Comment #8
grendzy CreditAttribution: grendzy commentedComment #10
pwolanin CreditAttribution: pwolanin commented@grendzy - so the original document discusses CRSF tokens as the sort of common that this attack would target.
Do you think the approach in the patch for masking the token is infeasible or wrong, or you just don't think it's that useful?
Comment #11
grendzy CreditAttribution: grendzy commentedBoth I guess - not useful because CSRF tokens are only a small fraction of the secrets that need protecting. UID 1's e-mail is one example above, another example would be private user profile fields. On top of that prior to #1798296: Integrate CSRF link token directly into routing system we don't even have a central way to protect CSRF tokens. Discussing tokens is a good way to illustrate the BREACH vulnerability (they are nearly universal to all web apps, and secondly it's a very high-value secret).
Also wrong, in the sense that if the transport security is broken, then a fix must address it at that layer, rather than accepting broken transport encryption and masking a single high-value secret.
I admit the
no-transform
header idea may turn out to be weak, if it's not well-supported by CDNs and load balancers in the wild (this still needs research). Another layer we could add would be a status report check inhook_requirements()
, or https://drupal.org/project/security_review (the check would generate a warning if an intermediary was detected gzipping SSL pages).Comment #12
pwolanin CreditAttribution: pwolanin commentedI had thought of the no-transform header as applying to encoding rather than compression.
If you don't think it's worth specifically masking CSRF tokens, then i think length hiding by adding random byes to both the header and body are worth doing.
They are low cost, and making any such attack 100x harder (or some significant factor) is useful.
Comment #13
grendzy CreditAttribution: grendzy commentedCompression is a a kind of encoding, yes? (Compressed responses are labeled with
Content-Encoding: gzip
) And I think the RFC is pretty clear thatno-transform
prohibits changes to the body, and the Content-Encoding header:For length-hiding, I'm inclined to agree that the cost / benefit ratio is pretty good. My stats are terribly rusty, so take this with a grain of salt, but here's a estimate of sample size needed given 0-128 of padding bytes, and a 10-byte secret:
n = 2 ((sigma erf^(-1)(c))/M)^2 | sigma = (standard deviation UniformDistribution[{0, 128}]), c = 0.5**(1/10), M = 0.5
18,327 samples are predicted per length observation, assuming 32 observations per byte, and 10 bytes in the secret means almost 6 million observations needed to "breach" the secret. Depending on how fast your Drupal backend is, this could take days or weeks (and proportionally more for longer secrets). This isn't a crypto-scale amount of time, but better than nothing, especially considering the victim needs to have a malicious site open for the duration.
Can padding be confined to the header? Padding an HTML body is pretty safe, but not so with all media types (XML, JSON, etc). I would expect an attacker can only see the overall message size, and not the boundary between header and body, but I'm not sure.
Last, we can document on https://drupal.org/https-information that site operators who want the strongest protection from BREACH should refrain from enabling gzip on SSL pages.
Comment #14
catchThis looks like something we could add to 8.0.1 - is there a particular reason it should block the release?
Comment #15
grendzy CreditAttribution: grendzy commentedI agree it should not be a release blocker.
Comment #16
pwolanin CreditAttribution: pwolanin commentedNot a release blocker, but nice to have.
Comment #17
alansaviolobo CreditAttribution: alansaviolobo commentedComment #18
pwolanin CreditAttribution: pwolanin commentedGiven that there are pushes to move all traffic to https, I don't think disabling compression for https is a good move.
Comment #19
pwolanin CreditAttribution: pwolanin commentedSince this attack is on the body, I'm not sure variable content in the header would be effective - it might? not clear.
Given that the attack requires getting user input into the response, it seems like JSON and XML responses are a lower priority.
I suggest random padding of the header and random padding of html bodies when the request is SSL would be easy to implement, and would probably mitigate a range of yet-undiscovered attacks in addition.
Comment #20
grendzy CreditAttribution: grendzy commentedpwolanin: "pushes to move all traffic to https" - to me, this is an argument in favor of disabling compression; as TLS becomes more widely used, it becomes even more important to have strong assurances of it's safety.
Interestingly EFF mentions CRIME, but not BREACH: https://www.eff.org/https-everywhere/deploying-https
Anyway, I would be happy to review a length-hiding patch. To the best of my knowledge, TLS operates 3 layers below HTTP, so it's not even aware of the header / body boundary, which is why I think header padding would have the intended effect, with a lower risk of side-effects.
Comment #21
pwolanin CreditAttribution: pwolanin commentedThe only reason to suggest adding it to the body also is that the content of the padding (as well as its size) would potentially affect the gzip routine (if used) and further thwart this attack.
Also - we should probably always pad (not try to check for SSL) in case Drupal is being served from a back-end behind a SSL reverse proxy?
Comment #22
grendzy CreditAttribution: grendzy commentedI think it's OK to rely on
\Drupal::request()->isSecure()
, proxy users still have to configure this accurately (via X-Forwarded-Proto or otherwise). If this isn't set, the session handler won't function correctly anyway.Comment #24
pwolanin CreditAttribution: pwolanin as a volunteer commentedComment #25
pwolanin CreditAttribution: pwolanin as a volunteer commentedI wasn't sure if Drupal meets the requirement of reflecting user input in the response, but it seems every FORM tag does via the action attribute like:
<form action="/node/1/edit?canary=1"
Comment #26
pwolanin CreditAttribution: pwolanin as a volunteer commentedAh, however - see: http://www.breachattack.com/resources/BREACH%20-%20SSL,%20gone%20in%2030...
Because Drupal FAPI CSRF tokens are in value attributes as shown in 2.4.7 those at least are not really vulnerable.
So the the risk is more to CSRF tokens in GET requests rendered as links.
Comment #27
grendzy CreditAttribution: grendzy at Metal Toad commentedWe do have modules like https://www.drupal.org/project/prepopulate that allow filling the value attribute. Even without a way to force a value attribute, at that point you're relying on a single quote mark, and the attacker's inability to guess the first three characters of whatever data is being attacked, which seems pretty flimsy. (Lots of private data have predictable prefixes, for example: Social Security Number Prefixes, Area codes, Health insurance account numbers, etc.
I also think focusing on the form tokens is a way to narrow view. In principle any private data on the page could be attacked. (For another example, see my comment in #7 -- the admin's email address could be leaked with e.g. https://victim.com/user/1/edit?chosen-plaintext=aaaa@gmail.com).
Below is a screenshot from a real Drupal site, showing how a little OSINT could be combined with BREACH to steal health insurance information.
Comment #28
grendzy CreditAttribution: grendzy at Metal Toad commentedBecause Drupal core cannot anticipate the nature of the data present on a page, the only effective solution I can see is to disable compression for non-cached/non-public pages over HTTPS. (Which was already the case in core last I checked, though we might be able to take steps to prevent proxies & CDNs from adding compression, or possibly a system status warning if compression is detected).
Comment #32
rreiss CreditAttribution: rreiss at Dofinity commentedHi,
Any thoughts about the same BREACH vulnerability with theme_token on D7?
Disabling HTTP compression will be a performance hit, so this isn't the best solution IMHO.
We ended up removing (unset) theme_token from our default theme, since we have no AJAX requests
with Drupal AJAX APIs.
That token seems a little bit odd to me, why not just checking a permission to the administrative theme / add per theme permission and check that permission instead? Weird solution, isn't it?
Comment #33
John Morahan CreditAttribution: John Morahan as a volunteer commentedIvan Ristić's book suggests another possible mitigation that I don't think has been considered: conditionally disable compression when the request doesn't have a same-origin Referer. It's similar to the CSRF token idea mentioned in #7, but without the tokens - and the only penalty for not sending a Referer at all is that you get an uncompressed response.
Comment #39
semiaddict CreditAttribution: semiaddict commentedThe solution indicated in #33 is also suggested and detailed on https://wiki.crashtest-security.com/prevent-ssl-breach
Comment #41
prudloff CreditAttribution: prudloff at Insite commentedThe attached patch returns a
Cache-Control: no-transform
header if the request uses HTTPS and does not have a trusted Referer header.We are using Varnish and this feels like the best compromise between security and performances.
Comment #46
needs-review-queue-bot CreditAttribution: needs-review-queue-bot as a volunteer commentedThe Needs Review Queue Bot tested this issue. It either no longer applies to Drupal core, or fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.