Sunday, December 11, 2011

RFC 6454 and RFC 6455

Today, the IETF published two document: RFC 6454, The Web Origin Concept, and RFC 6455, The WebSocket Protocol.  Both these documents started out as sections in the HTML5 specification, which has been a hotbed of standards activity over the past few years, but they took somewhat different paths through the standards process.

RFC 6454's path through the IETF process was mostly smooth sailing.  The document defines the same-origin policy, which is widely implemented and fairly cut-and-dried.  In addition to the comparison and serialization algorithms we inherited from the WHATWG, the websec working group added a definition of the Origin HTTP header, which is used by CORS, and a broad description of the principles behind the same-origin policy.

RFC 6455's path was less smooth.  The protocol underwent several major revisions in the WHATWG, before reaching the IETF.  The protocol was fairly mature by the time it reached the hybi working group and was implemented in WebKit and Firefox.  Unfortunately, some details of the protocol offended HTTP purists, who wanted the protocol handshake to comply with HTTP.  The working group polished up these details, leading to churn in the protocol.

Around this time, some colleagues and I were studying the interaction between DNS rebinding and transparent proxies.  It occurred to us that folks had analyzed the end-to-end security properties of WebSockets but less effort had been expended analyzing the interaction between WebSockets and transparent proxies.  We studied these issues and found an interesting vulnerability.  We presented our findings to the working group, which updated the protocol to fix issue.

One perspective on these events is that they are a success.  We found and fixed a protocol-level vulnerability before the protocol was deployed widely.  Another perspective is that we annoyed early adopters polishing unimportant protocol details.  My view is that this debate boils down to whether you really believe that worse is better.  For my part, I believe we had a net positive impact, but I hope we can be less disruptive to early adopters when we improve security in the future.

Saturday, December 3, 2011

Timing Attacks on CSS Shaders

CSS Shaders is a new feature folks from Adobe, Apple, and Opera have proposed to the W3C CSS-SVG Effects Task Force.  Rather than being limited to pre-canned effects, such as gradients and drop shadows, CSS Shaders would let web developers apply arbitrary OpenGL shaders to their content.  That makes for some really impressive demos.  Unfortunately, CSS Shaders has a security problem.

To understand the security problem with CSS Shaders, it's helpful to recall a recent security issue with WebGL.  Similar to CSS Shaders, WebGL lets developers use OpenGL shaders in their web applications.  Originally, WebGL let these shaders operate on arbitrary textures, including textures fetched from other origins.  Unfortunately, this design was vulnerable to a timing attack because the runtime of OpenGL shaders can depend on their inputs.

Using the shader code below, James Forshaw built a compelling proof-of-concept attack that extracted pixel values from a cross-origin image using WebGL:
for (int i = 0; i <= 1024; i += 1) {
  // Exit loop early depending on pixel brightness
  currCol.r -= 1.0;
  if (currCol.r <= 0.0) {
    currCol.r = 0.0;
    break;
  }
}
Timing attacks are difficult to mitigate because once the sensitive data is present in the timing channel it's very difficult to remove.  Using techniques like bucketing, we can limit the number of bits an attacker can extract per second, but, given enough time, the attacker can still steal the sensitive data.  The best solution is the one WebGL adopted: prevent sensitive data from entering the timing channel.  WebGL accomplished this by requiring cross-origin textures to be authorized via Cross-Origin Resource Sharing.

There's a direct application of this attack to CSS Shaders.  Because web sites are allowed to display content that they are not allowed to read, an attacker can use a Forshaw-style CSS shader read confidential information via the timing channel.  For example, a web site could use CSS shaders to extract your identity from an embedded Facebook Like button.  More subtly, a web site could extract your browsing history bypassing David Baron's defense against history sniffing.

The authors of the CSS Shaders proposal are aware of these issues.  In the Security Considerations section of their proposal, they write:
However, it seems difficult to mount such an attack with CSS shaders because the means to measure the time taken by a cross-domain shader are limited.
Now, I don't have a proof-of-concept attack, but this claim is fairly dubious.  The history of timing attacks, including other web timing attacks, teaches us that even subtle leaks in the timing channel can lead to practical attacks.  Given that we've seen practical applications of the WebGL version of this attack, it seems quite likely CSS Shaders are vulnerable to timing attacks.

Specifically, there are a number of mechanisms for timing rendering.  For example, MozBeforePaint and MozAfterPaint provide a mechanism for measuring paint times directly.  Also, the behavior of requestAnimationFrame contains information about rendering times.

Without a proof-of-concept attack we cannot be completely certain that these attacks on CSS Shaders are practical, but waiting for proof-of-concept attacks before addressing security concerns isn't a path that leads to security.

Sunday, November 20, 2011

Referer (sic)

One of the more astonishing facets of the web platform is the Referer header.  Whenever you click a link from one web site to another, the request that fetches the web page from the second web site contains the URL of the first web site.  This behavior causes both security and privacy problems:
  1. Security.  Despite copious warnings, developers often include secrets in URLs.  For example, to prevent Cross-Site Request Forgery (CSRF), developers often use secret tokens, which have a nasty habit of leaking into URLs and then into Referer headers sent to other sites.
  2. Privacy.  The Referer header is even worse for privacy, for example, leaking search terms from your favorite search engine to the web sites you visit.  As another example, Facebook accidentally leaked user identities to advertisers via the Referer header.
The principle of least astonishment tells us we should remove this "feature", but, unfortunately, we can't just remove the Referer header from the platform.  To many people rely on the Referer header for too many different purposes.  For example, bloggers rely on the Referer header to generate trackback links, but that's just the tip of the iceberg.

As a first step, I wrote up a short proposal for a mechanism web sites can use to suppress or truncate the Referer header:
<meta name="referrer" content="never">
<meta name="referrer" content="origin">
One subtlety in the design is including the "always" option.  The main reason to include this option is to make it easier for us to later block the Referer header by default.  The always options gives sites an escape valve to turn the Referer header back on, if needed.

This mechanism is now implemented in WebKit and will hopefully be implemented in Firefox and other browsers soon.

Monday, November 7, 2011

How I learned to stop worrying and embrace Content-Security-Policy

This week, the W3C Web Application Security working group held its first face-to-face meeting at TPAC, the W3C's annual technical meeting.  The main topic of discussion was moving Content-Security-Policy (CSP) from an unofficial draft onto the standards track.  I'm actually pretty excited about CSP now, but I wasn't always its biggest fan.

CSP has a bunch of features, but the core value proposition (in my view) is that web applications can whitelist where their scripts come from.  The main drawback of this approach is that authors need to remove all inline script from their web application because the browser doesn't know whether an inline script is part of the application or whether it was injected as part of a cross-site scripting (XSS) attack.

Initially, I was skeptical about CSP for two reasons:
  1. CSP has a bunch of functionality unrelated to mitigating XSS, which makes the feature more complicated than necessary.  Because I'm a big believer in minimal viable products, my initial reaction was to remove all the extra functionality and focus on the core use case for the first iteration.
  2. I was worried that removing all the inline scripts from a web application would be too hard because web applications use inline scripts frequently.  Joel Weinberger, Dawn Song, and I even wrote a paper exploring that issue.
I still view those criticisms as fair, but I worry less about those issues.  By and large, the early adopters I've worked with have been able to use CSP effectively, which is some evidence that the extra complexity isn't the end of the world.  I've also now built and retrofitted a number of non-trivial web applications to avoid inline script.  It's a fair amount of work, certainly, but not an insurmountable task.

Thanks to some great work by Thomas Sepez, Chrome now uses CSP in the vast majority its HTML-based user interfaces.  Over the years, there has been a steady trickle of XSS vulnerabilities in these interfaces, which is problematic because these interfaces (such as the browser's settings interface) have powerful privileges.  Now that Chrome uses CSP extensively, we can be confident that we've mitigated this entire class of vulnerabilities.

My perspective on CSP now is that it makes a serious dent in one of the biggest problems in web security.  It's definitely not a silver bullet (nothing ever is), but we should invest in CSP because we should be working on the big problems.  If we're not willing to dream big, we should just pack up our tent and go home.

Sunday, October 30, 2011

The Priority of Constituencies

Lawrence Lessig wrote in Code is Law that the choices we make in writing code embody our values.  This observation is especially true when building a browser because the browser mediates interactions between many distinct entities.  Because the browser's security policy is at the heart of mediating those interactions, we should ask ourselves what values the browser's security policy embodies.

One key value is the priority of constituencies, which is enshrined in the HTML Design Principles:
In case of conflict, consider users over authors over implementors over specifiers over theoretical purity.
To better understand this principle, let's consider a specific example: whether the browser's password manager should be enabled for a given web site.

The password manager is a source of conflict for these competing interests.  Implementors (myself included) believe that password managers improve security by reducing the costs of using a large number of more complex passwords.  Many banks, however, disagree.  They believe that password managers reduce security because passwords stored in password managers can be stolen by miscreants.

How do browser vendors resolve this conflict?  By default, we enable the password manager.  Because users have a higher priority than implementors (i.e., browser vendors), browsers let users turn the password manager off.  Because authors (i.e., site operators) also have a higher priority than browser vendors, browsers let authors disable the password manager on their own web sites by setting autocomplete=off.

The careful reader will have noticed that the scheme above violates the priority of constituencies in one case.  What if the user wants to use the password manager on a web site sets autocomplete=off?  Because users have a higher priority than authors, the browser should resolve this conflict in favor of the user.  Typically, browsers handle this case via their extension system.  For example, the autocomplete=on extension lets users override authors who want to disable the password manager.

How, then, should we respond to web site operators who wish to block or override these sorts of extensions?  As long as we believe that these extensions faithfully enact the user's will, we're hard-pressed to let authors block these extensions because that would violate the priority of constituencies.  Instead, we ask authors to be humble and accept the user as sovereign.

Saturday, October 22, 2011

X-Script-Origin, we hardly knew ye

On Thursday, Robert Kieffer filed an interesting bug in both the WebKit and Mozilla bug trackers:
WebKit and Mozilla browsers redact the information passed to window.onerror for exceptions that occur in scripts that originate from external domains. Unfortunately this means that for large institutions (like us here at Facebook) that use CDNs to host static script resources, we are unable to collect useful information about errors that occur in production.
Why do browsers redact this information in the first place?  The answer is actually a combination of two factors:
  1. Although browsers generally prevent one origin from reading information from another origin, the script element, like the image element, is a bit of a loophole: an origin is allowed to execute a script from any other origin.  (This exception has wide-ranging implications on both security and commerce on the web.)
  2. The script element ignores the MIME type of resources it loads.  That means if a web page tries to load an HTML document or an image with the script element, the browser will happily request the resource and attempt to execute it as a script.
At first blush, these two facts would seem to imply a serious security vulnerability.  Certainly executing a script leaks a great deal of information about the script and ignoring the MIME type means a malicious web site can cause the browser to execute any resource, regardless of the sensitivity of the resource (e.g., an attacker can execute the HTML that represents your email inbox as if it were JavaScript).

Fortunately, we're able to snatch security from the jaws of vulnerability because of a happy coincidence: resources that contain sensitive information happen to fail to parse as valid JavaScript (at least usually).  For example, your email inbox probably consists of HTML that quickly throws a SyntaxError exception when executed as JavaScript.  (The consequences of expanding JavaScript to include HTML-like syntax is an exercise for the reader.)

Returning to our original question, we now understand that (in an attack scenario) sensitive information actually flows though the JavaScript virtual machine, where it generates an exception.  That exception is then processed by window.onerror!  If browsers did not redact the information they give to window.onerror, they would potentially leak sensitive information to malicious web sites.

How, then, can we address Robert's use case?  Certainly we would like web sites like Facebook to be able to diagnose errors in their scripts.  Robert suggests an "X-Script-Origin" HTTP header attached to the script that would indicate which origins are authorized to see exceptions generated by the script.  Although that would work, that solution seems overly specific to the problem at hand.

A more general solution is for the server hosting the script to inform the browser which origins are authorized to learn sensitive information contained in the script.  (Typically servers would authorize every origin because scripts are usually the same for every user).  We already have a general mechanism for servers to make such assertions: Cross-Origin Resource Sharing.  We can address Robert's use case by adding a crossorigin attribute to the script element that functions similarly to the crossorigin attribute on the image element.  Once the embedding origin is authorized to read the contents of the script, there's no longer any need to redact the exceptions delivered to window.onerror.

Saturday, October 15, 2011

Local URIs are more equal than others (Part 1)

On Wednesday, Cedric Sodhi asked the WebKit development mailing list why WebKit restricts access to local URIs.  This post describes one of the reasons why local URIs are more equal than other URIs.  In a future post, we'll revisit this issue when we discuss how local URIs (e.g., file:///Users/abarth/tax2010.pdf) don't really fit cleanly into the web security model.

Although the web platform largely isolates different origins from each other, there are a number of "leaks" whereby one origin can extract information from another origin.  For example, browsers let one origin embed images from another origin, leaking information such as the height and width of the images across origins.  These leaks are often at the core of security vulnerabilities in the platform.

These same leak exists, of course, between local origins (e.g., those with file URIs) and non-local origins (e.g., those with http or https URIs).  What kind of information could a web site extract from your local system using this leak?

On my laptop, I have Skype installed, which means that, on my laptop, the URI below resolves to a PNG image with a particular height and width:
file:///Applications/Skype.app/Contents/Resources/SmallBlackDot.png
If I visit a web site, if the browser doesn't address this leak, the web site could determine whether I have Skype installed by attempting to load that URI as an image.  On my laptop, the image element would have a certain well-known height and width, but on a laptop without Skype installed, the browser would fire the error event.

Returning to Cedric's question, why do browser vendors restrict access to local URIs but not to non-local URIs if both have the same information leak?  I would prefer to close this leak in both cases, but many web sites embed cross-origin images, e.g. from content delivery networks.  If we were adding the <img> tag today, we would probably require servers opt in to cross-origin embedding using the Cross-Origin Resource Sharing protocol.

Fortunately, very few web sites include images (or other resources) from local URIs (especially after we removed the full path from <input type="file">, but that's a story for another time).  That means browsers can block all loads of local resources by non-local origins without making users sad, preventing web sites from snooping on your local file system.