SRI against the Great Cannon?

September 19th, 2015
One way to make things load faster is to put them closer to your users. Instead of serving everything from your main data center, you take copies of your resources and you distribute them all around the world. When someone in New York visits your page and loads your scripts they should be served from New York, while when someone does the same from Hong Kong they should be served from Hong Kong.

This is a CDN, and they're great. The problem is, they're expensive to build. You need to put servers all over the world, on lots of different networks, if you want to be as close to people as possible. Which means that unless you're a huge company you pay an external CDN to distribute your resources for you.

The problem now is that if there's a security breach at the CDN someone malicious could make changes to the javascript on the CDN and take control over your site. By using the CDN you make your site dependent on the security at your CDN company.

There's a new HTML feature, however, that means you can fix this problem for new browsers. It's called Subresource Integrity (SRI) and lets you specify hashes for resources. Traditionally, when you include a script from your CDN you'd write html like:
     <script src="//"></script>
With SRI, you can add an integrity attribute that lets you specify the hash that the correct version of foo.js has:
     <script src="//"
After downloading the script, if the hash doesn't match the expected one then your browser knows something is wrong and doesn't run it.

This is great, and any time you're putting a script on someone else's CDN you should use it. And people are starting to use it: GitHub just announced they've started with it on their CDN. Unfortunately, their announcement says something that implies SRI will go farther than it actually can:

Widespread adoption of Subresource Integrity could have largely prevented the Great Cannon attack earlier this year.
They're referring to an attack against them and via the Chinese government's "Great Cannon". The way this attack worked is that lots of sites include analytics or ads from the Chinese search engine Baidu. This looks vaguely like:
     <script src=""></script>
The Great Cannon would intercept these requests to ads.js and for ~2% of them respond with an attack script instead of the expected ad-loading script. The attack script would load lots of resources from GitHub and GreatFire in the background. So millions of browsers of people outside of China who had just happened to load containing Baidu ads on it were now sending substantial unwanted traffic to the victim sites, enough to overload (DDoS) them.

GitHub is suggesting that people could instead have written:
     <script src=""
Then when the GreatCannon intercepted the request for and substituted something malicious the hash wouldn't have matched and the browser wouldn't have run the attack script.

One objection to this is that if you're an active attacker that can change scripts as they pass then you could just as easily change the html as well to have the correct hash. First, this is really hard, with failures leading to broken scripts on pages. But more importantly, it ignores that the Chinese government can modify Chinese traffic but not all traffic. By rewriting Baidu's ad-loader they were able to affect any page with Baidu ads, not just ones served from China.

The real problem with using SRI on this sort of ad-loader or analytics snippet is that Baidu etc need to be able to make changes to their scripts. They don't just release one version and expect everyone to run exactly that script forever: bugs, browser changes, etc mean they need updates. When they do make an update, though, the hash changes. How do you make sure the integrity hashes in all these html snippets stay in sync with the version changes? This is not practical: people need their snippets to be simple static html.

The fix in this case is to use HTTPS for everything. The Great Cannon was able to modify the script because it was just loaded over HTTP. Use HTTPS for ads instead and it can't make these modifications anymore.

(There's a simple rule for when to use SRI: if the same entity controls the html and also chooses what version of script to run, then SRI is a good fit. Otherwise you don't have a way to prevent hash mismatches.)

Comment via: google plus, facebook

Recent posts on blogs I like:

Linkpost for July

Effective altruism, rationality, metascience, economics, social justice, fun.

via Thing of Things July 10, 2024

Coaching kids as they learn to climb

Helping kids learn to climb things that are at the edge of their ability The post Coaching kids as they learn to climb appeared first on Otherwise.

via Otherwise July 10, 2024

A discussion of discussions on AI bias

There've been regular viral stories about ML/AI bias with LLMs and generative AI for the past couple years. One thing I find interesting about discussions of bias is how different the reaction is in the LLM and generative AI case when compared to "…

via Posts on June 16, 2024

more     (via openring)