|September 19th, 2015|
One way to make things load faster is to put them closer to your users. Instead of serving everything from your main data center, you take copies of your resources and you distribute them all around the world. When someone in New York visits your page and loads your scripts they should be served from New York, while when someone does the same from Hong Kong they should be served from Hong Kong.
This is a CDN, and they're great. The problem is, they're expensive to build. You need to put servers all over the world, on lots of different networks, if you want to be as close to people as possible. Which means that unless you're a huge company you pay an external CDN to distribute your resources for you.
There's a new HTML feature, however, that means you can fix this problem for new browsers. It's called Subresource Integrity (SRI) and lets you specify hashes for resources. Traditionally, when you include a script from your CDN you'd write html like:
www.jefftk.com/foo.html: ... <script src="//cdn.jefftk.com/foo.js"></script> ...With SRI, you can add an
integrityattribute that lets you specify the hash that the correct version of
www.jefftk.com/foo.html: ... <script src="//cdn.jefftk.com/foo.js" integrity="sha256-..."></script> ...After downloading the script, if the hash doesn't match the expected one then your browser knows something is wrong and doesn't run it.
This is great, and any time you're putting a script on someone else's CDN you should use it. And people are starting to use it: GitHub just announced they've started with it on their CDN. Unfortunately, their announcement says something that implies SRI will go farther than it actually can:
Widespread adoption of Subresource Integrity could have largely prevented the Great Cannon attack earlier this year.They're referring to an attack against them and GreatFire.org via the Chinese government's "Great Cannon". The way this attack worked is that lots of sites include analytics or ads from the Chinese search engine Baidu. This looks vaguely like:
www.jefftk.com/foo.html: ... <script src="http://baidu.com/ads.js"></script> ...The Great Cannon would intercept these requests to
ads.jsand for ~2% of them respond with an attack script instead of the expected ad-loading script. The attack script would load lots of resources from GitHub and GreatFire in the background. So millions of browsers of people outside of China who had just happened to load containing Baidu ads on it were now sending substantial unwanted traffic to the victim sites, enough to overload (DDoS) them.
GitHub is suggesting that people could instead have written:
www.jefftk.com/foo.html: ... <script src="http://baidu.com/ads.js" integrity="sha256-..."></script> ...Then when the GreatCannon intercepted the request for
baidu.com/ads.jsand substituted something malicious the hash wouldn't have matched and the browser wouldn't have run the attack script.
One objection to this is that if you're an active attacker that can change scripts as they pass then you could just as easily change the html as well to have the correct hash. First, this is really hard, with failures leading to broken scripts on pages. But more importantly, it ignores that the Chinese government can modify Chinese traffic but not all traffic. By rewriting Baidu's ad-loader they were able to affect any page with Baidu ads, not just ones served from China.
The real problem with using SRI on this sort of ad-loader or analytics snippet is that Baidu etc need to be able to make changes to their scripts. They don't just release one version and expect everyone to run exactly that script forever: bugs, browser changes, etc mean they need updates. When they do make an update, though, the hash changes. How do you make sure the integrity hashes in all these html snippets stay in sync with the version changes? This is not practical: people need their snippets to be simple static html.
The fix in this case is to use HTTPS for everything. The Great Cannon was able to modify the script because it was just loaded over HTTP. Use HTTPS for ads instead and it can't make these modifications anymore.
(There's a simple rule for when to use SRI: if the same entity controls the html and also chooses what version of script to run, then SRI is a good fit. Otherwise you don't have a way to prevent hash mismatches.)
- Persistent Idealism
- You Should Be Logging Shell History
- Contra Dance Unplugged
- Teach Yourself any Instrument
- Preparing for our CD