Two weeks ago, Amazon S3 experienced a few hours of downtime and disrupted service. This in turn caused issues with thousands of sites that host files on S3. Many Americans could not view their Instagram photos, their Twitch streams, their Netflix movies. But this outage also affected services that US companies rely on daily: Atlassian, Slack, Autodesk, Github, Trello, Twilio, Zendesk, and more. Even the SEC’s website was affected.
When we rely on centralized cloud services, like AWS, we open ourselves to the possibility of widespread outages. Luckily, these rarely occur. Last week’s S3 outage was caused by a typo, and by now Amazon has likely made changes to prevent this type of outage from reoccurring.
But let’s be clear — centralized cloud services will always come with great risk. Greater centralization = greater risk. As more companies shut down their in-house servers and move to Amazon, Google, and Microsoft services, the ramifications of any outage will be far more pronounced. There is no such thing as 100% uptime.
Centralized cloud services are good when times are good. We are not currently at war. We have not had an attack on US soil since 9/11. We are not under any imminent threats.
But it is not enough to simply build for good times. When we build a house, a bridge, or any other physical infrastructure, we are required to follow building codes. These codes anticipate possible scenarios — like hurricanes, earthquakes, or bombings — and require us to build in greater redundancy.
We need to treat digital infrastructure like physical infrastructure. And I believe our digital infrastructure is not prepared for the worst case scenarios: natural disaster or war. Our data centers are large targets for malicious actors. To cripple our economy, a malicious actor needs only to cripple a few Amazon data centers through cyber attacks or bombings.