Between 12:45 PM EST and 5:08 PM EST, Amazon S3 experienced a significant outage in us-east-1. API calls to S3 for reads and writes failed. As a result, many AWS services were affected.
Some other AWS services that were affected:
We first encountered issues reading from and writing to S3, but also our internal API load balancers, which use Amazon Elastic Load Balancing, dropped HTTPS requests. These combined affected our front-end UI.
The issues affecting our front-end UI were short-lived and our front-end UI recovered quickly.
However, the S3 issues continued to affect action executions. Logs from action executions are stored in Amazon S3. Since API calls to S3 failed and/or timed-out, this caused delays in executing actions. At it's peak, actions were delayed by 91 minutes.
The net result of this AWS outage is as follows:
For many, this outage, like all outages, is a learning opportunity. We will take what we learned so that Skeddly will weather future issues and outages even better.