Human Error Caused Recent Amazon AWS Outage

Amazon's web servers were recently down for more than four hours, and since many sites and services are reliant on Amazon hosting, a lot of them were unreachable. The problem was resolved very quickly, but many were wondering what went wrong with the AWS.
Amazon now released a statement in a blog post, saying that the cause of the big AWS outage that took down a bunch of large internet sites for several hours was due to human error. Apparently one of its employees was debugging an issue with the billing system and accidentally took more servers offline than intended, which started a domino effect that took down two other server subsystems and more.
"We have not completely restarted the index subsystem or the placement subsystem in our larger regions for many years. S3 has experienced massive growth over the last several years and the process of restarting these services and running the necessary safety checks to validate the integrity of the metadata took longer than expected."

Post a Comment

0 Comments