LONDON (AP) — Amazon says a massive outage in its cloud computing service has been resolved as of Monday evening, after an issue disrupted Internet use around the world, disrupting a wide range of online services, including social media, gaming, food delivery, live streaming and financial platforms.
The daylong disruption and ensuing outrage served as the latest reminder that 21st-century society is increasingly dependent on a few companies for much of its Internet technology, which seems to work reliably until it suddenly collapses.
About three hours after the outage began early Monday morning, Amazon Web Services said it had begun to recover, but services did not return to normal operations until 6 p.m. EST, Amazon said on its AWS health website, where it tracks the outage.
AWS provides behind-the-scenes cloud computing infrastructure for some of the world’s largest organizations. Its clients include government departments, universities and corporations, including The Associated Press.
Cybersecurity expert Mike Chappell said the “slow and complicated recovery process” is “completely normal.”
As engineers roll out fixes across cloud computing infrastructure, the process can lead to smaller disruptions, he said.
“It’s similar to what happens after a widespread power outage: As power returns to the city, neighborhoods may see intermittent glitches while crews finish repairs,” said Chappell, a professor of information technology at the University of Notre Dame’s Mendoza College of Business.
Amazon blames DNS
Amazon attributed the outage to issues with its domain name system, which converts web addresses into IP addresses, which are numerical designations that identify locations on the Internet. These addresses allow websites and applications to be loaded on Internet-connected devices.
DownDetector, a website that tracks internet outages, said in a Facebook post that it had received more than 11 million user reports of issues at more than 2,500 companies. Users have reported issues with social media site Snapchat, video games Roblox, Fortnite, online broker Robinhood, and the McDonald’s app, as well as Netflix, Disney+, and several other services.
Cryptocurrency exchange Coinbase and X chat app Signal said they were experiencing an outage issue.
Amazon’s own services were also affected. Users of the company’s Ring doorbell cameras and Alexa-powered smart speakers reported not working, while others said they were unable to access Amazon’s website or download books to their Kindle.
Many college students and K-12 students were unable to submit or access their homework or course materials on Monday because an AWS outage disrupted Canvas, a widely used educational platform.
“I currently cannot grade any assignments online, and my students cannot access their materials online” because of the outage’s impact on learning management systems, said Damian P. Williams, a professor of philosophy and data science at the University of North Carolina at Charlotte.
The exact number of schools affected was not immediately known, but Canvas says on its website that 50% of college and university students in North America use it, including all Ivy League schools in the United States.
The UC Riverside campus said students were unable to submit assignments, take tests or access course materials, and online instruction was limited.
Ohio State University informed its 70,000 students across all six campuses via email Monday morning that online course materials may not be accessible due to the outage and that “students should contact their instructors for any alternative plans.” The university told students that as of 7:10 p.m. ET, access had been restored.
Record past outages
This isn’t the first time issues with Amazon’s cloud services have caused widespread disruption.
Several popular internet services were affected by a brief outage in 2023. AWS’s longest outage in modern history occurred in late 2021, when a wide range of businesses — from airlines and car dealerships to payment apps and video streaming services — were affected for more than five hours. Outages also occurred in 2020 and 2017.
The first signs of the issue appeared around 3:11 a.m. ET, when AWS reported on its Health Dashboard that it was “investigating increased error rates and latency for multiple AWS services in the US-East-1 region.” Later, the company stated that there were “high error rates” and that engineers were “actively working” on the problem.
At around 6 a.m. ET, the company reported a recovery in most affected services and said it was seeking a “complete resolution.” As of midday, AWS was still working on the issue.
The company said that 64 internal AWS services were affected.
Only a few companies provide most of the Internet infrastructure
Since much of the world now relies on three or four companies to provide basic Internet infrastructure, “when there is an issue like this, it can have a real impact” across many online services, said Patrick Burgess, a cybersecurity expert at the UK-based Chartered Institute of IT (BCS).
“The world now runs on the cloud,” Burgess said.
And because so much of the plumbing in the online world is powered by so few companies, when something goes wrong, “it’s very difficult for users to pinpoint what’s going on because we don’t see Amazon, we just see Snapchat or Roblox,” Burgess said.
“The good news is that this type of issue is usually relatively quick” to resolve, and there is no indication it was caused by a cyberattack, Burgess said.
“This appears to be a good old tech issue. Something went wrong, and Amazon will fix it,” he said.
Burgess said there are “well-established processes” for handling outages at AWS, as well as competitors Google and Microsoft, adding that such outages usually end in “hours, not days.”
___
Ortutay reported from San Francisco. Associated Press video journalists Mostakim Hasnath in London and Jocelyn Gecker in San Francisco contributed to this report.