AWS Outage Resolved After 24 Hours Of Disruption

Mourad Maacha

Monday, October 20, 2025

Everyone knows who Amazon is — they’re massive in cloud computing, hosting services for countless organizations globally, including schools. So when a company that big encounters a service disruption, it resonates widely. Here’s how the recent Amazon Web Services (AWS) outage was resolved:

On October 19, 2025 in the US-East-1 region, I noticed elevated error rates and latency across multiple AWS services. It began around 11:49 PM PDT.
The root cause was traced by 12:26 AM PDT the next day to a faulty DNS update. This prevented applications from resolving server IPs — like a broken phonebook for the internet.
Because of that, more than 100 AWS services were affected. Services that relied on the core database service DynamoDB in particular caused cascading failures — for example, EC2 launches stalled, Lambda functions had issues, load balancer health checks failed.

As a user or system administrator, the ripple effects were visible everywhere: gaming platforms went offline, financial apps had login failures, even Amazon’s own systems (like Prime Video and e-commerce checkout) saw disruption.

How It Was Fixed

Here’s what AWS did to bring the systems back online:

They flushed DNS caches and applied the fix for the core DynamoDB DNS issue by about 2:24 AM PDT.
They temporarily throttled some operations (for example, asynchronous Lambda invocations, EC2 instance launches) to stabilize dependent subsystems.
By around 3:01 PM PDT, AWS had confirmed that all services were fully restored, though some data-processing backlogs (for example in Redshift and Connect) remained to be cleared.

Final Thoughts

Contrary to what many people thought, this outage wasn’t caused by a cyberattack — rather, it appears to have been an internal update gone wrong.

Still, it’s a vivid reminder: even the biggest cloud provider can experience a disruption, and when they do, many of us feel it. Thinking proactively about architectural resilience and dependent-service risk is more important than ever.

AWS Outage Resolved After 24 Hours Of Disruption

How It Was Fixed

Final Thoughts

Comments

Leave a comment Cancel reply

More posts

Google Chrome Emergency Security Update Fixes High-Severity Vulnerabilities

Critical Chrome 0-day Vulnerability Everyone Should Know

Zero-Click Exploit in Claude Desktop Reveals Trust Boundary Failures in AI Agent Architecture

Command Injection Flaw in Hikvision Access Points Allows Authenticated Code Execution