RightScale Blog

Cloud Management Blog
Cloud Management Blog

Outage

Lessons Learned from Recent Cloud Outages

Posted by Uri Budnik   Ι   February 27, 2013 9 comments
Outages happen, and they happen everywhere. Whether you leverage a public cloud, a hosting provider, or your own data center, infrastructure downtime is inevitable. Equipment breaks or does not function as expected, software bugs slip by, natural disasters occur, and unforeseen situations lead to unexpected consequences. Sometimes services are degraded and sometimes complete data centers go dark...Read more

AWS Outage Lessons Learned: If Netflix Can Suffer, So Can You

Posted by Brian Adler   Ι   January 04, 2013 11 comments
On Christmas Eve and continuing into Christmas Day, AWS had a “Service Event” centered on the ELB (Elastic Load Balancing) service in the US-East region. Although only a small percentage of ELBs were functionally disabled and unable to route traffic to their backend servers, all ELBs in the region experienced a time interval in which they could not scale, nor could changes be made to their...Read more

Cloud Architecture 2013: Top 9 Fine-Tuning Tips

Posted by Brian Adler   Ι   December 18, 2012 3 comments
Before you congratulate yourself on crossing off those final to-dos for 2012, don’t forget this critical one: fine-tuning your applications and cloud architecture. And while it’s unlikely that you’ll accomplish this task by end of year, here are nine tips to help you optimally manage your cloud applications and architecture in 2013.Read more

AWS Outage Follow-Up

A week after the April 21, 2011, outage AWS posted a detailed post mortem explanation of what happened. It'll be interesting to see how everyone digests the very detailed account. Since AWS did not provide an executive summary I'll try my hand at one: The outage was triggered by an operator error during a router upgrade which funneled very high-volume network traffic into a low-bandwidth control...Read more

Amazon EC2 Outage: Summary and Lessons Learned

Last Thursday's Amazon EC2 outage was the worst in cloud computing's history. It made the front page of many news pages, including the New York Times, probably because many people were shocked by how many web sites and services rely on EC2. Seeing so much affected was a very graphical illustration of how pervasive cloud computing has become. I will try to summarize what happened, what worked and...Read more
Subscribe to RSS - Outage