Welcome!

Cloud Expo Authors: Robert Eve, Jeremy Geelan, Maureen O'Gara, Pat Romanski, Liz McMillan

Related Topics: Cloud Expo, SOA & WOA

Cloud Expo: Article

The AWS Outage: The Situation Is Catastrophic But Not Serious

The benefits of on-demand cloud infrastructure remain strong cloud drivers today, just as they were last week

The recent outage of Amazon Web Services (AWS) east region cloud has taken on many dramatic monikers such as "cloudgate," "cloudburst," and has even triggered a creative commiserative competition (http://lat.ms/g7aTDC). Most of us though are not surprised that an outage occurred, but remain a bit puzzled by the length of time it has taken for the engineers to right the situation. We look forward to post-mortem reports from AWS that will hopefully help us understand what actually happened. Was there an elusive heisenbug that sprinkled some corrosive pixie dust on the block storage devices? Or was it simply the case of someone making like an air traffic controller and falling asleep at the switch? In any case, full transparency should be the modus operandi here.

Two main themes though quickly emerge out of this episode.

First is there are a heck of a lot of enterprises out there that are using the public cloud today, and they have selected the AWS cloud to run their applications. These companies not only are the usual social | local | mobile suspects, but also include companies across media, technology and government sectors. This clear and vigorous adoption of cloud computing now seems to justify the buzz and hype that "cloud" has garnered over the last few years. How else to account for a failure of block storage devices in one of the clouds of one of the cloud providers yielding coverage in CNN, the Wall Street Journal and hundreds of other media outlets.

The second theme that sadly emerges is that while a huge number of companies have adopted the public cloud paradigm, the thought processes behind the design and deployment of their applications on public clouds still seems to follow the traditional datacenter deployment model.

The tremendous ease and benefits of the "programmable cloud infrastructure" that allows a call to an API to set up infrastructure, configure firewalls, provision storage, enable backups and deploy applications in the cloud are not being utilized to automate recovery in the case of such catastrophic failures. This becomes all the more painful when you realize that there is minimal incremental cost to having these automations in place. In the public cloud model, companies do not incur reservation costs for their entire recovery infrastructure.

Organizations that leverage native AWS capabilities, such as creating Amazon Machine Images (AMI) for all applications, utilizing snapshots and leveraging one or more of the other four geographically isolated AWS regions, can successfully weather these outages. Sure, there will be nuances across the application set and some may not be able to recover gracefully with pure automation and will require manual recovery steps.

Netflix, a large AWS user, has institutionalized this in their deployment model. In fact they frequently let loose their Chaos Monkey (http://techblog.netflix.com/2010/12/5-lessons-weve-learned-using-aws.html) that constantly forces random failures of even stable AWS instances to ensure recovery. Unlike Foursquare, Quora and Hootsuite, Netflix did not report any failures during the current AWS east region outage. Recovery.gov, a prominent federal government website running on AWS, also recovered quickly and gracefully in another AWS region.

While the failures have been catastrophic, perhaps embarrassing, and will hopefully prompt a review of application deployment and recovery strategies, they are not serious enough to change the dynamics of cloud adoption in the short or long term. The benefits of on-demand cloud infrastructure - such as rapid cycle time, lower capital costs and utility pricing models - remain strong cloud drivers today, just as they were last week.

•   •   •

Originally published on ITworld.

More Stories By Ahmar Abbas

Ahmar Abbas is Sr. VP of Infrastructure Management at San Jose, CA based CSS Corp. Prior to that, he was VP of Managed Hosting Services at Blackboard Inc. He is the founder of Grid Technology Partners, a consulting firm. A veteran of leading companies on the East Coast and Silicon Valley, Abbas has also held senior management positions at ONI Systems, Zaffire Inc and UUNET Inc. He started his career at Salomon Brothers. He is the author/editor of Grid Computing: A Practical Guide to Technology and Applications (Delmar Thomson, 2004).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Cloud Expo Breaking News
“Big data represents a sea change of capabilities in IT” notes Matt McLarty, Vice President, Client Solutions at Layer 7, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. McLarty continued: “In conjunction with mobile and cloud, I think Big Data will provide a technological makeover to the typical enterprise infrastructure, drawing a hard API border in front of core business services while blurring the line between logic and data services.” Cloud Computing Journal: Agree or...
As more enterprises are adopting clouds, the nature of cloud computing is changing. Previously, clouds were used to test applications or for non-mission critical applications. Today, enterprises are using clouds for cost-saving advantages and launching more mission critical applications that have defined performance needs. In his session at the 10th International Cloud Expo, Eric Shepcaro, CEO and Chairman of the Board of Telx, will discuss how distributed computing has many advantages. It wou...
Virtualization and private cloud are good for server consolidation, creating flexible environments, and saving IT budget dollars. A recent survey of 1200 companies with 500+ employees showed that 59% had server virtualization in production or pilot. But that doesn’t tell the whole story. In his session at the 10th International Cloud Expo, Dave Asprey, VP of Cloud Security at Trend Micro, will explain the types of situations when you should consider not virtualizing some of your applications. ...
Hardware and chemistry improvements will make the $1,000 human genome a reality soon. While the massive amount of genomics data that will be generated represents a huge opportunity to advance personal medicine, it also presents an enormous big data challenge. In his session at the 10th International Cloud Expo, Dr Andreas Sundquist, CEO of DNAnexus, will discuss how the cloud will address these issues by enabling the management, storage, sharing and analysis of the world’s DNA data and how it ...
The Platform as a Service (PaaS) market grew out of the fact that no other cloud solution addressed the ever-increasing complexity of managing and writing modern applications: no frameworks, libraries or APIs alone could tackle the sticky application engineering challenges. Unfortunately, PaaS 1.0 is what people are now seeing as strictly a “tool” to easily deploy apps to the infrastructure in a self-service way with little or no differentiation among offerings. However, in order for PaaS to rea...
Hadoop, MapReduce, Hive, Hbase, Lucene, Solr? The only thing growing faster than enterprise data these days is the landscape of big data tools. These tools, which are designed to help organizations turn big data into opportunities, are gaining deeper insight into massive volumes of information. A recent Gartner report predicts that enterprise data will increase by 650% over the next five years, which means that the time is now for IT decision makers to determine which big data tools are the best...
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what e...
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what e...
The proliferation of device connectivity is redefining the functionality requirements and capabilities of many embedded systems as more and more of these devices look to leverage the “Cloud.” While many commercial software and hardware component vendors have begun to realign their value propositions to satisfy growing demand, commercial-off-the-shelf products (COTS) alone cannot meet every OEM’s needs. As a result, the Embedded Cloud has injected a new level of uncertainty and a new competitive ...
Building a cloud computing environment with on-demand access to compute, network, and storage resources requires an elastic infrastructure at multiple levels. Virtualization combined with x86 servers has transformed the way we scale out compute resources. Unfortunately, legacy Fibre Channel and iSCSI storage architectures are rooted in rigid mainframe-era designs, and are fundamentally mismatched with the dynamic, shared modern data center. In his session at the 10th International Cloud Expo, ...