|By Greg Schulz||
|January 9, 2013 12:00 PM EST||
This is the second in a two-part industry trends and perspective looking at learning from cloud incidents, view part I here.
There is good information, insight and lessons to be learned from cloud outages and other incidents.
Sorry cynics no that does not mean an end to clouds, as they are here to stay. However when and where to use them, along with what best practices, how to be ready and configure for use are part of the discussion. This means that clouds may not be for everybody or all applications, or at least today. For those who are into clouds for the long haul (either all in or partially) including current skeptics, there are many lessons to be learned and leveraged.
In order to gain confidence in clouds, some questions that I routinely am asked include are clouds more or less reliable than what you are doing? Depends on what you are doing, and how you will be using the cloud services. If you are applying HA and other BC or resiliency best practices, you may be able to configure and isolate from the more common situations. On the other hand, if you are simply using the cloud services as a low-cost alternative selecting the lowest price and service class (SLAs and SLOs), you might get what you paid for. Thus, clouds are a shared responsibility, the service provider has things they need to do, and the user or person designing how the service will be used have some decisions making responsibilities.
Keep in mind that high availability (HA), resiliency, business continuance (BC) along with disaster recovery (DR) are the sum of several pieces. This includes people, best practices, processes including change management, good design eliminating points of failure and isolating or containing faults, along with how the components or technology used (e.g. hardware, software, networks, services, tools). Good technology used in goods ways can be part of a highly resilient flexible and scalable data infrastructure. Good technology used in the wrong ways may not leverage the solutions to their full potential.
While it is easy to focus on the physical technologies (servers, storage, networks, software, facilities), many of the cloud services incidents or outages have involved people, process and best practices so those need to be considered.
These incidents or outages bring awareness, a level set, that this is still early in the cloud evolution lifecycle and to move beyond seeing clouds as just a way to cut cost, and seeing the importance and value HA, resiliency, BC and DR. This means learning from mistakes, taking action to correct or fix errors, find and cut points of failure are part of a technology maturing or the use of it. These all tie into having services with service level agreements (SLAs) with service level objectives (SLOs) for availability, reliability, durability, accessibility, performance and security among others to protect against mayhem or other things that can and do happen.
Images licensed for use by StorageIO via Atomazul / Shutterstock.com
The reason I mentioned earlier that AWS had another incident is that like their peers or competitors who have incidents in the past, AWS appears to be going through some growing, maturing, evolution related activities. During summer 2012 there was an AWS incident that affected Netflix (read more here: AWS and the Netflix Fix?). It should also be noted that there were earlier AWS outages where Netflix (read about Netflix architecture here) leveraged resiliency designs to try and prevent mayhem when others were impacted.
Is AWS a lightning rod for things to happen, a point of attraction for Mayhem and others?
Granted given their size, scope of services and how being used on a global basis AWS is blazing new territory and experiences, similar to what other information services delivery platforms did in the past. What I mean is that while taken for granted today, open systems Unix, Linux, Windows-based along with client-server, midrange or distributed systems, not to mention mainframe hardware, software, networks, processes, procedures, best practices all went through growing pains.
There are a couple of interesting threads going on over in various LinkedIn Groups based on some reporters stories including on speculation of what happened, followed with some good discussions of what actually happened and how to prevent recurrence of them in the future.
Over in the Cloud Computing, SaaS & Virtualization group forum, this thread is based on a Forbes article (Amazon AWS Takes Down Netflix on Christmas Eve) and involves conversations about SLAs, best practices, HA and related themes. Have a look at the story the thread is based on and some of the assertions being made, and ensuing discussions.
Also over at LinkedIn, in the Cloud Hosting & Service Providers group forum, this thread is based on a story titled Why Netflix' Christmas Eve Crash Was Its Own Fault with a good discussion on clouds, HA, BC, DR, resiliency and related themes.
Over at the Virtualization Practice, there is a piece titled Is Amazon Ruining Public Cloud Computing? with comments from me and Adrian Cockcroft (@Adrianco) a Netflix Architect (you can read his blog here). You can also view some presentations about the Netflix architecture here.
What this all means
Saying you get what you pay for would be too easy and perhaps not applicable.
There are good services free, or low-cost, just like good free content and other things, however vice versa, just because something costs more, does not make it better.
Otoh, there are services that charge a premium however may have no better if not worse reliability, same with content for fee or perceived value that is no better than what you get free.
Additional related material
- Cloud conversations: confidence, certainty and confidentiality
- Only you can prevent cloud data loss (shared responsibility)
- The blame game: Does cloud storage result in data loss?
- Amazon Web Services (AWS) and the Netflix Fix?
- Cloud conversations: AWS Government Cloud (GovCloud)
- Everything Is Not Equal in the Data center
- Cloud and Virtual Data Storage Networking (CRC) - Intel Recommended Reading List
Some closing thoughts:
- Clouds are real and can be used safely; however, they are a shared responsibility.
- Only you can prevent cloud data loss, which means do your homework, be ready.
- If something can go wrong, it probably will, particularly if humans are involved.
- Prepare for the unexpected and clarify assumptions vs. realities of service capabilities.
- Leverage fault isolation and containment to prevent rolling or spreading disasters.
- Look at cloud services beyond lowest cost or for cost avoidance.
- What is your organizations culture for learning from mistakes vs. fixing blame?
- Ask yourself if you, your applications and organization are ready for clouds.
- Ask your cloud providers if they are ready for you and your applications.
- Identify what your cloud concerns are to decide what can be done about them.
- Do a proof of concept to decide what types of clouds and services are best for you.
Do not be scared of clouds, however be ready, do your homework, learn from the mistakes, misfortune and errors of others. Establish and leverage known best practices while creating new ones. Look at the past for guidance to the future, however avoid clinging to, and bringing the baggage of the past to the future. Use new technologies, tools and techniques in new ways vs. using them in old ways.
Ok, nuff said.
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2013 StorageIO All Rights Reserved
The web app is agile. The REST API is agile. The testing and planning are agile. But alas, data infrastructures certainly are not. Once an application matures, changing the shape or indexing scheme of data often forces at best a top down planning exercise and at worst includes schema changes that force downtime. The time has come for a new approach that fundamentally advances the agility of distributed data infrastructures. Come learn about a new solution to the problems faced by software organ...
Sep. 2, 2015 05:45 PM EDT Reads: 101
In 2014, the market witnessed a massive migration to the cloud as enterprises finally overcame their fears of the cloud’s viability, security, etc. Over the past 18 months, AWS, Google and Microsoft have waged an ongoing battle through a wave of price cuts and new features. For IT executives, sorting through all the noise to make the best cloud investment decisions has become daunting. Enterprises can and are moving away from a "one size fits all" cloud approach. The new competitive field has ...
Sep. 2, 2015 05:45 PM EDT Reads: 112
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of tech...
Sep. 2, 2015 05:30 PM EDT Reads: 258
DevOps Summit, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development...
Sep. 2, 2015 04:45 PM EDT Reads: 1,573
Introducing Containers & Microservices Bootcamp at @CloudExpo Silicon Valley | #Containers #Microservices
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on...
Sep. 2, 2015 04:45 PM EDT Reads: 387
Advances in technology and ubiquitous connectivity have made the utilization of a dispersed workforce more common. Whether that remote team is located across the street or country, management styles/ approaches will have to be adjusted to accommodate this new dynamic. In his session at 17th Cloud Expo, Sagi Brody, Chief Technology Officer at Webair Internet Development Inc., will focus on the challenges of managing remote teams, providing real-world examples that demonstrate what works and what...
Sep. 2, 2015 04:37 PM EDT
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
Sep. 2, 2015 04:15 PM EDT Reads: 347
In his session at @ThingsExpo, Lee Williams, a producer of the first smartphones and tablets, will talk about how he is now applying his experience in mobile technology to the design and development of the next generation of Environmental and Sustainability Services at ETwater. He will explain how M2M controllers work through wirelessly connected remote controls; and specifically delve into a retrofit option that reverse-engineers control codes of existing conventional controller systems so the...
Sep. 2, 2015 04:15 PM EDT Reads: 207
API-Driven Digital Healthcare Solution By @AkanaInc | @DevOpsSummit #API #IoT #DevOps #Microservices
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device acce...
Sep. 2, 2015 04:00 PM EDT Reads: 276
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises ar...
Sep. 2, 2015 04:00 PM EDT Reads: 1,577
The 3rd International WebRTC Summit, to be held Nov. 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 15th International Cloud Expo, 6th International Big Data Expo, 3rd International DevOps Summit and 2nd Internet of @ThingsExpo. WebRTC (Web-based Real-Time Com...
Sep. 2, 2015 03:45 PM EDT Reads: 1,559
Moving an existing on-premise infrastructure into the cloud can be a complex and daunting proposition. It is critical to understand the benefits as well as the challenges associated with either a full or hybrid approach. In his session at 17th Cloud Expo, Richard Weiss, Principal Consultant at Pythian, will present a roadmap that can be leveraged by any organization to plan, analyze, evaluate and execute on a cloud migration solution. He will review the five major cloud transformation phases a...
Sep. 2, 2015 03:30 PM EDT
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding bu...
Sep. 2, 2015 03:30 PM EDT Reads: 1,640
The 5th International DevOps Summit, co-located with 17th International Cloud Expo – being held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the ...
Sep. 2, 2015 03:15 PM EDT Reads: 1,621
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
Sep. 2, 2015 03:15 PM EDT Reads: 623
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Sep. 2, 2015 03:00 PM EDT Reads: 436
IBM’s Blue Box Cloud, powered by OpenStack, is now available in any of IBM’s globally integrated cloud data centers running SoftLayer infrastructure. Less than 90 days after its acquisition of Blue Box, IBM has integrated its Blue Box Cloud Dedicated private-cloud-as-a-service into its broader portfolio of OpenStack® based solutions. The announcement, made today at the OpenStack Silicon Valley event, further highlights IBM’s continued support to deliver OpenStack solutions across all cloud depl...
Sep. 2, 2015 03:00 PM EDT Reads: 289
Mobile, social, Big Data, and cloud have fundamentally changed the way we live. “Anytime, anywhere” access to data and information is no longer a luxury; it’s a requirement, in both our personal and professional lives. For IT organizations, this means pressure has never been greater to deliver meaningful services to the business and customers.
Sep. 2, 2015 02:00 PM EDT Reads: 821
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Sep. 2, 2015 01:30 PM EDT Reads: 939
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
Sep. 2, 2015 12:15 PM EDT Reads: 414