Welcome!

@CloudExpo Authors: Liz McMillan, Sematext Blog, Elizabeth White, Peter Silva, Pat Romanski

Related Topics: @CloudExpo

@CloudExpo: Blog Feed Post

Evaluating Cloud Computing Uptime SLAs

How many of us evaluate the vendor's Service Level Agreement (SLA) before they decide to deploy workloads in the cloud?

Last week's Windows Azure Storage outage made me thinking how many of us evaluate the vendor's Service Level Agreement (SLA) before they decide to deploy workloads in the cloud. I bet many think about it only when it is too late.



Let's take Windows Azure SLA and see how we as consumers of the cloud services are protected in case of downtime. Before all though I would like to point out that it is in the nature of any service (public or private) to experience outage once in a while - think about power outages that we hear about or live through every winter. It is important to understand that this will happen and as users of cloud services we need to be prepared for it. In this post I will use Windows Azure as example not because their services are better or worse than the other cloud vendors but to illustrate how the SLAs impact us and how they differ from vendor to vendor.

Each SLA (or at least the ones that bigger cloud vendors offer) contains few main sections:

  • Definitions - defining the terms used in the document
  • Claims - describing how and under what terms one can submit a claim for incidents as well as how much you will be credited
  • Exclusions - describing in what cases the vendor is not liable for the outage
  • The actual SLAs - those can be two types:
    • Guaranteed performance characteristics of the service
    • Uptime for the service

Looking at Windows Azure SLAs web page the first thing you will notice is that there are different SLAs for each service. You don't need to read all of them unless you utilize all of the services the vendor offer. The main point here is that you need to read the SLAs for the services you use. If, for example you use Windows Azure Storage and Windows Azure Compute you will notice that the uptime for those differ by 0.05% (Compute has uptime guarantee of 99.95% while Storage has uptime guarantee of 99.90%). Although this number is negligible at first sight using an SLA calculator you will notice that the expected downtime for Storage is twice as much as the expected downtime for Compute. It is obvious that the closer the uptime is to 100% the better the service is.

The next thing that you need to keep in mind is the timeframe for which the uptime is calculated for. In the case of Windows Azure the uptime is guaranteed on a monthly basis (for both Storage and Compute). In comparison Amazon's EC2 has annual uptime guarantee. Monthly SLA guarantees are preferable because you will avoid the case where the service experiences severe outage in particular month and stays up the rest of the year. Just to illustrate the last point imagine that EC2 experiences outage of 3h in particular month and stays up for the next 11 months. This outage is less than the 99.95% guarantee or 4:22:47.99 hours acceptable downtime per year and you will not be eligible for credit for it. On the other side if the SLA guarantee is on a monthly basis you will be eligible for the maximum credit for it because it severely exceeds the 21 minutes acceptable downtime per month.

One note about the acceptable downtime. In reality hardware in cloud data-centers fails all the time, which may result in downtime for your particular service but will not impact other services or workloads. Such outages are normally covered by the exclusion clause of the SLA and are your own responsibility. You should follow the standard architectural practices for cloud application and always make your services redundant in order to avoid this. The acceptable downtime metric is calculated for outages that impact vast amount of services or customers. Surprisingly though nowhere in the SLAs is mentioned how many customers need to be impacted in order for the vendor to report the outage. It may happen that a rack of servers in the datacenter goes down and few tens of customers are impacted for some amount of time. If you are one of those do not expect to see official statement from the cloud vendor about the outage. As a rule of thumb if the outage doesn't show up in the news you may have hard time proving that you deserve credit.

The last thing to keep in mind when evaluating SLAs from big cloud providers is the Beta and trial services. It is simple - there are no SLAs for services released in Beta functionality. You are free to use them at your own risk but don't expect any guarantees for uptime from the vendor.

When the so called secondary cloud providers are concerned you need to be much more careful. Those providers (and there are a lot of them) build their services on top of the bigger cloud vendors and thus are very dependent on the uptimes from the big guys. Hence they don't publish standard SLAs but negotiate the contracts on customer-by-customer basis. Most of the time this is based on the size of business you create for them and you can rely on good terms if you are big customer. Of course they put a lot of effort in helping you design your application for redundancy and avoid the risk of executing the SLA because of primary vendor outage. In the opposite case where you are a single developer you may end up without any guarantees for uptime from smaller cloud vendors.

More Stories By Toddy Mladenov

Toddy Mladenov has more than 15 years experience in software development and technology consulting at companies like Microsoft, SAP and 3Com. Currently he is a CTO of Agitare Technologies, Inc. - a boutique consulting company that specializes in Cloud Computing and Big Data Solutions. Before Agitare Tech Toddy spent few years with PaaS startup Apprenda and more than six years working on Microsft's cloud computing platform Windows Azure, Windows Client and MSN/Windows Live. During his career at Microsoft he managed different aspects of the software development process for Windows Azure and Windows Services. He also evangelized Microsoft cloud services among open source communities like PHP and Java. In the past he developed enterprise software for German's software giant SAP and several startups in Europe, and managed the technical sales for 3Com in the Balkan region.

With his broad industry experience, international background and end-user point of view Toddy has an unique approach towards technology. He believes that technology should be develop to improve people's lives and is eager to share his knowledge in topics like cloud computing, mobile and web development.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of Soli...
"Once customers get a year into their IoT deployments, they start to realize that they may have been shortsighted in the ways they built out their deployment and the key thing I see a lot of people looking at is - how can I take equipment data, pull it back in an IoT solution and show it in a dashboard," stated Dave McCarthy, Director of Products at Bsquare Corporation, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation. In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, discussed why and how ReadyTalk diverted from healthy revenue and mor...
Whether your IoT service is connecting cars, homes, appliances, wearable, cameras or other devices, one question hangs in the balance – how do you actually make money from this service? The ability to turn your IoT service into profit requires the ability to create a monetization strategy that is flexible, scalable and working for you in real-time. It must be a transparent, smoothly implemented strategy that all stakeholders – from customers to the board – will be able to understand and comprehe...
What happens when the different parts of a vehicle become smarter than the vehicle itself? As we move toward the era of smart everything, hundreds of entities in a vehicle that communicate with each other, the vehicle and external systems create a need for identity orchestration so that all entities work as a conglomerate. Much like an orchestra without a conductor, without the ability to secure, control, and connect the link between a vehicle’s head unit, devices, and systems and to manage the ...
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
The Internet of Things (IoT) promises to simplify and streamline our lives by automating routine tasks that distract us from our goals. This promise is based on the ubiquitous deployment of smart, connected devices that link everything from industrial control systems to automobiles to refrigerators. Unfortunately, comparatively few of the devices currently deployed have been developed with an eye toward security, and as the DDoS attacks of late October 2016 have demonstrated, this oversight can ...
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2017 New York. The 20th Cloud Expo and 7th @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, NY. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Internet to enable us all to im...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
"ReadyTalk is an audio and web video conferencing provider. We've really come to embrace WebRTC as the platform for our future of technology," explained Dan Cunningham, CTO of ReadyTalk, in this SYS-CON.tv interview at WebRTC Summit at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech.
Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr...
WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web communications world. The 6th WebRTC Summit continues our tradition of delivering the latest and greatest presentations within the world of WebRTC. Topics include voice calling, video chat, P2P file sharing, and use cases that have already leveraged the power and convenience of WebRTC.
As data explodes in quantity, importance and from new sources, the need for managing and protecting data residing across physical, virtual, and cloud environments grow with it. Managing data includes protecting it, indexing and classifying it for true, long-term management, compliance and E-Discovery. Commvault can ensure this with a single pane of glass solution – whether in a private cloud, a Service Provider delivered public cloud or a hybrid cloud environment – across the heterogeneous enter...
Cloud Expo, Inc. has announced today that Andi Mann returns to 'DevOps at Cloud Expo 2017' as Conference Chair The @DevOpsSummit at Cloud Expo will take place on June 6-8, 2017, at the Javits Center in New York City, NY. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the great t...
"At ROHA we develop an app called Catcha. It was developed after we spent a year meeting with, talking to, interacting with senior citizens watching them use their smartphones and talking to them about how they use their smartphones so we could get to know their smartphone behavior," explained Dave Woods, Chief Innovation Officer at ROHA, in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
If you haven’t heard yet, CollabNet just put out some very big news for managing and gaining value from DevOps. We introduced CollabNet DevOps Lifecycle Manager (DLM) — a platform designed exclusively for providing a single pane of glass, dashboard, and traceability views across your DevOps toolchain and processes from planning to operations and that can be traced back to planning and development.
"Venafi has a platform that allows you to manage, centralize and automate the complete life cycle of keys and certificates within the organization," explained Gina Osmond, Sr. Field Marketing Manager at Venafi, in this SYS-CON.tv interview at DevOps at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
@DevOpsSummit taking place June 6-8, 2017 at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @DevOpsSummit at Cloud Expo New York Call for Papers is now open.
SYS-CON Events announced today that Fusion, a leading provider of cloud services, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Fusion, a leading provider of integrated cloud solutions to small, medium and large businesses, is the industry’s single source for the cloud. Fusion’s advanced, proprietary cloud service platform enables the integration of leading edge solutions in the cloud, including cloud...