Welcome!

@CloudExpo Authors: Elizabeth White, Carmen Gonzalez, Aruna Ravichandran, Liz McMillan, Patrick Hubbard

Related Topics: @CloudExpo

@CloudExpo: Blog Feed Post

Evaluating Cloud Computing Uptime SLAs

How many of us evaluate the vendor's Service Level Agreement (SLA) before they decide to deploy workloads in the cloud?

Last week's Windows Azure Storage outage made me thinking how many of us evaluate the vendor's Service Level Agreement (SLA) before they decide to deploy workloads in the cloud. I bet many think about it only when it is too late.



Let's take Windows Azure SLA and see how we as consumers of the cloud services are protected in case of downtime. Before all though I would like to point out that it is in the nature of any service (public or private) to experience outage once in a while - think about power outages that we hear about or live through every winter. It is important to understand that this will happen and as users of cloud services we need to be prepared for it. In this post I will use Windows Azure as example not because their services are better or worse than the other cloud vendors but to illustrate how the SLAs impact us and how they differ from vendor to vendor.

Each SLA (or at least the ones that bigger cloud vendors offer) contains few main sections:

  • Definitions - defining the terms used in the document
  • Claims - describing how and under what terms one can submit a claim for incidents as well as how much you will be credited
  • Exclusions - describing in what cases the vendor is not liable for the outage
  • The actual SLAs - those can be two types:
    • Guaranteed performance characteristics of the service
    • Uptime for the service

Looking at Windows Azure SLAs web page the first thing you will notice is that there are different SLAs for each service. You don't need to read all of them unless you utilize all of the services the vendor offer. The main point here is that you need to read the SLAs for the services you use. If, for example you use Windows Azure Storage and Windows Azure Compute you will notice that the uptime for those differ by 0.05% (Compute has uptime guarantee of 99.95% while Storage has uptime guarantee of 99.90%). Although this number is negligible at first sight using an SLA calculator you will notice that the expected downtime for Storage is twice as much as the expected downtime for Compute. It is obvious that the closer the uptime is to 100% the better the service is.

The next thing that you need to keep in mind is the timeframe for which the uptime is calculated for. In the case of Windows Azure the uptime is guaranteed on a monthly basis (for both Storage and Compute). In comparison Amazon's EC2 has annual uptime guarantee. Monthly SLA guarantees are preferable because you will avoid the case where the service experiences severe outage in particular month and stays up the rest of the year. Just to illustrate the last point imagine that EC2 experiences outage of 3h in particular month and stays up for the next 11 months. This outage is less than the 99.95% guarantee or 4:22:47.99 hours acceptable downtime per year and you will not be eligible for credit for it. On the other side if the SLA guarantee is on a monthly basis you will be eligible for the maximum credit for it because it severely exceeds the 21 minutes acceptable downtime per month.

One note about the acceptable downtime. In reality hardware in cloud data-centers fails all the time, which may result in downtime for your particular service but will not impact other services or workloads. Such outages are normally covered by the exclusion clause of the SLA and are your own responsibility. You should follow the standard architectural practices for cloud application and always make your services redundant in order to avoid this. The acceptable downtime metric is calculated for outages that impact vast amount of services or customers. Surprisingly though nowhere in the SLAs is mentioned how many customers need to be impacted in order for the vendor to report the outage. It may happen that a rack of servers in the datacenter goes down and few tens of customers are impacted for some amount of time. If you are one of those do not expect to see official statement from the cloud vendor about the outage. As a rule of thumb if the outage doesn't show up in the news you may have hard time proving that you deserve credit.

The last thing to keep in mind when evaluating SLAs from big cloud providers is the Beta and trial services. It is simple - there are no SLAs for services released in Beta functionality. You are free to use them at your own risk but don't expect any guarantees for uptime from the vendor.

When the so called secondary cloud providers are concerned you need to be much more careful. Those providers (and there are a lot of them) build their services on top of the bigger cloud vendors and thus are very dependent on the uptimes from the big guys. Hence they don't publish standard SLAs but negotiate the contracts on customer-by-customer basis. Most of the time this is based on the size of business you create for them and you can rely on good terms if you are big customer. Of course they put a lot of effort in helping you design your application for redundancy and avoid the risk of executing the SLA because of primary vendor outage. In the opposite case where you are a single developer you may end up without any guarantees for uptime from smaller cloud vendors.

More Stories By Toddy Mladenov

Toddy Mladenov has more than 15 years experience in software development and technology consulting at companies like Microsoft, SAP and 3Com. Currently he is a CTO of Agitare Technologies, Inc. - a boutique consulting company that specializes in Cloud Computing and Big Data Solutions. Before Agitare Tech Toddy spent few years with PaaS startup Apprenda and more than six years working on Microsft's cloud computing platform Windows Azure, Windows Client and MSN/Windows Live. During his career at Microsoft he managed different aspects of the software development process for Windows Azure and Windows Services. He also evangelized Microsoft cloud services among open source communities like PHP and Java. In the past he developed enterprise software for German's software giant SAP and several startups in Europe, and managed the technical sales for 3Com in the Balkan region.

With his broad industry experience, international background and end-user point of view Toddy has an unique approach towards technology. He believes that technology should be develop to improve people's lives and is eager to share his knowledge in topics like cloud computing, mobile and web development.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
"We're bringing out a new application monitoring system to the DevOps space. It manages large enterprise applications that are distributed throughout a node in many enterprises and we manage them as one collective," explained Kevin Barnes, President of eCube Systems, in this SYS-CON.tv interview at DevOps at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long developm...
Updating DevOps to the latest production data slows down your development cycle. Probably it is due to slow, inefficient conventional storage and associated copy data management practices. In his session at @DevOpsSummit at 20th Cloud Expo, Dhiraj Sehgal, in Product and Solution at Tintri, will talk about DevOps and cloud-focused storage to update hundreds of child VMs (different flavors) with updates from a master VM in minutes, saving hours or even days in each development cycle. He will also...
A look across the tech landscape at the disruptive technologies that are increasing in prominence and speculate as to which will be most impactful for communications – namely, AI and Cloud Computing. In his session at 20th Cloud Expo, Curtis Peterson, VP of Operations at RingCentral, will highlight the current challenges of these transformative technologies and share strategies for preparing your organization for these changes. This “view from the top” will outline the latest trends and developm...
"There's a growing demand from users for things to be faster. When you think about all the transactions or interactions users will have with your product and everything that is between those transactions and interactions - what drives us at Catchpoint Systems is the idea to measure that and to analyze it," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York Ci...
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
@DevOpsSummit taking place June 6-8, 2017 at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @DevOpsSummit at Cloud Expo New York Call for Papers is now open.
SYS-CON Events announced today that Dataloop.IO, an innovator in cloud IT-monitoring whose products help organizations save time and money, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Dataloop.IO is an emerging software company on the cutting edge of major IT-infrastructure trends including cloud computing and microservices. The company, founded in the UK but now based in San Fran...
Discover top technologies and tools all under one roof at April 24–28, 2017, at the Westin San Diego in San Diego, CA. Explore the Mobile Dev + Test and IoT Dev + Test Expo and enjoy all of these unique opportunities: The latest solutions, technologies, and tools in mobile or IoT software development and testing. Meet one-on-one with representatives from some of today's most innovative organizations
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 7-9, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and E...
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...
The unique combination of Amazon Web Services and Cloud Raxak, a Gartner Cool Vendor in IT Automation, provides a seamless and cost-effective way of securely moving on-premise IT workloads to Amazon Web Services. Any enterprise can now leverage the cloud, manage risk, and maintain continuous security compliance. Forrester's analysis shows that enterprises need automated security to lower security risk and decrease IT operational costs. Through the seamless integration into Amazon Web Services, ...
Due of the rise of Hadoop, many enterprises are now deploying their first small clusters of 10 to 20 servers. At this small scale, the complexity of operating the cluster looks and feels like general data center servers. It is not until the clusters scale, as they inevitably do, when the pain caused by the exponential complexity becomes apparent. We've seen this problem occur time and time again. In his session at Big Data Expo, Greg Bruno, Vice President of Engineering and co-founder of StackIQ...
Containers have changed the mind of IT in DevOps. They enable developers to work with dev, test, stage and production environments identically. Containers provide the right abstraction for microservices and many cloud platforms have integrated them into deployment pipelines. DevOps and containers together help companies achieve their business goals faster and more effectively. In his session at DevOps Summit, Ruslan Synytsky, CEO and Co-founder of Jelastic, reviewed the current landscape of Dev...
WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web communications world. The 6th WebRTC Summit continues our tradition of delivering the latest and greatest presentations within the world of WebRTC. Topics include voice calling, video chat, P2P file sharing, and use cases that have already leveraged the power and convenience of WebRTC.
"A lot of times people will come to us and have a very diverse set of requirements or very customized need and we'll help them to implement it in a fashion that you can't just buy off of the shelf," explained Nick Rose, CTO of Enzu, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
WebRTC sits at the intersection between VoIP and the Web. As such, it poses some interesting challenges for those developing services on top of it, but also for those who need to test and monitor these services. In his session at WebRTC Summit, Tsahi Levent-Levi, co-founder of testRTC, reviewed the various challenges posed by WebRTC when it comes to testing and monitoring and on ways to overcome them.