@CloudExpo Authors: Liz McMillan, Mehdi Daoudi, Elizabeth White, Pat Romanski, Yeshim Deniz

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog

@CloudExpo: Article

Storage Can Be the Key to Aggressively Priced Cloud Offerings

When selling cloud-based services, the set of performance, scalability, and availability requirements are relatively clear

One of the key reasons for moving to a cloud-based infrastructure is to lower overall infrastructure costs. This is true regardless of whether you are building an in-house (private) cloud or are a public cloud provider looking to price your offerings competitively. Because storage comprises such a large part of the outlay of any cloud-based infrastructure, it's an obvious place to look for optimizations that can lower overall costs. A lower cost virtual infrastructure gives cloud providers pricing leeway, which can be used to either out-price competitors or to increase margins. New storage technologies introduced within the last five years provide significant opportunities to lower storage costs while still creating the type of high performance, scalable and highly available infrastructures that cloud providers need to meet their customers' business requirements.

Candidate Storage Technologies
The critical technologies in building a cost-effective storage infrastructure include the following:

Scalable, resilient networked storage subsystems. Ensure that the storage you choose is modularly expandable and will scale to meet your business requirements. Networked storage architectures offer better opportunities not only for expansion, but also for redundancy and for storage sharing - which is critical to support the live migration of virtual machines (VMs) necessary to meet uptime requirements. Storage layouts should use RAID for redundancy, and provide multiple paths to each storage device for high availability, as well as supporting online expansion and maintenance.

Thin provisioning. Historically, storage has been significantly over-provisioned to accommodate growth. Allocated but unused storage is an expensive waste of space, and thin provisioning is a storage technology, which effectively addresses this. By transparently allocating storage on demand as environments grow, administrators no longer have to overprovision. When thin provisioning technology is initially deployed in an environment, it's not uncommon for it to decrease storage capacity consumption by 70% or more. It allows for higher utilization of existing storage assets, reducing not only hardware infrastructure costs but also energy and floor space costs.

However, thin provisioning must be carefully watched when it's deployed in virtual environments. It can be difficult to stay on top of the capacity planning requirements to ensure that you don't unexpectedly run out of storage capacity. Running out of capacity shuts down VMs, so thin provisioning must be carefully managed to ensure that this does not occur. The savings, however, are significant so it's well worth it.

On a related topic, pay attention to how storage space reclamation occurs in your virtual infrastructure. When files are deleted, is the storage space that is freed up immediately returned to the storage pool, or does that only occur when the VMs that owned that data are rebooted? Both storage space reclamation and thin provisioning pose additional management challenges when multiple layers of virtualization exist, as is the case in hypervisor-based environments where an array that uses virtual storage technology is in use.

Scalable snapshot technologies. Snapshots have all sorts of uses - working from VM templates, ensuring a safety net during software updates, creating copies for test/dev environments, cloning desktops in VDI environments, etc. - all of which have significant operational value. If you've worked with snapshots in the past, you probably already know that snapshots can impose negative performance impacts. In fact, this performance impact can be so bad that administrators consciously limit their use of snapshots in some situations. In others, the value snapshots can provide has helped to drive the purchase of very high-end, very expensive storage arrays that overcome snapshot performance issues. In virtual computing environments, hypervisor-based snapshots generally also impose these same types of performance penalties.

Snapshots can also be very valuable when used for disk-based backup. Your customers will expect you to protect their data, and provide fast recovery with minimal data loss. To provide the best service to your customers, data protection operations should be as transparent as possible. The best way to meet these requirements will be to use snapshot backups, working with well-defined APIs like Windows Volume Shadowcopy Services (VSS) to ensure that you can create application-consistent backups for fast, reliable recovery.

The use of disk for backups also allows you to leverage storage capacity optimization (SCO) technologies like data deduplication to minimize the secondary storage capacity needed for data protection operations. Disk also makes it easier to leverage replication in creating disaster recovery (DR) plans for those customers that need them, and asynchronous replication products that use IP networks and support heterogeneous storage offer cost-effective DR options for virtual environments.

For cloud computing environments, the ability to use high performance, scalable snapshot technology has real operational value. Each cloud provider will need to evaluate how best to meet this need while still staying within budgetary constraints.

Primary storage optimization. SCO technology is not limited to use with secondary storage, and a number of large storage vendors offer what is called "primary storage optimization" in their product portfolios today. Similar in concept to deduplication (but not in implementation), these products effectively reduce the amount of primary storage capacity required to store a given amount of information. Because of their high performance requirements, primary data stores posed an additional challenge that did not exist for secondary storage: whatever optimization work is done must not impact production performance. Describing the different approaches for achieving primary storage optimization is beyond the scope of this article, but suffice it to say that they can generally reduce the amount of primary storage required for many environments by 70% or more, reducing not only primary storage costs but also secondary storage costs (since less primary storage is being backed up).

Note that certain primary and secondary SCO technologies can be used simultaneously against the same data stores, but care should be taken to ensure that they are complementary. Because there is so much duplicate data in virtualized environments (e.g., many VMs run the same operating systems and applications, etc.), SCO technologies are an excellent fit and can generate significant savings.

Storage Cost Challenges in Cloud Computing Environments
To meet performance, scalability, and availability requirements, cloud providers often invest in high-end, enterprise-class storage to support their virtual infrastructure. Higher per terabyte costs can be understood upfront, but there is another issue here that can hit cloud providers unexpectedly. Server virtualization is critical to cloud computing, but it poses cost challenges for legacy storage architectures. Because many VMs, each with their own independent workloads, will be placed on each host, the I/O patterns these hosts generate are much more random and much more write-intensive than those generated by physical servers running dedicated applications. This randomness lowers storage performance, driving the purchase of more spindles or exotic storage technologies like solid-state disk (SSD) to meet performance requirements. This poses a conundrum for those building cloud infrastructures: How do I create a cost-effective platform when the use of server virtualization requires more storage, actually driving storage costs up?

Cloud providers often consider the judicious use of SSD to reduce spindle count while maintaining high performance. SSD offers great read performance, quite good sequential write performance, but quite poor random write performance. The challenge in virtual computing environments is in managing random, write-intensive workloads, so SSD by itself is only a partial solution.

Interestingly, if there was a way to turn all those random writes into sequential writes, this could have a significant performance improvement without requiring any other infrastructure changes. Enterprise databases for decades have used a unique logging architecture to do just that. By sending all writes to a persistent log, which generates the write acknowledgement back to the database, it takes all the randomness out of the I/O stream. This means that the performance of the environment is determined by the sequential, not random, write performance characteristics of the device hosting the log. These writes are then later asynchronously de-staged to primary storage, an operation that has zero performance impact on the database. This trick increases the IOPS per spindle any given storage technology can sustain, a speedup that varies between 3x and 10x, depending on the storage technology in use.

What makes this particularly relevant for virtual computing environments is that it has been implemented in software at the storage, not the application layer, by several vendors. By implementing it at the storage layer, the performance speedups it produces are available to all applications, not just a given database application. For any given storage configuration, it reduces the number of spindles required to meet a given performance requirement, regardless of the type of storage technology in use - generally by at least 30%. It even speeds up SSD, since it allows SSD to operate at sequential rather than random write speeds.

This capability goes by the name of virtual storage optimization technology. It can be used in a complementary manner with the other storage technologies mentioned, is transparent to applications, and can be used with any heterogeneous, block-based storage. Much like the way server virtualization technology allowed organizations to get higher utilization out of their existing server hardware, virtual storage optimization technology does the same thing for storage hardware.

For Cloud Providers, Cost Is Critical
When selling cloud-based services, the set of performance, scalability, and availability requirements are relatively clear, and building the storage infrastructure to meet those needs will likely comprise at least 40% of the overall cost of your virtual infrastructure. But there is a big difference in how each cloud provider chooses to get there, and how each leverages available storage technologies to meet those requirements. The functionality of cloud service offerings for specific markets may be the same across providers that address those markets, but the one who meets those requirements with the most cost-effective virtual infrastructure has a significant leg up against the competition. The storage technologies available in today's market offer the savvy cloud provider the tools to achieve this advantage.

More Stories By Eric Burgener

Eric Burgener is vice president product management at Virsto Software. He has worked on emerging technologies for almost his entire career, with early stints at pioneering companies such as Tandem, Pyramid, Sun, Veritas, ConvergeNet, Mendocino, and Topio, among others, on fault tolerance and high availability, replication, backup, continuous data protection, and server virtualization technologies.

Over the last 25 years Eric has worked across a variety of functional areas, including sales, product management, marketing, business development, and technical support, and also spent time as an Executive in Residence with Mayfield and a storage industry analyst at Taneja Group. Before joining Virsto, he was VP of Marketing at InMage.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

CloudEXPO Stories
In today's always-on world, customer expectations have changed. Competitive differentiation is delivered through rapid software innovations, the ability to respond to issues quickly and by releasing high-quality code with minimal interruptions. DevOps isn't some far off goal; it's methodologies and practices are a response to this demand. The demand to go faster. The demand for more uptime. The demand to innovate. In this keynote, we will cover the Nutanix Developer Stack. Built from the foundation of software-defined infrastructure, Nutanix has rapidly expanded into full application lifecycle management across any infrastructure or cloud .Join us as we delve into how the Nutanix Developer Stack makes it easy to build hybrid cloud applications by weaving DBaaS, micro segmentation, event driven lifecycle operations, and both financial and cloud governance together into a single unified st...
"Cloud computing is certainly changing how people consume storage, how they use it, and what they use it for. It's also making people rethink how they architect their environment," stated Brad Winett, Senior Technologist for DDN Storage, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Sold by Nutanix, Nutanix Mine with Veeam can be deployed in minutes and simplifies the full lifecycle of data backup operations, including on-going management, scaling and troubleshooting. The offering combines highly-efficient storage working in concert with Veeam Backup and Replication, helping customers achieve comprehensive data protection for all their workloads — virtual, physical and private cloud —to meet increasing business demands for uptime and productivity.
While the focus and objectives of IoT initiatives are many and diverse, they all share a few common attributes, and one of those is the network. Commonly, that network includes the Internet, over which there isn't any real control for performance and availability. Or is there? The current state of the art for Big Data analytics, as applied to network telemetry, offers new opportunities for improving and assuring operational integrity. In his session at @ThingsExpo, Jim Frey, Vice President of Strategic Alliances at Kentik, discussed tactics and tools to bridge the gap between IoT project teams and the network planning and operations functions that play a significant role in project success.
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In their Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, and Mark Lavi, a Nutanix DevOps Solution Architect, explored the ways that Nutanix technologies empower teams to react faster than ever before and connect teams in ways that were either too complex or simply impossible with traditional infrastructures.