@CloudExpo Authors: Pat Romanski, Liz McMillan, Yeshim Deniz, Elizabeth White, Zakia Bouachraoui

Related Topics: @CloudExpo, Containers Expo Blog

@CloudExpo: Blog Post

What Is a Backup?

A backup must be made by copying the source data image when it is in a consistent state

The word "backup" gets thrown around so much that folks tuned in to the world of enterprise storage can start getting surly. One of the best ways to annoy a backup administrator is to start talking about how well the backup will facilitate disaster recovery, e-discovery, and compliance! So what is a backup anyway? Is it different from an archive?

SNIA defines a backup as follows:

"A collection of data stored on (usually removable) non-volatile storage media for purposes of recovery in case the original copy of data is lost or becomes inaccessible; also called a backup copy.To be useful for recovery, a backup must be made by copying the source data image when it is in a consistent state."

This description does not strike me as all that useful, so I put this simple question to a number of folks on Twitter and through direct discussion.

I contacted W. Curtis Preston, "Mr. Backup", for his opinion. He pointed out that just about any copy of data can be used as a backup, but not all are equally effective. A simple file copy routine might suffice, but managing this might prove troublesome. Preston also warned about relying on backups for more than simple restore: "using a backup as an archive, for example, doesn't make it an archive!"

EMC's Scott Waterhouse also helpfully chimed in, noting that backup data is typically managed independently from production data as well.

"I set the following as a criteria for a backup:
  1. It resides on a piece of storage on a different array and/or in a different location than the source data;
  2. Its creation, aging, and disposition is managed by a backup and restore application that will store the data in a format that is different than the source format (meaning either a different type of file system than the source and/or a different disk format and/or the source is encapsulated in a package as is the case with virtual tape), and with access permissions that are a subset of the permissions associated with the source data.
  3. At some point in its lifecycle, the backup must move offsite"

Waterhouse recently wrestled with the issue of differentiating backups from mere copies on his own backup blog, and I urge you to take a look at what he wrote there as well.

It seems we all agree on a number of essential elements that define a backup:

  1. A backup is a copy of a set of data. It must be logically distinct from the primary data set.
  2. Backups themselves should be protected or offline so they are not affected as use of the primary data set continues.
  3. The sole purpose of a backup is to allow for restore or recovery of data in whole or part. It is not appropriate to rely on a backup for other purposes.
  4. The backup process should be managed, with metrics, logging, and indexes to facilitate efficient recovery.
  5. Recoveries normally seek a coherent point in time representation, even if the backup system copies data more frequently or through incremental or differential techniques.
  6. The existence of the backup should not affect the performance or usability of the primary data set.

So a backup is something special. It exists outside the realm of production, waiting to present a set of data on demand.

Of course, there are other kinds of backup as well. Most data centers include redundant power supplies, fire suppression, networking, and servers. These are all backups, too. And every case follows the same set of rules: They're independent from the main system, kept available in case they are needed. Perhaps the most crucial element is their independence: They are not affected by, and do not affect, the primary system.

Special thanks to the following who provided input (and jokes!) on Twitter. Follow Nirvanix or Stephen Foskett on Twitter to become part of the Enterprise Storage Strategies conversation!

Read the original blog entry...

More Stories By Stephen Foskett

Stephen Foskett has provided vendor-independent end user consulting on storage topics for over 10 years. He has been a storage columnist and has authored numerous articles for industry publications. Stephen is a popular presenter at industry events and recently received Microsoft’s MVP award for contributions to the enterprise storage community. As the director of consulting for Nirvanix, Foskett provides strategic consulting to assist Fortune 500 companies in developing strategies for service-based tiered and cloud storage. He holds a bachelor of science in Society/Technology Studies, from Worcester Polytechnic Institute.

CloudEXPO Stories
Security, data privacy, reliability and regulatory compliance are critical factors when evaluating whether to move business applications from in-house client hosted environments to a cloud platform. In her session at 18th Cloud Expo, Vandana Viswanathan, Associate Director at Cognizant, In this session, will provide an orientation to the five stages required to implement a cloud hosted solution validation strategy.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Everyone wants the rainbow - reduced IT costs, scalability, continuity, flexibility, manageability, and innovation. But in order to get to that collaboration rainbow, you need the cloud! In this presentation, we'll cover three areas: First - the rainbow of benefits from cloud collaboration. There are many different reasons why more and more companies and institutions are moving to the cloud. Benefits include: cost savings (reducing on-prem infrastructure, reducing data center foot print, reducing IT support costs), enabling growth (ensuring a highly available, highly scalable infrastructure), increasing employee access & engagement (by having collaboration tools that are usable and available globally regardless of location there will be an increased connectedness amongst teams and individuals that will help increase both efficiency and productivity.)
DXWorldEXPO LLC announced today that "IoT Now" was named media sponsor of CloudEXPO | DXWorldEXPO 2018 New York, which will take place on November 11-13, 2018 in New York City, NY. IoT Now explores the evolving opportunities and challenges facing CSPs, and it passes on some lessons learned from those who have taken the first steps in next-gen IoT services.
Founded in 2000, Chetu Inc. is a global provider of customized software development solutions and IT staff augmentation services for software technology providers. By providing clients with unparalleled niche technology expertise and industry experience, Chetu has become the premiere long-term, back-end software development partner for start-ups, SMBs, and Fortune 500 companies. Chetu is headquartered in Plantation, Florida, with thirteen offices throughout the U.S. and abroad.