@CloudExpo Authors: Liz McMillan, Zakia Bouachraoui, Yeshim Deniz, Pat Romanski, Elizabeth White

Related Topics: @CloudExpo, Microservices Expo

@CloudExpo: Article

Stop Buying Database Licenses: You Have All the Capacity You Need

Ensure Big Data is getting to the right place at the right time and is managed responsibly

Any organization that has deployed a business application has experienced the joy of procuring database licenses. Most database software licensing models are based on the quantity and type of compute processing cores in the underlying database server - the more cores in the processor and the more processors in the server box, the higher the cost of the database software license. Depending on the application and the business's expectations, the tolerance threshold for performance can vary. This is typically considered during the design and testing phases of an application deployment life cycle.

Once the application goes into production and data accumulates in key areas supporting mission-critical business processes, performance starts to take a hit. Performance tuning is an art, requiring the skill and experience of the highly coveted and paid performance database administrator (DBA) - an employee who has been, incidentally, identified by industry research as being problematic to retain. [1] That performance guru will add database indexes, rearrange queries, or add additional database objects solely targeted at improving application performance. At some point, performance tuning will only get you so far and returns will start to diminish. DBAs will request more processing power to meet SLAs; more processing power turns into more database licenses not only in production but for every copy of the data. When data is copied to a data warehouse for reporting, to a test or development environment for product support activities, or to a disaster recovery site and there is a corresponding performance expectation for that environment, the production server upgrade now turns into several server upgrades - each with a corresponding increase in database license upgrades.

In many cases, data volumes are growing astronomically, requiring next-generation analytical platforms - more fondly referred to as "big data" systems - to keep up with the quest for knowledge. While these database systems offer an incredible opportunity to change the way organizations find value in their information assets, they too have an incremental cost associated with the size and volume of data.

The fact the database volume and its corresponding costs are growing exponentially is not the big insight. Anyone working in IT gets this and analysts indicate that "on average, data repositories for large applications grow annually at 65%."[2] What is revealing is that the vast majority of the data in these systems is dormant. In fact, industry analysts estimate that as much as 80% of the data in these systems is dormant. These are closed transactions and infrequently queried data that is often only retained for compliance purposes. If you knew you could keep all this data online, reduce its size 90+%, eliminate growth in your databases licenses, and still be able to restore, manage retention, or directly report on the data, why wouldn't you? Why keep this dormant data inside your most expensive applications, riding on your most expensive infrastructure, being maintained by your most expensive personnel? Stop the madness. By taking a good hard look at who is accessing what data over time, there is a good chance that after some inflection point, data is rarely accessed - if it is accessed at all. Why keep buying database licenses for data that doesn't justify the need?

While the concept is not new, information life-cycle management (ILM) has traditionally been associated with tiering infrastructure and archiving documents or email. Application ILM takes this a step further applying tiered services and archiving to databases. The idea is to inventory applications and data warehouses with the value of data and how end users access it. Mapping expectations of business process response time to the underlying infrastructure optimizes operations for performance and cost. It is a philosophical shift from blindly scaling up or out by adding more capacity or compute power as a reaction to missed SLAs to asking the business what it really needs and does that need change over time?

If the business doesn't know the answer to that question, maybe a good look at overall business process efficiency is in order. If there is an answer to that question, capitalize on it. Take stock of the assets used to support that business process and quantify the following: what percentage of data stored online in the production database could just be deleted? What percentage of data has retention requirements - either legal or operational - but doesn't have performance requirements? That's the opportunity to assess what database licenses are costing your organization and if you really need them.

Let's face it. Managing databases and procuring database licenses is expensive and there are no signs that data growth is diminishing. With the amount of high performance compute power and storage capacity wasted due to dormant data, there is a big opportunity to control big data growth. By removing dormant data by getting rid of what you don't need and archiving what you need to keep for longer retention periods, latent DB capacity is released, ultimately improving application performance and operational efficiencies. Taking control over data growth can not only deliver savings on avoided cost associated with more database licenses, but also offer additional advantages, such as:

  • Improved database query performance
  • Ability to meet or exceed application SLAs by IT
  • Shorten application upgrade cycles
  • Reduced backup, recovery, refresh and batch windows
  • Controlled data sprawl ultimately improving eDiscovery efforts
  • Focus budgets on managing data based on its value as determined by its age and access frequency

Whether you are talking about Big Data because data has accumulated over time or overnight, IT has always been in the business of supporting Big Data initiatives. By focusing on the benefits of Application ILM, organizations can be in a better position to make sure that not only is Big Data getting to the right place at the right time, but that big data is managed responsibly.


  1. ESG's 2011 IT Spending Intentions Survey identified database administrators as a top area of problematic skill shortages for IT.
  2. Source: Forrester Research, Inc. TechRadar: Enterprise Data Integration, February 2010

More Stories By Adam Wilson

Adam Wilson is the General Manager for Informatica’s Information Lifecycle Management Business Unit. Prior to assuming this role, he was in charge of product definition and go-to-market strategy for Informatica’s award-winning enterprise data integration platform. Mr. Wilson holds an MBA from the Kellogg School of Management and an engineering degree from Northwestern University. He can be reached at [email protected] or follow him on Twitter @ a_adam_wilson

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

CloudEXPO Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-centric compute for the most data-intensive applications. Hyperconverged systems already in place can be revitalized with vendor-agnostic, PCIe-deployed, disaggregated approach to composable, maximizing the value of previous investments.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by sharing information within the building and with outside city infrastructure via real time shared cloud capabilities.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.