Welcome!

@CloudExpo Authors: Pat Romanski, Yeshim Deniz, Elizabeth White, Liz McMillan, Matt Brickey

Related Topics: @CloudExpo, Microservices Expo, Agile Computing

@CloudExpo: Article

NoSQL – The Trend for Databases in the Cloud?

Why is NoSQL more responsive than SQL-based systems?

SQL seems to be somewhat old fashioned when it comes to scalable databases in the cloud. Non-relational databases (also called NoSQL) seem to take over in most data storage fields. But why do those databases seem to be more popular than the "classic" relational databases? Is it due to the fact that professors at universities "tortured" us with relational databases and therefore reduced our interest - the interest of the "new" generation for relational databases? Or are there some hard facts that tell us why relational databases are somewhat out of date?

I was at a user group meeting in Austria, Vienna, one month ago where I talked about NoSQL databases. The topic seemed to be of interest to a lot of people. However, we sat together for about four hours (my talk was planned for one hour only) discussing NoSQL versus SQL. I decided to summarize some of the ideas in a short article as this is useful for cloud computing.

If we look at what NoSQL offers, we'll find a numerous offers on NoSQL databases. Some of the most popular ones are MongoDB, Amazon Dynamo (Amazon SimpleDB), CouchDB, and Cassandra. Some people might think that non-relational databases might be for those people who are too "lazy" to do their complex business logic in the database. In fact, this logic reduces the performance of a system. If there is a need for a high-responsive and available system, SQL Databases might not be your best choice. But why is NoSQL more responsive than SQL-based systems? And why is there this saying that NoSQL allows better scalability than SQL-based systems? To understand this topic, we need to go back 10 years.

Dr. Eric A Brewer in his keynote "Symposium on Principles of Distributed Computing 2000" (Towards Robust Distributed Systems, 2000) addressed a problem that arises when we need high availability and scalability. This was the birth of the so-called "CAP Theorem." The CAP Theorem says that a distributed system can only achieve two out of the three states: "Consistency, Availability and Partition tolerance." This means:

  • That every node in a distributed system should see the same data as all other nodes at the same time (consistency)
  • That the failure of a node must not affect the availability of the system (availability)
  • That the system stays tolerant to the loss of some messages

Nowadays when talking about databases we often use the term "ACID," but NoSQL is related to another term: BASE. Base stands for "Basically Available, Soft state, eventually consistent." If you want to go deeper into eventually consistent, read the post by Werner Vogels - Eventually Consistent revisited. BASE states that all updates that occur to a distributed system will be eventually consistent after a period of no updates. For distributed systems such as cloud-based systems, it is simply not possible to keep a system consistent at all times. This results in bad availability.

To understand eventually consistent, it might be helpful to look at how Facebook is handling their data. Facebook uses MySQL, which is a relational (SQL) database. However, they simply don't use such features as joins that MySQL offers them; Facebook joins data on the web server. You might think "What, are they crazy?" However, the problem is that the joins Facebook needs will sooner or later result in a very slow system. David Recordon, Manager at Facebook, stated that joins are better performing on the web server [1]. Facebook must know what is good performance or not as they will store some 50 petabytes of data by the end of 2010. Twitter, another social platform that needs to scale their platform, should also think about switching to NoSQL platforms. This will hopefully reduce the "fail whale" to a minimum [2].

Summing it up, NoSQL is relevant for applications that are in the need of large-scale global Internet applications. But are there any other benefits for NoSQL databases? Another benefit is that there are often no schemas associated with a table. This allows the database to adopt new business requirements. I've seen a lot of projects where the requirements changed over the years. As this is rather hard to handle with traditional databases, NoSQL allows easy adoption of such requirements. A good example of this is Amazon. Amazon stores a lot of data on their products. As they offer products of different types - such as personal computers, smartphones, music, home entertainment systems and books - they need a flexible database. This is a challenge for traditional databases. With NoSQL databases it's easy to implement some kind of inheritance hierarchy - just by calling the table "product" and letting every product have its own fields. Databases such as Amazon Dynamo handle this with key/value storage. If you want to dig deeper into Amazon Dynamo, read Eventually Consistent [3] by Werner Vogels.

Will there be some sort of "war" between NoSQL and SQL supporters like the one of REST versus SOAP? The answer is maybe. Who will win this case? As with SOAP versus REST, there won't be a winner or a loser. We will have more opportunities to choose our database systems in the future. For data warehousing and systems that require business intelligence to be in the database, SQL databases might be your choice. If you need high-responsive, scalable and flexible databases, NoSQL might be better for you.

Resources

  1. Facebook infrastructure
  2. Twitters switches to NoSQL
  3. Eventually Consistent

More Stories By Mario Meir-Huber

Mario Meir-Huber studied Information Systems at the University of Linz. He worked in the IT sector for some years before founding CodeForce, an IT consulting and services company together with Andreas Aschauer. Since the advent of Cloud Computing, he has been passionate about this technology. He talks about Cloud Computing at various international events and conferences and writes for industry-leading magazines on cloud computing. He is Cloud Computing expert in various independent IT organizations and wrote a book on Cloud Computing covering all topics of the Cloud. You can follow Mario on Twitter (@mario_mh) or read his Blog at http://cloudvane.wordpress.com.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Calligo, an innovative cloud service provider offering mid-sized companies the highest levels of data privacy and security, has been named "Bronze Sponsor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalised support service from its globally located cloud plat...
"With Digital Experience Monitoring what used to be a simple visit to a web page has exploded into app on phones, data from social media feeds, competitive benchmarking - these are all components that are only available because of some type of digital asset," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Me...
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
SYS-CON Events announced today that Datera, that offers a radically new data management architecture, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera is transforming the traditional datacenter model through modern cloud simplicity. The technology industry is at another major inflection point. The rise of mobile, the Internet of Things, data storage and Big...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...
"Outscale was founded in 2010, is based in France, is a strategic partner to Dassault Systémes and has done quite a bit of work with divisions of Dassault," explained Jackie Funk, Digital Marketing exec at Outscale, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We focus on SAP workloads because they are among the most powerful but somewhat challenging workloads out there to take into public cloud," explained Swen Conrad, CEO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are still a relatively small software house and we are focusing on certain industries like FinTech, med tech, energy and utilities. We help our customers with their digital transformation," noted Piotr Stawinski, Founder and CEO of EARP Integration, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"I think DevOps is now a rambunctious teenager – it’s starting to get a mind of its own, wanting to get its own things but it still needs some adult supervision," explained Thomas Hooker, VP of marketing at CollabNet, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We've been engaging with a lot of customers including Panasonic, we've been involved with Cisco and now we're working with the U.S. government - the Department of Homeland Security," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We're here to tell the world about our cloud-scale infrastructure that we have at Juniper combined with the world-class security that we put into the cloud," explained Lisa Guess, VP of Systems Engineering at Juniper Networks, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
"We were founded in 2003 and the way we were founded was about good backup and good disaster recovery for our clients, and for the last 20 years we've been pretty consistent with that," noted Marc Malafronte, Territory Manager at StorageCraft, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are an IT services solution provider and we sell software to support those solutions. Our focus and key areas are around security, enterprise monitoring, and continuous delivery optimization," noted John Balsavage, President of A&I Solutions, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We want to show that our solution is far less expensive with a much better total cost of ownership so we announced several key features. One is called geo-distributed erasure coding, another is support for KVM and we introduced a new capability called Multi-Part," explained Tim Desai, Senior Product Marketing Manager at Hitachi Data Systems, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...