Click here to close now.

Welcome!

CloudExpo® Blog Authors: Lori MacVittie, Hovhannes Avoyan, Liz McMillan, Ian Goldsmith, Pat Romanski

Related Topics: CloudExpo® Blog

CloudExpo® Blog: Blog Feed Post

An Introduction to the Data Cloud

Running a Data Cloud in production presents a new set of challenges for DevOps

As data has grown exponentially at many sites, companies have been forced to horizontally scale their data.  Some have turned to sharding of databases like Postrres or MySQL , while others have switched to newer NoSQL data systems.  There have been many debates in the last few years about SQL vs. NoSQL data management systems and which is better.  What many have failed to grasp, though, is how similar these systems are and how complex they both are to run in production in high scale.

Both of these systems represent what I call a Data Cloud. This Data Cloud is logical data set spread across many nodes.  While developers have heated debates about which system is better and how to design code around it, those in DevOps usually struggle with very similar issues because the two systems are mostly the same.  Both systems

  • Run across many nodes with large amounts of data flowing between them and from/to the application
  • Strain both the hardware of all nodes, and the network connecting them
  • Maintain duplicate data across nodes for fault tolerance, and must have failover ability
  • Must be tuned on a per node and cluster-wide bases
  • Must allow for growth by adding additional nodes.

Running this Data Cloud in production presents a new set of challenges for DevOps, many of which are not well understood or addressed.  One of the main challenges is the management and monitoring of these systems, for which few (if any) tools or products exist at this time.

When systems were smaller and you ran a single Database in production, you probably had all the necessary systems in place.  With a plethora of products for Management, monitoring, visualizing data, and backups, it was not hard to be successful and meet your SLAs.

But now all this is much more complex once you move into the world of the Data Cloud.  Now you have a large number of nodes, all representing the same system and still needing to meet the same SLAs as the old simple DB from before.  Let us look at the challenges for running a production Data Cloud successfully.

Capacity Planning

Do you know how many nodes you need?  How many nodes do you put in each replica set?  How much latency and throughput do you need in your network for the nodes to communicate fast enough?  What is the ideal hardware to use for each node to balance performance with costs?

Monitoring

How do you monitor dozens, hundreds or even thousands of nodes all at once?   How do you get a unified view of your data cloud, and then drill down to the problem nodes?   Are there even any off-the-shelf monitoring tools that can help?  Your old monitoring tool won’t be very useful anymore unless you are willing to look at every node one by one to see what is going on there.

Alerting

How do you set up a common set of alerts across all nodes?  And how do you keep your alert thresholds in sync as you add nodes and remove them?   More importantly, even assuming you have alerting in place,  once staff receives critical alerts, how will they know where to find the troubled node in the massive cloud, or whether it’s a node level  issue or more global in nature?  This must be done quickly during critical outages.

Data Visualization

How does your staff view the data when it is distributed?  In case of data inaccuracy, how can they quickly identify the faulty nodes and fix up the data?

Performance Tuning

As performance degrades, how do you troubleshoot and identify the bottlenecks?  How do you find which nodes by be the cause of the problem?  How do you improve performance across all the nodes.

Data Cloud Management

How do you back up all the data while consistently tracking which nodes were backed up successfully and when?  How do you make schema changes across all the nodes in one consistent step without breaking your app? And how do you make configuration changes on various nodes or across all nodes?  And how do you track the configurations of each node and keep them consistent across your system?

By now you should see that there is a lot to think about before endeavoring to launch a production Data Cloud.  Too many companies focus all their energies on deciding which DB or NoSQL system to use and developing their apps for it.  But that might turn out to be the lesser of your challenges once you struggle to put the system into production.  Be sure you can answer all the questions I have listed above before your launch.

Boris.

Read the original blog entry...

More Stories By AppDynamics Blog

In high-production environments where release cycles are measured in hours or minutes — not days or weeks — there's little room for mistakes and no room for confusion. Everyone has to understand what's happening, in real time, and have the means to do whatever is necessary to keep applications up and running optimally.

DevOps is a high-stakes world, but done well, it delivers the agility and performance to significantly impact business competitiveness.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
T-Mobile has been transforming the wireless industry with its “Uncarrier” initiatives. Today as T-Mobile’s IT organization works to transform itself in a like manner, technical foundations built over the last couple of years are now key to their drive for more Agile delivery practices. In his session at DevOps Summit, Martin Krienke, Sr Development Manager at T-Mobile, will discuss where they started their Continuous Delivery journey, where they are today, and where they are going in an effort ...
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at @ThingsExpo, Sean Lorenz, Technical Product Manager for Xively at LogMeIn, demonstrated how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
There are 182 billion emails sent every day, generating a lot of data about how recipients and ISPs respond. Many marketers take a more-is-better approach to stats, preferring to have the ability to slice and dice their email lists based numerous arbitrary stats. However, fundamentally what really matters is whether or not sending an email to a particular recipient will generate value. Data Scientists can design high-level insights such as engagement prediction models and content clusters that a...
Containers Expo Blog covers the world of containers, as this lightweight alternative to virtual machines enables developers to work with identical dev environments and stacks. Containers Expo Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. Bookmark Containers Expo Blog ▸ Here Follow new article posts on Twitter at @ContainersExpo
It's time to face reality: "Americans are from Mars, Europeans are from Venus," and in today's increasingly connected world, understanding "inter-planetary" alignments and deviations is mission-critical for cloud. In her session at 15th Cloud Expo, Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems, discussed cultural expectations of privacy based on new research across these elements
In today's application economy, enterprise organizations realize that it's their applications that are the heart and soul of their business. If their application users have a bad experience, their revenue and reputation are at stake. In his session at 15th Cloud Expo, Anand Akela, Senior Director of Product Marketing for Application Performance Management at CA Technologies, discussed how a user-centric Application Performance Management solution can help inspire your users with every applicati...
As enterprises engage with Big Data technologies to develop applications needed to meet operational demands, new computation fabrics are continually being introduced. To leverage these new innovations, organizations are sacrificing market opportunities to gain expertise in learning new systems. In his session at Big Data Expo, Supreet Oberoi, Vice President of Field Engineering at Concurrent, Inc., discussed how to leverage existing infrastructure and investments and future-proof them against e...
The consumption economy is here and so are cloud applications and solutions that offer more than subscription and flat fee models and at the same time are available on a pure consumption model, which not only reduces IT spend but also lowers infrastructure costs, and offers ease of use and availability. In their session at 15th Cloud Expo, Ermanno Bonifazi, CEO & Founder of Solgenia, and Ian Khan, Global Strategic Positioning & Brand Manager at Solgenia, discussed this shifting dynamic with an ...
Due of the rise of Hadoop, many enterprises are now deploying their first small clusters of 10 to 20 servers. At this small scale, the complexity of operating the cluster looks and feels like general data center servers. It is not until the clusters scale, as they inevitably do, when the pain caused by the exponential complexity becomes apparent. We've seen this problem occur time and time again. In his session at Big Data Expo, Greg Bruno, Vice President of Engineering and co-founder of StackI...
Once the decision has been made to move part or all of a workload to the cloud, a methodology for selecting that workload needs to be established. How do you move to the cloud? What does the discovery, assessment and planning look like? What workloads make sense? Which cloud model makes sense for each workload? What are the considerations for how to select the right cloud model? And how does that fit in with the overall IT transformation?
You use an agile process; your goal is to make your organization more agile. But what about your data infrastructure? The truth is, today's databases are anything but agile - they are effectively static repositories that are cumbersome to work with, difficult to change, and cannot keep pace with application demands. Performance suffers as a result, and it takes far longer than it should to deliver new features and capabilities needed to make your organization competitive. As your application an...
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises a...
The recent trends like cloud computing, social, mobile and Internet of Things are forcing enterprises to modernize in order to compete in the competitive globalized markets. However, enterprises are approaching newer technologies with a more silo-ed way, gaining only sub optimal benefits. The Modern Enterprise model is presented as a newer way to think of enterprise IT, which takes a more holistic approach to embracing modern technologies.
SYS-CON Events announced today that SUSE, a pioneer in open source software, will exhibit at SYS-CON's DevOps Summit 2015 New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. SUSE provides reliable, interoperable Linux, cloud infrastructure and storage solutions that give enterprises greater control and flexibility. More than 20 years of engineering excellence, exceptional service and an unrivaled partner ecosystem power the products and support that help ...
Move from reactive to proactive cloud management in a heterogeneous cloud infrastructure. In his session at 16th Cloud Expo, Manoj Khabe, Innovative Solution-Focused Transformation Leader at Vicom Computer Services, Inc., will show how to replace a help desk-centric approach with an ITIL-based service model and service-centric CMDB that’s tightly integrated with an event and incident management platform. Learn how to expand the scope of operations management to service management. He will al...
There's no doubt that the Internet of Things is driving the next wave of innovation. Google has spent billions over the past few months vacuuming up companies that specialize in smart appliances and machine learning. Already, Philips light bulbs, Audi automobiles, and Samsung washers and dryers can communicate with and be controlled from mobile devices. To take advantage of the opportunities the Internet of Things brings to your business, you'll want to start preparing now.
In a world of ever-accelerating business cycles and fast-changing client expectations, the cloud increasingly serves as a growth engine and a path to new business models. Dynamic clouds enable businesses to continuously reinvent themselves, adapting their business processes, their service and software delivery and their operations to achieve speed-to-market and quick response to customer feedback. As the cloud evolves, the industry has multiple competing cloud technologies, offering on-premises ...
As the world moves from DevOps to NoOps, application deployment to the cloud ought to become a lot simpler. However, applications have been architected with a much tighter coupling than it needs to be which makes deployment in different environments and migration between them harder. The microservices architecture, which is the basis of many new age distributed systems such as OpenStack, Netflix and so on is at the heart of CloudFoundry – a complete developer-oriented Platform as a Service (PaaS...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, discussed how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP HANA...
There is no question that the cloud is where businesses want to host data. Until recently hypervisor virtualization was the most widely used method in cloud computing. Recently virtual containers have been gaining in popularity, and for good reason. In the debate between virtual machines and containers, the latter have been seen as the new kid on the block – and like other emerging technology have had some initial shortcomings. However, the container space has evolved drastically since coming on...