Welcome!

@CloudExpo Authors: Progress Blog, Pat Romanski, LeanTaaS Blog, Kevin Benedict, Don MacVittie

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Agile Computing

@CloudExpo: Blog Post

Fast Data Hits the Big Data Fast Lane

Fast Data is not a single technology, but a spectrum of approaches

Of the 3 "V’s” of Big Data – volume, variety, velocity (we’d add "Value” as the 4th V) – velocity has been the unsung ‘V.’ With the spotlight on Hadoop, the popular image of Big Data is large petabyte data stores of unstructured data (which are the first two V’s). While Big Data has been thought of as large stores of data at rest, it can also be about data in motion.

"Fast Data” refers to processes that require lower latencies than would otherwise be possible with optimized disk-based storage. Fast Data is not a single technology, but a spectrum of approaches that process data that might or might not be stored. It could encompass event processing, in-memory databases, or hybrid data stores that optimize cache with disk.

Fast Data is nothing new, but because of the cost of memory, was traditionally restricted to a handful of extremely high-value use cases. For instance:

  • Wall Street firms routinely analyze live market feeds, and in many cases, run sophisticated complex event processing (CEP) programs on event streams (often in real time) to make operational decisions.
  • Telcos have handled such data in optimizing network operations while leading logistics firms have used CEP to optimize their transport networks.

    While Big Data has been thought of as large stores of data at rest, it can also be about data in motion.

  • In-memory databases, used as a faster alternative to disk, have similarly been around for well over a decade, having been employed for program stock trading, telecommunications equipment, airline schedulers, and large destination online retail (e.g., Amazon).

Hybrid in-memory and disk have also become commonplace, especially amongst data warehousing systems (e.g., Teradata, Kognitio), and more recently among the emergent class of advanced SQL analytic platforms (e.g., Greenplum, Teradata Aster, IBM Netezza, HP Vertica, ParAccel) that employ smart caching in conjunction with a number of other bells and whistles to juice SQL performance and scaling (e.g., flatter indexes, extensive use of various data compression schemes, columnar table structures, etc.).

Many of these systems are in turn packaged as appliances that come with specially tuned, high-performance backplanes and direct attached disk.

Finally, caching is hardly unknown to the database world. Hot spots of data that are frequently accessed are often placed in cache, as are snapshots of database configurations that are often stored to support restore processes, and so on.

So what’s changed?


The usual factors: the same data explosion that created the urgency for Big Data is also generating demand for making the data instantly actionable.

Bandwidth, commodity hardware and, of course, declining memory prices, are further forcing the issue: Fast Data is no longer limited to specialized, premium use cases for enterprises with infinite budgets.

Not surprisingly, pure in-memory databases are now going mainstream: Oracle and SAP are choosing in-memory as one of the next places where they are establishing competitive stakes: SAP HANA vs. Oracle Exalytics.

Both Oracle and SAP for now are targeting analytic processing, including OLAP (by raising the size limits on OLAP cubes) and more complex, multi-stage analytic problems that traditionally would have required batch runs (such as multivariate pricing) or would not have been run at all (too complex, too much delay).

Not surprisingly, pure in-memory databases are now going mainstream.



More to the point, SAP is counting on HANA as a major pillar of its stretch goal to become the #2 database player by 2015, which means expanding HANA’s target to include next generation enterprise transactional applications with embedded analytics.

Potential use cases for Fast Data could encompass:

  • A homeland security agency monitoring the borders requiring the ability to parse, decipher, and act on complex occurrences in real time to prevent suspicious people from entering the country
  • Capital markets trading firms requiring real-time analytics and sophisticated event processing to conduct algorithmic or high-frequency trades
  • Entities managing smart infrastructure which must digest torrents of sensory data to make real-time decisions that optimize use of transportation or public utility infrastructure
  • B2B consumer products firms monitoring social networks may require real-time response to understand sudden swings in customer sentiment

For such organizations, Fast Data is no longer a luxury, but a necessity.

More specialized use cases are similarly emerging now that the core in-memory technology is becoming more affordable. YarcData, a startup from venerable HPC player Cray Computer, is targeting graph data, which represents data with many-to-many relationships. Graph computing is extremely process-intensive, and as such, has traditionally been run in batch when involving Internet-size sets of data. YarcData adopts a classic hybrid approach that pipelines computations in memory, but persisting data to disk. YarcData is the tip of the iceberg – we expect to see more specialized applications that utilize hybrid caching that combine speed with scale.

Memory’s not the new disk

T
he movement – or tiering – of data to faster or slower media is also nothing new. What is new is that data in memory may no longer be such a transient thing, and if memory is relied upon for in situ processing of data in motion or rapid processing of data at rest, memory cannot simply be treated as the new disk. Excluding specialized forms of memory such as ROM, by nature anything that’s solid state is volatile: there goes your power… and there goes your data.

Not surprisingly, in-memory systems such as HANA still replicate to disk to reduce volatility. For conventional disk data stores that increasingly leverage memory, Storage Switzerland’s George Crump makes the case that caching practices must become smarter to avoid misses (where data gets mistakenly swapped out).

There are also balance of system considerations: memory may be fast, but is its processing speed well matched with processor?



There are also balance of system considerations: memory may be fast, but is its processing speed well matched with processor? Maybe solid state overcomes I/O issues associated with disk, but may still be vulnerable to coupling issues if processors get bottlenecked or MapReduce jobs are not optimized.

Declining memory process are putting Fast Data on the fast lane to mainstream. But as the technology is now becoming affordable, we’re still early in the learning curve for how to design for it.

You may also be interested in:

More Stories By Tony Baer

Tony Baer is Principal Analyst with Ovum, leading Ovum’s research on the software lifecycle. Working in concert with other members of Ovum’s software group, his research covers the full lifecycle from design and development to deployment and management. Areas of focus include application lifecycle management, software development methodologies (including agile), SOA, IT service management/ITIL, and IT management/governance.

Baer has been a noted authority on software development platforms and integration architecture for nearly 20 years. Prior to joining Ovum, he was an independent analyst whose company ‘onStrategies’ delivered software development and integration tools to vendors with technology assessment and market positioning services. He also led Computerwire’s CIO Agenda and Computer Finance end-user best practices research services.

Follow him on Twitter @TonyBaer or read his blog site www.onstrategies.com/blog.

@CloudExpo Stories
Mobile device usage has increased exponentially during the past several years, as consumers rely on handhelds for everything from news and weather to banking and purchases. What can we expect in the next few years? The way in which we interact with our devices will fundamentally change, as businesses leverage Artificial Intelligence. We already see this taking shape as businesses leverage AI for cost savings and customer responsiveness. This trend will continue, as AI is used for more sophistica...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, discussed how given the magnitude of today's application ...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
The “Digital Era” is forcing us to engage with new methods to build, operate and maintain applications. This transformation also implies an evolution to more and more intelligent applications to better engage with the customers, while creating significant market differentiators. In both cases, the cloud has become a key enabler to embrace this digital revolution. So, moving to the cloud is no longer the question; the new questions are HOW and WHEN. To make this equation even more complex, most ...
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
The past few years have brought a sea change in the way applications are architected, developed, and consumed—increasing both the complexity of testing and the business impact of software failures. How can software testing professionals keep pace with modern application delivery, given the trends that impact both architectures (cloud, microservices, and APIs) and processes (DevOps, agile, and continuous delivery)? This is where continuous testing comes in. D
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve f...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone in...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, discussed the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, application p...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Services at NetApp, described how NetApp designed a three-year program of work to migrate 25PB of a major telco's enterprise data to a new STaaS platform, and then secured a long-term contract to manage and operate the platform. This significant program blended the best of NetApp’s solutions and services capabilities to enable this telco’s successful adoption of private cloud storage and launching ...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.