Welcome!

Cloud Expo Authors: Ken Rutsky, Elizabeth White, Dana Gardner, Jeremy Geelan, Helen Ching

Related Topics: AJAX & REA, Cloud Expo

AJAX & REA: Blog Post

Cloud Analytics: Dataflow vs Databases

Realtime analytics drives a migration away from databases to more scalable parallel dataflow architectures.

For twenty years, analytics has been viewed as just one specific area within the broader relational database industry. So, analytics has meant databases. Today that view is changing. Over the past year or so, a new movement, the "NoSQL" movement has emerged promoting the advantages of doing a variety of kinds of analytics without using any relational database technologies at all.

Whatever one thinks of the capabilities and limitations of distributed key-value stores relative to relational databases, one thing is clear - the stranglehold that SQL has held over all aspects of data analytics since 1990 is now coming to an end. Other non-SQL approaches to analytics such as MapReduce/Hadoop, a very simple dataflow architecture for batch computing, are now gaining ground. As the need for realtime analytics grows we will continue to see a migration away from databases and towards more scalable parallel dataflow architectures for analytics.



The main differences between databases and dataflow can be summarized as follows:

Database

Dataflow

Historical

Realtime

Offline

Online

Pull Model

Push Model

High latency

Low latency

Demand-driven

Data-driven


The shift from databases to dataflow for enterprise cloud analytics mirrors what we have recently seen in another area, the "realtime web". The old demand-driven web model of polling/querying/pulling RSS feeds has proved unable to deliver the kinds of low latency required for the numerous new realtime web services being created by Twitter and others. New data-driven, realtime, push models such as PubSubHubbub and RSSCloud are now replacing the old approaches.

More Stories By Bill McColl

Bill McColl left Oxford University to found Cloudscale. At Oxford he was Professor of Computer Science, Head of the Parallel Computing Research Center, and Chairman of the Computer Science Faculty. Along with Les Valiant of Harvard, he developed the BSP approach to parallel programming. He has led research, product, and business teams, in a number of areas: massively parallel algorithms and architectures, parallel programming languages and tools, datacenter virtualization, realtime stream processing, big data analytics, and cloud computing. He lives in Palo Alto, CA.

Cloud Expo Breaking News
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what e...
2011 was a year of rapid adoption for public and private cloud services. Instant and on-demand server provisioning was the driving force behind the massive growth. On top, cloud server templates and script automation simplified application installation for simple and pre-defined application stacks, but have not targeted more complex enterprise application environments. In his session at the 10th International Cloud Expo, John Yung, CEO of Appcara, will discuss how 2012 will be the year for app...
"Having been in the IT field for many years, I believe the cloud computing chapter in the industry is an exciting one and I am proud to be a part of it," said National Reconaissance Office (NRO) Chief Information Officer Jill T. Singer Tuesday, as it was announced that she was one of 10 winners of the 2012 CloudNOW "Top Ten Women in Cloud" Awards.
As more enterprises are adopting clouds, the nature of cloud computing is changing. Previously, clouds were used to test applications or for non-mission critical applications. Today, enterprises are using clouds for cost-saving advantages and launching more mission critical applications that have defined performance needs. In his session at the 10th International Cloud Expo, Eric Shepcaro, CEO and Chairman of the Board of Telx, will discuss how distributed computing has many advantages. It wou...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...
Building a cloud computing environment with on-demand access to compute, network, and storage resources requires an elastic infrastructure at multiple levels. Virtualization combined with x86 servers has transformed the way we scale out compute resources. Unfortunately, legacy Fibre Channel and iSCSI storage architectures are rooted in rigid mainframe-era designs, and are fundamentally mismatched with the dynamic, shared modern data center. In his session at the 10th International Cloud Expo, ...
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what e...
With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where ...
With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
Can you bring services from the cloud to your customers faster and have them adopt it with ease of use or bring the power of bundled services to the fingertips of your clients without creating new rigid ‘apps stove pipes'? Do you want to prevent your business running away to public and unmanageably immature cloud services? In his session at the 10th International Cloud Expo, Hans van de Koppel, Sr. Enterprise Architect at Capgemini, will take Cloud Expo delegates to the developing world of clou...