Welcome!

Cloud Expo Authors: Robert Eve, Jeremy Geelan, Maureen O'Gara, Pat Romanski, Liz McMillan

Related Topics: Cloud Expo, SOA & WOA

Cloud Expo: Article

The Consumerization of Big Data Analytics

Dataclouds

With Dropbox, Jive, Yammer, Chatter and a number of other new services, the modern enterprise is rapidly becoming "consumerized". And it's not just business, the same is happening in major web companies, on Wall Street, in government agencies, and in science labs.  Thirty years of bad "enterprise software" experiences is making this transition happen much more quickly than anyone would have expected. The shift to cloud computing is also accelerating the trend, as is the goal of developing a much more "social" approach to business.

The other major change that's going on today throughout business, web, finance, government and science, is that every organisation now realizes that it needs to be data-driven. Big data and analytics have the potential to unleash creativity and innovation everywhere - generating new ideas and new insights continuously. To achieve and maintain competitive advantage today, it is becoming essential for everyone in an organisation to have instant access to all the information they need, at all times.

Making big data analytics available to everyone in an organisation means that it has to be much simpler than traditional data analytics solutions such as databases, data warehouses and Hadoop clusters. It needs to be consumerized! We need a new generation of data analytics solutions that are not just powerful and scalable, but also very easy-to-use.

Dataclouds
At Cloudscale we've been working on the hard problem of delivering this extreme simplicity, extreme power and extreme scale. Our "datacloud" solution combines a number of advanced technologies in a unique way to achieve these challenging goals. The patented in-memory architecture is massively parallel, cloud-based, and fault tolerant. It runs on standard commodity hardware, either in the cloud (e.g. Amazon) or as an in-house (OpenStack) appliance.

Cloudscale lets anyone easily store, share, explore and analyze the exponentially growing volumes of data in their work and in their life. It's like a "Dropbox for Big Data Analytics". The Cloudscale data store and app store allow users to easily create, share and collaborate on all kinds of data and apps.

It’s designed for everyone - business users, data scientists, app developers, individuals - anyone, or any organization, that needs a simpler way to handle today’s explosively growing data volumes. And it’s viral - sharing data and apps creates powerful network effects within organizations, unleashing data-driven creativity and innovation everywhere.

With this new technology, anyone can now become a "big data rocket scientist". Through simple, easy-to-use interfaces, users can:

  • Work with all types of data - structured and unstructured - from any source
  • Work with live data streams and massive stored data sets
  • Quickly discover important patterns, correlations, statistics, trends, predictions,...
  • Quickly develop, deploy and scale big data apps - mapreduce, realtime analytics, statistics, pattern matching, machine learning, graph algorithms, time series,...
  • Evaluate millions of scenarios and potential opportunities and threats every second
  • Go from data to decision to action instantly

It's super-fast and super-scalable! For example, Cloudscale can be used to analyze a live stream in realtime at more than 150MB/sec on just three 8-core AWS cluster instances. That corresponds to processing a SINGLE STREAM in parallel at a rate of TWO MILLION ROWS PER SECOND, or well over ONE TRILLION EVENTS per week. To give some idea of how fast this is, the nationwide call log systems of even the biggest US telcos only generate about 50,000 rows/sec, even at peak. For processing even more data, the solution scales linearly.

The performance of the Cloudscale datacloud is more than 125x faster than Yahoo's S4 (Realtime MapReduce) system, on the same hardware - about the difference in speed between walking from San Francisco to New York (4mph) versus taking a plane (500mph).

These are just the first steps in the consumerization of the $30Billion+ analytics industry. As powerful analytics gets democratised in this way, we can expect that it will spread virally into every corner of every organisation.

More Stories By Bill McColl

Bill McColl left Oxford University to found Cloudscale. At Oxford he was Professor of Computer Science, Head of the Parallel Computing Research Center, and Chairman of the Computer Science Faculty. Along with Les Valiant of Harvard, he developed the BSP approach to parallel programming. He has led research, product, and business teams, in a number of areas: massively parallel algorithms and architectures, parallel programming languages and tools, datacenter virtualization, realtime stream processing, big data analytics, and cloud computing. He lives in Palo Alto, CA.

Cloud Expo Breaking News
As more enterprises are adopting clouds, the nature of cloud computing is changing. Previously, clouds were used to test applications or for non-mission critical applications. Today, enterprises are using clouds for cost-saving advantages and launching more mission critical applications that have defined performance needs. In his session at the 10th International Cloud Expo, Eric Shepcaro, CEO and Chairman of the Board of Telx, will discuss how distributed computing has many advantages. It wou...
“Big data represents a sea change of capabilities in IT” notes Matt McLarty, Vice President, Client Solutions at Layer 7, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. McLarty continued: “In conjunction with mobile and cloud, I think Big Data will provide a technological makeover to the typical enterprise infrastructure, drawing a hard API border in front of core business services while blurring the line between logic and data services.” Cloud Computing Journal: Agree or...
Hardware and chemistry improvements will make the $1,000 human genome a reality soon. While the massive amount of genomics data that will be generated represents a huge opportunity to advance personal medicine, it also presents an enormous big data challenge. In his session at the 10th International Cloud Expo, Dr Andreas Sundquist, CEO of DNAnexus, will discuss how the cloud will address these issues by enabling the management, storage, sharing and analysis of the world’s DNA data and how it ...
Virtualization and private cloud are good for server consolidation, creating flexible environments, and saving IT budget dollars. A recent survey of 1200 companies with 500+ employees showed that 59% had server virtualization in production or pilot. But that doesn’t tell the whole story. In his session at the 10th International Cloud Expo, Dave Asprey, VP of Cloud Security at Trend Micro, will explain the types of situations when you should consider not virtualizing some of your applications. ...
The Platform as a Service (PaaS) market grew out of the fact that no other cloud solution addressed the ever-increasing complexity of managing and writing modern applications: no frameworks, libraries or APIs alone could tackle the sticky application engineering challenges. Unfortunately, PaaS 1.0 is what people are now seeing as strictly a “tool” to easily deploy apps to the infrastructure in a self-service way with little or no differentiation among offerings. However, in order for PaaS to rea...
Hadoop, MapReduce, Hive, Hbase, Lucene, Solr? The only thing growing faster than enterprise data these days is the landscape of big data tools. These tools, which are designed to help organizations turn big data into opportunities, are gaining deeper insight into massive volumes of information. A recent Gartner report predicts that enterprise data will increase by 650% over the next five years, which means that the time is now for IT decision makers to determine which big data tools are the best...
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what e...
With Cloud Expo 2012 New York (10th Cloud Expo) now under four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference... We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what e...
The proliferation of device connectivity is redefining the functionality requirements and capabilities of many embedded systems as more and more of these devices look to leverage the “Cloud.” While many commercial software and hardware component vendors have begun to realign their value propositions to satisfy growing demand, commercial-off-the-shelf products (COTS) alone cannot meet every OEM’s needs. As a result, the Embedded Cloud has injected a new level of uncertainty and a new competitive ...
Building a cloud computing environment with on-demand access to compute, network, and storage resources requires an elastic infrastructure at multiple levels. Virtualization combined with x86 servers has transformed the way we scale out compute resources. Unfortunately, legacy Fibre Channel and iSCSI storage architectures are rooted in rigid mainframe-era designs, and are fundamentally mismatched with the dynamic, shared modern data center. In his session at the 10th International Cloud Expo, ...