Welcome!

Cloud Expo Authors: Jeremy Geelan, Maureen O'Gara, Xenia von Wedel, Toddy Mladenov, Kevin Benedict

Related Topics: Cloud Expo

Cloud Expo: Article

Is Big Data a Solution in Search of a Problem?

The trigger point of Big Data was when Google published its paper on the “Map-Reduce” algorithm

If you look at the predictions made for 2012, you will find a new entry which was not there last year. Be it Gartner, Forrester or McKenzie – “Big Data” finds a place in the prediction.

So, what is big data? Is it the next path breaking technology which will change everything or is it just a hype which will die down after sometime?

Let us take a realistic look at what the term big data mean and what problem it can solve.

What is "Big Data"?

(The Wikipedia page on Big Data is not that good. The clearest explanation I have found is from O’Reilly Radar – here is the link)

Here is a short explanation.

Big Data is the name given to the classes of technologies that needs to be used when your data volume becomes so much that the RDBMS technologies can no longer handle it.

Big data spans three dimensions (taken from this article of IBM):

  • Variety – Big data extends beyond structured data, including unstructured data of all varieties: text, audio, video, click streams, log files and more.
  • Velocity – Often time-sensitive, big data must be used as it is streaming in to the enterprise in order to maximize its value to the business.
  • Volume – Big data comes in one size: large. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.

In short, if your data volume can be handled efficiently by RDBMS you NEED NOT worry about Big Data.

How did it all start?

With the advent of cloud computing which provided easy access to massive amount distributed computing power there was a realization RDBMS cannot be effectively parallelized. In fact CAP theorem states that Consistency, Availability & Partition Tolerance cannot simultaneously be guaranteed. This led to a No-SQL movement and multiple non-relational databases sprang up.

Trigger Point of Big Data happened when Google published the paper on the “Map-Reduce” algorithm. It involves processing of highly distributable problems across huge datasets using a large number of computers. Map-Reduce is at the heart of Google’s search engine.

Takeoff happened when Apache open source “Hadoop” project which created its own implementation of Map-Reduce. The largest Hadoop implementation is probably at Facebook.

In short: Big Data requires large DISTRIBUTED processing power.

Why would you want to process so much data?

There are 3 basic assumptions which are driving the big data movement:

  1. Faster analysis of larger operational data will help you make better decision
  2. More in-depth analysis of customer data will guide you to better customer segmentation
  3. Insight into larger data set will help you come up with innovative product design

Companies that have successfully leveraged this are Google, Facebook, Amazon, Walmart, Yahoo etc.

In short – the ASSUMPTION is that more data and faster analytics will lead to more innovation and better decision making.

Three Prerequisites for leveraging Big Data

Let us assume that your data volume is large enough and you have access to enough distributed processing power. Will that be sufficient for you to venture into big data?

No … you need three more things.

  1. Business problem which you think that the data at your disposal can help to resolve
  2. Set of questions to be answered through data analysis
  3. Algorithm to analyze the data – this is the domain of the new field Data Science

Big Data will be useful only if you are equipped with all these.

Therefore, for most of us, Big Data is a solution which is in search of a problem.

Related Articles

More Stories By Udayan Banerjee

Udayan Banerjee is CTO at NIIT Technologies Ltd, an IT industry veteran with more than 30 years' experience. He blogs at http://setandbma.wordpress.com.
The blog focuses on emerging technologies like cloud computing, mobile computing, social media aka web 2.0 etc. It also contains stuff about agile methodology and trends in architecture. It is a world view seen through the lens of a software service provider based out of Bangalore and serving clients across the world. The focus is mostly on...

  • Keep the hype out and project a realistic picture
  • Uncover trends not very apparent
  • Draw conclusion from real life experience
  • Point out fallacy & discrepancy when I see them
  • Talk about trends which I find interesting
Google

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Cloud Expo Breaking News
“I believe it is incumbent on the Cloud Service Providers (CSPs) and/or System Integrators (SIs) to understand the regulatory and compliance-related issues that their customers face,” noted Manjula Talreja, VP of Global Cloud Business Development at Cisco, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “Of course these issues are different in each industry and in each country.” Cloud Computing Journal: The move to cloud isn't about saving money, it is about saving time - ...
“Regulations and compliance are key trust topics with regards to cloud solutions and technology,” noted Sven Denecken, Vice President, Strategy and Co-Innovation Cloud Solutions, SAP AG, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “But it is also more than security of access – it is portability of data and a clear definition of where the data resides.” Cloud Computing Journal: The move to cloud isn't about saving money, it is about saving time – agree or disagree? Sve...
Many organizations want to expand upon the IaaS foundation to deliver cloud services in all forms – software, mobility, infrastructure and IT. Understanding the strategy, planning process and tools for this transformation will help catalyze changes in the way the business operates and deliver real value.
WSO2 on Thursday announced that WSO2 Vice President of Technology Evangelism Chris Haddad and SUSE Business Development Manager Frank Rego will lead a joint presentation at 12 International Cloud Expo. The session, "Bridging IaaS and PaaS to Deliver the Service-Oriented Data Center," is part of the event's Enterprise Cloud Computing Track on Thursday, June 13, 2013. The Cloud Expo conference is being held June 10-13, 2013 at the Javits Center in New York City. Bridging IaaS and PaaS to Deliver ...
IT has more opportunities than ever before with the growth in users, devices, data and secure cloud services. This creates not only a more enriching experience for users, but more opportunities for businesses. The key to capitalizing on these opportunities is to have the right tools in place to help scale operations. In his Day 3 Keynote at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], Intel's Rob Crooke will describe the range of products that Intel provides to support different usa...
Quantum Corp., a proven global expert in data protection and Big Data management, has announced that Senior Vice President of Cloud Solutions Henrik Rosendahl will present a session exploring the future of cloud data protection and the impact of data reduction technologies on cloud storage at the 12th International Cloud Expo. The conference takes place June 10-13 at the Javits Center in New York City. Rosendahl will explore trends in cloud-based backup and disaster recovery (DR) and how curre...
One of the cloud’s biggest draws is the capability to virtualize computing resources, allowing it to be consumed with the click of a mouse. But behind that simple click is an enormous infrastructure challenge that has recently been cited as a major cause for slower enterprise adoption. Enterprises can better prepare for this shift and take full advantage of future computing benefits. Between architecture design and migration planning, the road can be long, so what do you do with your talent? I...
In the old world of IT, if you didn't have hardware capacity or the budget to buy more, your project was dead in the water. Budget constraints can leave some of the best, most creative and most ingenious innovations on the cutting room floor. It’s a true dilemma for developers and innovators – why spend the time creating, when a project could be abandoned in a blink? That was the old world. In the new world of IT, developers rule. They have access to resources they can spin up instantly. A hyb...
INetU, the industry's experts in complex hosting and a global provider of business-centric managed cloud and application hosting, has announced that Cloud Architect Rich Hand will be presenting "Private Cloud, Public Cloud - Is There a Third Option?" at the 12th International Cloud Expo taking place June 10-13, 2013 in New York City. As more enterprise IT departments move into the cloud, many executives are evaluating whether to adopt a Public or Private cloud. The cost benefits of the Public ...
“I’m careful when using terms like Big Data, because it can mean so many things to different people,” explained Eric Hanselman, Chief Analyst at 451 Research, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “There is huge value in analytics that companies can use to pull intelligence from a collection of data sources that are available in their businesses. The inexpensive storage that cloud services can offer make a great environment to pull together siloed data.” Cloud Co...