|By Udayan Banerjee||
|February 15, 2012 05:00 AM EST||
So, what is big data? Is it the next path breaking technology which will change everything or is it just a hype which will die down after sometime?
Let us take a realistic look at what the term big data mean and what problem it can solve.
What is "Big Data"?
Here is a short explanation.
Big Data is the name given to the classes of technologies that needs to be used when your data volume becomes so much that the RDBMS technologies can no longer handle it.
Big data spans three dimensions (taken from this article of IBM):
- Variety – Big data extends beyond structured data, including unstructured data of all varieties: text, audio, video, click streams, log files and more.
- Velocity – Often time-sensitive, big data must be used as it is streaming in to the enterprise in order to maximize its value to the business.
- Volume – Big data comes in one size: large. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.
In short, if your data volume can be handled efficiently by RDBMS you NEED NOT worry about Big Data.
How did it all start?
With the advent of cloud computing which provided easy access to massive amount distributed computing power there was a realization RDBMS cannot be effectively parallelized. In fact CAP theorem states that Consistency, Availability & Partition Tolerance cannot simultaneously be guaranteed. This led to a No-SQL movement and multiple non-relational databases sprang up.
Trigger Point of Big Data happened when Google published the paper on the “Map-Reduce” algorithm. It involves processing of highly distributable problems across huge datasets using a large number of computers. Map-Reduce is at the heart of Google’s search engine.
In short: Big Data requires large DISTRIBUTED processing power.
Why would you want to process so much data?
There are 3 basic assumptions which are driving the big data movement:
- Faster analysis of larger operational data will help you make better decision
- More in-depth analysis of customer data will guide you to better customer segmentation
- Insight into larger data set will help you come up with innovative product design
Companies that have successfully leveraged this are Google, Facebook, Amazon, Walmart, Yahoo etc.
In short – the ASSUMPTION is that more data and faster analytics will lead to more innovation and better decision making.
Three Prerequisites for leveraging Big Data
Let us assume that your data volume is large enough and you have access to enough distributed processing power. Will that be sufficient for you to venture into big data?
No … you need three more things.
- Business problem which you think that the data at your disposal can help to resolve
- Set of questions to be answered through data analysis
- Algorithm to analyze the data – this is the domain of the new field Data Science
Big Data will be useful only if you are equipped with all these.
Therefore, for most of us, Big Data is a solution which is in search of a problem.
"We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 8, 2016 06:30 AM EST Reads: 1,111
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor – all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Dec. 8, 2016 06:15 AM EST Reads: 5,909
Get deep visibility into the performance of your databases and expert advice for performance optimization and tuning. You can't get application performance without database performance. Give everyone on the team a comprehensive view of how every aspect of the system affects performance across SQL database operations, host server and OS, virtualization resources and storage I/O. Quickly find bottlenecks and troubleshoot complex problems.
Dec. 8, 2016 05:00 AM EST Reads: 2,159
Internet of @ThingsExpo has announced today that Chris Matthieu has been named tech chair of Internet of @ThingsExpo 2017 New York The 7th Internet of @ThingsExpo will take place on June 6-8, 2017, at the Javits Center in New York City, New York. Chris Matthieu is the co-founder and CTO of Octoblu, a revolutionary real-time IoT platform recently acquired by Citrix. Octoblu connects things, systems, people and clouds to a global mesh network allowing users to automate and control design flo...
Dec. 8, 2016 04:30 AM EST Reads: 730
"We are an all-flash array storage provider but our focus has been on VM-aware storage specifically for virtualized applications," stated Dhiraj Sehgal of Tintri in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 8, 2016 04:15 AM EST Reads: 946
With 15% of enterprises adopting a hybrid IT strategy, you need to set a plan to integrate hybrid cloud throughout your infrastructure. In his session at 18th Cloud Expo, Steven Dreher, Director of Solutions Architecture at Green House Data, discussed how to plan for shifting resource requirements, overcome challenges, and implement hybrid IT alongside your existing data center assets. Highlights included anticipating workload, cost and resource calculations, integrating services on both sides...
Dec. 8, 2016 04:00 AM EST Reads: 3,797
Unless your company can spend a lot of money on new technology, re-engineering your environment and hiring a comprehensive cybersecurity team, you will most likely move to the cloud or seek external service partnerships. In his session at 18th Cloud Expo, Darren Guccione, CEO of Keeper Security, revealed what you need to know when it comes to encryption in the cloud.
Dec. 8, 2016 04:00 AM EST Reads: 4,814
"We're a cybersecurity firm that specializes in engineering security solutions both at the software and hardware level. Security cannot be an after-the-fact afterthought, which is what it's become," stated Richard Blech, Chief Executive Officer at Secure Channels, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 8, 2016 04:00 AM EST Reads: 1,045
According to Forrester Research, every business will become either a digital predator or digital prey by 2020. To avoid demise, organizations must rapidly create new sources of value in their end-to-end customer experiences. True digital predators also must break down information and process silos and extend digital transformation initiatives to empower employees with the digital resources needed to win, serve, and retain customers.
Dec. 8, 2016 02:45 AM EST Reads: 1,298
"We are the public cloud providers. We are currently providing 50% of the resources they need for doing e-commerce business in China and we are hosting about 60% of mobile gaming in China," explained Yi Zheng, CPO and VP of Engineering at CDS Global Cloud, in this SYS-CON.tv interview at 19th Cloud Expo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 8, 2016 02:30 AM EST Reads: 1,120
The WebRTC Summit New York, to be held June 6-8, 2017, at the Javits Center in New York City, NY, announces that its Call for Papers is now open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 20th International Cloud Expo and @ThingsExpo. WebRTC is the future of browser-to-browser communications, and continues to make inroads into the traditional, difficult, plug-in web co...
Dec. 8, 2016 01:45 AM EST Reads: 1,409
Between 2005 and 2020, data volumes will grow by a factor of 300 – enough data to stack CDs from the earth to the moon 162 times. This has come to be known as the ‘big data’ phenomenon. Unfortunately, traditional approaches to handling, storing and analyzing data aren’t adequate at this scale: they’re too costly, slow and physically cumbersome to keep up. Fortunately, in response a new breed of technology has emerged that is cheaper, faster and more scalable. Yet, in meeting these new needs they...
Dec. 8, 2016 01:30 AM EST Reads: 1,925
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busin...
Dec. 8, 2016 01:00 AM EST Reads: 3,957
The Internet of Things (IoT) promises to simplify and streamline our lives by automating routine tasks that distract us from our goals. This promise is based on the ubiquitous deployment of smart, connected devices that link everything from industrial control systems to automobiles to refrigerators. Unfortunately, comparatively few of the devices currently deployed have been developed with an eye toward security, and as the DDoS attacks of late October 2016 have demonstrated, this oversight can ...
Dec. 8, 2016 12:15 AM EST Reads: 1,360
What happens when the different parts of a vehicle become smarter than the vehicle itself? As we move toward the era of smart everything, hundreds of entities in a vehicle that communicate with each other, the vehicle and external systems create a need for identity orchestration so that all entities work as a conglomerate. Much like an orchestra without a conductor, without the ability to secure, control, and connect the link between a vehicle’s head unit, devices, and systems and to manage the ...
Dec. 7, 2016 10:30 PM EST Reads: 884
"Once customers get a year into their IoT deployments, they start to realize that they may have been shortsighted in the ways they built out their deployment and the key thing I see a lot of people looking at is - how can I take equipment data, pull it back in an IoT solution and show it in a dashboard," stated Dave McCarthy, Director of Products at Bsquare Corporation, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 7, 2016 10:00 PM EST Reads: 1,196
In his session at Cloud Expo, Robert Cohen, an economist and senior fellow at the Economic Strategy Institute, provideed economic scenarios that describe how the rapid adoption of software-defined everything including cloud services, SDDC and open networking will change GDP, industry growth, productivity and jobs. This session also included a drill down for several industries such as finance, social media, cloud service providers and pharmaceuticals.
Dec. 7, 2016 09:15 PM EST Reads: 410
In IT, we sometimes coin terms for things before we know exactly what they are and how they’ll be used. The resulting terms may capture a common set of aspirations and goals – as “cloud” did broadly for on-demand, self-service, and flexible computing. But such a term can also lump together diverse and even competing practices, technologies, and priorities to the point where important distinctions are glossed over and lost.
Dec. 7, 2016 08:45 PM EST Reads: 1,640
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Dec. 7, 2016 08:15 PM EST Reads: 2,230
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of Soli...
Dec. 7, 2016 07:15 PM EST Reads: 1,817