|By Bob Gourley||
|January 2, 2013 08:00 AM EST||
By Daniel Abadi
Editor’s note: The piece below by Daniel Abadi first appeared on the Hadapt blog and is republished with permission here. The framework presented provides insight into the very dynamic market around “Big Data Innovators” and should be of use for classifying many other firms in this interesting space. -bg
Recently InformationWeek published a piece, authored by Doug Henschen, that listed 13 innovative Big Data vendors. The complete list is reproduced below:
2. Amazon (Redshift, EMR, DynamoDB)
3. Cloudera (CDH, Impala)
11. Neo Technology
These 13 vendors distribute 16 unique data management products (since both Amazon and Cloudera offer multiple distinct data management/processing systems), all of which push the boundary on Big Data management.
In this post I will attempt to subcategorize these 16 products into a competitive grouping, where products placed inside the same group can be considered replacements for each other (and hence are competitive), and each group is complementary to every other group.
Before starting this classification, I will remove three products that, while potentially being interesting from a Big Data perspective, are often used outside of what has become known as the “Big Data realm”, and therefore their primary competitors did not make it on the InformationWeek list. These three products are Splunk (which typically competes with companies focused on the security, compliance, and IT operations management verticals), Amazon Redshift (which typically completes with traditional MPP database vendors), and Neo Technology (which, although usually classified as a “NoSQL database”, its focus on graph data makes it highly unique from a technology and use case perspective relative to the other NoSQL databases on this list).
The remaining 13 products can be classified into four distinct groups:
1. Operational data stores that allow flexible schemas
2. Hadoop distributions
3. Real-time Hadoop-based analytical platforms
4. Hadoop-based BI solutions
Group 1 (operational data stores that allow flexible schemas)
This group is composed of database products that can be used to manage active data for dynamic applications with hard to define (or hard to predict) schemas. The database must be optimized for inserting, retrieving, updating, or deleting individual data items in real-time (latencies on the order of milliseconds), but should also support some sort of interface for performing analysis of the data stored within. The dynamic nature of the typical use case for databases in this group implies a NoSQL interface, and either a key-value or document-store retrieval model. From the InformationWeek list, MongoDB, DynamoDB, Couchbase, and Datastax all fit in this category. Although there are some significant technical differences between these products, they can nonetheless be roughly described as potential replacements for each other in Group 1 use cases.
Group 2 (Hadoop distributions)
The products in this group are designed for very different situations than Group 1. Hadoop is typically used for large scale data analysis and batch processing. Rather than inserting, retrieving, updating, or deleting individual data items, Hadoop is optimized for scanning through large swaths of data, processing and analyzing the data as it proceeds. Hadoop has become the poster-child for “Big Data” due to its proven massive scalability, and its ability to handle the “variety” aspect of Big Data (since Hadoop does not require data to fit neatly into rows and columns in order to be analyzed and processed). From the InformationWeek list, Cloudera, Hortonworks, MapR, and Amazon EMR all fit in this category.
Group 3 (real-time Hadoop-based analytical platforms)
Group 3 takes Hadoop to the next level, transforming it from a mere batch processing system to a full-fledged analytical platform that can answer queries in real-time. Furthermore, by adding a more robust SQL interface to Hadoop (in addition to industry-standard ODBC connectors), group 3 products help to hide the complexity of Hadoop and the need for Hadoop specialists, since traditional business intelligence and visualization tools are now able to interface directly with data stored inside Hadoop. From the InformationWeek list, Hadapt clearly fits in this category, and with certain caveats, so does Cloudera Impala (the caveats are that as of the time of writing this blog post (a) Impala is an extremely young codebase and is still only in beta (b) Impala only supports a small subset of SQL and does not support UDFs or other ways to combine structured and unstructured data in the same query, so calling it an “analytical platform” might be a bit of a stretch).
Group 4 (Hadoop-based BI solutions)
Often lumped together with group 3 products, group 4 products are often confused as being competitive with group 3 products. However, just as business intelligence tools and analytical database solutions are highly complementary and were often packaged together in the pre-Hadoop world, the same is true in the Hadoop/Big Data world. Therefore, Datameer, Karmasphere, and Platfora, all of which function as a business intelligence layer above Hadoop, are capable of working closely with the group 3 products (with announcements along these lines already starting to begin).
In conclusion, although “Big Data” is an enormous and rapidly growing market, one single data management software product is not going to rule the market. Rather, there are four major groups of data management solutions within the Big Data space; and while there is fierce competition within each group, at the macro level these groups can not only co-exist, but are highly complementary. In the long run, it is likely that the 2-3 leaders in each group will emerge and share the Big Data pie.
2015 predictions circa 1970: houses anticipate our needs and adapt, city infrastructure is citizen and situation aware, office buildings identify and preprocess you. Today smart buildings have no such collective conscience, no shared set of fundamental services to identify, predict and synchronize around us. LiveSpace and M2Mi are changing that. LiveSpace Smart Environment devices deliver over the M2Mi IoT Platform real time presence, awareness and intent analytics as a service to local connecte...
Jun. 3, 2015 11:15 AM EDT Reads: 1,137
IBM has acquired Blue Box Group, Inc., a managed private cloud provider built on OpenStack. Customers benefit from the ability to more easily deploy workloads across hybrid cloud environments. Financial details were not disclosed. Enterprises are seeking ways to embrace all types of cloud to address a wide range of workloads. Today's announcement reinforces IBM's commitment to deliver flexible cloud computing models that make it easier for customers to move to data and applications across cloud...
Jun. 3, 2015 11:00 AM EDT Reads: 229
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will addresses this very serious issue o...
Jun. 3, 2015 11:00 AM EDT Reads: 1,313
paradigm shifts in networking, to cloud and licensure, and all the Internet of Things in between. In 2014 automation was the name of the game. In his session at DevOps Summit, Matthew Joyce, a Sales Engineer at Big Switch, will discuss why in 2015 it’s complexity reduction. Matthew Joyce, a sales engineer at Big Switch, is helping push networking into the 21st century. He is also a hacker at NYC Resistor. Previously he worked at NASA Ames Research Center with the Nebula Project (where OpenSta...
Jun. 3, 2015 11:00 AM EDT Reads: 1,109
While there are hundreds of public and private cloud hosting providers to choose from, not all clouds are created equal. If you’re seeking to host enterprise-level mission-critical applications, where Cloud Security is a primary concern, WHOA.com is setting new standards for cloud hosting, and has established itself as a major contender in the marketplace. We are constantly seeking ways to innovate and leverage state-of-the-art technologies. In his session at 16th Cloud Expo, Mike Rivera, Seni...
Jun. 3, 2015 11:00 AM EDT Reads: 1,248
"Verizon Digital Media Services is responsible for the broadcast, video and content delivery network that accelerates, scales and helps our customers reach end users with all kinds of video and web content," stated James Segil, CMO of Verizon Digital Media Services, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jun. 3, 2015 11:00 AM EDT Reads: 3,343
"For over 25 years we have been working with a lot of enterprise customers and we have seen how companies create applications. And now that we have moved to cloud computing, mobile, social and the Internet of Things, we see that the market needs a new way of creating applications," stated Jesse Shiah, CEO, President and Co-Founder of AgilePoint Inc., in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jun. 3, 2015 11:00 AM EDT Reads: 4,293
The most often asked question post-DevOps introduction is: “How do I get started?” There’s plenty of information on why DevOps is valid and important, but many managers still struggle with simple basics for how to initiate a DevOps program in their business. They struggle with issues related to current organizational inertia, the lack of experience on Continuous Integration/Delivery, understanding where DevOps will affect revenue and budget, etc. In their session at DevOps Summit, JP Morgentha...
Jun. 3, 2015 10:45 AM EDT Reads: 1,121
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable o...
Jun. 3, 2015 10:30 AM EDT Reads: 3,018
"At Harbinger we do products as well as services. Our services are with helping companies move their products to the cloud operating systems. Some of the challenges we have seen as far as cloud adoption goes are in the cloud security space," noted Shrikant Pattathil, Executive Vice President at Harbinger Systems, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jun. 3, 2015 10:00 AM EDT Reads: 3,304
The multi-trillion economic opportunity around the "Internet of Things" (IoT) is emerging as the hottest topic for investors in 2015. As we connect the physical world with information technology, data from actions, processes and the environment can increase sales, improve efficiencies, automate daily activities and minimize risk. In his session at @ThingsExpo, Ed Maguire, Senior Analyst at CLSA Americas, will describe what is new and different about IoT, explore financial, technological and re...
Jun. 3, 2015 10:00 AM EDT Reads: 1,083
"At our booth we are showing how to provide trust in the Internet of Things. Trust is where everything starts to become secure and trustworthy. Now with the scaling of the Internet of Things it becomes an interesting question – I've heard numbers from 200 billion devices next year up to a trillion in the next 10 to 15 years," explained Johannes Lintzen, Vice President of Sales at Utimaco, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in San...
Jun. 3, 2015 09:15 AM EDT Reads: 2,875
Aria Systems, which helps enterprises grow recurring revenue, today announced that its co-founder and inventor of cloud billing Brendan O'Brien, will be a featured speaker at the Cloud Expo, June 9-11 at The Javits Center in New York. Aria also will be demonstrating its Active Monetization platform in Booth #424 on the Expo Floor. O'Brien will lead the following sessions: June 9 - 11:00 am - 11:35 am, Room 1A16 with participants from Constant Contact, MGI Research and ATG Enabling Complex ...
Jun. 3, 2015 09:08 AM EDT Reads: 349
"For the past 4 years we have been working mainly to export. For the last 3 or 4 years the main market was Russia. In the past year we have been working to expand our footprint in Europe and the United States," explained Andris Gailitis, CEO of DEAC, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jun. 3, 2015 09:00 AM EDT Reads: 4,379
"Our premise is Docker is not enough. That's not a bad thing - we actually love Docker. At ActiveState all our products are based on open source technology and Docker is an up-and-coming piece of open source technology," explained Bart Copeland, President & CEO of ActiveState Software, in this SYS-CON.tv interview at DevOps Summit at Cloud Expo®, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Jun. 3, 2015 09:00 AM EDT Reads: 6,289
SYS-CON Events announced today DevOps.com will exhibit at SYS-CON's DevOps Summit 2015 New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Launched in 2014, DevOps.com has quickly established itself as an indispensable resource for DevOps education and community building. DevOps.com make it their mission to cover all aspects of DevOps – philosophy, tools, business impact, best practices and more.
Jun. 3, 2015 08:45 AM EDT Reads: 671
Container technology is sending shock waves through the world of cloud computing. Heralded as the 'next big thing,' containers provide software owners a consistent way to package their software and dependencies while infrastructure operators benefit from a standard way to deploy and run them. Containers present new challenges for tracking usage due to their dynamic nature. They can also be deployed to bare metal, virtual machines and various cloud platforms. How do software owners track the usag...
Jun. 3, 2015 08:45 AM EDT Reads: 1,219
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it!
Jun. 3, 2015 08:00 AM EDT Reads: 1,153
The web app is Agile. The REST API is Agile. The testing and planning are Agile. But alas, Data infrastructures certainly are not. Once an application matures, changing the shape or indexing scheme of data often forces at best a top down planning exercise and at worst includes schema changes which force downtime. The time has come for a new approach that fundamentally advances the agility of distributed data infrastructures. Come learn about a new solution to the problems faced by software orga...
Jun. 3, 2015 08:00 AM EDT Reads: 1,122
IndependenceIT has been selected by nGenx to power Windows-based DaaS and application delivery on Google Compute Engine to support the delivery of GoldMine Cloud software. For independent software vendors (ISVs) like GoldMine, this expands the theater of operations to increase revenue opportunities while reducing software management and maintenance liabilities. IndependenceIT was selected by application and desktop pioneer, nGenx, to deliver its “Bring Your Own Cloud” strategy to GoldMine and o...
Jun. 3, 2015 08:00 AM EDT Reads: 674