Click here to close now.

Welcome!

Cloud Expo Authors: Liz McMillan, AppDynamics Blog, Carmen Gonzalez, Roger Strukhoff, Pat Romanski

Related Topics: Big Data Journal, .NET, Virtualization, Web 2.0, Cloud Expo, Apache

Big Data Journal: Blog Feed Post

Classifying Today’s “Big Data Innovators”

These 13 vendors distribute 16 unique data management products

By

Editor’s note: The piece below by   first appeared on the Hadapt blog and is republished with permission here. The framework presented provides insight into the very dynamic market around “Big Data Innovators” and should be of use for classifying many other firms in this interesting space. -bg

Recently InformationWeek published a piece, authored by Doug Henschen, that listed 13 innovative Big Data vendors. The complete list is reproduced below:

1.  MongoDB
2.  Amazon (Redshift, EMR, DynamoDB)
3.  Cloudera (CDH, Impala)
4.  Couchbase
5.  Datameer
6.  Datastax
7.  Hadapt
8.  Hortonworks
9.  Karmasphere
10.  MapR
11.  Neo Technology
12.  Platfora
13.  Splunk

Big-Data3These 13 vendors distribute 16 unique data management products (since both Amazon and Cloudera offer multiple distinct data management/processing systems), all of which push the boundary on Big Data management.

In this post I will attempt to subcategorize these 16 products into a competitive grouping, where products placed inside the same group can be considered replacements for each other (and hence are competitive), and each group is complementary to every other group.

Before starting this classification, I will remove three products that, while potentially being interesting from a Big Data perspective, are often used outside of what has become known as the “Big Data realm”, and therefore their primary competitors did not make it on the InformationWeek list. These three products are Splunk (which typically competes with companies focused on the security, compliance, and IT operations management verticals), Amazon Redshift (which typically completes with traditional MPP database vendors), and Neo Technology (which, although usually classified as a “NoSQL database”, its focus on graph data makes it highly unique from a technology and use case perspective relative to the other NoSQL databases on this list).

The remaining 13 products can be classified into four distinct groups:
1.  Operational data stores that allow flexible schemas
2.  Hadoop distributions
3.  Real-time Hadoop-based analytical platforms
4.  Hadoop-based BI solutions

Group 1 (operational data stores that allow flexible schemas)
This group is composed of database products that can be used to manage active data for dynamic applications with hard to define (or hard to predict) schemas. The database must be optimized for inserting, retrieving, updating, or deleting individual data items in real-time (latencies on the order of milliseconds), but should also support some sort of interface for performing analysis of the data stored within. The dynamic nature of the typical use case for databases in this group implies a NoSQL interface, and either a key-value or document-store retrieval model. From the InformationWeek list, MongoDB, DynamoDB, Couchbase, and Datastax all fit in this category. Although there are some significant technical differences between these products, they can nonetheless be roughly described as potential replacements for each other in Group 1 use cases.

Group 2 (Hadoop distributions)
The products in this group are designed for very different situations than Group 1. Hadoop is typically used for large scale data analysis and batch processing. Rather than inserting, retrieving, updating, or deleting individual data items, Hadoop is optimized for scanning through large swaths of data, processing and analyzing the data as it proceeds. Hadoop has become the poster-child for “Big Data” due to its proven massive scalability, and its ability to handle the “variety” aspect of Big Data (since Hadoop does not require data to fit neatly into rows and columns in order to be analyzed and processed). From the InformationWeek list, Cloudera, Hortonworks, MapR, and Amazon EMR all fit in this category.

Group 3 (real-time Hadoop-based analytical platforms)
Group 3 takes Hadoop to the next level, transforming it from a mere batch processing system to a full-fledged analytical platform that can answer queries in real-time. Furthermore, by adding a more robust SQL interface to Hadoop (in addition to industry-standard ODBC connectors), group 3 products help to hide the complexity of Hadoop and the need for Hadoop specialists, since traditional business intelligence and visualization tools are now able to interface directly with data stored inside Hadoop. From the InformationWeek list, Hadapt clearly fits in this category, and with certain caveats, so does Cloudera Impala (the caveats are that as of the time of writing this blog post (a) Impala is an extremely young codebase and is still only in beta (b) Impala only supports a small subset of SQL and does not support UDFs or other ways to combine structured and unstructured data in the same query, so calling it an “analytical platform” might be a bit of a stretch).

Group 4 (Hadoop-based BI solutions)
Often lumped together with group 3 products,  group 4 products are often confused as being competitive with group 3 products. However, just as business intelligence tools and analytical database solutions are highly complementary and were often packaged together in the pre-Hadoop world, the same is true in the Hadoop/Big Data world. Therefore, Datameer, Karmasphere, and Platfora, all of which function as a business intelligence layer above Hadoop, are capable of working closely with the group 3 products (with announcements along these lines already starting to begin).

In conclusion, although “Big Data” is an enormous and rapidly growing market, one single data management software product is not going to rule the market. Rather, there are four major groups of data management solutions within the Big Data space; and while there is fierce competition within each group, at the macro level these groups can not only co-exist, but are highly complementary. In the long run, it is likely that the 2-3 leaders in each group will emerge and share the Big Data pie.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley, former CTO of the Defense Intelligence Agency (DIA), is Founder and CTO of Crucial Point LLC, a technology research and advisory firm providing fact based technology reviews in support of venture capital, private equity and emerging technology firms. He has extensive industry experience in intelligence and security and was awarded an intelligence community meritorious achievement award by AFCEA in 2008, and has also been recognized as an Infoworld Top 25 CTO and as one of the most fascinating communicators in Government IT by GovFresh.

@CloudExpo Stories
SYS-CON Events announced today the IoT Bootcamp – Jumpstart Your IoT Strategy, being held June 9–10, 2015, in conjunction with 16th Cloud Expo and Internet of @ThingsExpo at the Javits Center in New York City. This is your chance to jumpstart your IoT strategy. Combined with real-world scenarios and use cases, the IoT Bootcamp is not just based on presentations but includes hands-on demos and walkthroughs. We will introduce you to a variety of Do-It-Yourself IoT platforms including Arduino, Ras...
SYS-CON Events announced today the DevOps Foundation Certification Course, being held June ?, 2015, in conjunction with DevOps Summit and 16th Cloud Expo at the Javits Center in New York City, NY. This sixteen (16) hour course provides an introduction to DevOps – the cultural and professional movement that stresses communication, collaboration, integration and automation in order to improve the flow of work between software developers and IT operations professionals. Improved workflows will res...
The best mobile applications are augmented by dedicated servers, the Internet and Cloud services. Mobile developers should focus on one thing: writing the next socially disruptive viral app. Thanks to the cloud, they can focus on the overall solution, not the underlying plumbing. From iOS to Android and Windows, developers can leverage cloud services to create a common cross-platform backend to persist user settings, app data, broadcast notifications, run jobs, etc. This session provide...
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Soha will exhibit at SYS-CON's DevOps Summit New York, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Soha delivers enterprise-grade application security, on any device, as agile as the cloud. This turnkey, cloud-based service enables customers to solve secure application access and delivery challenges that traditional or virtualized network solutions cannot solve because they are too expensive, inflexible and operational...
Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 16th Cloud Expo at the Javits Center in New York June 9-11 will find fresh new content in a new track called PaaS | Containers & Microservices Containers are not being considered for the first time by the cloud community, but a current era of re-consideration has pushed them to the top of the cloud agenda. With the launch ...
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable o...
SYS-CON Events announced today that Ciqada will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Ciqada™ makes it easy to connect your products to the Internet. By integrating key components - hardware, servers, dashboards, and mobile apps - into an easy-to-use, configurable system, your products can quickly and securely join the internet of things. With remote monitoring, control, and alert messaging capability, you will mee...
of cloud, colocation, managed services and disaster recovery solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. TierPoint, LLC, is a leading national provider of information technology and data center services, including cloud, colocation, disaster recovery and managed IT services, with corporate headquarters in St. Louis, MO. TierPoint was formed through the strategic combination of some of t...
Public Cloud IaaS started it's life in the developer and startup communities and has grown rapidly to a $20B+ industry, but it still pales in comparison to how much is spent worldwide on IT: $3.6 trillion. In fact, there are 8.6 million data centers worldwide, the reality is many small and medium sized business have server closets and colocation footprints filled with servers and storage gear. While on-premise environment virtualization may have peaked at 75%, the Public Cloud has lagged in ado...
Dave will share his insights on how Internet of Things for Enterprises are transforming and making more productive and efficient operations and maintenance (O&M) procedures in the cleantech industry and beyond. Speaker Bio: Dave Landa is chief operating officer of Cybozu Corp (kintone US). Based in the San Francisco Bay Area, Dave has been on the forefront of the Cloud revolution driving strategic business development on the executive teams of multiple leading Software as a Services (SaaS) ap...
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
SYS-CON Events announced today that GENBAND, a leading developer of real time communications software solutions, has been named “Silver Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. The GENBAND team will be on hand to demonstrate their newest product, Kandy. Kandy is a communications Platform-as-a-Service (PaaS) that enables companies to seamlessly integrate more human communications into their Web and mobile applicatio...
SYS-CON Events announced today that BroadSoft, the leading global provider of Unified Communications and Collaboration (UCC) services to operators worldwide, has been named “Gold Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. BroadSoft is the leading provider of software and services that enable mobile, fixed-line and cable service providers to offer Unified Communications over their Internet Protocol networks. The Compa...
ProfitBricks, the provider of painless cloud infrastructure for IaaS, today announced the release of a Node.js SDK written against its recently launched REST API. This new JavaScript based library provides coverage for all existing ProfitBricks REST API functions. With additional libraries set to release this month, ProfitBricks continues to prove its dedication to the DevOps community and commitment to making cloud migrations and cloud management painless. Node.js is an open source, cross-pl...
The IoT Bootcamp is coming to Cloud Expo | @ThingsExpo on June 9-10 at the Javits Center in New York. Instructor. Registration is now available at http://iotbootcamp.sys-con.com/ Instructor Janakiram MSV previously taught the famously successful Multi-Cloud Bootcamp at Cloud Expo | @ThingsExpo in November in Santa Clara. Now he is expanding the focus to Janakiram is the founder and CTO of Get Cloud Ready Consulting, a niche Cloud Migration and Cloud Operations firm that recently got acquir...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding bu...
The 5th International DevOps Summit, co-located with 17th International Cloud Expo – being held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the...
ProfitBricks has launched its new DevOps Central and REST API, along with support for three multi-cloud libraries and a Python SDK. This, combined with its already existing SOAP API and its new RESTful API, moves ProfitBricks into a position to better serve the DevOps community and provide the ability to automate cloud infrastructure in a multi-cloud world. Following this momentum, ProfitBricks has also introduced several libraries that enable developers to use their favorite language to code ...
What exactly is a cognitive application? In her session at 16th Cloud Expo, Ashley Hathaway, Product Manager at IBM Watson, will look at the services being offered by the IBM Watson Developer Cloud and what that means for developers and Big Data. She'll explore how IBM Watson and its partnerships will continue to grow and help define what it means to be a cognitive service, as well as take a look at the offerings on Bluemix. She will also check out how Watson and the Alchemy API team up to off...