Welcome!

@CloudExpo Authors: Liz McMillan, Pat Romanski, Elizabeth White, Kevin Benedict, Richard Hale

Related Topics: @BigDataExpo, Open Source Cloud, @CloudExpo, Apache

@BigDataExpo: Blog Feed Post

The Future of Data Management According to Doug Cutting (Father of Hadoop)

The pedigree of Hadoop came from Google’s GFS (Google File System, now HDFS) and Map-Reduce programming

I joined 600 people last night at a session sponsored by Hive to listen to Doug Cutting, the creator of Hadoop. Currently he is the chief architect at Cloudera and a director at Apache  Software Foundation. The hall at NetApp facility was overflowing with an eager audience. Doug spoke about the future of data management.

He narrated a brief history of Hadoop, how it was founded and how far it has come. As everyone knows, the pedigree of Hadoop came from Google’s GFS (Google File System, now HDFS) and Map-Reduce programming. Here are the key predictions he made:

  • Hadoop has grown to become the de-facto standard for Big Data. He had anticipated IBM and Microsoft to come up with alternative designs to compete with Hadoop, but that never happened. Both companies plus Oracle, HP and other players have endorsed Hadoop as the platform.
  • Hadoop will become the center of data management in future. It will not be the original HDFS+MR layers, but a whole new ecosystem called “The Enterprise Data Hub”. There will be an explosion of products surrounding Hadoop (all open systems). He cited examples of Pig, Hive, Sqoop, etc. Currently many SQL implementations over HDFS are coming up.
  • Will there be OLTP (Transactional systems) on Hadoop? He said yes. Current implementation of Impala (from Cloudera) has SQL on HDFS with Map-Reduce on top is proving quite efficient in ETL workloads. Several customers have started migrating from legacy world to Impala.
  • The new project at Google called Spanner is also leading the way to a future OLTP system distributed across the globe. This work will propel future additions to the Hadoop ecosystem.
  • He explained the big advantage of Open systems architecture and why that will become the norm over proprietary systems.
  • The future Hadoop ecosystem (Enterprise Data Hub) will be a threat to the current incumbents like Oracle, MySQL, SQL server, DB2, and Vertica. Current challenges of weak security and lack of standardization will be addressed eventually.

Doug is an engaging speaker and clearly showed he knows his subject well. I have my doubts on his future predictions, as DBMS’s take a long time to mature and provide all the critical functions for mission-critical applications. We have learnt that over the last 4 decades. Hadoop is still primarily a batch system doing offline analytics. Moving from there to do real-time production workload is quite a jump and will take many years to accomplish.

Then there are the new breed of highly efficient NoSQL databases like MongoDB that are being deployed to create “systems of engagement” at large enterprises. Also, the incumbents are not sitting idle either with a total market size of $30 Billion dollars. It is funny to remember that our tax records are still managed by Model 204 at IRS, a DBMS created during the 1960s. Switching databases is extremely cumbersome and not for the faint-hearted. Doug did say that future spending will steer more towards Hadoop.

Given the challenges of Big Data and the rapid adoption of Hadoop, we will watch this space as it unfolds over next couple of years.

Read the original blog entry...

More Stories By Jnan Dash

Jnan Dash is Senior Advisor at EZShield Inc., Advisor at ScaleDB and Board Member at Compassites Software Solutions. He has lived in Silicon Valley since 1979. Formerly he was the Chief Strategy Officer (Consulting) at Curl Inc., before which he spent ten years at Oracle Corporation and was the Group Vice President, Systems Architecture and Technology till 2002. He was responsible for setting Oracle's core database and application server product directions and interacted with customers worldwide in translating future needs to product plans. Before that he spent 16 years at IBM. He blogs at http://jnandash.ulitzer.com.

@CloudExpo Stories
"Software-defined storage is a big problem in this industry because so many people have different definitions as they see fit to use it," stated Peter McCallum, VP of Datacenter Solutions at FalconStor Software, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
"Operations is sort of the maturation of cloud utilization and the move to the cloud," explained Steve Anderson, Product Manager for BMC’s Cloud Lifecycle Management, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
The cloud competition for database hosts is fierce. How do you evaluate a cloud provider for your database platform? In his session at 18th Cloud Expo, Chris Presley, a Solutions Architect at Pythian, gave users a checklist of considerations when choosing a provider. Chris Presley is a Solutions Architect at Pythian. He loves order – making him a premier Microsoft SQL Server expert. Not only has he programmed and administered SQL Server, but he has also shared his expertise and passion with b...
Unless your company can spend a lot of money on new technology, re-engineering your environment and hiring a comprehensive cybersecurity team, you will most likely move to the cloud or seek external service partnerships. In his session at 18th Cloud Expo, Darren Guccione, CEO of Keeper Security, revealed what you need to know when it comes to encryption in the cloud.
We're entering the post-smartphone era, where wearable gadgets from watches and fitness bands to glasses and health aids will power the next technological revolution. With mass adoption of wearable devices comes a new data ecosystem that must be protected. Wearables open new pathways that facilitate the tracking, sharing and storing of consumers’ personal health, location and daily activity data. Consumers have some idea of the data these devices capture, but most don’t realize how revealing and...
What are the successful IoT innovations from emerging markets? What are the unique challenges and opportunities from these markets? How did the constraints in connectivity among others lead to groundbreaking insights? In her session at @ThingsExpo, Carmen Feliciano, a Principal at AMDG, will answer all these questions and share how you can apply IoT best practices and frameworks from the emerging markets to your own business.
Ask someone to architect an Internet of Things (IoT) solution and you are guaranteed to see a reference to the cloud. This would lead you to believe that IoT requires the cloud to exist. However, there are many IoT use cases where the cloud is not feasible or desirable. In his session at @ThingsExpo, Dave McCarthy, Director of Products at Bsquare Corporation, will discuss the strategies that exist to extend intelligence directly to IoT devices and sensors, freeing them from the constraints of ...
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
SYS-CON Events announced today the Kubernetes and Google Container Engine Workshop, being held November 3, 2016, in conjunction with @DevOpsSummit at 19th Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA. This workshop led by Sebastian Scheele introduces participants to Kubernetes and Google Container Engine (GKE). Through a combination of instructor-led presentations, demonstrations, and hands-on labs, students learn the key concepts and practices for deploying and maintainin...
Extracting business value from Internet of Things (IoT) data doesn’t happen overnight. There are several requirements that must be satisfied, including IoT device enablement, data analysis, real-time detection of complex events and automated orchestration of actions. Unfortunately, too many companies fall short in achieving their business goals by implementing incomplete solutions or not focusing on tangible use cases. In his general session at @ThingsExpo, Dave McCarthy, Director of Products...
Cloud analytics is dramatically altering business intelligence. Some businesses will capitalize on these promising new technologies and gain key insights that’ll help them gain competitive advantage. And others won’t. Whether you’re a business leader, an IT manager, or an analyst, we want to help you and the people you need to influence with a free copy of “Cloud Analytics for Dummies,” the essential guide to this explosive new space for business intelligence.
Traditional IT, great for stable systems of record, is struggling to cope with newer, agile systems of engagement requirements coming straight from the business. In his session at 18th Cloud Expo, William Morrish, General Manager of Product Sales at Interoute, outlined ways of exploiting new architectures to enable both systems and building them to support your existing platforms, with an eye for the future. Technologies such as Docker and the hyper-convergence of computing, networking and sto...
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, discussed the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filterin...
IoT generates lots of temporal data. But how do you unlock its value? You need to discover patterns that are repeatable in vast quantities of data, understand their meaning, and implement scalable monitoring across multiple data streams in order to monetize the discoveries and insights. Motif discovery and deep learning platforms are emerging to visualize sensor data, to search for patterns and to build application that can monitor real time streams efficiently. In his session at @ThingsExpo, ...
Enterprise networks are complex. Moreover, they were designed and deployed to meet a specific set of business requirements at a specific point in time. But, the adoption of cloud services, new business applications and intensifying security policies, among other factors, require IT organizations to continuously deploy configuration changes. Therefore, enterprises are looking for better ways to automate the management of their networks while still leveraging existing capabilities, optimizing perf...
Early adopters of IoT viewed it mainly as a different term for machine-to-machine connectivity or M2M. This is understandable since a prerequisite for any IoT solution is the ability to collect and aggregate device data, which is most often presented in a dashboard. The problem is that viewing data in a dashboard requires a human to interpret the results and take manual action, which doesn’t scale to the needs of IoT.
When building large, cloud-based applications that operate at a high scale, it’s important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. “Fly two mistakes high” is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee...
Internet of @ThingsExpo has announced today that Chris Matthieu has been named tech chair of Internet of @ThingsExpo 2016 Silicon Valley. The 6thInternet of @ThingsExpo will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
Continuous testing helps bridge the gap between developing quickly and maintaining high quality products. But to implement continuous testing, CTOs must take a strategic approach to building a testing infrastructure and toolset that empowers their team to move fast. Download our guide to laying the groundwork for a scalable continuous testing strategy.
SYS-CON Events announced today the Enterprise IoT Bootcamp, being held November 1-2, 2016, in conjunction with 19th Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA. Combined with real-world scenarios and use cases, the Enterprise IoT Bootcamp is not just based on presentations but with hands-on demos and detailed walkthroughs. We will introduce you to a variety of real world use cases prototyped using Arduino, Raspberry Pi, BeagleBone, Spark, and Intel Edison. Y...