|By Jnan Dash||
|January 16, 2014 02:15 PM EST||
I joined 600 people last night at a session sponsored by Hive to listen to Doug Cutting, the creator of Hadoop. Currently he is the chief architect at Cloudera and a director at Apache Software Foundation. The hall at NetApp facility was overflowing with an eager audience. Doug spoke about the future of data management.
He narrated a brief history of Hadoop, how it was founded and how far it has come. As everyone knows, the pedigree of Hadoop came from Google’s GFS (Google File System, now HDFS) and Map-Reduce programming. Here are the key predictions he made:
- Hadoop has grown to become the de-facto standard for Big Data. He had anticipated IBM and Microsoft to come up with alternative designs to compete with Hadoop, but that never happened. Both companies plus Oracle, HP and other players have endorsed Hadoop as the platform.
- Hadoop will become the center of data management in future. It will not be the original HDFS+MR layers, but a whole new ecosystem called “The Enterprise Data Hub”. There will be an explosion of products surrounding Hadoop (all open systems). He cited examples of Pig, Hive, Sqoop, etc. Currently many SQL implementations over HDFS are coming up.
- Will there be OLTP (Transactional systems) on Hadoop? He said yes. Current implementation of Impala (from Cloudera) has SQL on HDFS with Map-Reduce on top is proving quite efficient in ETL workloads. Several customers have started migrating from legacy world to Impala.
- The new project at Google called Spanner is also leading the way to a future OLTP system distributed across the globe. This work will propel future additions to the Hadoop ecosystem.
- He explained the big advantage of Open systems architecture and why that will become the norm over proprietary systems.
- The future Hadoop ecosystem (Enterprise Data Hub) will be a threat to the current incumbents like Oracle, MySQL, SQL server, DB2, and Vertica. Current challenges of weak security and lack of standardization will be addressed eventually.
Doug is an engaging speaker and clearly showed he knows his subject well. I have my doubts on his future predictions, as DBMS’s take a long time to mature and provide all the critical functions for mission-critical applications. We have learnt that over the last 4 decades. Hadoop is still primarily a batch system doing offline analytics. Moving from there to do real-time production workload is quite a jump and will take many years to accomplish.
Then there are the new breed of highly efficient NoSQL databases like MongoDB that are being deployed to create “systems of engagement” at large enterprises. Also, the incumbents are not sitting idle either with a total market size of $30 Billion dollars. It is funny to remember that our tax records are still managed by Model 204 at IRS, a DBMS created during the 1960s. Switching databases is extremely cumbersome and not for the faint-hearted. Doug did say that future spending will steer more towards Hadoop.
Given the challenges of Big Data and the rapid adoption of Hadoop, we will watch this space as it unfolds over next couple of years.
The buzz continues for cloud, data analytics and the Internet of Things (IoT) and their collective impact across all industries. But a new conversation is emerging - how do companies use industry disruption and technology enablers to lead in markets undergoing change, uncertainty and ambiguity? Organizations of all sizes need to evolve and transform, often under massive pressure, as industry lines blur and merge and traditional business models are assaulted and turned upside down. In this new da...
Dec. 1, 2015 01:15 PM EST Reads: 299
Cloud computing is unquestionably one of the driving forces of DevOps, as the automation of operations transforms enterprise software development. DevOps, however, is more than a technology trend, as it represents a move toward silo-busting, self-organizing horizontal teams that drive business velocity. At the same time, enterprise Digital Transformation represents an upheaval across the enterprise, as customer preferences and behavior drive enterprise technology decisions. This transformation ...
Dec. 1, 2015 12:35 PM EST
SYS-CON Events announced today that Catchpoint, a global leader in monitoring, and testing the performance of online applications, has been named "Silver Sponsor" of DevOps Summit New York, which will take place on June 7-9, 2016 at the Javits Center in New York City. Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.Founde...
Dec. 1, 2015 12:15 PM EST
With all the incredible momentum behind the Internet of Things (IoT) industry, it is easy to forget that not a single CEO wakes up and wonders if “my IoT is broken.” What they wonder is if they are making the right decisions to do all they can to increase revenue, decrease costs, and improve customer experience – effectively the same challenges they have always had in growing their business. The exciting thing about the IoT industry is now these decisions can be better, faster, and smarter. Now ...
Dec. 1, 2015 12:00 PM EST Reads: 306
In recent years, at least 40% of companies using cloud applications have experienced data loss. One of the best prevention against cloud data loss is backing up your cloud data. In his General Session at 17th Cloud Expo, Sam McIntyre, Partner Enablement Specialist at eFolder, presented how organizations can use eFolder Cloudfinder to automate backups of cloud application data. He also demonstrated how easy it is to search and restore cloud application data using Cloudfinder.
Dec. 1, 2015 12:00 PM EST Reads: 230
The Internet of Everything is re-shaping technology trends–moving away from “request/response” architecture to an “always-on” Streaming Web where data is in constant motion and secure, reliable communication is an absolute necessity. As more and more THINGS go online, the challenges that developers will need to address will only increase exponentially. In his session at @ThingsExpo, Todd Greene, Founder & CEO of PubNub, exploreed the current state of IoT connectivity and review key trends and t...
Dec. 1, 2015 11:45 AM EST Reads: 472
Actifio is powering new application development and testing services from Net3 Technologies (N3T), a managed cloud services provider. N3T's new Symmetry DevOps™ service builds on its existing Palmetto Virtual Data Center (PvDC) Cloud services for data backup and disaster recovery (DR) based on the Actifio Copy Data Virtualization platform. Previously, N3T's data protection and DR services were challenged by overlapping and inefficient legacy hardware and software platforms from multiple vendo...
Dec. 1, 2015 11:30 AM EST
Most of the IoT Gateway scenarios involve collecting data from machines/processing and pushing data upstream to cloud for further analytics. The gateway hardware varies from Raspberry Pi to Industrial PCs. The document states the process of allowing deploying polyglot data pipelining software with the clear notion of supporting immutability. In his session at @ThingsExpo, Shashank Jain, a development architect for SAP Labs, discussed the objective, which is to automate the IoT deployment proces...
Dec. 1, 2015 11:00 AM EST Reads: 138
Countless business models have spawned from the IaaS industry – resell Web hosting, blogs, public cloud, and on and on. With the overwhelming amount of tools available to us, it's sometimes easy to overlook that many of them are just new skins of resources we've had for a long time. In his general session at 17th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, an IBM Company, broke down what we have to work with, discussed the benefits and pitfalls and how we can best use them ...
Dec. 1, 2015 10:45 AM EST Reads: 131
In demand-intensive mobile and web applications, an emerging pattern is to host the Systems of Engagement in the cloud (for maximum responsiveness) but keep the Systems of Record with the other important business systems in the company datacenter, often on a tightly secured mainframe. But what about the space in between? In this IBM Redpaper publication, we show that the IBM Bluemix cloud platform offers technologies that make it easy for cloud-based SoEs to securely connect to on-premises IBM...
Dec. 1, 2015 10:19 AM EST
Discussions of cloud computing have evolved in recent years from a focus on specific types of cloud, to a world of hybrid cloud, and to a world dominated by the APIs that make today's multi-cloud environments and hybrid clouds possible. In this Power Panel at 17th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists addressed the importance of customers being able to use the specific technologies they need, through environments and ecosystems that expose their APIs to make true ...
Dec. 1, 2015 10:00 AM EST Reads: 577
Microservices are a very exciting architectural approach that many organizations are looking to as a way to accelerate innovation. Microservices promise to allow teams to move away from monolithic "ball of mud" systems, but the reality is that, in the vast majority of organizations, different projects and technologies will continue to be developed at different speeds. How to handle the dependencies between these disparate systems with different iteration cycles? Consider the "canoncial problem"...
Dec. 1, 2015 09:00 AM EST Reads: 482
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
Dec. 1, 2015 08:00 AM EST Reads: 396
We all know that data growth is exploding and storage budgets are shrinking. Instead of showing you charts on about how much data there is, in his General Session at 17th Cloud Expo, Scott Cleland, Senior Director of Product Marketing at HGST, showed how to capture all of your data in one place. After you have your data under control, you can then analyze it in one place, saving time and resources.
Dec. 1, 2015 08:00 AM EST Reads: 252
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound...
Dec. 1, 2015 06:30 AM EST Reads: 515
As organizations shift towards IT-as-a-service models, the need for managing & protecting data residing across physical, virtual, and now cloud environments grows with it. CommVault can ensure protection & E-Discovery of your data - whether in a private cloud, a Service Provider delivered public cloud, or a hybrid cloud environment – across the heterogeneous enterprise.
Dec. 1, 2015 06:00 AM EST Reads: 273
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Day 2 Keynote at 17th Cloud Expo, San...
Dec. 1, 2015 05:00 AM EST Reads: 622
Apps and devices shouldn't stop working when there's limited or no network connectivity. Learn how to bring data stored in a cloud database to the edge of the network (and back again) whenever an Internet connection is available. In his session at 17th Cloud Expo, Ben Perlmutter, a Sales Engineer with IBM Cloudant, demonstrated techniques for replicating cloud databases with devices in order to build offline-first mobile or Internet of Things (IoT) apps that can provide a better, faster user e...
Dec. 1, 2015 04:45 AM EST Reads: 459
In today's enterprise, digital transformation represents organizational change even more so than technology change, as customer preferences and behavior drive end-to-end transformation across lines of business as well as IT. To capitalize on the ubiquitous disruption driving this transformation, companies must be able to innovate at an increasingly rapid pace. Traditional approaches for driving innovation are now woefully inadequate for keeping up with the breadth of disruption and change facin...
Dec. 1, 2015 03:30 AM EST Reads: 533
Cloud computing delivers on-demand resources that provide businesses with flexibility and cost-savings. The challenge in moving workloads to the cloud has been the cost and complexity of ensuring the initial and ongoing security and regulatory (PCI, HIPAA, FFIEC) compliance across private and public clouds. Manual security compliance is slow, prone to human error, and represents over 50% of the cost of managing cloud applications. Determining how to automate cloud security compliance is critical...
Dec. 1, 2015 03:00 AM EST Reads: 469