Welcome!

@CloudExpo Authors: Elizabeth White, Liz McMillan, Yeshim Deniz, Pat Romanski, Zakia Bouachraoui

Related Topics: Open Source Cloud, @CloudExpo

Open Source Cloud: Article

NoHadoop: Big Data Requires Not Only Hadoop

The one area where MapReduce/Hadoop wins today is that it's freely available to anyone

Over the past few years, Hadoop has become something of a poster child for the NoSQL movement. Whether it's interpreted as "No SQL" or "Not Only SQL", the message has been clear, if you have big data challenges, then your programming tool of choice should be Hadoop. Sure, continue to use SQL for your ancient legacy stuff, but when you need cutting edge performance and scalability, it's time to go Hadoop.

The only problem with this story is that the people who really do have cutting edge performance and scalability requirements today have already moved on from the Hadoop model. A few have moved back to SQL, but the much more significant trend is that, having come to realize the capabilities and limitations of MapReduce and Hadoop, a whole raft of new post-Hadoop architectures are now being developed that are, in most cases, orders of magnitude faster at scale than Hadoop.



The problem with simple batch processing tools like MapReduce and Hadoop is that they are just not powerful enough in any one of the dimensions of the big data space that really matters. If you need complex joins or ACID requirements, SQL beats Hadoop easily. If you have realtime requirements, Cloudscale beats Hadoop by three or four orders of magnitude. If you have supercomputing requirements, MPI or BSP beat Hadoop easily. If you have graph computing requirements, Google's Pregel beats Hadoop by orders of magnitude. If you need interactive analysis of web-scale data sets, then Google's Dremel architecture beats Hadoop by orders of magnitude. If you need to incrementally update the analytics on a massive data set continuously, as Google now have to do on their index of the web, then an architecture like Percolator beats Hadoop easily.

The one area where MapReduce/Hadoop wins today is that it's freely available to anyone, but for those that have reasonably challenging big data requirements, that simple type of architecture is nowhere near enough.

More Stories By Bill McColl

Bill McColl left Oxford University to found Cloudscale. At Oxford he was Professor of Computer Science, Head of the Parallel Computing Research Center, and Chairman of the Computer Science Faculty. Along with Les Valiant of Harvard, he developed the BSP approach to parallel programming. He has led research, product, and business teams, in a number of areas: massively parallel algorithms and architectures, parallel programming languages and tools, datacenter virtualization, realtime stream processing, big data analytics, and cloud computing. He lives in Palo Alto, CA.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


CloudEXPO Stories
Eric Taylor, a former hacker, reveals what he's learned about cybersecurity. Taylor's life as a hacker began when he was just 12 years old and playing video games at home. Russian hackers are notorious for their hacking skills, but one American says he hacked a Russian cyber gang at just 15 years old. The government eventually caught up with Taylor and he pleaded guilty to posting the personal information on the internet, among other charges. Eric Taylor, who went by the nickname Cosmo the God, also posted personal information of celebrities and government officials, including Michelle Obama, former CIA director John Brennan, Kim Kardashian and Tiger Woods. Taylor recently became an advisor to cybersecurity start-up Path which helps companies make sure their websites are properly loading around the globe.
ClaySys Technologies is one of the leading application platform products in the ‘No-code' or ‘Metadata Driven' software business application development space. The company was founded to create a modern technology platform that addressed the core pain points related to the traditional software application development architecture. The founding team of ClaySys Technologies come from a legacy of creating and developing line of business software applications for large enterprise clients around the world.
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in this new hybrid and dynamic environment.
Most modern computer languages embed a lot of metadata in their application. We show how this goldmine of data from a runtime environment like production or staging can be used to increase profits. Adi conceptualized the Crosscode platform after spending over 25 years working for large enterprise companies like HP, Cisco, IBM, UHG and personally experiencing the challenges that prevent companies from quickly making changes to their technology, due to the complexity of their enterprise. An accomplished expert in Enterprise Architecture, Adi has also served as CxO advisor to numerous Fortune executives.
DevOpsSUMMIT at CloudEXPO, to be held June 25-26, 2019 at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits, DevOps is correlated with 20% faster time-to-market, 22% improvement in quality, and 18% reduction in dev and ops costs, according to research firm Vanson-Bourne. It is changing the way IT works, how businesses interact with customers, and how organizations are buying, building, and delivering software.