Welcome!

@CloudExpo Authors: Liz McMillan, Zakia Bouachraoui, Yeshim Deniz, Pat Romanski, Elizabeth White

Related Topics: @CloudExpo, Java IoT, Microservices Expo, Open Source Cloud, Agile Computing, Apache

@CloudExpo: Article

MapR Takes a Run at Taming HBase

MapR says it’s given HBase enterprise-grade reliability and performance

MapR Technologies worked for months to make the elephantine curmudgeon known as Hadoop less snappy and irascible.

It dragged it out of the elephant house, scrubbed its hands and face, gave it some manners, told it which fork to use, slicked down its hair, polished its shoes and sent it out to live in civil society.

Then, beguiled by the fact that 45% of Hadoop shops also use HBase, it turned its hand to taming the equally irritable and hard-to-manage database that lets people run NoSQL queries against Hadoop data. This week it put its civilizing influence into public beta. It calls the widgetry M7.

MapR says it's given HBase enterprise-grade reliability and performance and put Hadoop and NoSQL capabilities together on the same platform capable of Big Data operations ranging from batch analytics to real-time analytics.

The widgetry, which mixes tables and files together, is supposed to have unified management, unified data protection, consistent access, more flexibility and 2x the performance of previous platforms.

Since it's effectively a complete platform, M7 is supposed to let folks do a lot more innovative things with data. The widgetry includes integrated snapshots, mirroring and instant recovery, along with consistently low latency.

HBase can now reportedly recover immediately from hardware and software failures, and is primed for disaster recovery and full data protection.

Even with multiple hardware or software outages and errors, MapR says applications are supposed to keep running without human intervention.

To get it to this yummy level, MapR threw out resource-eating compactions and capacity-limiting RegionServers. That way HBase could perform uniformly and consistently.

M7 is supposed to simplify HBase administration by ensuring there are no separate processes to monitor and manage, no pre-splitting, no manual database repair operations and no downtime for standard maintenance as well as no manual compactions and no manual region merges.

MapR marketing VP Jack Norris said once when he told an audience compactions were gone he got a very gratifying ovation.

MapR also minimized the read- and write-amplification factor by using newfangled data structures so inserts and updates are faster. And M7 supports in-memory columns so there are more options for improving database performance.

MapR says HBase's scalability has been improved "dramatically." M7 users can create more than a trillion tables. With M7, HBase has more than 20x the number of column families than Apache HBase and has increased row and cell sizes to handle large data objects.

It should broaden HBase's on-premise and in the cloud use cases and applications

M7 is binary-compatible with Apache HBase. Customers don't need to recompile or change code to take advantage of its enterprise-grade features. M7 also supports Apache HBase in the same cluster.

The widgetry should emerge from beta in Q1. By then MapR should have settled on pricing.

Registration to participate in the MapR M7 beta program is now open. See www.mapr.com/m7beta

More Stories By Maureen O'Gara

Maureen O'Gara the most read technology reporter for the past 20 years, is the Cloud Computing and Virtualization News Desk editor of SYS-CON Media. She is the publisher of famous "Billygrams" and the editor-in-chief of "Client/Server News" for more than a decade. One of the most respected technology reporters in the business, Maureen can be reached by email at maureen(at)sys-con.com or paperboy(at)g2news.com, and by phone at 516 759-7025. Twitter: @MaureenOGara

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


CloudEXPO Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-centric compute for the most data-intensive applications. Hyperconverged systems already in place can be revitalized with vendor-agnostic, PCIe-deployed, disaggregated approach to composable, maximizing the value of previous investments.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by sharing information within the building and with outside city infrastructure via real time shared cloud capabilities.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.