Welcome!

Cloud Expo Authors: Jeremy Geelan, Maureen O'Gara, Xenia von Wedel, Toddy Mladenov, Kevin Benedict

Related Topics: Cloud Expo, Open Source

Cloud Expo: Blog Feed Post

Top Five Hosted Hadoop-Based Applications Reviewed

Amazon Elastic MapReduce; Cloud Era CDH; InfoSphere BigInsights; MapR M3 and M5 and Hortonworks Data Platform

It is our goal at Monitis to make the lives of web developers and system administrators easy. We have reviewed the 5 leading hosted hadoop-based applications and given a short analysis of them in this post to help guide you in finding a solution that best suits your needs.

The article covers: Amazon Elastic MapReduce; Cloud Era CDH; InfoSphere BigInsights; MapR M3 and M5 and Hortonworks Data Platform.

amazon_web_services_logo_aws

Amazon Elastic MapReduce (http://aws.amazon.com/elasticmapreduce/)

Introduced by Amazon in 2009, Elastic MapReduce automates the process of various Hadoop cluster processes and transfers between Amazon’s EC2 and S3 products. For a minimal fee, Amazon will provide its clients with the ability to launch a preconfigured Hadoop cluster to run a client’s MapReduce Program.

AWS Screenshot

Pros

  • Very easy to setup a job flow
  • There’s an enormous amount of documentation available to help new users
  • Example applications are provided, giving an option to test drive the application before putting it to use.
  • Entire application system can be powered by a command line interface, compared to a web-based management console.
  • Ability to conduct several jobs simultaneously and parallel.
  • No hardware is needed and costs can be very limited, which is great for small businesses seeking to be more cost efficient.

Cons

  • Need an account with Amazon Web Services (AWS)
  • Service is only available in the United States
  • Requires the use of Amazon’s S3 service, which adds extra costs to an overall project (data transfer, security etc.)

 

Cloudera Logo

Cloudera CDH (www.cloudera.com)

Founded in March 2009, Cloudera was previously considered to be the Red Hat of the Hadoop World. With a large customer base of over 400 (including paid and free downloads), the company’s offerings include the  Cloudera Enterprise products and Training & Support Services. Formed by a number of key executives from various technology giants (Oracle, Yahoo, Google and Facebook), Cloudera is considered the pioneer in the Hadoop community, having a head start in the industry compared to its competitors.

Cloudera screen.jpg

Pros

  • Free application that can be easily downloaded
  • Installed internally within an organization which allows the company to have full control of all processes, jobs etc.
  • Technical support is superior and the knowledgebase is an essential resource to anyone starting out with Hadoop
  • Used by a large number of companies worldwide, and has been proven as a leading choice in Hadoop applications.
  • Application includes additional resources and components (e.g. Pig, Hive, Flume, HBase, Zookeeper, Mahout, Whirr, Hue, Sqoop and Oozie)
  • Cloudera conducts quarterly updates: eliminating the need to conduct a big scale annual upgrade.

Cons

  • Requires companies to obtain the necessary hardware in order to install the application, adding additional costs.
  • Additional costs are added to support and maintain the application, increasing the company’s operating costs.

 

logo-ibm-infosphere

IBM InfoSphere BigInsights (www.ibm.com/software/data/infosphere/biginsights)

A new product introduced in May 2011, the product is geared towards handling extremely large volumes of streaming data using a Hadoop-based analytics framework. IBM states that the IBM InfoSphere Biginsights will be able to handle “tens-of-petabytes” of data, and will retain a sub-millisecond response time. The company also plans to launch 20 new service offerings, including numerous analytical tools for business and IT.

ibm

Pros

  • Superior product support and long standing company reputation established from many years of servicing the IT community.
  • Comes standard with a number of essential components including; PIG programming, IBM DB2 and IBM BigSheets.
  • Offers two replication models that provide log-based replication working independently (queue-based and SQL-based).
  • Lots of documentation and step-by-step training is available from the IBM website.
  • Superior product for analysing big data in motion that needs to be continuously analyzed in real time.

Cons

  • New to the marketplace and has not been around long enough to ensure a solid reputation.
  • An expensive solution for small/medium size organizations seeking to utilize a more cost effective application.

 

30621_MapR_Logo_Tag-Copy

MapR M3 and M5 (www.mapr.com)

With headquarters in San Jose, CA, MapR markets its proprietary applications with a focus on providing a number of key features and capabilities for the use with MapReduce and Hadoop.

mappr

Pros

  • Offers superior monitoring that can provide a better understanding of data distribution and processing – essential for achieving increased performance.
  • A free version is offered, which includes everything except management tools which are only offered in its M5 series products.
  • Excellent technical support and vast quantities of documentation available

Cons

  • New to the marketplace so has a limited reputation
  • An expensive solution for small/medium size organizations
  • 24×7 support is only available on the paid version of the application
  • Requires an enormous amount of disk space to install (25GB), compared to similar products.

 

hortonworks

Hortonworks Data Platform (http://hortonworks.com/)

Hortonworks was formed in June 2011 by a number of key architects and Hadoop committers formerly employed within the Yahoo Hadoop Software department. The company’s offerings include; HDP (Hadoop Data Platform) and Training Support Services. The company currently serves 2 customers – Yahoo and Microsoft.

Pros

  • A spin-off Yahoo product, so it’s been tested in the marketplace.
  • Lots of documentation and support available from the knowledgebase community.
  • The company is continuously working with Yahoo to develop its future products
  • Scalable to meet the demands of specific projects.
  • Offers variations and expanded product offerings from partnerships with a number of specialized companies.

Cons

  • Product is similar in nature to Cloudera, and provides similar features.

 

1 YEAR WEBSITE TRAFFIC COMPARISON (from Compete.com)

Hadoop Based Application Website Performance

Hadoop Based Application Website Performance Stats

Hopefully our post has been of interest to web developers and system administrators.

More information on Monitis can be found on our website: www.monitis.com

More Stories By Hovhannes Avoyan

Hovhannes Avoyan is the CEO of Monitis, Inc., a provider of on-demand systems management and monitoring software to 50,000 users spanning small businesses and Fortune 500 companies.

Prior to Monitis, he served as General Manager and Director of Development at prominent web portal Lycos Europe, where he grew the Lycos Armenia group from 30 people to over 200, making it the company's largest development center. Prior to Lycos, Avoyan was VP of Technology at Brience, Inc. (based in San Francisco and acquired by Syniverse), which delivered mobile internet content solutions to companies like Cisco, Ingram Micro, Washington Mutual, Wyndham Hotels , T-Mobile , and CNN. Prior to that, he served as the founder and CEO of CEDIT ltd., which was acquired by Brience. A 24 year veteran of the software industry, he also runs Sourcio cjsc, an IT consulting company and startup incubator specializing in web 2.0 products and open-source technologies.

Hovhannes is a senior lecturer at the American Univeristy of Armenia and has been a visiting lecturer at San Francisco State University. He is a graduate of Bertelsmann University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Cloud Expo Breaking News
“I believe it is incumbent on the Cloud Service Providers (CSPs) and/or System Integrators (SIs) to understand the regulatory and compliance-related issues that their customers face,” noted Manjula Talreja, VP of Global Cloud Business Development at Cisco, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “Of course these issues are different in each industry and in each country.” Cloud Computing Journal: The move to cloud isn't about saving money, it is about saving time - ...
“Regulations and compliance are key trust topics with regards to cloud solutions and technology,” noted Sven Denecken, Vice President, Strategy and Co-Innovation Cloud Solutions, SAP AG, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “But it is also more than security of access – it is portability of data and a clear definition of where the data resides.” Cloud Computing Journal: The move to cloud isn't about saving money, it is about saving time – agree or disagree? Sve...
Many organizations want to expand upon the IaaS foundation to deliver cloud services in all forms – software, mobility, infrastructure and IT. Understanding the strategy, planning process and tools for this transformation will help catalyze changes in the way the business operates and deliver real value.
WSO2 on Thursday announced that WSO2 Vice President of Technology Evangelism Chris Haddad and SUSE Business Development Manager Frank Rego will lead a joint presentation at 12 International Cloud Expo. The session, "Bridging IaaS and PaaS to Deliver the Service-Oriented Data Center," is part of the event's Enterprise Cloud Computing Track on Thursday, June 13, 2013. The Cloud Expo conference is being held June 10-13, 2013 at the Javits Center in New York City. Bridging IaaS and PaaS to Deliver ...
IT has more opportunities than ever before with the growth in users, devices, data and secure cloud services. This creates not only a more enriching experience for users, but more opportunities for businesses. The key to capitalizing on these opportunities is to have the right tools in place to help scale operations. In his Day 3 Keynote at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], Intel's Rob Crooke will describe the range of products that Intel provides to support different usa...
Quantum Corp., a proven global expert in data protection and Big Data management, has announced that Senior Vice President of Cloud Solutions Henrik Rosendahl will present a session exploring the future of cloud data protection and the impact of data reduction technologies on cloud storage at the 12th International Cloud Expo. The conference takes place June 10-13 at the Javits Center in New York City. Rosendahl will explore trends in cloud-based backup and disaster recovery (DR) and how curre...
One of the cloud’s biggest draws is the capability to virtualize computing resources, allowing it to be consumed with the click of a mouse. But behind that simple click is an enormous infrastructure challenge that has recently been cited as a major cause for slower enterprise adoption. Enterprises can better prepare for this shift and take full advantage of future computing benefits. Between architecture design and migration planning, the road can be long, so what do you do with your talent? I...
In the old world of IT, if you didn't have hardware capacity or the budget to buy more, your project was dead in the water. Budget constraints can leave some of the best, most creative and most ingenious innovations on the cutting room floor. It’s a true dilemma for developers and innovators – why spend the time creating, when a project could be abandoned in a blink? That was the old world. In the new world of IT, developers rule. They have access to resources they can spin up instantly. A hyb...
INetU, the industry's experts in complex hosting and a global provider of business-centric managed cloud and application hosting, has announced that Cloud Architect Rich Hand will be presenting "Private Cloud, Public Cloud - Is There a Third Option?" at the 12th International Cloud Expo taking place June 10-13, 2013 in New York City. As more enterprise IT departments move into the cloud, many executives are evaluating whether to adopt a Public or Private cloud. The cost benefits of the Public ...
“I’m careful when using terms like Big Data, because it can mean so many things to different people,” explained Eric Hanselman, Chief Analyst at 451 Research, in this exclusive Q&A with Cloud Expo Conference Chair Jeremy Geelan. “There is huge value in analytics that companies can use to pull intelligence from a collection of data sources that are available in their businesses. The inexpensive storage that cloud services can offer make a great environment to pull together siloed data.” Cloud Co...