|By Hovhannes Avoyan||
|April 9, 2012 07:00 AM EDT||
It is our goal at Monitis to make the lives of web developers and system administrators easy. We have reviewed the 5 leading hosted hadoop-based applications and given a short analysis of them in this post to help guide you in finding a solution that best suits your needs.
The article covers: Amazon Elastic MapReduce; Cloud Era CDH; InfoSphere BigInsights; MapR M3 and M5 and Hortonworks Data Platform.
Amazon Elastic MapReduce (http://aws.amazon.com/elasticmapreduce/)
Introduced by Amazon in 2009, Elastic MapReduce automates the process of various Hadoop cluster processes and transfers between Amazon’s EC2 and S3 products. For a minimal fee, Amazon will provide its clients with the ability to launch a preconfigured Hadoop cluster to run a client’s MapReduce Program.
- Very easy to setup a job flow
- There’s an enormous amount of documentation available to help new users
- Example applications are provided, giving an option to test drive the application before putting it to use.
- Entire application system can be powered by a command line interface, compared to a web-based management console.
- Ability to conduct several jobs simultaneously and parallel.
- No hardware is needed and costs can be very limited, which is great for small businesses seeking to be more cost efficient.
- Need an account with Amazon Web Services (AWS)
- Service is only available in the United States
- Requires the use of Amazon’s S3 service, which adds extra costs to an overall project (data transfer, security etc.)
Cloudera CDH (www.cloudera.com)
Founded in March 2009, Cloudera was previously considered to be the Red Hat of the Hadoop World. With a large customer base of over 400 (including paid and free downloads), the company’s offerings include the Cloudera Enterprise products and Training & Support Services. Formed by a number of key executives from various technology giants (Oracle, Yahoo, Google and Facebook), Cloudera is considered the pioneer in the Hadoop community, having a head start in the industry compared to its competitors.
- Free application that can be easily downloaded
- Installed internally within an organization which allows the company to have full control of all processes, jobs etc.
- Technical support is superior and the knowledgebase is an essential resource to anyone starting out with Hadoop
- Used by a large number of companies worldwide, and has been proven as a leading choice in Hadoop applications.
- Application includes additional resources and components (e.g. Pig, Hive, Flume, HBase, Zookeeper, Mahout, Whirr, Hue, Sqoop and Oozie)
- Cloudera conducts quarterly updates: eliminating the need to conduct a big scale annual upgrade.
- Requires companies to obtain the necessary hardware in order to install the application, adding additional costs.
- Additional costs are added to support and maintain the application, increasing the company’s operating costs.
IBM InfoSphere BigInsights (www.ibm.com/software/data/infosphere/biginsights)
A new product introduced in May 2011, the product is geared towards handling extremely large volumes of streaming data using a Hadoop-based analytics framework. IBM states that the IBM InfoSphere Biginsights will be able to handle “tens-of-petabytes” of data, and will retain a sub-millisecond response time. The company also plans to launch 20 new service offerings, including numerous analytical tools for business and IT.
- Superior product support and long standing company reputation established from many years of servicing the IT community.
- Comes standard with a number of essential components including; PIG programming, IBM DB2 and IBM BigSheets.
- Offers two replication models that provide log-based replication working independently (queue-based and SQL-based).
- Lots of documentation and step-by-step training is available from the IBM website.
- Superior product for analysing big data in motion that needs to be continuously analyzed in real time.
- New to the marketplace and has not been around long enough to ensure a solid reputation.
- An expensive solution for small/medium size organizations seeking to utilize a more cost effective application.
MapR M3 and M5 (www.mapr.com)
With headquarters in San Jose, CA, MapR markets its proprietary applications with a focus on providing a number of key features and capabilities for the use with MapReduce and Hadoop.
- Offers superior monitoring that can provide a better understanding of data distribution and processing – essential for achieving increased performance.
- A free version is offered, which includes everything except management tools which are only offered in its M5 series products.
- Excellent technical support and vast quantities of documentation available
- New to the marketplace so has a limited reputation
- An expensive solution for small/medium size organizations
- 24×7 support is only available on the paid version of the application
- Requires an enormous amount of disk space to install (25GB), compared to similar products.
Hortonworks Data Platform (http://hortonworks.com/)
Hortonworks was formed in June 2011 by a number of key architects and Hadoop committers formerly employed within the Yahoo Hadoop Software department. The company’s offerings include; HDP (Hadoop Data Platform) and Training Support Services. The company currently serves 2 customers – Yahoo and Microsoft.
- A spin-off Yahoo product, so it’s been tested in the marketplace.
- Lots of documentation and support available from the knowledgebase community.
- The company is continuously working with Yahoo to develop its future products
- Scalable to meet the demands of specific projects.
- Offers variations and expanded product offerings from partnerships with a number of specialized companies.
- Product is similar in nature to Cloudera, and provides similar features.
1 YEAR WEBSITE TRAFFIC COMPARISON (from Compete.com)
Hopefully our post has been of interest to web developers and system administrators.
More information on Monitis can be found on our website: www.monitis.com
SaaS companies can greatly expand revenue potential by pushing beyond their own borders. The challenge is how to do this without degrading service quality. In his session at 18th Cloud Expo, Adam Rogers, Managing Director at Anexia, discussed how IaaS providers with a global presence and both virtual and dedicated infrastructure can help companies expand their service footprint with low “go-to-market” costs.
Jul. 23, 2016 07:00 AM EDT Reads: 2,235
"Avere Systems is a hybrid cloud solution provider. We have customers that want to use cloud storage and we have customers that want to take advantage of cloud compute," explained Rebecca Thompson, VP of Marketing at Avere Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 07:00 AM EDT Reads: 1,871
IoT generates lots of temporal data. But how do you unlock its value? You need to discover patterns that are repeatable in vast quantities of data, understand their meaning, and implement scalable monitoring across multiple data streams in order to monetize the discoveries and insights. Motif discovery and deep learning platforms are emerging to visualize sensor data, to search for patterns and to build application that can monitor real time streams efficiently. In his session at @ThingsExpo, ...
Jul. 23, 2016 06:30 AM EDT Reads: 586
Ovum, a leading technology analyst firm, has published an in-depth report, Ovum Decision Matrix: Selecting a DevOps Release Management Solution, 2016–17. The report focuses on the automation aspects of DevOps, Release Management and compares solutions from the leading vendors.
Jul. 23, 2016 06:00 AM EDT Reads: 1,562
"This week we're really focusing on scalability, asset preservation and how do you back up to the cloud and in the cloud with object storage, which is really a new way of attacking dealing with your file, your blocked data, where you put it and how you access it," stated Jeff Greenwald, Senior Director of Market Development at HGST, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 05:30 AM EDT Reads: 1,324
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busin...
Jul. 23, 2016 05:30 AM EDT Reads: 1,696
There will be new vendors providing applications, middleware, and connected devices to support the thriving IoT ecosystem. This essentially means that electronic device manufacturers will also be in the software business. Many will be new to building embedded software or robust software. This creates an increased importance on software quality, particularly within the Industrial Internet of Things where business-critical applications are becoming dependent on products controlled by software. Qua...
Jul. 23, 2016 05:15 AM EDT Reads: 1,138
As companies gain momentum, the need to maintain high quality products can outstrip their development team’s bandwidth for QA. Building out a large QA team (whether in-house or outsourced) can slow down development and significantly increases costs. This eBook takes QA profiles from 5 companies who successfully scaled up production without building a large QA team and includes: What to consider when choosing CI/CD tools How culture and communication can make or break implementation
Jul. 23, 2016 05:00 AM EDT Reads: 1,473
Continuous testing helps bridge the gap between developing quickly and maintaining high quality products. But to implement continuous testing, CTOs must take a strategic approach to building a testing infrastructure and toolset that empowers their team to move fast. Download our guide to laying the groundwork for a scalable continuous testing strategy.
Jul. 23, 2016 05:00 AM EDT Reads: 1,778
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
Jul. 23, 2016 04:30 AM EDT Reads: 1,911
"We formed Formation several years ago to really address the need for bring complete modernization and software-defined storage to the more classic private cloud marketplace," stated Mark Lewis, Chairman and CEO of Formation Data Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 04:30 AM EDT Reads: 1,384
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, discussed how research has demonstrated the value of Machine Learning in delivering next generation analytics to imp...
Jul. 23, 2016 04:00 AM EDT Reads: 2,332
Most organizations prioritize data security only after their data has already been compromised. Proactive prevention is important, but how can you accomplish that on a small budget? Learn how the cloud, combined with a defense and in-depth approach, creates efficiencies by transferring and assigning risk. Security requires a multi-defense approach, and an in-house team may only be able to cherry pick from the essential components. In his session at 19th Cloud Expo, Vlad Friedman, CEO/Founder o...
Jul. 23, 2016 03:45 AM EDT Reads: 1,754
Organizations planning enterprise data center consolidation and modernization projects are faced with a challenging, costly reality. Requirements to deploy modern, cloud-native applications simultaneously with traditional client/server applications are almost impossible to achieve with hardware-centric enterprise infrastructure. Compute and network infrastructure are fast moving down a software-defined path, but storage has been a laggard. Until now.
Jul. 23, 2016 03:45 AM EDT Reads: 1,576
"We host and fully manage cloud data services, whether we store, the data, move the data, or run analytics on the data," stated Kamal Shannak, Senior Development Manager, Cloud Data Services, IBM, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 23, 2016 03:30 AM EDT Reads: 1,078
With over 720 million Internet users and 40–50% CAGR, the Chinese Cloud Computing market has been booming. When talking about cloud computing, what are the Chinese users of cloud thinking about? What is the most powerful force that can push them to make the buying decision? How to tap into them? In his session at 18th Cloud Expo, Yu Hao, CEO and co-founder of SpeedyCloud, answered these questions and discussed the results of SpeedyCloud’s survey.
Jul. 23, 2016 02:45 AM EDT Reads: 702
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
Jul. 23, 2016 02:45 AM EDT Reads: 2,049
With the proliferation of both SQL and NoSQL databases, organizations can now target specific fit-for-purpose database tools for their different application needs regarding scalability, ease of use, ACID support, etc. Platform as a Service offerings make this even easier now, enabling developers to roll out their own database infrastructure in minutes with minimal management overhead. However, this same amount of flexibility also comes with the challenges of picking the right tool, on the right ...
Jul. 23, 2016 02:30 AM EDT Reads: 748
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develo...
Jul. 23, 2016 01:15 AM EDT Reads: 1,013
SYS-CON Events announced today that MangoApps will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device.
Jul. 23, 2016 01:00 AM EDT Reads: 1,151