@CloudExpo Authors: Elizabeth White, Pat Romanski, Dana Gardner, Sematext Blog, Roger Strukhoff

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, @CloudExpo, Apache, SDN Journal

@BigDataExpo: Article

Happiness Is… a Handhold on Hadoop

For a Hadoop solution do we look inside or outside?

This post is sponsored by The Business Value Exchange and HP Enterprise Services

As we know, the subject of Big Data and the ‘space race' to produce software application development functions that will enable us to extract insight and (therefore) value from the Big Data mountain remains one of the most discussed issues in information technology today.

Increasingly prevalent and popular, if not quite as ‘predominant' as some would have us believe, in this arena is Apache Hadoop. This software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.

But there's a problem, because Hadoop is drastically underutilized in two respects:

  • Full-blown implementations of Hadoop are argued to be extremely technically difficult to pull off.
  • Implementations that do exist are argued to only take advantage of a fraction of what might be represented in a complete deployment in terms of data management and sheer number crunching power.

What's the answer?

Do we look inside (@ logs) or outside (@ architecture)?
For a Hadoop solution do we look inside or outside? That is to say, do we look inside at logs and logfiles as we tinker around to perfect our Hadoop installation? Or do we look at higher level and look at the architectural considerations that should be governing any individual instance of Hadoop to get some greater insight into what should be working?

Looking inside at logs and logfiles - these are files that record "events" occurring throughout an operating system or software application or data management environment such as Apache Hadoop.

If we look at how our logs and logfiles are performing, then we can get information on hidden: errors, anomalies, problems and patterns... and these are the sorts of reports that can help guide DevOps (developer-operations) pros as they attempt to being a Hadoop project online.

The HP System Management Homepage (SMH) software function provides this kind of information to users working directly with the firm's own dedicated software for particular hardware. Elsewhere there are products such as XpoLog Augmented Search 5.0, which brings XpoLog's troubleshooting capabilities to the Hadoop platform. Put simply, it's a big expanding market.

... and then outside (@ architecture)?
The converse approach (actually it should be corollary and complementary one) here is to focus more closely on the outside, i.e., the architecture inside which an instance of Hadoop is created. HP provides its own Reference Architectures for Hadoop and this is available for each of the three leading distributions (Cloudera, Hortonworks and MapR).

This sponsored HP commentary has highlighted the firm's own product initially, but thankfully HP is big and bold enough not to shirk away from us being able to mention other vendors in this space (most of which will be key partners anyway) - so yes indeed competing products do exist from Cisco, Dell, IBM and others.

Ways to Improve the RDBMS with Hadoop
In a comprehensive sub-headed piece entitled Ten Ways To Improve the RDBMS with Hadoop to be found on Business Process Management (BPM) website http://www.ebizq.net/ you can read the following opinion why a good Hadoop installation can help improve the scalability of applications:

"Very low cost commodity hardware can be used to power Hadoop clusters since redundancy and fault resistance is built into the software instead of using expensive enterprise hardware or software alternatives with proprietary solutions. This makes adding more capacity (and therefore scale) easier to achieve and Hadoop is an affordable and very granular way to scale out instead of up. While there can be cost in converting existing applications to Hadoop, for new applications it should be a standard option in the software selection decision tree."

There is much to gain from intelligent implementation of Hadoop, but it's not easy and we need to look both inside and out (and back to front) in terms of where we can get guidance on best practice and efficiency in our implementation.

More Stories By Adrian Bridgwater

Adrian Bridgwater is a freelance journalist and corporate content creation specialist focusing on cross platform software application development as well as all related aspects software engineering, project management and technology as a whole.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@CloudExpo Stories
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
SYS-CON Events announced today that Niagara Networks will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Niagara Networks offers the highest port-density systems, and the most complete Next-Generation Network Visibility systems including Network Packet Brokers, Bypass Switches, and Network TAPs.
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
SYS-CON Events announced today that StarNet Communications will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. StarNet Communications’ FastX is the industry first cloud-based remote X Windows emulator. Using standard Web browsers (FireFox, Chrome, Safari, etc.) users from around the world gain highly secure access to applications and data hosted on Linux-based servers in a central data center. ...
SYS-CON Events announced today that Cemware will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Use MATLAB functions by just visiting website mathfreeon.com. MATLAB compatible, freely usable, online platform services. As of October 2016, 80,000 users from 180 countries are enjoying our platform service.
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

SYS-CON Events announced today that eCube Systems, the leading provider of modern development tools and best practices for Continuous Integration on OpenVMS, will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. eCube Systems offers a family of middleware products and development tools that maximize return on technology investment by leveraging existing technical equity to meet evolving business needs. ...
Effectively SMBs and government programs must address compounded regulatory compliance requirements. The most recent are Controlled Unclassified Information and the EU’s GDPR have Board Level implications. Managing sensitive data protection will likely result in acquisition criteria, demonstration requests and new requirements. Developers, as part of the pre-planning process and the associated supply chain, could benefit from updating their code libraries and design by incorporating changes.
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
SYS-CON Events announced today that Isomorphic Software will exhibit at DevOps Summit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Isomorphic Software provides the SmartClient HTML5/AJAX platform, the most advanced technology for building rich, cutting-edge enterprise web applications for desktop and mobile. SmartClient combines the productivity and performance of traditional desktop software with the simp...
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
SYS-CON Events announced today that Embotics, the cloud automation company, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Embotics is the cloud automation company for IT organizations and service providers that need to improve provisioning or enable self-service capabilities. With a relentless focus on delivering a premier user experience and unmatched customer support, Embotics is the fas...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
@ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.
The Quantified Economy represents the total global addressable market (TAM) for IoT that, according to a recent IDC report, will grow to an unprecedented $1.3 trillion by 2019. With this the third wave of the Internet-global proliferation of connected devices, appliances and sensors is poised to take off in 2016. In his session at @ThingsExpo, David McLauchlan, CEO and co-founder of Buddy Platform, discussed how the ability to access and analyze the massive volume of streaming data from millio...
SYS-CON Events announced today that Interface Masters Technologies, a leader in Network Visibility and Uptime Solutions, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Interface Masters Technologies is a leading vendor in the network monitoring and high speed networking markets. Based in the heart of Silicon Valley, Interface Masters' expertise lies in Gigabit, 10 Gigabit and 40 Gigabit Eth...
Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required. In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, will draw together recent research and lessons learned from emerging and established ...