Welcome!

@CloudExpo Authors: Christopher Harrold, Yeshim Deniz, Jyoti Bansal, Dana Gardner, Gopala Krishna Behara

Related Topics: @BigDataExpo, Java IoT, Microservices Expo, @CloudExpo, Apache, SDN Journal

@BigDataExpo: Article

Happiness Is… a Handhold on Hadoop

For a Hadoop solution do we look inside or outside?

This post is sponsored by The Business Value Exchange and HP Enterprise Services

As we know, the subject of Big Data and the ‘space race' to produce software application development functions that will enable us to extract insight and (therefore) value from the Big Data mountain remains one of the most discussed issues in information technology today.

Increasingly prevalent and popular, if not quite as ‘predominant' as some would have us believe, in this arena is Apache Hadoop. This software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.

But there's a problem, because Hadoop is drastically underutilized in two respects:

  • Full-blown implementations of Hadoop are argued to be extremely technically difficult to pull off.
  • Implementations that do exist are argued to only take advantage of a fraction of what might be represented in a complete deployment in terms of data management and sheer number crunching power.

What's the answer?

Do we look inside (@ logs) or outside (@ architecture)?
For a Hadoop solution do we look inside or outside? That is to say, do we look inside at logs and logfiles as we tinker around to perfect our Hadoop installation? Or do we look at higher level and look at the architectural considerations that should be governing any individual instance of Hadoop to get some greater insight into what should be working?

Looking inside at logs and logfiles - these are files that record "events" occurring throughout an operating system or software application or data management environment such as Apache Hadoop.

If we look at how our logs and logfiles are performing, then we can get information on hidden: errors, anomalies, problems and patterns... and these are the sorts of reports that can help guide DevOps (developer-operations) pros as they attempt to being a Hadoop project online.

The HP System Management Homepage (SMH) software function provides this kind of information to users working directly with the firm's own dedicated software for particular hardware. Elsewhere there are products such as XpoLog Augmented Search 5.0, which brings XpoLog's troubleshooting capabilities to the Hadoop platform. Put simply, it's a big expanding market.

... and then outside (@ architecture)?
The converse approach (actually it should be corollary and complementary one) here is to focus more closely on the outside, i.e., the architecture inside which an instance of Hadoop is created. HP provides its own Reference Architectures for Hadoop and this is available for each of the three leading distributions (Cloudera, Hortonworks and MapR).

This sponsored HP commentary has highlighted the firm's own product initially, but thankfully HP is big and bold enough not to shirk away from us being able to mention other vendors in this space (most of which will be key partners anyway) - so yes indeed competing products do exist from Cisco, Dell, IBM and others.

Ways to Improve the RDBMS with Hadoop
In a comprehensive sub-headed piece entitled Ten Ways To Improve the RDBMS with Hadoop to be found on Business Process Management (BPM) website http://www.ebizq.net/ you can read the following opinion why a good Hadoop installation can help improve the scalability of applications:

"Very low cost commodity hardware can be used to power Hadoop clusters since redundancy and fault resistance is built into the software instead of using expensive enterprise hardware or software alternatives with proprietary solutions. This makes adding more capacity (and therefore scale) easier to achieve and Hadoop is an affordable and very granular way to scale out instead of up. While there can be cost in converting existing applications to Hadoop, for new applications it should be a standard option in the software selection decision tree."

There is much to gain from intelligent implementation of Hadoop, but it's not easy and we need to look both inside and out (and back to front) in terms of where we can get guidance on best practice and efficiency in our implementation.

More Stories By Adrian Bridgwater

Adrian Bridgwater is a freelance journalist and corporate content creation specialist focusing on cross platform software application development as well as all related aspects software engineering, project management and technology as a whole.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
SYS-CON Events announced today that Hitrons Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Hitrons Solutions Inc. is distributor in the North American market for unique products and services of small and medium-size businesses, including cloud services and solutions, SEO marketing platforms, and mobile applications.
@GonzalezCarmen has been ranked the Number One Influencer and @ThingsExpo has been named the Number One Brand in the “M2M 2016: Top 100 Influencers and Brands” by Onalytica. Onalytica analyzed tweets over the last 6 months mentioning the keywords M2M OR “Machine to Machine.” They then identified the top 100 most influential brands and individuals leading the discussion on Twitter.
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and G...
Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech.
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
"We host and fully manage cloud data services, whether we store, the data, move the data, or run analytics on the data," stated Kamal Shannak, Senior Development Manager, Cloud Data Services, IBM, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Zerto exhibited at SYS-CON's 18th International Cloud Expo®, which took place at the Javits Center in New York City, NY, in June 2016. Zerto is committed to keeping enterprise and cloud IT running 24/7 by providing innovative, simple, reliable and scalable business continuity software solutions. Through the Zerto Cloud Continuity Platform™, organizations can seamlessly move and protect virtualized workloads between public, private and hybrid clouds. The company’s flagship product, Zerto Virtual...
Some people worry that OpenStack is more flash then substance; however, for many customers this could not be farther from the truth. No other technology equalizes the playing field between vendors while giving your internal teams better access than ever to infrastructure when they need it. In his session at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will talk through some real-world OpenStack deployments and look into the ways this can benefit customers of all sizes....
Extreme Computing is the ability to leverage highly performant infrastructure and software to accelerate Big Data, machine learning, HPC, and Enterprise applications. High IOPS Storage, low-latency networks, in-memory databases, GPUs and other parallel accelerators are being used to achieve faster results and help businesses make better decisions. In his session at 18th Cloud Expo, Michael O'Neill, Strategic Business Development at NVIDIA, focused on some of the unique ways extreme computing is...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
Due of the rise of Hadoop, many enterprises are now deploying their first small clusters of 10 to 20 servers. At this small scale, the complexity of operating the cluster looks and feels like general data center servers. It is not until the clusters scale, as they inevitably do, when the pain caused by the exponential complexity becomes apparent. We've seen this problem occur time and time again. In his session at Big Data Expo, Greg Bruno, Vice President of Engineering and co-founder of StackIQ...
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
FinTech is the sum of financial and technology, and it’s one of the fastest growing tech industries. Total global investments in FinTech almost reached $50 billion last year, but there is still a great deal of confusion over what it is and what it means – especially as it applies to retirement. Building financial startups is not simple, but with the right team, technology and an innovative approach it can be an extremely interesting domain to disrupt. FinTech heralds a financial revolution that...
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
In his session at DevOps Summit, Tapabrata Pal, Director of Enterprise Architecture at Capital One, will tell a story about how Capital One has embraced Agile and DevOps Security practices across the Enterprise – driven by Enterprise Architecture; bringing in Development, Operations and Information Security organizations together. Capital Ones DevOpsSec practice is based upon three "pillars" – Shift-Left, Automate Everything, Dashboard Everything. Within about three years, from 100% waterfall, C...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...