Click here to close now.

Welcome!

@CloudExpo Authors: Bart Copeland, Elizabeth White, Ed Featherston, Tom Lounibos, Pat Romanski

Related Topics: @CloudExpo, @BigDataExpo

@CloudExpo: Article

Hadoop Moving More Toward Real-Time

Interview with Continuent CEO Robert Hodges

No discussion of the Red Hat Summit 2014 would be complete without some discussion of Apache Hadoop. The happy elephant has now been pushing data for close to a decade, its distributed file system (HDFS) setting the tone for support of modern-day, highly distributed and very large databases in the cloud.

So I was pleased to have Robert Hodges, CEO of Hadoop-focused Continuent Tungsten, answer a few questions about his company's world.

Roger: What's the scope of the challenge you face in addressing big Hadoop deployments?

Robert: Hadoop is really very powerful as the way to concentrate and analyze information, so the key issue is how the information from existing transactional data stores gets added to Hadoop without implying additional load, application changes, or repetitive dump processes.

From our existing customer deployments, we know that the biggest challenge is getting the information into Hadoop as quickly and timely as possible from multiple different hosts simultaneously. Our customers often have many more transactional hosts running MySQL than they have Hadoop hosts, just because the scale-out and sharding required to support their transactional needs is so high.

Roger: What are the key pain points?

Robert: The key pain points are therefore the extraction of data from the transactional stores without implying additional load on these servers which are running their live customer facing website, while simultaneously loading large quantities of data that needs to be merged and analysed on the Hadoop side.

The replication solution based on Tungsten Replicator provides this very simply by placing a very low-level of load required for extraction of data, while continually streaming the changes over into Hadoop. Because this can be done on a server or cluster basis, it is easy to scale up the replication of data into Hadoop by adding more streams of replication data.

Roger: How critical is the real-time aspect of modern IT? How quickly is it growing?

Robert: It's growing very quickly, and in some cases quicker than some company IT departments and the technology they support are able to cope. Replication has for a long time been the solution for this scale-out process, but the flows of this replication data are changing.

One of the key drivers behind the adoption of Hadoop and Cassandra and similar databases is the ability to parallel process the data to get numbers in real-time. You can see this in a wide range of different markets, from banking, through to social networking and online stores.

As we get access to more information, the services supporting them need to support that an ever faster rate. We all want the lowest rate on my plane ticket purchase, while receiving the absolute best benefits and service, and all those different elements rely on real-time analysis.

Roger: What does IT think of this?

Robert: Of course, this also presents a completely different problem for the IT departments. They must deal with how to get the data into a system so that it can be analyzed quickly. The location for your active transactional dataset is not the same as your analysis tools, and may be based on completely different quantities of raw data.

Transactional databases might be conveniently sharded into 50 or 100 different RDBMS of 100GB each, but analysis needs to process all 10,000GB of data collectively to get meaningful information. That means that the IT infrastructure needs an effective way to combine and transfer this active data.

It's also clear from recent advancements in querying and processing techniques built on top of Hadoop that Hadoop itself is moving into a more real-time tool. Spark, Storm and other query engines provide very fast query and analysis on very large datasets, taking advantage of the distributed nature of Hadoop, and the increasing RAM and CPU power in evolutions of new hardware. Compatibility with Spark and similar live query mechanisms in Hadoop will form a key part of the next evolution of all Hadoop deployments.

Roger: How key is the role of Big Data in developing your solutions? How important is the term Big Data to you?

Robert: Big Data has been a significant requirement for our customers and their needs for some time, but we have definitely seen a shift recently from the scale-out, sharded nature of the typical RDBMS towards concentrating that information for analysis in Big Data stores. As that movement of data moves into the real-time it will be critical to the tools we develop to help make the transfer and management of data replication as easy as possible for our customers.

To us as the provider of the tools that enable our customers to easily share and transfer data, Big Data is therefore as important to us as it is to our customers. Of course, transactional databases are not going away, and we certainly don't expect that to change, but Hadoop and other Big Data solutions are being brought to work alongside these active data stores. Continuent will certainly be looking to expand our different solutions and techniques to bridge the gap between RDBMS and Big Data.

Contact Me on Twitter

More Stories By Roger Strukhoff

Roger Strukhoff (@IoT2040) is Executive Director of the Tau Institute for Global ICT Research, with offices in Illinois and Manila. He is Conference Chair of @CloudExpo & @ThingsExpo, and Editor of SYS-CON Media's CloudComputing BigData & IoT Journals. He holds a BA from Knox College & conducted MBA studies at CSU-East Bay.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
SYS-CON Media announced today that CloudBees, the Jenkins Enterprise company, has launched ad campaigns on SYS-CON's DevOps Journal. CloudBees' campaigns focus on the business value of Continuous Delivery and how it has been recognized as a game changer for IT and is now a top priority for organizations, and the best ways to optimize Jenkins to ensure your continuous integration environment is optimally configured.
The 4th International Internet of @ThingsExpo, co-located with the 17th International Cloud Expo - to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA - announces that its Call for Papers is open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than
SYS-CON Events announced today that ProfitBricks, the provider of painless cloud infrastructure, will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. ProfitBricks is the IaaS provider that offers a painless cloud experience for all IT users, with no learning curve. ProfitBricks boasts flexible cloud servers and networking, an integrated Data Center Designer tool for visual control over the...
DevOps is about increasing efficiency, but nothing is more inefficient than building the same application twice. However, this is a routine occurrence with enterprise applications that need both a rich desktop web interface and strong mobile support. With recent technological advances from Isomorphic Software and others, it is now feasible to create a rich desktop and tuned mobile experience with a single codebase, without compromising performance or usability.
The most often asked question post-DevOps introduction is: “How do I get started?” There’s plenty of information on why DevOps is valid and important, but many managers still struggle with simple basics for how to initiate a DevOps program in their business. They struggle with issues related to current organizational inertia, the lack of experience on Continuous Integration/Delivery, understanding where DevOps will affect revenue and budget, etc. In their session at DevOps Summit, JP Morgenthal...
"We provide a web application framework for building really sophisticated web applications that run on a browser without any installation need so we get used for biotech, defense, and banking applications," noted Charles Kendrick, CTO and Chief Architect at Isomorphic Software, in this SYS-CON.tv interview at @DevOpsSummit (http://DevOpsSummit.SYS-CON.com), held June 9-11, 2015, at the Javits Center in New York
"The idea of polyglot persistence is you have to apply the right database for the job - you always have to have many different databases in play. We offer that whole system as a service," explained Raj Singh, Developer Advocate for IBM Cloud Data Services, in this SYS-CON.tv interview at 16th Cloud Expo, held June 9-11, 2015, at the Javits Center in New York City.
In his session at 16th Cloud Expo, Simone Brunozzi, VP and Chief Technologist of Cloud Services at VMware, reviewed the changes that the cloud computing industry has gone through over the last five years and shared insights into what the next five will bring. He also chronicled the challenges enterprise companies are facing as they move to the public cloud. He delved into the "Hybrid Cloud" space and explained why every CIO should consider ‘hybrid cloud' as part of their future strategy to achie...
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
SYS-CON Events announced today that WHOA.com, an ISO 27001 Certified secure cloud computing company, participated as “Bronze Sponsor” of SYS-CON's 16th International Cloud Expo® New York, which took place June 9-11, 2015, at the Javits Center in New York City, NY. WHOA.com is a leader in next-generation, ISO 27001 Certified secure cloud solutions. WHOA.com offers a comprehensive portfolio of best-in-class cloud services for business including Infrastructure as a Service (IaaS), Secure Cloud Desk...
SYS-CON Events announced today that Intelligent Systems Services will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Established in 1994, Intelligent Systems Services Inc. is located near Washington, DC, with representatives and partners nationwide. ISS’s well-established track record is based on the continuous pursuit of excellence in designing, implementing and supporting nationwide clients’ ...
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists addressed this very serious issue of pro...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading in...
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides private all-in-one social intranets allowing workers to securely collaborate from anywhere in the world and from any device. Social, mobile, and easy to use. MangoApps has been named a "Market Leader" by Ovum Research and a "Cool Vendor" by Gartner. 20,000+ business custome...
IT data is typically silo'd by the various tools in place. Unifying all the log, metric and event data in one analytics platform stops finger pointing and provides the end-to-end correlation. Logs, metrics and custom event data can be joined to tell the holistic story of your software and operations. For example, users can correlate code deploys to system performance to application error codes. In his session at DevOps Summit, Michael Demmer, VP of Engineering at Jut, will discuss how this can...
SYS-CON Events announced today that kintone has been named “Bronze Sponsor” of SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. kintone promotes cloud-based workgroup productivity, transparency and profitability with a seamless collaboration space, build your own business application (BYOA) platform, and workflow automation system.
SYS-CON Events announced today that Harbinger Systems will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Harbinger Systems is a global company providing software technology services. Since 1990, Harbinger has developed a strong customer base worldwide. Its customers include software product companies ranging from hi-tech start-ups in Silicon Valley to leading product companies in the US a...
The cloud has transformed how we think about software quality. Instead of preventing failures, we must focus on automatic recovery from failure. In other words, resilience trumps traditional quality measures. Continuous delivery models further squeeze traditional notions of quality. Remember the venerable project management Iron Triangle? Among time, scope, and cost, you can only fix two or quality will suffer. Only in today's DevOps world, continuous testing, integration, and deployment upend...
Live Webinar with 451 Research Analyst Peter Christy. Join us on Wednesday July 22, 2015, at 10 am PT / 1 pm ET In a world where users are on the Internet and the applications are in the cloud, how do you maintain your historic SLA with your users? Peter Christy, Research Director, Networks at 451 Research, will discuss this new network paradigm, one in which there is no LAN and no WAN, and discuss what users and network administrators gain and give up when migrating to the agile world of clo...