Click here to close now.

Welcome!

Cloud Expo Authors: Carmen Gonzalez, Elizabeth White, Lori MacVittie, Liz McMillan, Asad Ali

Related Topics: Cloud Expo, Big Data Journal

Cloud Expo: Article

Hadoop Moving More Toward Real-Time

Interview with Continuent CEO Robert Hodges

No discussion of the Red Hat Summit 2014 would be complete without some discussion of Apache Hadoop. The happy elephant has now been pushing data for close to a decade, its distributed file system (HDFS) setting the tone for support of modern-day, highly distributed and very large databases in the cloud.

So I was pleased to have Robert Hodges, CEO of Hadoop-focused Continuent Tungsten, answer a few questions about his company's world.

Roger: What's the scope of the challenge you face in addressing big Hadoop deployments?

Robert: Hadoop is really very powerful as the way to concentrate and analyze information, so the key issue is how the information from existing transactional data stores gets added to Hadoop without implying additional load, application changes, or repetitive dump processes.

From our existing customer deployments, we know that the biggest challenge is getting the information into Hadoop as quickly and timely as possible from multiple different hosts simultaneously. Our customers often have many more transactional hosts running MySQL than they have Hadoop hosts, just because the scale-out and sharding required to support their transactional needs is so high.

Roger: What are the key pain points?

Robert: The key pain points are therefore the extraction of data from the transactional stores without implying additional load on these servers which are running their live customer facing website, while simultaneously loading large quantities of data that needs to be merged and analysed on the Hadoop side.

The replication solution based on Tungsten Replicator provides this very simply by placing a very low-level of load required for extraction of data, while continually streaming the changes over into Hadoop. Because this can be done on a server or cluster basis, it is easy to scale up the replication of data into Hadoop by adding more streams of replication data.

Roger: How critical is the real-time aspect of modern IT? How quickly is it growing?

Robert: It's growing very quickly, and in some cases quicker than some company IT departments and the technology they support are able to cope. Replication has for a long time been the solution for this scale-out process, but the flows of this replication data are changing.

One of the key drivers behind the adoption of Hadoop and Cassandra and similar databases is the ability to parallel process the data to get numbers in real-time. You can see this in a wide range of different markets, from banking, through to social networking and online stores.

As we get access to more information, the services supporting them need to support that an ever faster rate. We all want the lowest rate on my plane ticket purchase, while receiving the absolute best benefits and service, and all those different elements rely on real-time analysis.

Roger: What does IT think of this?

Robert: Of course, this also presents a completely different problem for the IT departments. They must deal with how to get the data into a system so that it can be analyzed quickly. The location for your active transactional dataset is not the same as your analysis tools, and may be based on completely different quantities of raw data.

Transactional databases might be conveniently sharded into 50 or 100 different RDBMS of 100GB each, but analysis needs to process all 10,000GB of data collectively to get meaningful information. That means that the IT infrastructure needs an effective way to combine and transfer this active data.

It's also clear from recent advancements in querying and processing techniques built on top of Hadoop that Hadoop itself is moving into a more real-time tool. Spark, Storm and other query engines provide very fast query and analysis on very large datasets, taking advantage of the distributed nature of Hadoop, and the increasing RAM and CPU power in evolutions of new hardware. Compatibility with Spark and similar live query mechanisms in Hadoop will form a key part of the next evolution of all Hadoop deployments.

Roger: How key is the role of Big Data in developing your solutions? How important is the term Big Data to you?

Robert: Big Data has been a significant requirement for our customers and their needs for some time, but we have definitely seen a shift recently from the scale-out, sharded nature of the typical RDBMS towards concentrating that information for analysis in Big Data stores. As that movement of data moves into the real-time it will be critical to the tools we develop to help make the transfer and management of data replication as easy as possible for our customers.

To us as the provider of the tools that enable our customers to easily share and transfer data, Big Data is therefore as important to us as it is to our customers. Of course, transactional databases are not going away, and we certainly don't expect that to change, but Hadoop and other Big Data solutions are being brought to work alongside these active data stores. Continuent will certainly be looking to expand our different solutions and techniques to bridge the gap between RDBMS and Big Data.

Contact Me on Twitter

More Stories By Roger Strukhoff

Roger Strukhoff (@IoT2040) is Executive Director of the Tau Institute for Global ICT Research, with offices in Illinois and Manila. He is Conference Chair of @CloudExpo & @ThingsExpo, and Editor of SYS-CON Media's CloudComputing BigData & IoT Journals. He holds a BA from Knox College & conducted MBA studies at CSU-East Bay.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
SYS-CON Media announced that IBM, which offers the world’s deepest portfolio of technologies and expertise that are transforming the future of work, has launched ad campaigns on SYS-CON’s numerous online magazines such as Cloud Computing Journal, Virtualization Journal, SOA World Magazine, and IoT Journal. IBM’s campaigns focus on vendors in the technology marketplace, the future of testing, Big Data and analytics, and mobile platforms.
The WebRTC Summit 2014 New York, to be held June 9-11, 2015, at the Javits Center in New York, NY, announces that its Call for Papers is open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 16th International Cloud Expo, @ThingsExpo, Big Data Expo, and DevOps Summit.
DevOps tasked with driving success in the cloud need a solution to efficiently leverage multiple clouds while avoiding cloud lock-in. Flexiant today announces the commercial availability of Flexiant Concerto. With Flexiant Concerto, DevOps have cloud freedom to automate the build, deployment and operations of applications consistently across multiple clouds. Concerto is available through four disruptive pricing models aimed to deliver multi-cloud at a price point everyone can afford.
SYS-CON Events announced today that SafeLogic has been named “Bag Sponsor” of SYS-CON's 16th International Cloud Expo® New York, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. SafeLogic provides security products for applications in mobile and server/appliance environments. SafeLogic’s flagship product CryptoComply is a FIPS 140-2 validated cryptographic engine designed to secure data on servers, workstations, appliances, mobile devices, and in the Cloud....
SYS-CON Events announced today the IoT Bootcamp – Jumpstart Your IoT Strategy, being held June 9–10, 2015, in conjunction with 16th Cloud Expo and Internet of @ThingsExpo at the Javits Center in New York City. This is your chance to jumpstart your IoT strategy. Combined with real-world scenarios and use cases, the IoT Bootcamp is not just based on presentations but includes hands-on demos and walkthroughs. We will introduce you to a variety of Do-It-Yourself IoT platforms including Arduino, Ras...
SYS-CON Events announced today that the DevOps Institute has been named “Association Sponsor” of SYS-CON's DevOps Summit, which will take place on June 9–11, 2015, at the Javits Center in New York City, NY. The DevOps Institute provides enterprise level training and certification. Working with thought leaders from the DevOps community, the IT Service Management field and the IT training market, the DevOps Institute is setting the standard in quality for DevOps education and training.
Plutora provides enterprise release management and test environment SaaS solutions to clients in North America, Europe and Asia Pacific. Leading companies across a variety of industries, including financial services, telecommunications, retail, pharmaceutical and media, rely on Plutora's SaaS solutions to orchestrate releases and environments faster and with integrity. Products include Plutora Release Manager, Plutora Test Environment Manager and Plutora Deployment Manager.
SYS-CON Events announced today the DevOps Foundation Certification Course, being held June ?, 2015, in conjunction with DevOps Summit and 16th Cloud Expo at the Javits Center in New York City, NY. This sixteen (16) hour course provides an introduction to DevOps – the cultural and professional movement that stresses communication, collaboration, integration and automation in order to improve the flow of work between software developers and IT operations professionals. Improved workflows will res...
When it comes to building applications, one database definitely does not fit all. Traditional SQL databases are great for storing highly structured, normalized data and performing analytics and reporting. NoSQL has attracted developers with its awesome flexibility, and JSON-centric document stores like Cloudant make web developers incredibly productive by offering a JavaScript environment from end-to-end. Recent Big Data challenges have driven the need for a distributed approach to analytics e...
Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 16th Cloud Expo at the Javits Center in New York June 9-11 will find fresh new content in a new track called PaaS | Containers & Microservices Containers are not being considered for the first time by the cloud community, but a current era of re-consideration has pushed them to the top of the cloud agenda. With the launch ...
Modern Systems announced completion of a successful project with its new Rapid Program Modernization (eavRPMa"c) software. The eavRPMa"c technology architecturally transforms legacy applications, enabling faster feature development and reducing time-to-market for critical software updates. Working with Modern Systems, the University of California at Santa Barbara (UCSB) leveraged eavRPMa"c to transform its Student Information System from Software AG's Natural syntax to a modern application lev...
SOA Software has changed its name to Akana. With roots in Web Services and SOA Governance, Akana has established itself as a leader in API Management and is expanding into cloud integration as an alternative to the traditional heavyweight enterprise service bus (ESB). The company recently announced that it achieved more than 90% year-over-year growth. As Akana, the company now addresses the evolution and diversification of SOA, unifying security, management, and DevOps across SOA, APIs, microser...
Even though it’s now Microservices Journal, long-time fans of SOA World Magazine can take comfort in the fact that the URL – soa.sys-con.com – remains unchanged. And that’s no mistake, as microservices are really nothing more than a new and improved take on the Service-Oriented Architecture (SOA) best practices we struggled to hammer out over the last decade. Skeptics, however, might say that this change is nothing more than an exercise in buzzword-hopping. SOA is passé, and now that people are ...
The webinar, hosted by XebiaLabs, will feature 4 experts including Special Host Gene Kim, author of The Phoenix Project, along with IT thought leaders Gary Gruver, Randy Shoup and XebiaLabs' Andrew Phillips. The panel brings more than 30 years of collective experience surrounding microservices transformations at major companies including Google, eBay and Tripwire. "The story around microservices and containers is pretty compelling and the attraction of more flexibility is obviously alluring,"...
SYS-CON Events announced today that Creative Business Solutions will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Creative Business Solutions is the top stocking authorized HP Renew Distributor in the U.S. Based out of Long Island, NY, Creative Business Solutions offers a one-stop shop for a diverse range of products including Proliant, Blade and Industry Standard Servers, Networking, Server Options and...
WHOA.com has announced the newest addition to its data center footprint with the expansion into Equinix's newest state-of-the-art facility: DC-11 Washington, DC IBX+. Located in Ashburn, VA, this data center expands Whoa.com's presence to meet rapidly expanding customer demand for secure cloud solutions. Equinix, Inc. operates International Business Exchange™ (IBX®) data centers in 32 markets across 15 countries in the Americas, EMEA, and Asia-Pacific. Equinix is committed to operating faciliti...
SYS-CON Events announced today that FierceDevOps will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. FierceDevOps keeps software developers and IT operations personnel updated on the latest news and trends around the rapidly evolving role of the traditional IT worker.
The Open Compute Project is a collective effort by Facebook and a number of players in the datacenter industry to bring lessons learned from the social media giant's giant IT deployment to the rest of the world. Datacenters account for 3% of global electricity consumption – about the same as all of Switzerland or the Czech Republic -- according to people I met at the recent Open Compute Summit in San Jose. With increasing mobility at the edge of the cloud and vast new dataflows being pre...
GENBAND has announced that SageNet is leveraging the Nuvia platform to deliver Unified Communications as a Service (UCaaS) to its large base of retail and enterprise customers. Nuvia’s cloud-based solution provides SageNet’s customers with a full suite of business communications and collaboration tools. Two large national SageNet retail customers have recently signed up to deploy the Nuvia platform and the company will continue to sell the service to new and existing customers. Nuvia’s capabili...
WSM International is launching a DevOps services division that offers assessment, consulting and implementation to large enterprises and organizations with complex infrastructures. This is the first independent services company to create a dedicated practice to help organizations looking to transition to the DevOps model. The concept of DevOps is to blend information technology (IT) software development with operations to optimize the computing infrastructure according to the specific needs of ...