Click here to close now.




















Welcome!

@CloudExpo Authors: Dana Gardner, Eric Aarrestad, Liz McMillan, Adrian Bridgwater, Pat Romanski

Related Topics: @CloudExpo, @BigDataExpo

@CloudExpo: Article

Hadoop Moving More Toward Real-Time

Interview with Continuent CEO Robert Hodges

No discussion of the Red Hat Summit 2014 would be complete without some discussion of Apache Hadoop. The happy elephant has now been pushing data for close to a decade, its distributed file system (HDFS) setting the tone for support of modern-day, highly distributed and very large databases in the cloud.

So I was pleased to have Robert Hodges, CEO of Hadoop-focused Continuent Tungsten, answer a few questions about his company's world.

Roger: What's the scope of the challenge you face in addressing big Hadoop deployments?

Robert: Hadoop is really very powerful as the way to concentrate and analyze information, so the key issue is how the information from existing transactional data stores gets added to Hadoop without implying additional load, application changes, or repetitive dump processes.

From our existing customer deployments, we know that the biggest challenge is getting the information into Hadoop as quickly and timely as possible from multiple different hosts simultaneously. Our customers often have many more transactional hosts running MySQL than they have Hadoop hosts, just because the scale-out and sharding required to support their transactional needs is so high.

Roger: What are the key pain points?

Robert: The key pain points are therefore the extraction of data from the transactional stores without implying additional load on these servers which are running their live customer facing website, while simultaneously loading large quantities of data that needs to be merged and analysed on the Hadoop side.

The replication solution based on Tungsten Replicator provides this very simply by placing a very low-level of load required for extraction of data, while continually streaming the changes over into Hadoop. Because this can be done on a server or cluster basis, it is easy to scale up the replication of data into Hadoop by adding more streams of replication data.

Roger: How critical is the real-time aspect of modern IT? How quickly is it growing?

Robert: It's growing very quickly, and in some cases quicker than some company IT departments and the technology they support are able to cope. Replication has for a long time been the solution for this scale-out process, but the flows of this replication data are changing.

One of the key drivers behind the adoption of Hadoop and Cassandra and similar databases is the ability to parallel process the data to get numbers in real-time. You can see this in a wide range of different markets, from banking, through to social networking and online stores.

As we get access to more information, the services supporting them need to support that an ever faster rate. We all want the lowest rate on my plane ticket purchase, while receiving the absolute best benefits and service, and all those different elements rely on real-time analysis.

Roger: What does IT think of this?

Robert: Of course, this also presents a completely different problem for the IT departments. They must deal with how to get the data into a system so that it can be analyzed quickly. The location for your active transactional dataset is not the same as your analysis tools, and may be based on completely different quantities of raw data.

Transactional databases might be conveniently sharded into 50 or 100 different RDBMS of 100GB each, but analysis needs to process all 10,000GB of data collectively to get meaningful information. That means that the IT infrastructure needs an effective way to combine and transfer this active data.

It's also clear from recent advancements in querying and processing techniques built on top of Hadoop that Hadoop itself is moving into a more real-time tool. Spark, Storm and other query engines provide very fast query and analysis on very large datasets, taking advantage of the distributed nature of Hadoop, and the increasing RAM and CPU power in evolutions of new hardware. Compatibility with Spark and similar live query mechanisms in Hadoop will form a key part of the next evolution of all Hadoop deployments.

Roger: How key is the role of Big Data in developing your solutions? How important is the term Big Data to you?

Robert: Big Data has been a significant requirement for our customers and their needs for some time, but we have definitely seen a shift recently from the scale-out, sharded nature of the typical RDBMS towards concentrating that information for analysis in Big Data stores. As that movement of data moves into the real-time it will be critical to the tools we develop to help make the transfer and management of data replication as easy as possible for our customers.

To us as the provider of the tools that enable our customers to easily share and transfer data, Big Data is therefore as important to us as it is to our customers. Of course, transactional databases are not going away, and we certainly don't expect that to change, but Hadoop and other Big Data solutions are being brought to work alongside these active data stores. Continuent will certainly be looking to expand our different solutions and techniques to bridge the gap between RDBMS and Big Data.

Contact Me on Twitter

More Stories By Roger Strukhoff

Roger Strukhoff (@IoT2040) is Executive Director of the Tau Institute for Global ICT Research, with offices in Illinois and Manila. He is Conference Chair of @CloudExpo & @ThingsExpo, and Editor of SYS-CON Media's CloudComputing BigData & IoT Journals. He holds a BA from Knox College & conducted MBA studies at CSU-East Bay.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
With SaaS use rampant across organizations, how can IT departments track company data and maintain security? More and more departments are commissioning their own solutions and bypassing IT. A cloud environment is amorphous and powerful, allowing you to set up solutions for all of your user needs: document sharing and collaboration, mobile access, e-mail, even industry-specific applications. In his session at 16th Cloud Expo, Shawn Mills, President and a founder of Green House Data, discussed h...
The Software Defined Data Center (SDDC), which enables organizations to seamlessly run in a hybrid cloud model (public + private cloud), is here to stay. IDC estimates that the software-defined networking market will be valued at $3.7 billion by 2016. Security is a key component and benefit of the SDDC, and offers an opportunity to build security 'from the ground up' and weave it into the environment from day one. In his session at 16th Cloud Expo, Reuven Harrison, CTO and Co-Founder of Tufin,...
SYS-CON Events announced today that MobiDev, a software development company, will exhibit at the 17th International Cloud Expo®, which will take place November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software development company with representative offices in Atlanta (US), Sheffield (UK) and Würzburg (Germany); and development centers in Ukraine. Since 2009 it has grown from a small group of passionate engineers and business managers to a full-scale mobi...
There are many considerations when moving applications from on-premise to cloud. It is critical to understand the benefits and also challenges of this migration. A successful migration will result in lower Total Cost of Ownership, yet offer the same or higher level of robustness. In his session at 15th Cloud Expo, Michael Meiner, an Engineering Director at Oracle, Corporation, analyzed a range of cloud offerings (IaaS, PaaS, SaaS) and discussed the benefits/challenges of migrating to each offe...
Chuck Piluso presented a study of cloud adoption trends and the power and flexibility of IBM Power and Pureflex cloud solutions. Prior to Secure Infrastructure and Services, Mr. Piluso founded North American Telecommunication Corporation, a facilities-based Competitive Local Exchange Carrier licensed by the Public Service Commission in 10 states, serving as the company's chairman and president from 1997 to 2000. Between 1990 and 1997, Mr. Piluso served as chairman & founder of International Te...
Mobile, social, Big Data, and cloud have fundamentally changed the way we live. “Anytime, anywhere” access to data and information is no longer a luxury; it’s a requirement, in both our personal and professional lives. For IT organizations, this means pressure has never been greater to deliver meaningful services to the business and customers.
In their session at 17th Cloud Expo, Hal Schwartz, CEO of Secure Infrastructure & Services (SIAS), and Chuck Paolillo, CTO of Secure Infrastructure & Services (SIAS), provide a study of cloud adoption trends and the power and flexibility of IBM Power and Pureflex cloud solutions. In his role as CEO of Secure Infrastructure & Services (SIAS), Hal Schwartz provides leadership and direction for the company.
Container technology is sending shock waves through the world of cloud computing. Heralded as the 'next big thing,' containers provide software owners a consistent way to package their software and dependencies while infrastructure operators benefit from a standard way to deploy and run them. Containers present new challenges for tracking usage due to their dynamic nature. They can also be deployed to bare metal, virtual machines and various cloud platforms. How do software owners track the usag...
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Arch...
Malicious agents are moving faster than the speed of business. Even more worrisome, most companies are relying on legacy approaches to security that are no longer capable of meeting current threats. In the modern cloud, threat diversity is rapidly expanding, necessitating more sophisticated security protocols than those used in the past or in desktop environments. Yet companies are falling for cloud security myths that were truths at one time but have evolved out of existence.
Digital Transformation is the ultimate goal of cloud computing and related initiatives. The phrase is certainly not a precise one, and as subject to hand-waving and distortion as any high-falutin' terminology in the world of information technology. Yet it is an excellent choice of words to describe what enterprise IT—and by extension, organizations in general—should be working to achieve. Digital Transformation means: handling all the data types being found and created in the organizat...
Public Cloud IaaS started its life in the developer and startup communities and has grown rapidly to a $20B+ industry, but it still pales in comparison to how much is spent worldwide on IT: $3.6 trillion. In fact, there are 8.6 million data centers worldwide, the reality is many small and medium sized business have server closets and colocation footprints filled with servers and storage gear. While on-premise environment virtualization may have peaked at 75%, the Public Cloud has lagged in adop...
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
The time is ripe for high speed resilient software defined storage solutions with unlimited scalability. ISS has been working with the leading open source projects and developed a commercial high performance solution that is able to grow forever without performance limitations. In his session at Cloud Expo, Alex Gorbachev, President of Intelligent Systems Services Inc., shared foundation principles of Ceph architecture, as well as the design to deliver this storage to traditional SAN storage co...
MuleSoft has announced the findings of its 2015 Connectivity Benchmark Report on the adoption and business impact of APIs. The findings suggest traditional businesses are quickly evolving into "composable enterprises" built out of hundreds of connected software services, applications and devices. Most are embracing the Internet of Things (IoT) and microservices technologies like Docker. A majority are integrating wearables, like smart watches, and more than half plan to generate revenue with ...
The Cloud industry has moved from being more than just being able to provide infrastructure and management services on the Cloud. Enter a new era of Cloud computing where monetization’s services through the Cloud are an essential piece of strategy to feed your organizations bottom-line, your revenue and Profitability. In their session at 16th Cloud Expo, Ermanno Bonifazi, CEO & Founder of Solgenia, and Ian Khan, Global Strategic Positioning & Brand Manager at Solgenia, discussed how to easily o...
The Internet of Everything (IoE) brings together people, process, data and things to make networked connections more relevant and valuable than ever before – transforming information into knowledge and knowledge into wisdom. IoE creates new capabilities, richer experiences, and unprecedented opportunities to improve business and government operations, decision making and mission support capabilities.
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Opening Keynote at 16th Cloud Expo, S...
The speed of software changes in growing and large scale rapid-paced DevOps environments presents a challenge for continuous testing. Many organizations struggle to get this right. Practices that work for small scale continuous testing may not be sufficient as the requirements grow. In his session at DevOps Summit, Marc Hornbeek, Sr. Solutions Architect of DevOps continuous test solutions at Spirent Communications, explained the best practices of continuous testing at high scale, which is rele...
"Alert Logic is a managed security service provider that basically deploys technologies, but we support those technologies with the people and process behind it," stated Stephen Coty, Chief Security Evangelist at Alert Logic, in this SYS-CON.tv interview at 16th Cloud Expo, held June 9-11, 2015, at the Javits Center in New York City.