Welcome!

@CloudExpo Authors: Pat Romanski, Elizabeth White, Liz McMillan, Yeshim Deniz, Zakia Bouachraoui

Related Topics: @DXWorldExpo, Open Source Cloud, @CloudExpo

@DXWorldExpo: Article

The Big Challenge of Big Data and Hadoop Integration

Details of Cédric Carbone's upcoming session at 12th Cloud Expo / Cloud Expo New York [June 10-13, 2013]

Enterprises can't close their doors just because integration tools won't cope with the volume of information that their systems produce.  As each day goes by, their information will become larger and more complicated, and enterprises must constantly struggle to manage the integration of dozens (or hundreds) of systems.

Apache Hadoop has quickly become the technology of choice for enterprises that need to perform complex analysis of petabytes of data, but few are aware of its potential to handle large-scale integration work. By using effective tools, integrators can process the complex transformation, synchronization, and orchestration tasks required in a high-performance, low cost, infinitely scalable way.

In his session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], Cédric Carbone will discuss how Hadoop can be used to integrate disparate systems and services, and provide a demonstration of the process for designing and deploying common integration tasks.



About the Speaker:
Cédric Carbone is CTO of Talend, where he manages the data integration, data quality, Big Data, MDM and ESB products lines with an international team of more than 140 R&D engineers. He's also a Board Member at the Eclipse Foundation and OW2 consortium.

Carbone has lectured at several universities on technical topics such as XML or Web Services. He holds a Master's Degree in Computer Science and an Advanced Degree in Document Engineering.

More Stories By Jeremy Geelan

Jeremy Geelan is Chairman & CEO of the 21st Century Internet Group, Inc. and an Executive Academy Member of the International Academy of Digital Arts & Sciences. Formerly he was President & COO at Cloud Expo, Inc. and Conference Chair of the worldwide Cloud Expo series. He appears regularly at conferences and trade shows, speaking to technology audiences across six continents. You can follow him on twitter: @jg21.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


CloudEXPO Stories
Using new techniques of information modeling, indexing, and processing, new cloud-based systems can support cloud-based workloads previously not possible for high-throughput insurance, banking, and case-based applications. In his session at 18th Cloud Expo, John Newton, CTO, Founder and Chairman of Alfresco, described how to scale cloud-based content management repositories to store, manage, and retrieve billions of documents and related information with fast and linear scalability. He addressed the challenges of scaling document repositories to this level; architectural approaches for coordinating data; search and storage technologies, Solr, and Amazon storage and database technologies; the breadth of use cases that modern content systems need to support; how to support user applications that require subsecond response times.
A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great deals to great conferences, helping you discover new conferences and increase your return on investment.
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
Containers and Kubernetes allow for code portability across on-premise VMs, bare metal, or multiple cloud provider environments. Yet, despite this portability promise, developers may include configuration and application definitions that constrain or even eliminate application portability. In this session we'll describe best practices for "configuration as code" in a Kubernetes environment. We will demonstrate how a properly constructed containerized app can be deployed to both Amazon and Azure using the Kublr platform, and how Kubernetes objects, such as persistent volumes, ingress rules, and services, can be used to abstract from the infrastructure.