Click here to close now.

Welcome!

@CloudExpo Authors: Bart Copeland, Elizabeth White, Ed Featherston, Tom Lounibos, Pat Romanski

Related Topics: @CloudExpo, Java IoT, Microservices Expo, Open Source Cloud, Agile Computing, Apache

@CloudExpo: Article

Big Data Security for Apache Hadoop

Ten Tips for HadoopWorld attendees

Big Data takes center stage today at the Strata Conference & Hadoop World in New York, the world’s largest gathering of the Apache Hadoop™ community. A key conversation topic will be how organizations can improve data security for Hadoop and the applications that run on the platform. As you know, Hadoop and similar data stores hold a lot of promise for organizations to finally gain some value out of the immense amount of data they're capturing. But HDFS, Hive and other nascent NoSQL technologies were not necessarily designed with comprehensive security in mind. Often what happens as big data projects grow is sensitive data like HIPAA data, PII and financial records get captured and stored. It's important this data remains secure at rest.

I polled my fellow co-workers at Gazzang last week, and asked them to come up with a top ten list for securing Apache Hadoop. Here's what they delivered. Enjoy:

Think about security before getting started – You don’t wait until after a burglary to put locks on your doors, and you should not wait until after a breach to secure your data. Make sure a serious data security discussion takes place before installing and feeding data into your Hadoop cluster.

Consider what data may get stored – If you are using Hadoop to store and run analytics against regulatory data, you likely need to comply with specific security requirements. If the stored data does not fall under regulatory jurisdiction, keep in mind the risks to your public reputation and potential loss of revenue if data such as personally identifiable information (PII) were breached.

Encrypt data at rest and in motion – Add transparent data encryption at the file layer as a first step toward enhancing the security of a big data project. SSL encryption can protect big data as it moves between nodes and applications.

As Securosis analyst Adrian Lane wrote in a recent blog, “File encryption addresses two attacker methods for circumventing normal application security controls. Encryption protects in case malicious users or administrators gain access to data nodes and directly inspect files, and it also renders stolen files or disk images unreadable. It is transparent to both Hadoop and calling applications and scales out as the cluster grows. This is a cost-effective way to address several data security threats.”

Store the keys away from the encrypted data – Storing encryption keys on the same server as the encrypted data is akin to locking your house and leaving the key in your front door. Instead, use a key management system that separates the key from the encrypted data.

Institute access controls – Establishing and enforcing policies that govern which people and processes can access data stored within Hadoop is essential for keeping rogue users and applications off your cluster.

Require multi-factor authentication - Multi-factor authentication can significantly reduce the likelihood of an account being compromised or access to Hadoop data being granted to an unauthorized party.

Use secure automation – Beyond data encryption, organizations should look to DevOps tools such as Chef or Puppet for automated patch and configuration management.

Frequently audit your environment – Project needs, data sets, cloud requirements and security risks are constantly changing. It’s important to make sure you are closely monitoring your Hadoop environment and performing frequent checks to ensure performance and security goals are being met.

Ask tough questions of your cloud provider – Be sure you know what your cloud provider is responsible for. Will they encrypt your data? Who will store and have access to your keys? How is your data retired when you no longer need it? How do they prevent data leakage?

Centralize accountability – Centralizing the accountability for data security ensures consistent policy enforcement and access control across diverse organizational silos and data sets.

Did we miss anything? If so, please comment below, and enjoy Strata +HadoopWorld.

More Stories By David Tishgart

After spending years at large corporations including Dell, AMD and BMC, David Tishgart joined the startup ranks leading product marketing for Gazzang. Focused on security for big data, he helps communicate the benefits and challenges that big data can present, offering practical solutions. When not ranting about encryption and key management, you can find David clamoring for a big data application that can fine tune his fantasy football team.

@CloudExpo Stories
SYS-CON Media announced today that CloudBees, the Jenkins Enterprise company, has launched ad campaigns on SYS-CON's DevOps Journal. CloudBees' campaigns focus on the business value of Continuous Delivery and how it has been recognized as a game changer for IT and is now a top priority for organizations, and the best ways to optimize Jenkins to ensure your continuous integration environment is optimally configured.
The 4th International Internet of @ThingsExpo, co-located with the 17th International Cloud Expo - to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA - announces that its Call for Papers is open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than
SYS-CON Events announced today that ProfitBricks, the provider of painless cloud infrastructure, will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. ProfitBricks is the IaaS provider that offers a painless cloud experience for all IT users, with no learning curve. ProfitBricks boasts flexible cloud servers and networking, an integrated Data Center Designer tool for visual control over the...
DevOps is about increasing efficiency, but nothing is more inefficient than building the same application twice. However, this is a routine occurrence with enterprise applications that need both a rich desktop web interface and strong mobile support. With recent technological advances from Isomorphic Software and others, it is now feasible to create a rich desktop and tuned mobile experience with a single codebase, without compromising performance or usability.
The most often asked question post-DevOps introduction is: “How do I get started?” There’s plenty of information on why DevOps is valid and important, but many managers still struggle with simple basics for how to initiate a DevOps program in their business. They struggle with issues related to current organizational inertia, the lack of experience on Continuous Integration/Delivery, understanding where DevOps will affect revenue and budget, etc. In their session at DevOps Summit, JP Morgenthal...
"We provide a web application framework for building really sophisticated web applications that run on a browser without any installation need so we get used for biotech, defense, and banking applications," noted Charles Kendrick, CTO and Chief Architect at Isomorphic Software, in this SYS-CON.tv interview at @DevOpsSummit (http://DevOpsSummit.SYS-CON.com), held June 9-11, 2015, at the Javits Center in New York
"Plutora provides release and testing environment capabilities to the enterprise," explained Dalibor Siroky, Director and Co-founder of Plutora, in this SYS-CON.tv interview at @DevOpsSummit, held June 9-11, 2015, at the Javits Center in New York City.
DevOps tends to focus on the relationship between Dev and Ops, putting an emphasis on the ops and application infrastructure. But that’s changing with microservices architectures. In her session at DevOps Summit, Lori MacVittie, Evangelist for F5 Networks, will focus on how microservices are changing the underlying architectures needed to scale, secure and deliver applications based on highly distributed (micro) services and why that means an expansion into “the network” for DevOps.
"The idea of polyglot persistence is you have to apply the right database for the job - you always have to have many different databases in play. We offer that whole system as a service," explained Raj Singh, Developer Advocate for IBM Cloud Data Services, in this SYS-CON.tv interview at 16th Cloud Expo, held June 9-11, 2015, at the Javits Center in New York City.
In his session at 16th Cloud Expo, Simone Brunozzi, VP and Chief Technologist of Cloud Services at VMware, reviewed the changes that the cloud computing industry has gone through over the last five years and shared insights into what the next five will bring. He also chronicled the challenges enterprise companies are facing as they move to the public cloud. He delved into the "Hybrid Cloud" space and explained why every CIO should consider ‘hybrid cloud' as part of their future strategy to achie...
SYS-CON Events announced today that WHOA.com, an ISO 27001 Certified secure cloud computing company, participated as “Bronze Sponsor” of SYS-CON's 16th International Cloud Expo® New York, which took place June 9-11, 2015, at the Javits Center in New York City, NY. WHOA.com is a leader in next-generation, ISO 27001 Certified secure cloud solutions. WHOA.com offers a comprehensive portfolio of best-in-class cloud services for business including Infrastructure as a Service (IaaS), Secure Cloud Desk...
SYS-CON Events announced today that Intelligent Systems Services will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Established in 1994, Intelligent Systems Services Inc. is located near Washington, DC, with representatives and partners nationwide. ISS’s well-established track record is based on the continuous pursuit of excellence in designing, implementing and supporting nationwide clients’ ...
The Internet of Things is not only adding billions of sensors and billions of terabytes to the Internet. It is also forcing a fundamental change in the way we envision Information Technology. For the first time, more data is being created by devices at the edge of the Internet rather than from centralized systems. What does this mean for today's IT professional? In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists addressed this very serious issue of pro...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading in...
SYS-CON Events announced today that MangoApps will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides private all-in-one social intranets allowing workers to securely collaborate from anywhere in the world and from any device. Social, mobile, and easy to use. MangoApps has been named a "Market Leader" by Ovum Research and a "Cool Vendor" by Gartner. 20,000+ business custome...
IT data is typically silo'd by the various tools in place. Unifying all the log, metric and event data in one analytics platform stops finger pointing and provides the end-to-end correlation. Logs, metrics and custom event data can be joined to tell the holistic story of your software and operations. For example, users can correlate code deploys to system performance to application error codes. In his session at DevOps Summit, Michael Demmer, VP of Engineering at Jut, will discuss how this can...
SYS-CON Events announced today that kintone has been named “Bronze Sponsor” of SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. kintone promotes cloud-based workgroup productivity, transparency and profitability with a seamless collaboration space, build your own business application (BYOA) platform, and workflow automation system.
Live Webinar with 451 Research Analyst Peter Christy. Join us on Wednesday July 22, 2015, at 10 am PT / 1 pm ET In a world where users are on the Internet and the applications are in the cloud, how do you maintain your historic SLA with your users? Peter Christy, Research Director, Networks at 451 Research, will discuss this new network paradigm, one in which there is no LAN and no WAN, and discuss what users and network administrators gain and give up when migrating to the agile world of clo...
SYS-CON Events announced today that Harbinger Systems will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Harbinger Systems is a global company providing software technology services. Since 1990, Harbinger has developed a strong customer base worldwide. Its customers include software product companies ranging from hi-tech start-ups in Silicon Valley to leading product companies in the US a...
The cloud has transformed how we think about software quality. Instead of preventing failures, we must focus on automatic recovery from failure. In other words, resilience trumps traditional quality measures. Continuous delivery models further squeeze traditional notions of quality. Remember the venerable project management Iron Triangle? Among time, scope, and cost, you can only fix two or quality will suffer. Only in today's DevOps world, continuous testing, integration, and deployment upend...