|By David Tishgart||
|October 27, 2012 09:00 AM EDT||
Big Data takes center stage today at the Strata Conference & Hadoop World in New York, the world’s largest gathering of the Apache Hadoop™ community. A key conversation topic will be how organizations can improve data security for Hadoop and the applications that run on the platform. As you know, Hadoop and similar data stores hold a lot of promise for organizations to finally gain some value out of the immense amount of data they're capturing. But HDFS, Hive and other nascent NoSQL technologies were not necessarily designed with comprehensive security in mind. Often what happens as big data projects grow is sensitive data like HIPAA data, PII and financial records get captured and stored. It's important this data remains secure at rest.
I polled my fellow co-workers at Gazzang last week, and asked them to come up with a top ten list for securing Apache Hadoop. Here's what they delivered. Enjoy:
Think about security before getting started – You don’t wait until after a burglary to put locks on your doors, and you should not wait until after a breach to secure your data. Make sure a serious data security discussion takes place before installing and feeding data into your Hadoop cluster.
Consider what data may get stored – If you are using Hadoop to store and run analytics against regulatory data, you likely need to comply with specific security requirements. If the stored data does not fall under regulatory jurisdiction, keep in mind the risks to your public reputation and potential loss of revenue if data such as personally identifiable information (PII) were breached.
Encrypt data at rest and in motion – Add transparent data encryption at the file layer as a first step toward enhancing the security of a big data project. SSL encryption can protect big data as it moves between nodes and applications.
As Securosis analyst Adrian Lane wrote in a recent blog, “File encryption addresses two attacker methods for circumventing normal application security controls. Encryption protects in case malicious users or administrators gain access to data nodes and directly inspect files, and it also renders stolen files or disk images unreadable. It is transparent to both Hadoop and calling applications and scales out as the cluster grows. This is a cost-effective way to address several data security threats.”
Store the keys away from the encrypted data – Storing encryption keys on the same server as the encrypted data is akin to locking your house and leaving the key in your front door. Instead, use a key management system that separates the key from the encrypted data.
Institute access controls – Establishing and enforcing policies that govern which people and processes can access data stored within Hadoop is essential for keeping rogue users and applications off your cluster.
Require multi-factor authentication - Multi-factor authentication can significantly reduce the likelihood of an account being compromised or access to Hadoop data being granted to an unauthorized party.
Use secure automation – Beyond data encryption, organizations should look to DevOps tools such as Chef or Puppet for automated patch and configuration management.
Frequently audit your environment – Project needs, data sets, cloud requirements and security risks are constantly changing. It’s important to make sure you are closely monitoring your Hadoop environment and performing frequent checks to ensure performance and security goals are being met.
Ask tough questions of your cloud provider – Be sure you know what your cloud provider is responsible for. Will they encrypt your data? Who will store and have access to your keys? How is your data retired when you no longer need it? How do they prevent data leakage?
Centralize accountability – Centralizing the accountability for data security ensures consistent policy enforcement and access control across diverse organizational silos and data sets.
Did we miss anything? If so, please comment below, and enjoy Strata +HadoopWorld.
[webcast] Mastering DevOps Automation | @DevOpsSummit @IBMDevOps #IBM #Cloud #DevOps #ContinuousDelivery
Deploying applications in hybrid cloud environments is hard work. Your team spends most of the time maintaining your infrastructure, configuring dev/test and production environments, and deploying applications across environments – which can be both time consuming and error prone. But what if you could automate provisioning and deployment to deliver error free environments faster? What could you do with your free time?
Jul. 27, 2016 09:17 AM EDT
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
Jul. 27, 2016 09:15 AM EDT Reads: 1,343
The best-practices for building IoT applications with Go Code that attendees can use to build their own IoT applications. In his session at @ThingsExpo, Indraneel Mitra, Senior Solutions Architect & Technology Evangelist at Cognizant, provided valuable information and resources for both novice and experienced developers on how to get started with IoT and Golang in a day. He also provided information on how to use Intel Arduino Kit, Go Robotics API and AWS IoT stack to build an application tha...
Jul. 27, 2016 09:15 AM EDT Reads: 1,107
Amazon has gradually rolled out parts of its IoT offerings in the last year, but these are just the tip of the iceberg. In addition to optimizing their back-end AWS offerings, Amazon is laying the ground work to be a major force in IoT – especially in the connected home and office. Amazon is extending its reach by building on its dominant Cloud IoT platform, its Dash Button strategy, recently announced Replenishment Services, the Echo/Alexa voice recognition control platform, the 6-7 strategic...
Jul. 27, 2016 09:00 AM EDT Reads: 426
UpGuard has become a member of the Center for Internet Security (CIS), and will continue to help businesses expand visibility into their cyber risk by providing hardening benchmarks to all customers. By incorporating these benchmarks, UpGuard's CSTAR solution builds on its lead in providing the most complete assessment of both internal and external cyber risk. CIS benchmarks are a widely accepted set of hardening guidelines that have been publicly available for years. Numerous solutions exist t...
Jul. 27, 2016 09:00 AM EDT Reads: 593
IoT generates lots of temporal data. But how do you unlock its value? You need to discover patterns that are repeatable in vast quantities of data, understand their meaning, and implement scalable monitoring across multiple data streams in order to monetize the discoveries and insights. Motif discovery and deep learning platforms are emerging to visualize sensor data, to search for patterns and to build application that can monitor real time streams efficiently. In his session at @ThingsExpo, ...
Jul. 27, 2016 08:45 AM EDT Reads: 1,048
Choosing the right cloud for your workloads is a balancing act that can cost your organization time, money and aggravation - unless you get it right the first time. Economics, speed, performance, accessibility, administrative needs and security all play a vital role in dictating your approach to the cloud. Without knowing the right questions to ask, you could wind up paying for capacity you'll never need or underestimating the resources required to run your applications.
Jul. 27, 2016 08:00 AM EDT Reads: 477
Verizon Communications Inc. (NYSE, Nasdaq: VZ) and Yahoo! Inc. (Nasdaq: YHOO) have entered into a definitive agreement under which Verizon will acquire Yahoo's operating business for approximately $4.83 billion in cash, subject to customary closing adjustments. Yahoo informs, connects and entertains a global audience of more than 1 billion monthly active users** -- including 600 million monthly active mobile users*** through its search, communications and digital content products. Yahoo also co...
Jul. 27, 2016 08:00 AM EDT Reads: 586
"Avere Systems is a hybrid cloud solution provider. We have customers that want to use cloud storage and we have customers that want to take advantage of cloud compute," explained Rebecca Thompson, VP of Marketing at Avere Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 27, 2016 08:00 AM EDT Reads: 2,081
"We view the cloud not really as a specific technology but as a way of doing business and that way of doing business is transforming the way software, infrastructure and services are being delivered to business," explained Matthew Rosen, CEO and Director at Fusion, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 27, 2016 07:45 AM EDT Reads: 1,567
There will be new vendors providing applications, middleware, and connected devices to support the thriving IoT ecosystem. This essentially means that electronic device manufacturers will also be in the software business. Many will be new to building embedded software or robust software. This creates an increased importance on software quality, particularly within the Industrial Internet of Things where business-critical applications are becoming dependent on products controlled by software. Qua...
Jul. 27, 2016 06:30 AM EDT Reads: 1,463
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busin...
Jul. 27, 2016 06:30 AM EDT Reads: 1,820
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
Jul. 27, 2016 04:45 AM EDT Reads: 2,269
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, discussed how research has demonstrated the value of Machine Learning in delivering next generation analytics to imp...
Jul. 27, 2016 04:30 AM EDT Reads: 2,498
"We host and fully manage cloud data services, whether we store, the data, move the data, or run analytics on the data," stated Kamal Shannak, Senior Development Manager, Cloud Data Services, IBM, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Jul. 27, 2016 04:30 AM EDT Reads: 1,308
Large scale deployments present unique planning challenges, system commissioning hurdles between IT and OT and demand careful system hand-off orchestration. In his session at @ThingsExpo, Jeff Smith, Senior Director and a founding member of Incenergy, will discuss some of the key tactics to ensure delivery success based on his experience of the last two years deploying Industrial IoT systems across four continents.
Jul. 27, 2016 04:00 AM EDT Reads: 1,563
With the proliferation of both SQL and NoSQL databases, organizations can now target specific fit-for-purpose database tools for their different application needs regarding scalability, ease of use, ACID support, etc. Platform as a Service offerings make this even easier now, enabling developers to roll out their own database infrastructure in minutes with minimal management overhead. However, this same amount of flexibility also comes with the challenges of picking the right tool, on the right ...
Jul. 27, 2016 03:45 AM EDT Reads: 1,188
With over 720 million Internet users and 40–50% CAGR, the Chinese Cloud Computing market has been booming. When talking about cloud computing, what are the Chinese users of cloud thinking about? What is the most powerful force that can push them to make the buying decision? How to tap into them? In his session at 18th Cloud Expo, Yu Hao, CEO and co-founder of SpeedyCloud, answered these questions and discussed the results of SpeedyCloud’s survey.
Jul. 27, 2016 03:30 AM EDT Reads: 969
Organizations planning enterprise data center consolidation and modernization projects are faced with a challenging, costly reality. Requirements to deploy modern, cloud-native applications simultaneously with traditional client/server applications are almost impossible to achieve with hardware-centric enterprise infrastructure. Compute and network infrastructure are fast moving down a software-defined path, but storage has been a laggard. Until now.
Jul. 27, 2016 03:15 AM EDT Reads: 1,773
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
Jul. 27, 2016 02:45 AM EDT Reads: 2,287