|By Srinivasan Sundara Rajan||
|February 14, 2011 06:00 PM EST||
Data Warehousing As A Cloud Candidate
Over the past year, we have started seeing greater support for Cloud from major vendors and Cloud is here to stay. The bigger impact is that, the path is clearly drawn for the enterprises to adopt Cloud. With this in mind, it is time to identify the potential for existing data center applications to be migrated to Cloud.
Most of the major IT majors predict a HYBRID Delivery will be future, where by the future enterprises needs to look for a delivery model that comprises of certain work loads on Clouds and some of them continue to be on data centers and then look for a model that will integrate them together.
Before we go further into a blue print of How Data warehouses fit within a HYBRID Cloud environment, we will see the salient features of Data warehouses and how the Cloud tenants make them a very viable work load to be moved to Cloud.
A data warehouse is a subject oriented, integrated, time variant and non volatile collection of data in support of management's decision making process.
Data Warehousing Usage
Cloud Tenant Value Proposition
ETL (Extract, Cleaning, Transform, Load) process is subject to variable patterns. Normally we may get large files over the week end or in night time to be processed and loaded.
It is better to use the COMPUTE resources on demand for the ETL as they require , rather than having a fixed capacity
OLAP (Online Analytical Processing) and related processing needs for MOLAP (Multi dimensional OLAP) and / or ROLAP (Relational OLAP) are highly compute intensive and requires stronger processing needs
High Performance Computing and ability to scale up on demand, tenants of Cloud will be highly aligned to this need
Physical architecture needs are complex in a data warehousing environment.
Most of the IaaS , PaaS offerings like Azure platform, Amazon EC2 have built in provisions for a highly available architecture, with most of the day to day administration is abstracted from the enterprises.
The below are some of the advantages of SQL Azure Platform
Multiple Software and platform needs,
The product stack of data warehousing environment is really huge and most organizations will normally find it difficult to get into a ideal list of software and platforms and tools for their BI platform. platform. SaaS for applications like data cleansing or address validation and PaaS for reporting like Microsoft SQL Azure reporting will be ideal to solve the tools and platform maze.
The following are the ideal steps for migrating a in-premise data warehouse system to a cloud platform, for the sake of case study , Microsoft Windows Azure platform is chosen as the target platform.
1. Create Initial Database / Allocate Storage / Migrate Data
The existing STAR Schema design of the existing data warehousing system can be migrated to Cloud platform as it is. And migrating to a Relational database platform like SQL Azure should be straightforward. To migrate the data, the initial storage allocations of the existing database on the data center needs to be calculated and the same amount Storage resources will be allocated on the Cloud.
You can store any amount of data, from kilobytes to terabytes, in SQL Azure. However, individual databases are limited to 10 GB in size. To create solutions that store more than 10 GB of data, you must partition large data sets across multiple databases and use parallel queries to access the data.
Once a high scalable database infrastructure is setup on SQL Azure platform , the following are some of the methods in which the data from the existing on-premise data warehouses can be moved to SQL Azure.
Traditional BCP Tool : BCP is a command line utility that ships with Microsoft SQL Server. It bulk copies data between SQL Azure (or SQL Server) and a data file in a user-specified format. The bcp utility that ships with SQL Server 2008 R2 is fully supported by SQL Azure. You can use BCP to backup and restore your data on SQL Azure You can import large numbers of new rows into SQL Azure tables or export data out of tables into data files by using the bcp utility.
The following tools are also useful, if you existing Data warehouse is in Sql Server within the data center.
You can transfer data to SQL Azure by using SQL Server 2008 Integration Services (SSIS). SQL Server 2008 R2 or later supports the Import and Export Data Wizard and bulk copy for the transfer of data between an instance of Microsoft SQL Server and SQL Azure.
SQL Server Migration Assistant (SSMA for Access v4.2) supports migrating your schema and data from Microsoft Access to SQL Azure.
2. Set Up ETL & Integration With Existing On Premise Data Sources
After the initial load of the data warehouse on Cloud, it required to be continuously refreshed with the operational data. This process needs to extract data from different data sources (such as flat files, legacy databases, RDBMS, ERP, CRM and SCM application packages).
This process will also carry out necessary transformations such as joining of tables, sorting, applying various filters.
The following are typical options available in Sql Azure platform to build a ETL platform between the On Premise and data warehouse hosted on cloud. The tools mentioned above on the initial load of the data also holds good for ETL tool, however they are not repeated to avoid duplication.
SQL Azure Data Sync :
- Cloud to cloud synchronization
- Enterprise (on-premise) to cloud
- Cloud to on-premise.
- Bi-directional or sync-to-hub or sync-from-hub synchronization
The following diagram courtesy of Vendor will give a over view of how the SQL Azure Data Sync can be used for ETL purposes.
Integration provides common Biztalk Server integration capabilities (e.g. pipeline, transforms, adapters) on Windows Azure, using out-of-box integration patterns to accelerate and simplify development. It also delivers higher level business user enablement capabilities such as Business Activity Monitoring and Rules, as well as self-service trading partner community portal and provisioning of business-to-business pipelines. The following diagram courtesy of the vendor shows how the Windows Azure Appfabric Integration can be used as a ETL platform.
3. Create CUBES & Other Analytics Structures
The multi dimensional nature of OLAP requires a analytical engine to process the underlying data and create a multi dimensional view and the success of OLAP has resulted in a large number of vendors offering OLAP servers using different architectures.
MLOAP : A Proprietary multidimensional database with a aim on performance.
ROLAP : Relational OLAP is a technology that provides sophisticated multidimensional analysis that is performed on open relational databases. ROLAP can scale to large data sets in the terabyte range.
HOLAP : Hybrid OLAP is an attempt to combine some of the features of MOLAP and ROLAP technology.
SQL Azure Database does not support all of the features and data types found in SQL Server. Analysis Services, Replication, and Service Broker are not currently provided as services on the Windows Azure platform.
At this time there is no direct support for OLAP and CUBE processing on SQL Azure, however with the HPC (High Performance Computing ) attributes using multiple Worker roles, manually aggregation of the data can be achieved.
4. Generate Reports
Reporting consists of analyzing the data stored in the data warehouse in multiple dimensions and generate standard reports for business intelligence and also generate ad-hoc reports. These reports present data in graphical/tabular form and also provide statistical analysis features. These reports should be rendered as Excel, PDF and other formats.
It is better to utilize the SaaS based or PaaS based reporting infrastructure rather than custom coding all the reports.
SQL Azure Reporting enables developers to enhance their applications by embedding cloud based reports on information stored in a SQL Azure database. Developers can author reports using familiar SQL Server Reporting Services tools and then use these reports in their applications which may be on-premises or in the cloud.
SQL Azure Reporting also currently can connect only to SQL Azure databases.
The above steps will provide a path to migrate on premise Data warehousing applications to Cloud. As we needed lot of support from the vendor in terms of IaaS, PaaS and SaaS, Microsoft Azure Platform is chosen as a platform to support the case study. With several features integrated as part of this, Microsoft Cloud Platform positioned to be one of the leading platform for BI on Cloud.
The following diagram indicates a blue print of a typical Cloud BI Organization on a Microsoft Azure Platform.
Performance is the intersection of power, agility, control, and choice. If you value performance, and more specifically consistent performance, you need to look beyond simple virtualized compute. Many factors need to be considered to create a truly performant environment. In their General Session at 15th Cloud Expo, Phil Jackson, Development Community Advocate at SoftLayer, and Harold Hannon, Sr. Software Architect at SoftLayer, to discuss how to take advantage of a multitude of compute option...
Sep. 25, 2014 03:30 PM EDT Reads: 1,851
Come learn about what you need to consider when moving your data to the cloud. In her session at 15th Cloud Expo, Skyla Loomis, a Program Director of Cloudant Development at Cloudant, will discuss the security, performance, and operational implications of keeping your data on premise, moving it to the cloud, or taking a hybrid approach. She will use real customer examples to illustrate the tradeoffs, key decision points, and how to be successful with a cloud or hybrid cloud solution.
Sep. 22, 2014 10:00 PM EDT Reads: 2,370
In today's application economy, enterprise organizations realize that it's their applications that are the heart and soul of their business. If their application users have a bad experience, their revenue and reputation are at stake. In his session at 15th Cloud Expo, Anand Akela, Senior Director of Product Marketing for Application Performance Management at CA Technologies, will discuss how a user-centric Application Performance Management solution can help inspire your users with every appli...
Sep. 22, 2014 03:30 PM EDT Reads: 2,016
With the explosion of the cloud, more businesses are transitioning to a recurring revenue model to generate reliable sales, grow profits, and open new markets. This opportunity requires businesses to get to market quickly with the pricing and packaging options customers want. In addition, you will want to take advantage of the ensuing tidal wave of data to more effectively upsell, cross-sell and manage your customers. All of this is possible, but only with the right approach. At 15th Cloud Exp...
Sep. 12, 2014 11:45 PM EDT Reads: 1,516
Planning scalable environments isn't terribly difficult, but it does require a change of perspective. In his session at 15th Cloud Expo, Phil Jackson, Development Community Advocate for SoftLayer, will broaden your views to think on an Internet scale by dissecting a video publishing application built with The SoftLayer Platform, Message Queuing, Object Storage, and Drupal. By examining a scalable modular application build that can handle unpredictable traffic, attendees will able to grow your de...
Sep. 12, 2014 11:30 PM EDT Reads: 1,823
The cloud provides an easy onramp to building and deploying Big Data solutions. Transitioning from initial deployment to large-scale, highly performant operations may not be as easy. In his session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, will discuss the benefits, weaknesses, and performance characteristics of public and bare metal cloud deployments that can help you make the right decisions.
Sep. 11, 2014 10:30 PM EDT Reads: 2,300
Over the last few years the healthcare ecosystem has revolved around innovations in Electronic Health Record (HER) based systems. This evolution has helped us achieve much desired interoperability. Now the focus is shifting to other equally important aspects – scalability and performance. While applying cloud computing environments to the EHR systems, a special consideration needs to be given to the cloud enablement of Veterans Health Information Systems and Technology Architecture (VistA), i.e....
Sep. 10, 2014 06:30 PM EDT Reads: 3,708
Cloud and Big Data present unique dilemmas: embracing the benefits of these new technologies while maintaining the security of your organization’s assets. When an outside party owns, controls and manages your infrastructure and computational resources, how can you be assured that sensitive data remains private and secure? How do you best protect data in mixed use cloud and big data infrastructure sets? Can you still satisfy the full range of reporting, compliance and regulatory requirements? I...
Sep. 10, 2014 03:00 PM EDT Reads: 2,798
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using...
Sep. 6, 2014 06:45 PM EDT Reads: 5,108
Is your organization struggling to deal with skyrocketing volumes of digital assets? The amount of data is growing exponentially and organizations are having a hard time managing this growth. In his session at 15th Cloud Expo, Amar Kapadia, Senior Director of Open Cloud Strategy at Seagate, will walk through the essential considerations when developing a cloud storage strategy. In this discussion, you will understand the challenges IT is facing, why companies need to move to cloud, and how the...
Sep. 6, 2014 02:00 PM EDT Reads: 1,814
If cloud computing benefits are so clear, why have so few enterprises migrated their mission-critical apps? The answer is often inertia and FUD. No one ever got fired for not moving to the cloud – not yet. In his session at 15th Cloud Expo, Michael Hoch, SVP, Cloud Advisory Service at Virtustream, will discuss the six key steps to justify and execute your MCA cloud migration.
Sep. 6, 2014 11:00 AM EDT Reads: 1,841
The 16th International Cloud Expo announces that its Call for Papers is now open. 16th International Cloud Expo, to be held June 9–11, 2015, at the Javits Center in New York City brings together Cloud Computing, APM, APIs, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speak...
Sep. 5, 2014 02:15 PM EDT Reads: 3,032
Most of today’s hardware manufacturers are building servers with at least one SATA Port, but not every systems engineer utilizes them. This is considered a loss in the game of maximizing potential storage space in a fixed unit. The SATADOM Series was created by Innodisk as a high-performance, small form factor boot drive with low power consumption to be plugged into the unused SATA port on your server board as an alternative to hard drive or USB boot-up. Built for 1U systems, this powerful devic...
Sep. 4, 2014 07:15 PM EDT Reads: 9,070
SYS-CON Events announced today that Gridstore™, the leader in software-defined storage (SDS) purpose-built for Windows Servers and Hyper-V, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Gridstore™ is the leader in software-defined storage purpose built for virtualization that is designed to accelerate applications in virtualized environments. Using its patented Server-Side Virtual C...
Sep. 4, 2014 01:45 PM EDT Reads: 2,384
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to bui...
Sep. 3, 2014 07:00 PM EDT Reads: 3,308
SYS-CON Events announced today that TechXtend (formerly Programmer’s Paradise), a leading value-added provider of server and storage virtualization, and r-evolution will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. TechXtend (formerly Programmer’s Paradise) is a leading value-added provider of software, systems and solutions for corporations, government organizations, and academic instit...
Sep. 3, 2014 01:00 PM EDT Reads: 2,583
Every healthy ecosystem is diverse. This is especially true in cloud ecosystems, where portability and interoperability are more important than old enterprise models of proprietary ownership. In his session at 15th Cloud Expo, Mark Baker, Server Product Manager at Canonical/Ubuntu, will discuss how single vendors used to take the lead in creating and delivering technology, but in a cloud economy, where users want tools of their preference, when and where they need them, it makes no sense.
Sep. 3, 2014 09:00 AM EDT Reads: 2,664
The emergence of cloud computing and Big Data warrants a greater role for the PMO to successfully manage enterprise transformation driven by these powerful trends. As the adoption of cloud-based services continues to grow, a governance model is needed to orchestrate enterprise cloud implementations and harness the power of Big Data analytics. In his session at 15th Cloud Expo, Mahesh Singh, President of BigData, Inc., to discuss how the Enterprise PMO takes center stage not only in developing th...
Sep. 2, 2014 11:00 PM EDT Reads: 1,982
The consumption economy is here and so are cloud applications and solutions that offer more than subscription and flat fee models and at the same time are available on a pure consumption model, which not only reduces IT spend but also lowers infrastructure costs, and offers ease of use and availability. In their session at 15th Cloud Expo, Ermanno Bonifazi, CEO & Founder of Solgenia, and Ian Khan, Global Strategic Positioning & Brand Manager at Solgenia, will discuss this shifting dynamic with ...
Sep. 2, 2014 11:00 PM EDT Reads: 1,524
Cloud computing started a technology revolution; now DevOps is driving that revolution forward. By enabling new approaches to service delivery, cloud and DevOps together are delivering even greater speed, agility, and efficiency. No wonder leading innovators are adopting DevOps and cloud together! In his session at DevOps Summit, Andi Mann, Vice President of Strategic Solutions at CA Technologies, will explore the synergies in these two approaches, with practical tips, techniques, research dat...
Sep. 2, 2014 02:30 PM EDT Reads: 3,835