|By Robert Eve||
|December 6, 2011 09:30 AM EST||
Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility is the first book published on the topic of data virtualization. Along with an overview of data virtualization and its advantages, it presents ten case studies of organizations that have adopted data virtualization to significantly improve business decision making, decrease time-to-solution and reduce costs. This article describes data virtualization adoption at one of the enterprises profiled, Pfizer Inc.
Pfizer Inc. is a biopharmaceutical company that develops, manufactures and markets medicines for both humans and animals. As the world's largest drug manufacturer, Pfizer operates globally with 111,500 employees and a presence in over 100 countries.
Worldwide Pharmaceutical Sciences (PharmSci) is a group of scientists responsible for enabling what drugs Pfizer will bring to market. This group designs, synthesizes and manufactures all drugs that are part of clinical trials and toxicology testing within Pfizer.
For this case study, we interviewed Dr. Michael C. Linhares, Ph.D and Research Fellow. Linhares heads up the Business Information Systems (BIS) team within PharmSci.
BIS is responsible for portfolio and resource management across all of PharmSci's projects. This involves designing, building and supporting systems that deliver data to executive teams and staff to help them make decisions regarding how to allocate available resources - both people and dollars - across the overall portfolio of over 100 projects annually.
The Business Problem
A major challenge for PharmSci is the fact that it has a complex portfolio of projects that is constantly changing.
According to Linhares, "Every week, something new comes up and we need to ensure that the right information is communicated to the right people. The people making decisions about resource allocation need easy and simple methods for obtaining that information. One aspect of this is that some people learn the information first and they need to communicate it to others who are responsible for making decisions based on the information. This creates an information-sharing challenge."
Linhares estimates that there are 80 to 100 information producers within PharmSci and over 1,000 information consumers, including the executives who seek a full picture of the project portfolio - financial data, project data, people data and data about the pharmaceutical compounds themselves.
The Technical Problem
The data required is created in and managed by different applications, each developed by a different team, stored in multiple sources managed by different technologies, and the applications don't talk to each other.
This makes it very difficult to access summary information across all projects. Examples would be identifying how much money is being spent on all projects in the project management system, what the next milestones are and when each will be met, and who is working on each project. "We needed a solution that would allow us to pull all this information together in an agile way."
When Linhares joined PharmSci, there was very little in the way of effective information integration. Most integration was done manually by exporting data from various systems into Excel spreadsheets and then either combining spreadsheets or taking the spreadsheet data and moving it into Access or SQL Server databases. With no real security controls, this approach also lacked scalability and opportunities for reuse, generated multiple copies of the spreadsheets (with various changes), and it often took weeks to build a spreadsheet with only a 50% chance that it would include all of the data required.
To be successful, the solution to these data integration and reporting problems had to provide the following:
- A single, integrated view of all data sources with a common set of naming conventions
- A flexible middle layer that would be independent of both the data sources on the back end and the reporting tools on the front end to facilitate easy change management
- Shared metadata and business rule functionality so there would be a single point for managing and monitoring the solution
- A development platform that supported fast, iterative development and, therefore, continuous process improvement
Three Options Considered
BIS considered three solution architectures to meet their business and technical challenges.
- Traditional Information Factory: The first option was a traditional approach of an integrated, scalable information factory. Pfizer had already implemented information factories in the division using a combination of Informatica ETL tools, Oracle databases and custom-built reporting applications. However, according to Linhares, an information factory "seemed like overkill. We didn't have high volumes of data, nor did we need the inherent complexity of using ETL tools to transform and move data while making sure we included all the detailed data we might possibly ever need over time." Furthermore, because of the way the information factories were managed within Pfizer, change management entailed significant overhead. However, the architectural concepts of an information factory were not going to be ignored in the final solution.
- Single Vendor Stack: A second possible approach was to implement the solution in a single integrated technology (SQL Server with integration services). Major disadvantages were the lack of access to multiple data source types, the need to move data multiple times and the lack of an integrated metadata repository for understanding and organizing the data model.
- Data Virtualization: The third option was to create a federated data virtualization layer that integrated and accessed the underlying data sources through virtual views of the data. By leaving the source data in place, this approach would eliminate the issues inherent in copying and moving all the data (which Linhares described as unnecessary, "non-value added" activities). With the right technology and mix of products, data virtualization would enable PharmSci to migrate from inefficient, off-line spreadmarts to online access to integrated information that could be rapidly tailored and reused to dramatically increase its value to the organization.
The Data Virtualization Solution - Architecture
Pfizer's solution is the PharmSci Portfolio Database (PSPD), a federated data delivery framework implemented with the Composite Data Virtualization Platform.
Data virtualization enables the integration of all PharmSci data sources into a single reporting schema of information that can be accessed by all front-end tools and users. The solution architecture includes the following components:
Trusted Data Sources: There are many sources of data for PSPD; they are geographically dispersed, store data in a variety of formats across a multivendor, heterogeneous data environment. Here are some examples:
- Enterprise Project Management (EPM) is a SQL Server database of WRD's drug portfolio project plans. It includes detailed project schedules and milestones.
- The Global Information Factory (GIF) is an Oracle-based data warehouse of monthly finance data.
- OneSource, a database of corporate-level drug portfolio information is itself a unified set of Composite views across several different sources built by another group within Pfizer.
- Flat files are provided by the Finance Department on actual resource use.
- SharePoint lists are small SharePoint databases accessed using a web service.
- There are other data sources as well, including custom-built systems. As Linhares pointed out, "It doesn't matter what data sources we have. With a virtual approach, we are not limited by the types of data we need to access."
Data Virtualization Layer: The Composite Data Virtualization Platform forms the data virtualization layer that enables the solution to be independent of the data sources and front-end tools. It provides abstracted access to all of the data sources and delivers the data through virtual views. These views effectively present the PharmSci Portfolio Database as subject-specific data marts. The Composite metadata repository manages data lineage and business rules.
Consuming Applications: The flexibility of the platform is demonstrated by the varied reporting applications that use the information in PSPD. Examples include:
- SAP Business Objects for ad hoc queries, standard reports and dashboards.
- TIBCO Spotfire for analytics and access to data through standard presentation reports.
- Web services for parameterized queries.
- Data services to provide data for downstream applications.
- QuickViews (web pages built using DevExpress, a .NET toolkit) for access to live data.
SharePoint Portal: Branded as "InfoSource," this team collaboration web portal is the front-end interface that provides integrated access to PSPD data for all PharmSci customers through the consuming applications described above.
The Data Virtualization Solution - Best Practices
Linhares and team applied a number of data virtualization best practices when implementing the architecture described above.
Two Layers of Abstraction: Linhares stressed the importance of building two clear levels of abstraction into the data virtualization architecture. The first level abstracts Sources (the information abstraction layer), the second consumers (the reporting abstraction layer).
"We built a representation of the data in Composite. If a source is ever changed by the owner, which often happens, we can update the representation in the information abstraction layer quickly. This allows control of all downstream data in one location."
The second level of abstraction is the one between the reporting schema and the front-end reporting tools. A consolidated and integrated set of information is exposed as a single schema. This allows BIS to be system agnostic and support the use of whatever tool is best for the customer. All of the reporting tools use the same reporting abstraction layer; they always get the same answer to the same question because there is only a single source of data.
Consolidated Business Rules: Another key piece of the solution is the ability to include the business rules about how PharmSci manages its data within these abstraction layers. The business rules are embedded in the view definitions and are applied consistently at the same point.
Rapid Application Development Process: Prior to data virtualization, data integration was the slowest step for BIS in fulfilling a customer request for information. Now it's typically the fastest. "For example, a request that came in Friday morning and was completed by that afternoon. The customer's response was an amazed, ‘What do you mean you already have it done?'"
BIS uses a simple development process. The first step is what Linhares calls "triage" - looking at what the customer wants, estimating how long it will take and communicating that to the customer.
BIS does not spend a lot of time documenting the requirements of the solution. Instead, the group first creates a prototype on paper in the form of a simple data flow, then creates the necessary virtual views, gives the customer web access to the views and asks: "Is this what you wanted?"
The customer can then play with the result and respond with any changes or additions needed. BIS arrives at the final solution working with the customer in an iterative process.
Summary of Benefits
Linhares described several major benefits of the data virtualization solution.
The ability to provide integrated data in context: Data virtualization has enabled BIS to replace isolated silos of data with a data delivery platform that integrates different types and sources of data into a comprehensive package of value-added information. Instead of only the team leader and a core group of eight to ten people knowing about a project, the entire organization has access to relevant project information.
The independence of the data virtualization layer: "This is one of the huge benefits of data virtualization. It allows me to manage and monitor everything in one place and it makes change management easy for BIS and transparent to users."
Fast, iterative development environment: The data delivery infrastructure already exists in the data virtualization layer (defined data sources, standard naming conventions, access methods, etc.) so when a request for information comes in, BIS can quickly put it together for the customer.
Elimination of manual effort throughout PharmSci: According to Linhares, people initially resisted going away from their spreadsheets. But once there was a single source for the data and it was all available through InfoSource, there was a dramatic reduction in the need to have meetings to reconcile spreadsheet data among teams.
• • •
Editor's Note: Robert Eve is the co-author, along with Judith R. Davis, of Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility, the first book published on the topic of data virtualization. The complete Pfizer case study, along with nine others enterprise are available in the book.
SYS-CON Events announced today Isomorphic Software, the global leader in high-end, web-based business applications, will exhibit at SYS-CON's DevOps Summit 2015 New York, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Isomorphic Software is the global leader in high-end, web-based business applications. We develop, market, and support the SmartClient & Smart GWT HTML5/Ajax platform, combining the productivity and performance of traditional desktop software ...
Dec. 19, 2014 06:00 PM EST Reads: 616
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 19, 2014 06:00 PM EST Reads: 1,126
“We help people build clusters, in the classical sense of the cluster. We help people put a full stack on top of every single one of those machines. We do the full bare metal install," explained Greg Bruno, Vice President of Engineering and co-founder of StackIQ, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 19, 2014 06:00 PM EST Reads: 1,152
The cloud is becoming the de-facto way for enterprises to leverage common infrastructure while innovating and one of the biggest obstacles facing public cloud computing is security. In his session at 15th Cloud Expo, Jeff Aliber, a global marketing executive at Verizon, discussed how the best place for web security is in the cloud. Benefits include: Functions as the first layer of defense Easy operation –CNAME change Implement an integrated solution Best architecture for addressing network-l...
Dec. 19, 2014 05:00 PM EST Reads: 1,148
The BPM world is going through some evolution or changes where traditional business process management solutions really have nowhere to go in terms of development of the road map. In this demo at 15th Cloud Expo, Kyle Hansen, Director of Professional Services at AgilePoint, shows AgilePoint’s unique approach to dealing with this market circumstance by developing a rapid application composition or development framework.
Dec. 19, 2014 04:45 PM EST Reads: 745
AppZero has announced that its award-winning application migration software is now fully qualified within the Microsoft Azure Certified program. AppZero has undergone extensive technical evaluation with Microsoft Corp., earning its designation as Microsoft Azure Certified. As a result of AppZero's work with Microsoft, customers are able to easily find, purchase and deploy AppZero from the Azure Marketplace. With just a few clicks, users have an Azure-based solution for moving applications to the...
Dec. 19, 2014 02:00 PM EST Reads: 672
The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud strategy and programs. In this Power Panel at 15th Cloud Expo (http://www.CloudComputingExpo.com...
Dec. 19, 2014 11:30 AM EST Reads: 2,282
"BSQUARE is in the business of selling software solutions for smart connected devices. It's obvious that IoT has moved from being a technology to being a fundamental part of business, and in the last 18 months people have said let's figure out how to do it and let's put some focus on it, " explained Dave Wagstaff, VP & Chief Architect, at BSQUARE Corporation, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 19, 2014 11:00 AM EST Reads: 1,839
The move in recent years to cloud computing services and architectures has added significant pace to the application development and deployment environment. When enterprise IT can spin up large computing instances in just minutes, developers can also design and deploy in small time frames that were unimaginable a few years ago. The consequent move toward lean, agile, and fast development leads to the need for the development and operations sides to work very closely together. Thus, DevOps become...
Dec. 19, 2014 10:00 AM EST Reads: 1,945
"Our premise is Docker is not enough. That's not a bad thing - we actually love Docker. At ActiveState all our products are based on open source technology and Docker is an up-and-coming piece of open source technology," explained Bart Copeland, President & CEO of ActiveState Software, in this SYS-CON.tv interview at DevOps Summit at Cloud Expo®, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 19, 2014 09:00 AM EST Reads: 1,879
Verizon Enterprise Solutions is simplifying the cloud-purchasing experience for its clients, with the launch of Verizon Cloud Marketplace, a key foundational component of the company's robust ecosystem of enterprise-class technologies. The online storefront will initially feature pre-built cloud-based services from AppDynamics, Hitachi Data Systems, Juniper Networks, PfSense and Tervela. Available globally to enterprises using Verizon Cloud, Verizon Cloud Marketplace provides a one-stop shop fo...
Dec. 19, 2014 08:00 AM EST Reads: 1,861
SYS-CON Events announced today that Windstream, a leading provider of advanced network and cloud communications, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Windstream (Nasdaq: WIN), a FORTUNE 500 and S&P 500 company, is a leading provider of advanced network communications, including cloud computing and managed services, to businesses nationwide. The company also offers broadband, p...
Dec. 19, 2014 07:00 AM EST Reads: 2,154
The Internet of Things is not new. Historically, smart businesses have used its basic concept of leveraging data to drive better decision making and have capitalized on those insights to realize additional revenue opportunities. So, what has changed to make the Internet of Things one of the hottest topics in tech? In his session at @ThingsExpo, Chris Gray, Director, Embedded and Internet of Things, discussed the underlying factors that are driving the economics of intelligent systems. Discover ...
Dec. 19, 2014 06:30 AM EST Reads: 2,144
SYS-CON Events announced today that IDenticard will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. IDenticard™ is the security division of Brady Corp (NYSE: BRC), a $1.5 billion manufacturer of identification products. We have small-company values with the strength and stability of a major corporation. IDenticard offers local sales, support and service to our customers across the United States and Canada...
Dec. 19, 2014 04:00 AM EST Reads: 1,958
SYS-CON Events announced today that AIC, a leading provider of OEM/ODM server and storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. AIC is a leading provider of both standard OTS, off-the-shelf, and OEM/ODM server and storage solutions. With expert in-house design capabilities, validation, manufacturing and production, AIC's broad selection of products are highly flexible and are conf...
Dec. 19, 2014 04:00 AM EST Reads: 1,781
Leysin American School is an exclusive, private boarding school located in Leysin, Switzerland. Leysin selected an OpenStack-powered, private cloud as a service to manage multiple applications and provide development environments for students across the institution. Seeking to meet rigid data sovereignty and data integrity requirements while offering flexible, on-demand cloud resources to users, Leysin identified OpenStack as the clear choice to round out the school's cloud strategy. Additional...
Dec. 19, 2014 02:00 AM EST Reads: 1,879
DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete...
Dec. 18, 2014 09:45 PM EST Reads: 983
“DevOps is really about the business. The business is under pressure today, competitively in the marketplace to respond to the expectations of the customer. The business is driving IT and the problem is that IT isn't responding fast enough," explained Mark Levy, Senior Product Marketing Manager at Serena Software, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 08:00 PM EST Reads: 1,221
Mobile commerce traffic is surpassing desktop, yet less than 20% of sales in the U.S. are mobile commerce sales. In his session at 15th Cloud Expo, Dan Franklin, Segment Manager, Commerce, at Verizon Digital Media Services, defined mobile devices and discussed how next generation means simplification. It means taking your digital content and turning it into instantly gratifying experiences.
Dec. 18, 2014 12:00 PM EST Reads: 1,102