With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do th...| By Gary Hamilton, Jocelyn Quimbo, Saurabh Verma | Article Rating: |
|
| November 30, 2009 02:00 PM EST | Reads: |
6,837 |
SaaS has rapidly evolved from an online application source to providing application building blocks such as
- Platform as-a-Service (PaaS)
- Infrastructure-as-a-Service (IaaS) and
- Database-as-a-Service (DaaS)
DaaS is the latest entrant into the "as a Service" realm and typically provides tools for defining logical data structures, data services like APIs and web service interfaces, customizable user interfaces, and data storage, backup, recovery and export policies. To ensure successful DaaS
implementations, developers and database professionals need to address traditional challenges associated with data design and performance tuning. They will also need to address new challenges introduced by the lack of physical access for backup, recovery and integration.
What Is DaaS?
DaaS provides traditional database features, typically data definition, storage and retrieval, on a subscription basis over the web. To subscribers DaaS appears as a black box supporting logical data operations, and logical data stores where customers can only see their organization's data. Physical access is seen as a security risk and thus it is not available. As with SaaS, DaaS vendors build and manage data centers incorporating best practices in security, back-up, recovery and customer support. Data services typically are provided as SOAP or REST APIs allowing users to define data structures, perform CRUD operations, manage entitlements and query the database using a subset of standard SQL.
Real-World Examples: Force.com and Amazon SimpleDB
Two real-world examples of DaaS are Salesforce.com's Force.com, which provides data services in its toolkit for building applications, and Amazon's SimpleDB, which provides an API for creating data stores which can be used for applications or pure data storage.
Force.com
Force.com supports the Model-View-Controller paradigm for application development where Model refers to the data model.
- Database schema: Developers can configure pick list values for fields in standard CRM objects (tables), or create custom objects and fields via the Salesforce.com Setup menu. Data elements can also be defined programmatically through the Metadata API, which is used by the Force.com IDE, an add-on for Eclipse. Lookup fields and parent-child relationships allow foreign key relationships between tables.
- CRUD operations: Data entry, updates and deletes can be performed using Force.com pages that are automatically generated for each table, or through the Force.com Web Services API. Apex, Force.com's programming language, provides the ability to develop object oriented code to perform data operations.
- Database queries: Querying data is done through SOQL, Force.com's subset of SQL. SOQL provides read-only access via the Web Services API or Apex, Force.com's development language.
- Stored procedures: Custom business rules can be implemented as Triggers, the equivalent of database stored procedures written in Apex.
- Pro: Force.com database development and functionality parallels traditional database development.
- Con: Force.com database design requires careful design and coding.
Amazon SimpleDB
Amazon.com's SimpleDB service appears to be geared to developing applications quickly with minimal effort on database design and definition:
- Database schema: SimpleDB stores data in "domains," the equivalent of a spreadsheet tab. Once a domain is created attributes (fields) are created when records are added to the domain. Each record requires a unique ID string for each item (record) and attributes are added as name-value pairs such as ("First Name", "Tara"). Items are limited to 256 name-value pairs, and domains are limited to 1 billion attributes.
- CRUD operations: SimpleDB uses the Put to insert and update items, Get to retrieve an item by unique number, and Delete to delete records.
- Database queries: SimpleDB supports a subset of SQL for read-only access to data. SimpleDB does not support queries across domains, so SQL joins are not available. The Developer Guide suggests storing related data in a single domain as a workaround.
- Stored procedures: SimpleDB currently does not support stored procedures.
- Pro: SimpleDB allows rapid development of web based applications requiring data services.
- Con: SimpleDB does not support joins, foreign keys and stored procedures. Porting complex applications to SimpleDB may not be feasible.
What Is Data Management?
One could argue that data management began as early as man invented written communication. Even cataloging, bookkeeping, and archiving, which are all forms of data management, are known to exist in ancient times. In recent history, the first computerized database management systems started to evolve in the 1960's when the primary data storage media were magnetic tapes.[1] In the 1980's, Data Management also became known as Data Resource Management and Enterprise Information Management as organizations recognized corporate data as assets that must be managed. The publication of the DAMA-DMBOK Guide [2] this year is a major step to formalizing data management as a science and practice. Data management functions discussed in this article are based upon this guide.
The DAMA-DMBOK Guide identifies ten data management functions found in most organizations. These functions are briefly described below:
- Data Governance - planning, supervision and control over data management and use
- Data Architecture Management - as an integral part of the enterprise architecture
- Data Development - analysis, design, building, testing, deployment and maintenance
- Database Operations Management - support for structured physical data assets
- Data Security Management - ensuring privacy, confidentiality and appropriate access
- Reference & Master Data Management - managing golden versions and replicas
- Data Warehousing & Business Intelligence Management - enabling access to decision support data for reporting and analysis
- Document & Content Management - storing, protecting, indexing and enabling access to data found in unstructured sources (electronic files and physical records)
- Meta Data Management - integrating, controlling and delivering meta data
- Data Quality Management - defining, monitoring and improving data quality
Figure 1 shows the scope of each of these functions. This article focuses on the Data Development and Database Operations Management functions as they relate to DaaS. It points out similarities and differences in managing data that resides in a DaaS environment versus a non-DaaS environment as well as key implementation challenges in present day technologies.
DaaS Data Management
How does data management in a DaaS environment differ from traditional environments?
All of the data management functions shown in Figure 1 apply to DaaS but with a twist introduced by the additional layer of abstraction presented by the Service Oriented Architecture (SOA) that defines the DaaS. As with other types of SOA implementations, a DaaS provider hides the physical implementation and complexities of managing the data stores from its consumers, while at the same time providing a ubiquitous language agnostic API, such as XML, to enter, retrieve, and manipulate data.
For those familiar with relational database managements systems (RDBMS) such as Oracle and SQL Server, DaaS is analogous to RDBMS as RDBMS is to flat files. In both cases, the abstractions introduced by the newer platform simplify access and management of data but in some cases limit what you can do with the data. Limitations are usually overcome as the platform matures. For example, a DaaS query language might initially allow access to one object (table) at a time but future releases may provide two or more objects to be joined together.
Just like the logical abstractions of RDBMS hides the fact that tables are implemented as files; DaaS objects might actually be implemented as RDBMS tables behind the scene. However, a DaaS consumer does not need to know that and does not have to be concerned with the maintenance of the underlying RDBMS. Table 1 shows more examples on how DaaS differs from RDBMS using Force.com and Oracle as bases for comparison.
DaaS Challenges
DaaS presents many advantages and promises. However, adopters of this new paradigm may find some new challenges, some of which are highlighted below using Force.com as an example.
Data Design
Joins
In order to optimize performance and simplify data access DaaS typically limits on resource intensive queries and reports. For queries this may mean that certain join statements or outer joins are not supported, or the number of entities that can be queried is limited. Similarly, report writers may limit joins by controlling what entities are available. Approaches for dealing with joins include copying some attributes of master objects into child objects, or writing code to merge master and detail results.
Physical Database Access
Performance Tuning
DaaS by nature hides the underlying details of the physical database implementation. At this point in time, troubleshooting and performance tuning require cobbling together various tools and approaches. While a best practice is to always follow vendor recommendations carefully, empirical data can be gathered via commercial performance testing tools, custom scripts/code, manual testing and vendor profiling tools. Analyze the results carefully and consult the vendor if their best practices do not mesh with the data.
Data Partitioning
Since DaaS provides logical database services there is no standard for partitioning data. Best practice is to review vendor documentation on performance, especially for large data volumes. Approaches to data partitioning include defining tables or namespaces in lieu of partitions, creating indexed fields as partition filters, creating hierarchies and entitlements to control data visibility, or licensing multiple DaaS instances.
Backup and Recovery
While DaaS provides high performance tools for querying and exporting data, it can be difficult to perform a "database dump" that includes exporting data, Metadata and code as one operation. And once the data is "dumped" there may be no facility to rebuild a database from the database dump. Administrators used to these features with on-premise software must develop custom scripts for dumping and loading data for DaaS.
Transaction Processing
Some DaaS implementations allow the equivalent of stored procedures to support referential integrity and transaction logic. One workaround is the tried and true polling service that looks for updated records and performs the appropriate operations for inserts, deletes and updates. Regardless of the approach, pay careful attention to commit/rollback logic and error handling.
Benefits
Despite some of the limitations listed above, DaaS adoption is being driven by multiple factors that speed application delivery including:
Ease of Deployment
Without the need to procure, install and configure equipment DaaS can be rapidly deployed. And since vendors have already done extensive performance tuning on logical data services there may be little need to do performance tuning or extensive data design.
Platform Independence
Because DaaS is web-based most vendors comply with web standards, providing interoperability with desktops, servers and development tools from many vendors. Web service APIs for SOAP or REST are typically interoperable with multiple development platforms such as Adobe, Java, Mac OS X Cocoa, .NET and Visual Basic, Ruby On Rails, Perl, PHP and Python.
Simplified Database Administration
Database administrators may not need to understand SQL or APIs to configure DaaS databases. Data objects, custom fields, validation rules and data entry forms may be configured via the DaaS user interface. These are logical operations changes are usually available immediately via reports, SQL queries and web services.
Standardized Data Integration
DaaS web services provide programmatic access to data via the vendor's API. Ubiquitous support for SOAP and REST services in ETL tools, middleware and application servers facilitate integration with most platforms. And since most DaaS vendors provide text-based file import and export, batch processing or semi-manual procedures allow integration with legacy systems and new applications.
Conclusion
Database as a Service can be an important tool in a developer's toolbox for rapid development as well as for organizations that have more limited IT infrastructure resources. As illustrated above there are some limitations but often the overall benefits to the project outweigh them. Regardless of the approach, Database as a Service is something all Data Architects need to know and understand as we move into the next decade. It is critical to enabling the movement of enterprise applications to the cloud.
References
- Olle, T. William, 2006,"Nineteen Sixties History of Data Base Management", (ISBN: 978-0-397-34637-3), Springer Boston.
- DAMA International, 2009, "The DAMA Guide to the Data Management Body of Knowledge", (ISBN: 0977140083), Technics Publications, LLC.
Published November 30, 2009 Reads 6,837
Copyright © 2009 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Gary Hamilton
Gary Hamilton is a Global Services expert at Acumen Solutions, a leading business and technology consulting firm with offices across the U.S. and Europe. He is a hands-on leader skilled in application development, delivery and operations. Over the last five years, Gary has established expertise in CRM systems, with a focus on web services and systems integrations.
More Stories By Jocelyn Quimbo
Jocelyn Quimbo is a Senior Data Architect/Integrator at Acumen Solutions, a leading business and technology consulting firm with offices across the U.S. and Europe. Acumen Solutions helps clients turn customers into advocates, suppliers into partners and leverage systems and data for effective business decisions and process automation. Jocelyn holds an MS in Computer and Information Science from Florida State University.
More Stories By Saurabh Verma
Saurabh Verma is Director of Global Services at Acumen Solutions at Acumen Solutions, a leading business and technology consulting firm with offices across the U.S. and Europe. He is PMP certified with over 13+ years of progressive technical services and program management experience. Saurabh is an expert in strategic and management consulting with focus on Americas and EMEA telecommunication industry.
With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do th...Feb. 12, 2012 01:30 PM EST Reads: 458 |
By Jeremy Geelan With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...Feb. 12, 2012 12:00 PM EST Reads: 526 |
By Jeremy Geelan With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...Feb. 12, 2012 08:30 AM EST Reads: 493 |
By Jeremy Geelan With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
Feb. 12, 2012 08:00 AM EST Reads: 627 |
By Jeremy Geelan With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...Feb. 12, 2012 07:45 AM EST Reads: 570 |
By Jeremy Geelan With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...Feb. 12, 2012 07:45 AM EST Reads: 3,258 |
By Jeremy Geelan With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...Feb. 12, 2012 07:30 AM EST Reads: 794 |
By Elizabeth White In 2011, Apache Hadoop received tremendous attention for helping organizations cost-effectively capitalize on their big data. Hadoop is now disrupting the business of analyzing data.
In his session at the 10th International Cloud Expo, Eric Baldeschwieler, Co-Founder & CEO of Hortonworks, will look at the current state of the Hadoop project, lessons learned by deploying it at scale, and the roadmap for its future.
Big Data Track attendees will learn about the exciting developments that have ...Feb. 12, 2012 07:15 AM EST Reads: 895 |
By Jeremy Geelan With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...Feb. 10, 2012 06:45 AM EST Reads: 4,232 |
By Elizabeth White What are the legal implications and consequences of cloud computing in the healthcare and high-tech sectors? What are the potential legal protections and solutions from the point of view of providers, suppliers and consumers?
In his session at the 10th International Cloud Expo, Paul Rubell, a Partner at Meltzer Lippe, will discuss the federal mandates that will encourage “meaningful use” of EHR technology by 2015, and what those mandates will require executives to understand about cloud comput...Feb. 10, 2012 06:15 AM EST Reads: 1,589 |
- How Are You Building Your Cloud?
- Cloud Expo New York Speaker Profile: Dave Asprey – Trend Micro
- Big Data in Telecom: The Need for Analytics
- Big Data Gold Mine in Cloud Governance and Automation
- Microsoft Tries Hadoop on Azure
- Thoughts on Big Data and Data Virtualization
- Cloud Expo New York Speaker Profile: Mårten Mickos – Eucalyptus Systems
- Cloud Expo New York Speaker Profile: Bernard Golden – HyperStratus
- Drool, Britannia? Is the UK Failing the Cloud?
- What Motivates Open Standards in the Cloud?
- StorSimple Supports OpenStack
- What to Expect in 2012: Cloud Computing and Open Source Software
- The Future of Cloud Computing: Industry Predictions for 2012
- HP Puts Activist Shareholder on Board
- Make Customer On-Boarding Easy as Paint-by-Numbers for Cloud Services
- Amazon Tipped to Buy webOS
- Gartner Hype Cycle for Emerging Technologies 2011
- How Are You Building Your Cloud?
- Cloud Expo New York Speaker Profile: Dave Asprey – Trend Micro
- Big Data in Telecom: The Need for Analytics
- i-Technology in 2012: Five Industry Predictions
- Big Data Gold Mine in Cloud Governance and Automation
- 9th International Cloud Expo | Cloud Expo Silicon Valley – Photo Album
- Microsoft Tries Hadoop on Azure
- What is Cloud Computing?
- The Top 150 Players in Cloud Computing
- Six Benefits of Cloud Computing
- Virtualization Conference Keynote Webcast Live on SYS-CON.TV
- GDS International: Global Warming Scam?
- What's the Difference Between Cloud Computing and SaaS?
- Twenty-One Experts Define Cloud Computing
- The Future of Cloud Computing
- The Top 250 Players in the Cloud Computing Ecosystem
- SOA 2 Point Oh No!
- Cloud Expo Europe 2009 in Prague: Themes & Topics
- A Brief History of Cloud Computing: Is the Cloud There Yet?








With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...
With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
With Big Data Expo 2012 New York (co-located with 10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...
In 2011, Apache Hadoop received tremendous attention for helping organizations cost-effectively capitalize on their big data. Hadoop is now disrupting the business of analyzing data.
In his session at the 10th International Cloud Expo, Eric Baldeschwieler, Co-Founder & CEO of Hortonworks, will look at the current state of the Hadoop project, lessons learned by deploying it at scale, and the roadmap for its future.
Big Data Track attendees will learn about the exciting developments that have ...
With Cloud Expo 2012 New York (10th Cloud Expo) just four months away, what better time to start introducing you in greater detail to the distinguished individuals in our incredible Speaker Faculty for the technical and strategy sessions at the conference...
We have technical and strategy sessions for you every day from June 11 through June 14 dealing with every nook and cranny of Cloud Computing and Big Data, but what of those who are presenting? Who are they, where do they work, what else h...
What are the legal implications and consequences of cloud computing in the healthcare and high-tech sectors? What are the potential legal protections and solutions from the point of view of providers, suppliers and consumers?
In his session at the 10th International Cloud Expo, Paul Rubell, a Partner at Meltzer Lippe, will discuss the federal mandates that will encourage “meaningful use” of EHR technology by 2015, and what those mandates will require executives to understand about cloud comput...
What do these two vulnerabilities have in common?
Apache Killer.
Post of Doom.
Right, they’re platform-based vulnerabilities. Meaning they are vulnerabilities peculiar to the web or application server platform upon which applications are deployed. Mitigations for such vulnerabilities generally ...
PaaS v2.0 should be more open than the current implementations, and cultivate tools communities. But the focus on open development stacks is ignoring the second aspect of PaaS - the management of live applications after they are built. PaaS providers need to allow for communication of SLA and busine...
IT departments and data centers are used to seeing demand for resources surge. In recent years, this has been especially evident in the area of data storage. No matter what you want to call it – “data explosion,” or something else – you can’t deny the fact that organizations simply have a greater ne...
As the name suggests one of the key factors of ‘Enterprise Cloud’ is that it’s intended for the enterprise market, in particular the enterprise applications that they use such as SAP, Oracle and JD Edwards amongst others. Where Cloud Computing overlaps with this sector is ‘En...
The National Science Foundation released their report on cloud computing. It can be found here. The intent of this report is to provide information that guides funding programs. The NSF used NIST’s guidance on cloud computing to inform their research and decision making. This report will be instrume...
Although it can feel like you’re playing an intense game of Buzzword Bingo, the key way to approach new technologies like Cloud Computing is to marry them up with other hot topics, like social media and big data.
Typically these aren’t entirely different domains more so simply different perspective...
In a recent Amazon Web Service Blog, it was quoted that Amazon S3 has reached over 762 Billion objects at the end of 2011. We have been following Amazon S3’s growth closely. As usual, we will plug the numbers in an Excel spread-sheet and see its growth in a chart.
As shown in the chart, you can see...
The Enterprise is buzzing with API initiatives these days. APIs not only serve mobile applications, they are increasingly redefining how the enterprise does B2B and integration in general. API management as a category follows different models. On one hand, certain technology vendors offer specialize...
[Editor's note: this analysis predates any official announcements by NASA]
Recently, some news of a NASA hack-and-dump passed my twitter deck. I decided after watching a few of my friends re-tweet the news that it might be worth checking out. At least I’d see if I could perform some password anal...













