Welcome!

@CloudExpo Authors: Elizabeth White, Liz McMillan, Pat Romanski, Yeshim Deniz, Aruna Ravichandran

Related Topics: @CloudExpo

@CloudExpo: Blog Feed Post

Data Analytics in the Cloud: Two Cool NoSQL ‘Big Data’ Options for the SMB

Some estimates suggest that by 2015 the digital universe will grow to 8 zettabytes of data

Some estimates suggest that by 2015 the digital universe will grow to 8 zettabytes of data (1 Zettabyte = 1,000,000,000,000,000,000,000 bytes).

Much has been written in recent years about “Big Data” and the implications for Information management and data analytics. Simply put, Big data is data that is too large to process using traditional methods. By ‘traditional methods’ we refer to the relational database environments (RDBMS) where data is organized into a set of formally described tables and often accessed using the structured query language (SQL). These systems were designed decades ago when data was much more structured and less accessible.

With the development of web technologies and open source architectures, database management systems have also evolved. The most notable expression of this is MySQL, which is open-source and easily accessible to the beginner, and often bundled into software packages in some variation of the LAMP environment. By contrast, more than half of the digital data today is the unstructured data from social networks, mobile devices, web applications and other similar sources.

While Big Data has become a “big” buzzword in the IT industry today – similar to and, in many ways, a consequence of the Cloud computing phenomenon – and has spun off many kinds of definitions, the essence of the phenomenon can be summed up in the following O’Reilly definition: “Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it.”

The need to understand and manage Big Data has become the bread and butter of IT and engineering teams at major tech companies like Google, Amazon, Facebook, Twitter, as well as other entities that traffic millions of users. But what solutions are available to the SMB, to the average sized business? According to a study released in April 2012 by Techaisle, a survey of over 800 SMBS revealed that 34 percent of US mid-market businesses that are currently using business intelligence are also interested in big data analytics.

In its recent “Hype Cycle for Big Data 2012” Emerging technologies report, the major research firm Gartner states that Column-Store DBMS, Cloud Computing, In-Memory Database Management Systems will be the three most transformational technologies in the next five years.  This same report predicts that Complex Event Processing, Content Analytics, Context-Enriched Services, Hybrid Cloud Computing, Information Capabilities Framework and Telematics are part of the emerging technologies that Gartner also considers to be transformational.  The Hype Cycle for Big Data is shown below:

The time has arrived for SMBs to seriously start thinking about Big Data solutions. As one source has well stated, “It may take a while but eventually any good technology embraced by large enterprises trickles its way down to small and mid-sized businesses in some appropriately modified and re-priced form. It will be no different for modern business analytics tools. The time could be ripe for mid-range customers to start thinking about either modernising their data warehouses or data marts if they are lucky enough to have any, or come up with a plan to install a business analytics platforms if they don’t.”

With this in mind, here are two Important “Big Data” Solutions for the SMB to Keep an Eye on . . .

Google Big Query

BigQuery was introduced in limited preview in November 2011 and made publicly available May 1, 2012, fulfilling Google’s desire to “bring Big Data analytics to all businesses via the cloud.” With Big Query, Google has developed a data analytics solution that offers an easy to use and quickly scalable framework for looking at massive amounts of data in the cloud within a traditional SQL framework. As its tagline suggests, BigQuery allows one to “analyze terabytes of data with just a click of a button.”

The setup process for BigQuery takes less than 5 minutes. Simply Log in to the Google APIs Console and then create a new Google APIs Console project or use an existing project. Navigate to the API Services table and Click on Services on the left-hand sidebar and then Enable BigQuery.

Once BigQuery is enabled, click on the “BigQuery” link choose to manage data through the “web interface” tool

You’ll then be presented with a screen that resembles the basic contours of a traditional MySQL environment, but which is much more simplified. Google has provided a set of publicdata:samples. Click the drop-down and you’ll be presented with a list of these samples. Click on “natality” and then “details”. This brings up the Center for Disease Control (CDC) Birth Vital Statistics for all birth data available in the United States from the 50 States, the District of Columbia, and New York City from 1969 to 2008. In the data set below there are over 137M rows of data!

In order to run a sample query, go back to the homepage for the “BigQuery Browser Tool Tutorial” and select “Run a Query”. You’ll now be presented with a series of sample SQL queries. Choose the one that will select the 10 heaviest children by birth weight that were born in the United States between 1969 and 2008:

SELECT weight_pounds, state, year, gestation_weeks FROM publicdata:samples.natality
ORDER BY weight_pounds DESC LIMIT 10;

Copy and paste the query back into your Compose Query textbox and select “Run Query”. Within seconds, the query extracts the 10 largest birth weights from 137M records from 30 years of data!

What is amazing about the BigQuery interface is the scale of data that is easily presentable to the user in no time. Users can of course create their own tables by importing data from one’s local environment or from Google Cloud Storage. The opportunities for slicing and dicing large data sets are now almost limitless with Google’s BigQuery solution to data analytics.

Bime

BIME (pronounced “beam”) is a French startup that has partnered with Google to create a front-end application for BigQuery that can be used as a business analytics tool. The application runs on Amazon’s Web Services compute cloud and can import data from BigQuery or any variety of cloud and non-cloud sources. With the clever tagline of “Mine Your Own Business.” BIME in its own words “is a revolutionary approach to data analysis and dashboarding. It allows you to analyze your data through interactive data visualizations and create stunning dashboards from the Web.”

The relationship between Google’s BigQuery and BIME is best captured in the screenshot below, which shows how BIME can be used to import and slice and dice the CDC Birth statistics discussed above.

BIME offers a very easy to sign up free 10 day trial with no obligation. Once you sign up for a free account, go to “Create a Connection”

You’ll then need to define a data source from where you wish to import your data set. For very large data sets, you will need to select BimeDB, which requires credit card information to charge either $0.50 or $1.00/hour depending on the size of data sets required

For more conventional data sets, you can import your data sets directly from the desktop. BIME offers an Excel-like environment in which data sets of any size can be sliced and diced and pivoted to derive the desired analytics.

In the case below, we ran a sample Google’s BigQuery CDC Birth statistics table in order to extract the top 500 birth weights from 1969-2008, and then in turn derive the average birth weight for a sampling of five states: Alabama, North Dakota, South Carolina, Texas, and Washington.

Following the 10 day free trial period, BIME users can upgrade to a scaled price plan depending on the data analysis needs of their business.

In conclusion, it bears important mentioning that “Big Data” is Big Business not only for large corporations but for SMBs as well. The discussion above has outlined two major data analytics solutions that are easily accessible and scalable for the everyday small-medium business. Within the emerging technology spectrum, Big Data is critically important and those companies able to easily and efficiently slice and dice this data to provide accurate consumer trends, market forecasts, and offer stakeholders the most up-to-date analysis and metrics, immediately will set themselves apart from other players in the industry. Consider BigQuery and BIME today for your SMB data analytics solutions!

Read the original blog entry...

More Stories By Hovhannes Avoyan

Hovhannes Avoyan is the CEO of PicsArt, Inc.,

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, will discuss some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he’ll go over some of the best practices for structured team migrat...
As people view cloud as a preferred option to build IT systems, the size of the cloud-based system is getting bigger and more complex. As the system gets bigger, more people need to collaborate from design to management. As more people collaborate to create a bigger system, the need for a systematic approach to automate the process is required. Just as in software, cloud now needs DevOps. In this session, the audience can see how people can solve this issue with a visual model. Visual models ha...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Taica manufacturers Alpha-GEL brand silicone components and materials, which maintain outstanding performance over a wide temperature range -40C to +200C. For more information, visit http://www.taica.co.jp/english/.
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they bu...
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, will discuss how from store operations...
As hybrid cloud becomes the de-facto standard mode of operation for most enterprises, new challenges arise on how to efficiently and economically share data across environments. In his session at 21st Cloud Expo, Dr. Allon Cohen, VP of Product at Elastifile, will explore new techniques and best practices that help enterprise IT benefit from the advantages of hybrid cloud environments by enabling data availability for both legacy enterprise and cloud-native mission critical applications. By rev...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
SYS-CON Events announced today that Datera will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera offers a radically new approach to data management, where innovative software makes data infrastructure invisible, elastic and able to perform at the highest level. It eliminates hardware lock-in and gives IT organizations the choice to source x86 server nodes, with business model option...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
SYS-CON Events announced today that Avere Systems, a leading provider of hybrid cloud enablement solutions, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere Systems was created by file systems experts determined to reinvent storage by changing the way enterprises thought about and bought storage resources. With decades of experience behind the company’s founders, Avere got its ...
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with lega...