Welcome!

Cloud Expo Authors: Liz McMillan, Vincent Brasseur, Pat Romanski, Elizabeth White, Roger Strukhoff

Related Topics: Cloud Expo, Java, SOA & WOA, Virtualization, Search, Web 2.0

Cloud Expo: Blog Feed Post

The Next Big Thing: WeeData

Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing

‘Big Data’ has a problem, and that problem is its name.

Dig deep into the big data ecosystem, or spend any time at all talking with its practitioners, and you should quickly start hitting the Vs. Initially Volume, Velocity and Variety, the Vs rapidly bred like rabbits. Now we have a plethora of new V-words, including Value, Veracity, and more. Every new presentation on big data, it seems, feels obligated to add a V to the pile.

Gartner stands by the original three, stating earlier this year that

Big Data are high-volume, high-velocity, and/or high-variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.

(my emphasis)

But by latching onto the ‘big’ part of the name, and reinforcing that with the ‘volume’ V, we become distracted and run the risk of rapidly missing the point. The implication from a whole industry is that size matters. Bigger is better. If you don’t collect everything, you’re woefully out of touch. And if you’re not counting in petas, exas, zettas or yottas, how on earth do you live with the shame?

From the outset, though, size was only part of the picture. Streams of data from social networks, traffic management systems or stock control processes raise a lot of challenges because of the speed with which data must be ingested, or the rapidity with which actionable decisions must be taken. Data volumes may only be a few gigabytes or – oh, the embarrassment – megabytes, but the challenge is still very real. Combining data of different types from disparate sources also creates opportunities. Video from traffic cameras, combined with the logs of pressure sensors beneath the roads, weather data from forecasters and historic records, and social networking comments about the smoothness (or otherwise) of the morning commute build a rich picture that no single source can provide. The initial data sets may be large, but the subset that is actually relevant to the problem being studied will often be far smaller. Variety is the challenge here, not volume.

Of all the Vs clamouring to join the initial three, the most compelling to me must surely be Value. What is the value of the insight offered by this data, regardless of how big it is, how fast it’s coming at me, or how many formats it comprises?

The correct interpretation of the right analysis, performed on the optimal set of data. Surely that’s what we should celebrate? Sometimes, certainly, that analysis will be performed on mind-bogglingly huge data stores. But sometimes it won’t, and the results can be just as valuable. Is it better to collect everything and then extract the gigabyte or two of data you actually need, or does it make more sense to know what you’re interested in and devise a collection strategy that gets you the right data in the first place?

Big Data is impressive stuff. This fledgling market segment is full of interesting companies doing remarkable things. The ability to hold genomes, city traffic systems, internet search logs, or complex financial models in memory and to manipulate them in order to derive insight in real time is stunning and transformative. The switch from highly structured relational databases to schema-less data stores creates new opportunities, and new challenges.

Let’s celebrate the value of data volume when it’s appropriate to do so, but let’s not create the patently false impression that big is best.

Ben Kepes and I jokingly used Twitter to announce a new company at last year’s Defrag. WeeData (‘wee‘ as in small, not the other kind) was to concern itself with everything that big data’s champions did not, and that’s a very large market indeed. We’re hoping that the people who rushed to sign up as customers, investors, and Board members got the joke…

Read the original blog entry...

More Stories By Paul Miller

Paul Miller works at the interface between the worlds of Cloud Computing and the Semantic Web, providing the insights that enable you to exploit the next wave as we approach the World Wide Database. He blogs at www.cloudofdata.com.

@CloudExpo Stories
High-performing enterprise Software Quality Assurance (SQA) teams validate systems that are ready for use - getting most actively involved as components integrate and form complete systems. These teams catch and report on defects, making sure the customer gets the best software possible. SQA teams have leveraged automation and virtualization to execute more thorough testing in less time - bringing Dev and Ops together, ensuring production readiness. Does the emergence of DevOps mean the end of E...
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using ...
The Internet of Things is tied together with a thin strand that is known as time. Coincidentally, at the core of nearly all data analytics is a timestamp. When working with time series data there are a few core principles that everyone should consider, especially across datasets where time is the common boundary. In his session at Internet of @ThingsExpo, Jim Scott, Director of Enterprise Strategy & Architecture at MapR Technologies, discussed single-value, geo-spatial, and log time series dat...
"Verizon offers public cloud, virtual private cloud as well as private cloud on-premises - many different alternatives. Verizon's deep knowledge in applications and the fact that we are responsible for applications that make call outs to other systems. Those systems and those resources may not be in Verizon Cloud, we understand at the end of the day it's going to be federated," explained Anne Plese, Senior Consultant, Cloud Product Marketing at Verizon Enterprise, in this SYS-CON.tv interview at...
"For the past 4 years we have been working mainly to export. For the last 3 or 4 years the main market was Russia. In the past year we have been working to expand our footprint in Europe and the United States," explained Andris Gailitis, CEO of DEAC, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
P2P RTC will impact the landscape of communications, shifting from traditional telephony style communications models to OTT (Over-The-Top) cloud assisted & PaaS (Platform as a Service) communication services. The P2P shift will impact many areas of our lives, from mobile communication, human interactive web services, RTC and telephony infrastructure, user federation, security and privacy implications, business costs, and scalability. In his session at @ThingsExpo, Robin Raymond, Chief Architect...
The term culture has had a polarizing effect among DevOps supporters. Some propose that culture change is critical for success with DevOps, but are remiss to define culture. Some talk about a DevOps culture but then reference activities that could lead to culture change and there are those that talk about culture change as a set of behaviors that need to be adopted by those in IT. There is no question that businesses successful in adopting a DevOps mindset have seen departmental culture change, ...
The Domain Name Service (DNS) is one of the most important components in networking infrastructure, enabling users and services to access applications by translating URLs (names) into IP addresses (numbers). Because every icon and URL and all embedded content on a website requires a DNS lookup loading complex sites necessitates hundreds of DNS queries. In addition, as more internet-enabled ‘Things' get connected, people will rely on DNS to name and find their fridges, toasters and toilets. Acco...
"Cloud consumption is something we envision at Solgenia. That is trying to let the cloud spread to the user as a consumption, as utility computing. We want to allow the people to just pay for what they use, not a subscription model," explained Ermanno Bonifazi, CEO & Founder of Solgenia, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Enthusiasm for the Internet of Things has reached an all-time high. In 2013 alone, venture capitalists spent more than $1 billion dollars investing in the IoT space. With "smart" appliances and devices, IoT covers wearable smart devices, cloud services to hardware companies. Nest, a Google company, detects temperatures inside homes and automatically adjusts it by tracking its user's habit. These technologies are quickly developing and with it come challenges such as bridging infrastructure gaps,...
SYS-CON Media announced that Centrify, a provider of unified identity management across cloud, mobile and data center environments that delivers single sign-on (SSO) for users and a simplified identity infrastructure for IT, has launched an ad campaign on Cloud Computing Journal. The ads focus on security: how an organization can successfully control privilege for all of the organization’s identities to mitigate identity-related risk without slowing down the business, and how Centrify provides ...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, discussed how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP HANA...
"SAP had made a big transition into the cloud as we believe it has significant value for our customers, drives innovation and is easy to consume. When you look at the SAP portfolio, SAP HANA is the underlying platform and it powers all of our platforms and all of our analytics," explained Thorsten Leiduck, VP ISVs & Digital Commerce at SAP, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
"We help companies that are using a lot of Software as a Service. We help companies manage and gain visibility into what people are using inside the company and decide to secure them or use standards to lock down or to embrace the adoption of SaaS inside the company," explained Scott Kriz, Co-founder and CEO of Bitium, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at Internet of @ThingsExpo, James Kirkland, Chief Ar...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, discussed how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP HANA...
Bit6 today issued a challenge to the technology community implementing Web Real Time Communication (WebRTC). To leap beyond WebRTC’s significant limitations and fully leverage its underlying value to accelerate innovation, application developers need to consider the entire communications ecosystem.
The definition of IoT is not new, in fact it’s been around for over a decade. What has changed is the public's awareness that the technology we use on a daily basis has caught up on the vision of an always on, always connected world. If you look into the details of what comprises the IoT, you’ll see that it includes everything from cloud computing, Big Data analytics, “Things,” Web communication, applications, network, storage, etc. It is essentially including everything connected online from ha...
Cloud Expo 2014 TV commercials will feature @ThingsExpo, which was launched in June, 2014 at New York City's Javits Center as the largest 'Internet of Things' event in the world.
SYS-CON Events announced today that Windstream, a leading provider of advanced network and cloud communications, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Windstream (Nasdaq: WIN), a FORTUNE 500 and S&P 500 company, is a leading provider of advanced network communications, including cloud computing and managed services, to businesses nationwide. The company also offers broadband, p...