| By David Smith | Article Rating: |
|
| October 24, 2012 06:39 PM EDT | Reads: |
1,047 |
The O'Reilly Strata conferences are always great fun to attend, and this latest installment in New York City is no exception. This one is super-busy though; the conference has been sold out for weeks -- and not just marketing-sold-out, it's fire-department-sold out. It's non-stop conversations and presentations, and it's tough to move through the hallways in between.
Nonetheless, I thought I'd pause for a couple of minutes and share some of the highlights for me so far.
Ed Kohlwey and Stephanie Beben gave a three-hour tutorial on the RHadoop project, showing the packed room how to crunch big data. They shared how consulting firm Booz Allen Hamilton uses R and Hadoop for data exploration; to run many tasks in parallel; and to sort, sample and join data. They've also create a very handy VirtualBox VM including R, Hadoop, RHadoop and RStudio (along with demonstration script files) which I hope to be able to post a download link for soon.
Stan Humphries from Zillow gave a presentation on how data and statistical analysis drives Zillow's home valuation service. One fascinating tidbit: while Zillow has long used R to fit their valuation model, until recently they recoded the model scoring algorithm in C++ for use on the production site. The process of re-implementing a new version of the model, validating it, and deploying it used to take 9 months. But now that they run R in production via the Amazon cloud, without the need to recode the model in another language, the deployment time for new valuation models is just four weeks.
Mike Driscoll from Metamarkets shared the technology behind their data stack: node.js and D3 for visualization; R and Scala for analytics; Druid as the data store; and Hadoop and Kafka for ETL. Druid is MetaMarket's home-grown high-performance, which they announced today is now available as open source software.
In a similar vein, Cloudera announced the release of Impala, an open-source project two years in the making to bring high-performance real-time analytics to Hadoop.
And there were even more announcements: Kaggle launched a partnership with EMC to give Greenplum users direct access to the roster of Kaggle data scientists competitors.
It's been a great conference so far, and this is only day one! Looking forward to more great talks and conversations tomorrow. Read the original blog entry...
Published October 24, 2012 Reads 1,047
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By David Smith
David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.< David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid
May. 19, 2013 01:00 PM EDT Reads: 3,494 |
By Jeremy Geelan Our more interconnected planet is accelerating the adoption and convergence of next-generation architectures, in the form of cloud, mobile and instrumented physical assets. Organizations that can effectively balance optimization and innovation, will be in a position to leverage new systems of engagement, out maneuver their peers and achieve desired outcomes. In the Opening Keynote at 12th Cloud Expo | Cloud Expo New York, IBM GM & Next Generation Platform CTO Dr Danny Sabbah will detail the crit...May. 19, 2013 01:00 PM EDT Reads: 2,818 |
By Pat Romanski The cloud-enabled data center sits at the center of IT transformation. It facilitates the interconnection and communities that come together, propelling growth for both buyers and sellers.
In his session at the 12th International Cloud Expo, Gerry Fassig, CoreSite’s Vice President of Sales, will discuss how CoreSite is bringing together best-of-breed partners through the Open Cloud Exchange resulting in public, private, and hybrid cloud interconnection and management as well as connectivity to...May. 19, 2013 01:00 PM EDT Reads: 1,242 |
By Jeremy Geelan Companies around the world are collecting massive amounts of data everyday that’s sitting around and not being utilized. Take for example the fact that companies collect demographic and location-based data via mobile devices all the time, but have to figure out how to monetize that data. In this session, Joyent CTO and founder Jason Hoffman will examine the state of Big Data, taking a look at what we're doing now to discussing what's on the horizon, as companies prepare and realign their busines...May. 19, 2013 01:00 PM EDT Reads: 1,081 |
By Jeremy Geelan The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. In Nati Shalom's upcoming session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], you'll learn how to build your big data "database on-demand" using MongoDB, Cassandra, Solr, MySQL, or any other big data solution, as well as manage your big data application using a new open source framework called “Cloudify.” All this, on top of the OpenStack cloud. May. 19, 2013 12:00 PM EDT Reads: 2,384 |
By Jeremy Geelan Planning scalable environments isn't terribly difficult, but it does require a change of perspective. During this session we'll broaden our views to think on an Internet Scale by dissecting a video publishing application built with The SoftLayer Platform, Message Queuing, Object Storage, and Drupal. By examining a scalable modular application build that can handle unpredictable traffic, you'll be able to grow your development arsenal and pick up a few strategies to apply to your own projects. May. 19, 2013 12:00 PM EDT Reads: 2,312 |
By Jeremy Geelan May. 19, 2013 12:00 PM EDT Reads: 2,361 |
By Jeremy Geelan
Cloud enables SMBs to access new, scalable resources – previously only available to enterprises – in flexible and cost-effective ways. McKinsey’s SMB Cloud Report projects the public cloud market to reach $40-$50 billion by 2015, with SMBs comprising 65% of public cloud spending in 2015. But selling cloud to SMBs raises the questions of who, what and how.
In this session Manjula Talreja, VP of Cisco’s Global Cloud Business Development Team, will discuss the importance of knowing who SMB...May. 19, 2013 12:00 PM EDT Reads: 1,053 |
By Elizabeth White SYS-CON Events announced today that Zyrion Inc., the leading provider of Cloud and IT Monitoring software solutions, has been named “Entrance Carpet Sponsor” of SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York.
Zyrion is the leading provider of integrated Cloud and Network monitoring software for distributed and complex datacenter environments, and offers the most scalable monitoring platform in the industry. Zyr...May. 19, 2013 11:00 AM EDT Reads: 1,174 |
By Jeremy Geelan The world’s first vendor neutral marketplace for IaaS (Infrastructure as a Service) cloud computing is being built. This marketplace fills the current gap in the value chain by offering standardized products and by addressing the needs of providers and consumers of cloud computing resources. Zimory is the technical partner for the settlement process of this project.
In his session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], Zimory CEO Rüdiger Baumann session will introduce th...May. 19, 2013 11:00 AM EDT Reads: 3,350 |
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York Speaker Profile: Nicos Vekiarides – TwinStrata
- AMD and Adobe Collaborate on Upcoming Version of Adobe Premiere Pro Software to Enable Breakthrough Video Editing Performance Through Open Standards
- Windows Azure IaaS Reaches General Availability
- State and Local Governments Adopt Microsoft Dynamics CRM to Improve Citizen Service Delivery
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York Speaker Profile: Jill T. Singer – NRO
- Cloud Expo New York Speaker Profile: Greg O'Connor – AppZero
- Examining the True Cost of Big Data
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: How to Use Google Apps Script
- Cloud Expo New York Speaker Profile: Nicos Vekiarides – TwinStrata
- Cloud Computing Bootcamp at Cloud Expo New York
- AMD and Adobe Collaborate on Upcoming Version of Adobe Premiere Pro Software to Enable Breakthrough Video Editing Performance Through Open Standards
- Windows Azure IaaS Reaches General Availability
- The Top 150 Players in Cloud Computing
- What is Cloud Computing?
- Six Benefits of Cloud Computing
- The Top 250 Players in the Cloud Computing Ecosystem
- Twenty-One Experts Define Cloud Computing
- What's the Difference Between Cloud Computing and SaaS?
- Virtualization Conference Keynote Webcast Live on SYS-CON.TV
- The Future of Cloud Computing
- A Brief History of Cloud Computing: Is the Cloud There Yet?
- GDS International: Global Warming Scam?
- Cloud Expo Europe 2009 in Prague: Themes & Topics
- Cloud Computing Expo 2009 West: Call for Papers Now Closed








Our more interconnected planet is accelerating the adoption and convergence of next-generation architectures, in the form of cloud, mobile and instrumented physical assets. Organizations that can effectively balance optimization and innovation, will be in a position to leverage new systems of engagement, out maneuver their peers and achieve desired outcomes. In the Opening Keynote at 12th Cloud Expo | Cloud Expo New York, IBM GM & Next Generation Platform CTO Dr Danny Sabbah will detail the crit...
The cloud-enabled data center sits at the center of IT transformation. It facilitates the interconnection and communities that come together, propelling growth for both buyers and sellers.
In his session at the 12th International Cloud Expo, Gerry Fassig, CoreSite’s Vice President of Sales, will discuss how CoreSite is bringing together best-of-breed partners through the Open Cloud Exchange resulting in public, private, and hybrid cloud interconnection and management as well as connectivity to...
Companies around the world are collecting massive amounts of data everyday that’s sitting around and not being utilized. Take for example the fact that companies collect demographic and location-based data via mobile devices all the time, but have to figure out how to monetize that data. In this session, Joyent CTO and founder Jason Hoffman will examine the state of Big Data, taking a look at what we're doing now to discussing what's on the horizon, as companies prepare and realign their busines...
The massive computing and storage resources that are needed to support big data applications make cloud environments an ideal fit. In Nati Shalom's upcoming session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], you'll learn how to build your big data "database on-demand" using MongoDB, Cassandra, Solr, MySQL, or any other big data solution, as well as manage your big data application using a new open source framework called “Cloudify.” All this, on top of the OpenStack cloud.
Planning scalable environments isn't terribly difficult, but it does require a change of perspective. During this session we'll broaden our views to think on an Internet Scale by dissecting a video publishing application built with The SoftLayer Platform, Message Queuing, Object Storage, and Drupal. By examining a scalable modular application build that can handle unpredictable traffic, you'll be able to grow your development arsenal and pick up a few strategies to apply to your own projects.
Cloud enables SMBs to access new, scalable resources – previously only available to enterprises – in flexible and cost-effective ways. McKinsey’s SMB Cloud Report projects the public cloud market to reach $40-$50 billion by 2015, with SMBs comprising 65% of public cloud spending in 2015. But selling cloud to SMBs raises the questions of who, what and how.
In this session Manjula Talreja, VP of Cisco’s Global Cloud Business Development Team, will discuss the importance of knowing who SMB...
SYS-CON Events announced today that Zyrion Inc., the leading provider of Cloud and IT Monitoring software solutions, has been named “Entrance Carpet Sponsor” of SYS-CON's 12th International Cloud Expo, which will take place on June 10–13, 2013, at the Javits Center in New York City, New York.
Zyrion is the leading provider of integrated Cloud and Network monitoring software for distributed and complex datacenter environments, and offers the most scalable monitoring platform in the industry. Zyr...
The world’s first vendor neutral marketplace for IaaS (Infrastructure as a Service) cloud computing is being built. This marketplace fills the current gap in the value chain by offering standardized products and by addressing the needs of providers and consumers of cloud computing resources. Zimory is the technical partner for the settlement process of this project.
In his session at 12th Cloud Expo | Cloud Expo New York [June 10-13, 2013], Zimory CEO Rüdiger Baumann session will introduce th...
A recent Gartner study states that the function of the modern CIO is in flux and that his or her future focus must incorporate digital assets (aka cloud-based data and applications) to remain relevant. Towards the goal of riding the sea change a compiler of stacks to a broker of business needs, secu...
In the coming years, big data will change the way organisations and societies are operated and managed. Big data however, is not the only trend that will impact significantly how organisations operate. Another major trend at the moment is gamification. Gamification will change the way organisations ...
We all talk about cloud differently, but is there a way we should be speaking about this tech?
Cloud computing is now a widely reported, if not accepted, IT movement that, depending on who you talk to, has changed or is changing the way businesses utilize infrastructure.
New technologies allow schools, colleges and universities to analyze absolutely everything that happens. From student behavior, testing results, career development of students as well as educational needs based on changing societies. A lot of this data has already been stored and is used for statist...
The age of data center automation is upon us. Whether it's cloud or SDN or devops in general, automation as a means to achieve efficiency and, one hopes, free up resources that can be then redirected to focus on innovation.
As is always the case when we begin to move further upwards, abstracting ...
As the infrastructure cloud market (IaaS and PaaS) continues to grow rapidly, we are seeing quite a few customers who are delivering an application – whether it is a mission-critical or SaaS application – and basing their solution on VMware.
VMware Security Cloud Encryption cloud keyboard Cloud Enc...
Windows Azure Virtual Networks offers the power to open up several cross-premises use case scenarios, including Active Directory Disaster Recovery, SQL Database Replication, Windows Server 2012 DFS-R File Replication, Accelerated Cloud File Services with BranchCache, Hybrid Web Applications and MORE...
Have you heard of products like IBM’s InfoSphere Streams, Tibco’s Event Processing product, or Oracle’s CEP product? All good examples of commercially available stream processing technologies which help you process events in real-time.
I’ve been asked what I consider as “Big Data” versus “Small Dat...
My fellow Technical Evangelists and I have authored a content series that steps through building your very own Private Cloud by leveraging Windows Server 2012, our FREE Hyper-V Server 2012, Windows Azure Infrastructure Services ( IaaS ) and System Center 2012 Service Pack 1.
Week-by-week, we walk ...












