Welcome!

Cloud Expo Authors: Liz McMillan, Carmen Gonzalez, Elizabeth White, Wallace Sann, Ian Khan

Related Topics: Cloud Expo, Open Source

Cloud Expo: Blog Feed Post

Faster R in Hadoop: rmr 1.3 Now Available

Continuing the Big Data integration of R and Hadoop

The RHadoop project continues the Big Data integration of R and Hadoop, with a new update to its rmr package.

Version 1.3 of rmr improves the performance of map-reduce jobs for Hadoop written in R. New features include: An optional vectorized API for efficient R programming when dealing with small records.

Fast C implementations for serialization and deserialization from and to typedbytes. Other readers and writers work much better in vectorized mode, namely csv and text Additional steps to support structured data better (use more data frames and fewer lists in the API) More forgiving behavior for package loading and bug fixes Also, the documentation has gotten a major overhaul in this version, with pages of combined text, code and graphics generated automatically using the knitr package.




(RHadoop lead developer Antonio Piccolboni provides some background on how knitr is used in these documentation guidelines.)

If you haven't take a look at rmr before, this tutorial by Jeffrey Breen is a great place to get started. Otherwise, check out the wiki pages on the RHadoop github site, linked below. github: RevolutionAnalytics / RHadoop

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
It's time to face reality: "Americans are from Mars, Europeans are from Venus," and in today's increasingly connected world, understanding "inter-planetary" alignments and deviations is mission-critical for cloud. In her session at 15th Cloud Expo, Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems, discussed cultural expectations of privacy based on new research across these elements
Mobile, the cloud and data have upended traditional ways of doing business. Agile, continuous delivery and DevOps have stepped in to hasten product development, but one crucial process still hasn't caught up. Continuous content delivery is the missing limb of the success ecosystem. Currently workers spend countless, non-value add hours working in independent silos, hunting for versions, manually pushing documents between platforms, all while trying to manage the continuous update and flow of mul...
Over the last few years the healthcare ecosystem has revolved around innovations in Electronic Health Record (HER) based systems. This evolution has helped us achieve much desired interoperability. Now the focus is shifting to other equally important aspects - scalability and performance. While applying cloud computing environments to the EHR systems, a special consideration needs to be given to the cloud enablement of Veterans Health Information Systems and Technology Architecture (VistA), i.e....
"At Harbinger we do products as well as services. Our services are with helping companies move their products to the cloud operating systems. Some of the challenges we have seen as far as cloud adoption goes are in the cloud security space," noted Shrikant Pattathil, Executive Vice President at Harbinger Systems, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
After a couple of false starts, cloud-based desktop solutions are picking up steam, driven by trends such as BYOD and pervasive high-speed connectivity. In his session at 15th Cloud Expo, Seth Bostock, CEO of IndependenceIT, cut through the hype and the acronyms, and discussed the emergence of full-featured cloud workspaces that do for the desktop what cloud infrastructure did for the server. He also discussed VDI vs DaaS, implementation strategies and evaluation criteria.
Things are being built upon cloud foundations to transform organizations. This CEO Power Panel at 15th Cloud Expo, moderated by Roger Strukhoff, Cloud Expo and @ThingsExpo conference chair, addressed the big issues involving these technologies and, more important, the results they will achieve. Rodney Rogers, chairman and CEO of Virtustream; Brendan O'Brien, co-founder of Aria Systems, Bart Copeland, president and CEO of ActiveState Software; Jim Cowie, chief scientist at Dyn; Dave Wagstaff, VP ...
“The year of the cloud – we have no idea when it's really happening but we think it's happening now. For those technology providers like Zentera that are helping enterprises move to the cloud - it's been fun to watch," noted Mike Loftus, VP Product Management and Marketing at Zentera Systems, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Bring the world's best IaaS to the world's best PaaS. In their session at 15th Cloud Expo, Animesh Singh, a Senior Cloud Architect and Strategist for IBM Cloud Labs, and Jason Anderson, a Cloud Architect for IBM Cloud Labs, shared their experiences running Cloud Foundry on OpenStack. They focused on how Cloud Foundry and OpenStack complement each other, how they technically integrate using cloud provider interface (CPI), how we could automate an OpenStack setup for Cloud Foundry deployments, an...
"Application monitoring and intelligence can smooth the path in a DevOps environment. In a DevOps environment you see constant change. If you are trying to monitor things in a constantly changing environment, you're going to spend a lot of your job fixing your monitoring," explained Todd Rader, Solutions Architect at AppDynamics, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
"Desktop as a Service is emerging as a very big trend. One of the big influencers of this – for Esri – is that we have a large user base that uses virtualization and they are looking at Desktop as a Service right now," explained John Meza, Product Engineer at Esri, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
What do a firewall and a fortress have in common? They are no longer strong enough to protect the valuables housed inside. Like the walls of an old fortress, the cracks in the firewall are allowing the bad guys to slip in - unannounced and unnoticed. By the time these thieves get in, the damage is already done and the network is already compromised. Intellectual property is easily slipped out the back door leaving no trace of forced entry. If we want to reign in on these cybercriminals, it's hig...
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using ...
The consumption economy is here and so are cloud applications and solutions that offer more than subscription and flat fee models and at the same time are available on a pure consumption model, which not only reduces IT spend but also lowers infrastructure costs, and offers ease of use and availability. In their session at 15th Cloud Expo, Ermanno Bonifazi, CEO & Founder of Solgenia, and Ian Khan, Global Strategic Positioning & Brand Manager at Solgenia, discussed this shifting dynamic with an ...
More and more file-based and machine generated data is being created every day causing exponential data and content growth, and creating a management nightmare for IT managers. What data centers really need to cope with this growth is a purpose-built tiered archive appliance that enables users to establish a single storage target for all of their applications - an appliance that will intelligently place and move data to and between storage tiers based on user-defined policies. In her session a...
Recurring revenue models are great for driving new business in every market sector, but they are complex and need to be effectively managed to maximize profits. How you handle the range of options for pricing, co-terming and proration will ultimately determine the fate of your bottom line. In his session at 15th Cloud Expo, Brendan O'Brien, Co-founder at Aria Systems, session examined: How time impacts recurring revenue How to effectively handle customer plan changes The range of pricing a...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, discussed how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP HANA...
Axios Systems on Tuesday announced it has selected CenturyLink Cloud as the hosting platform for Axios Systems’ IT Service Management (ITSM) solutions in Canada. CenturyLink, a global provider of communications and IT services, joins other leading technology providers across North America, Europe and APAC to help Axios’ international customers drive efficiencies and innovation across their service management provision. The arrangement with CenturyLink enables Axios to further strengthen its pres...
"SAP had made a big transition into the cloud as we believe it has significant value for our customers, drives innovation and is easy to consume. When you look at the SAP portfolio, SAP HANA is the underlying platform and it powers all of our platforms and all of our analytics," explained Thorsten Leiduck, VP ISVs & Digital Commerce at SAP, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
The Industrial Internet revolution is now underway, enabled by connected machines and billions of devices that communicate and collaborate. The massive amounts of Big Data requiring real-time analysis is flooding legacy IT systems and giving way to cloud environments that can handle the unpredictable workloads. Yet many barriers remain until we can fully realize the opportunities and benefits from the convergence of machines and devices with Big Data and the cloud, including interoperability, ...
Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile ...