Welcome!

@CloudExpo Authors: Yeshim Deniz, Elizabeth White, Liz McMillan, Zakia Bouachraoui, Pat Romanski

Related Topics: @CloudExpo

@CloudExpo: Blog Feed Post

The R Language & Big Data

Surveys continue to rank R #1 for Data Mining

KDnuggets recently posted its annual poll on data mining software, and the R language retains its #1 ranking as the most commonly-used software for data mining: R is now used by 52.5% of poll respondents, compared with 45% last year.



Donnie Berkholz provides an analysis of the year-on-year trends for Redmonk. He provides the chart below, and notes "the general trend of newer, open-source languages growing at varying speeds (Python followed by R and Hadoop-based options like Hive/Pig), while older languages including Java, SAS, and Matlab are bleeding users".





Meanwhile, the recently-released Rexer Analytics 2011 Data Miner Survey also ranks R #1 amongst data miners overall, and also for corporate, consulting, academic and government use:



Rexer Analytics has conducted their survey since 2007, which provides perspective on the growth of R for data mining over the past five years:



The complete report of the Rexer Data Mining Survey is available on request from Rexer Analytics.

If you're interested in using R for data mining, check out the Revolution Analytics webinar, Introduction to to  R for Data Mining.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


CloudEXPO Stories
Even if your IT and support staff are well versed in agility and cloud technologies, it can be an uphill battle to establish a DevOps style culture - one where continuous improvement of both products and service delivery is expected and respected and all departments work together throughout a client or service engagement. As a service-oriented provider of cloud and data center technology, Green House Data sought to create more of a culture of innovation and continuous improvement, from our helpdesk on to our product development and cloud service teams. Learn how the Chief Executive team helped guide managers and staff towards this goal with metrics to measure progress, staff hiring or realignment, and new technologies and certifications.
Technology has changed tremendously in the last 20 years. From onion architectures to APIs to microservices to cloud and containers, the technology artifacts shipped by teams has changed. And that's not all - roles have changed too. Functional silos have been replaced by cross-functional teams, the skill sets people need to have has been redefined and the tools and approaches for how software is developed and delivered has transformed. When we move from highly defined rigid roles and systems to more fluid ones, we gain agility at the cost of control. But where do we want to keep control? How do we take advantage of all these new changes without losing the ability to efficiently develop and ship great software? And how should program and project managers adapt?
When Enterprises started adopting Hadoop-based Big Data environments over the last ten years, they were mainly on-premise deployments. Organizations would spin up and manage large Hadoop clusters, where they would funnel exabytes or petabytes of unstructured data.However, over the last few years the economics of maintaining this enormous infrastructure compared with the elastic scalability of viable cloud options has changed this equation. The growth of cloud storage, cloud-managed big data environments, and cloud data warehouses like Snowflake, Redshift, BigQuery and Azure SQL DW, have given the cloud its own gravity - pulling data from existing environments. In this presentation we will discuss this transition, describe the challenges and solutions for creating the data flows necessary to move to cloud analytics, and provide real-world use-cases and benefits obtained through adop...
Docker and Kubernetes are key elements of modern cloud native deployment automations. After building your microservices, common practice is to create docker images and create YAML files to automate the deployment with Docker and Kubernetes. Writing these YAMLs, Dockerfile descriptors are really painful and error prone.Ballerina is a new cloud-native programing language which understands the architecture around it - the compiler is environment aware of microservices directly deployable into infrastructures like Docker and Kubernetes.
Your applications have evolved, your computing needs are changing, and your servers have become more and more dense. But your data center hasn't changed so you can't get the benefits of cheaper, better, smaller, faster... until now. Colovore is Silicon Valley's premier provider of high-density colocation solutions that are a perfect fit for companies operating modern, high-performance hardware. No other Bay Area colo provider can match our density, operating efficiency, and ease of scalability.