BookmarkSubscribeRSS Feed

Experience of an R Programmer Using the R Runner Custom Step in SAS Studio Flows

Started ‎02-12-2024 by
Modified ‎02-12-2024 by
Views 255

Coming to SAS after 6 years of almost exclusively programming in R, I was thrilled to discover ways to combine the strengths of both languages with tools like the R SWAT Package and PROC IML. Most recently I tried the R Runner custom step in SAS Studio Flows, which allows users to code in R, or even upload an entire R script to be used in a flow. The purpose of this post is to share my experience using R Runner and how anyone building a flow may benefit from it.

 

SAS Studio Flows

 

If you are asking yourself, “What is a ‘flow’?”, SAS Studio Flows offer a low-code option for analyzing data with SAS Viya. A flow is a sequence of operations on data. Data and operations are represented in SAS Studio by steps that you can access from the Steps section of the navigation pane. Each step in a flow is represented by a node on the flow canvas. While SAS provides several predefined steps to use in flows, users and developers also build custom steps to perform specific tasks. R Runner is a custom step that allows users to run R scripts and write R code to analyze data within a flow.

 

Here are some additional resources regarding SAS Studio Flows:

Quick Start Video

Documentation

SAS Studio Flow Course

 

Set Up

 

I accessed R Runner through the SAS Studio Custom Steps GitHub repository which is available to all users.

 

My first step was to make sure my machine had the necessary requirements for R Runner. I asked my SAS Administrator for the following, which are outlined in more detail here:

 

  • A Viya 4 Environment 2023.08 or later.
  • SAS Viya needs access to an active Python and R environment.
  • The rpy2 Python package is installed and configured.
  • A path to R is available through the R_HOME environment variable. Option to run R within SAS Studio compute sessions is enabled with the -RLANG system option
  • Recommended: Administrators could make use of the SAS Configurator for Open Source (also commonly known as sas-pyconfig) to install and configure Python and R access from SAS Viya.

 

Once my environment was set up, I needed to upload the R Runner custom step into SAS Viya following the steps outlined here.

 

*Before continuing, view a tour of the R Runner interface here to familiarize yourself with the interface.

 

Using R Runner

 

Now that R Runner was ready to go, I did what any programmer would do and began my exploration by using the text box to print the ever-famous line, “Hello World!”

 

01_ST_hello1-1536x1058.png

 Select any image to see a larger version.

Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Navigating to the “Submitted Code and Results” log, I found the output of my R Code.

 

02_ST_hello2.png

 

Next, I submitted a regression model with the following steps to analyze the SASHELP.CARS table to understand the relationship between Horsepower and Highway MPG.

 

  1. Right-click on the R Runner step and select Add input port.

03_ST_regression1.png

 

  1. From the Libraries pane, select the SASHELP.CARS table and drag it to the flow and connect it to the R Runner input port. This allows me to reference the input data source using the generic table name, r_input_table. (For more detail, click here).

04_ST_regression3.gif

 

  1. Add the model code to the textbox. Notice instead of supplying CARS for the table name, I specified r_input_table.
model <- lm(MPG_Highway ~ Horsepower, data=r_input_table)
print(summary(model))

 

05_ST_regression4.png

 

  1. Run the flow and navigate to the “Submitted Code and Results” log. The program returns a summary of the regression model.

06_ST_regression5.png

 

In addition to printing R output, R Runner has the ability to modify SAS tables with R code within a flow.

 

For example, let’s say I want to filter the SASHELP.CARS table by Make, only keeping Audis. The tidyverse set of packages in R includes convenient data transformation functions. tidyverse was installed with the SAS Open Source Configurator so all I needed to do was build my flow and include library(tidyverse) in my R code to load in the tidyverse packages. To create a table with only Audi vehicles included, I followed these steps:

 

  1. Follow steps 1 and 2 of the previous example to connect SASHELP.CARS to R Runner.
  2. Right-click R Runner and add an output port, selecting outputtable.

07_ST_filter1-1.png

 

  1. Right-click the output port and add a table, providing the library WORK and table name AUDI.

08_ST_filter2.png

 

09_ST_filter3.png

 

  1. Click R Runner again and add code to the textbox.
library(tidyverse)
AUDI <- r_input_table %>% filter(Make == “Audi”)

 

  1. Provide a table name of AUDI to the “Provide name of R data frame…” textbox. This is used by R Runner to write to the defined output table.

10_ST_filter4.png

 

  1. Run the flow and navigate to the work library to open WORK.AUDI.

11_ST_filter5.png

 

Submitting an R Script

 

Next, I put the same code in an R script and provided the location of the script to R Runner which gave the same results. This method is especially helpful to run longer programs and allows users to use the editing capabilities provided in an IDE such as R Studio to build the script, then execute it within a flow.

 

12_ST_script1.png

 

13_ST_filter6.png

 

In some Viya environments, if using the folder button on the right, the path provided may not be complete. If you receive an error, find the location of your R script in the file system and copy/paste it to the text box instead with sasserver: in front of the path.

 

14_ST_folder-1.png

 

15_ST_script2.png

 

16_ST_script3.png

 

17_ST_script4.png

 

Data Visualization

 

You can even take advantage of R’s graphing capabilities within a flow. Below is an example creating a histogram using the ggplot2 package in R. This feature is limited to creating PDFs currently, but future developments will expand R Runner’s graphing capabilities.

 

I created a histogram of the Highway MPG data from SASHELP.CARS following these steps:

 

  1. Connect the SASHELP.CARS dataset to R Runner following steps 1-2 from the previous examples.
  2. Add code to create and save the plot in the textbox.
p <- ggplot(r_input_table, aes(x=MPG_Highway)) +

geom_histogram(fill = “white”, color = “blue”) +

ggtitle(“Highway MPG”)

ggsave(“MPG_Hist.pdf”, p, path = “/home/student”)

 

  1. Locate the PDF in your file directory.

18_ST_graphing1.png

 

19_ST_graphing2.png

 

Installing R Packages

 

When configuring R with SAS Viya using the SAS Configurator for Open Source, certain R packages are suggested as part of the installation manifest. If there are specific R packages you know you will need to use as part of your R script, please let your SAS administrator know so they can include those packages in the configuration. As of now, users cannot install R packages directly using R Runner.

 

Future Developments

 

R Runner developers are in the process of planning improvements and maintain an upstream repository for this purpose here.

 

Summary

 

R Runner serves as an excellent tool for integrating the capabilities of both R and SAS within flows. The most current version of R Runner can be accessed here.

 

References

 

R RWAT and PROC IML

 

SAS Studio Flows Quick Start Video

 

SAS Studio Flows Documentation

 

SAS Studio Flows Course

 

R Runner Communities Post

 

SAS Studio Custom Steps -- R Runner Git Repository

 

Custom Step Installation Instructions

 

R Runner Upstream GitHub Repository

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎02-12-2024 09:30 AM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags