Experience of an R Programmer Using the R Runner Custom Step in SAS Studio Flows
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Coming to SAS after 6 years of almost exclusively programming in R, I was thrilled to discover ways to combine the strengths of both languages with tools like the R SWAT Package and PROC IML. Most recently I tried the R Runner custom step in SAS Studio Flows, which allows users to code in R, or even upload an entire R script to be used in a flow. The purpose of this post is to share my experience using R Runner and how anyone building a flow may benefit from it.
SAS Studio Flows
If you are asking yourself, “What is a ‘flow’?”, SAS Studio Flows offer a low-code option for analyzing data with SAS Viya. A flow is a sequence of operations on data. Data and operations are represented in SAS Studio by steps that you can access from the Steps section of the navigation pane. Each step in a flow is represented by a node on the flow canvas. While SAS provides several predefined steps to use in flows, users and developers also build custom steps to perform specific tasks. R Runner is a custom step that allows users to run R scripts and write R code to analyze data within a flow.
Here are some additional resources regarding SAS Studio Flows:
Set Up
I accessed R Runner through the SAS Studio Custom Steps GitHub repository which is available to all users.
My first step was to make sure my machine had the necessary requirements for R Runner. I asked my SAS Administrator for the following, which are outlined in more detail here:
- A Viya 4 Environment 2023.08 or later.
- SAS Viya needs access to an active Python and R environment.
- The rpy2 Python package is installed and configured.
- A path to R is available through the R_HOME environment variable. Option to run R within SAS Studio compute sessions is enabled with the -RLANG system option
- Recommended: Administrators could make use of the SAS Configurator for Open Source (also commonly known as sas-pyconfig) to install and configure Python and R access from SAS Viya.
Once my environment was set up, I needed to upload the R Runner custom step into SAS Viya following the steps outlined here.
*Before continuing, view a tour of the R Runner interface here to familiarize yourself with the interface.
Using R Runner
Now that R Runner was ready to go, I did what any programmer would do and began my exploration by using the text box to print the ever-famous line, “Hello World!”
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Navigating to the “Submitted Code and Results” log, I found the output of my R Code.
Next, I submitted a regression model with the following steps to analyze the SASHELP.CARS table to understand the relationship between Horsepower and Highway MPG.
- Right-click on the R Runner step and select Add input port.
- From the Libraries pane, select the SASHELP.CARS table and drag it to the flow and connect it to the R Runner input port. This allows me to reference the input data source using the generic table name, r_input_table. (For more detail, click here).
- Add the model code to the textbox. Notice instead of supplying CARS for the table name, I specified r_input_table.
model <- lm(MPG_Highway ~ Horsepower, data=r_input_table)
print(summary(model))
- Run the flow and navigate to the “Submitted Code and Results” log. The program returns a summary of the regression model.
In addition to printing R output, R Runner has the ability to modify SAS tables with R code within a flow.
For example, let’s say I want to filter the SASHELP.CARS table by Make, only keeping Audis. The tidyverse set of packages in R includes convenient data transformation functions. tidyverse was installed with the SAS Open Source Configurator so all I needed to do was build my flow and include library(tidyverse) in my R code to load in the tidyverse packages. To create a table with only Audi vehicles included, I followed these steps:
- Follow steps 1 and 2 of the previous example to connect SASHELP.CARS to R Runner.
- Right-click R Runner and add an output port, selecting outputtable.
- Right-click the output port and add a table, providing the library WORK and table name AUDI.
- Click R Runner again and add code to the textbox.
library(tidyverse)
AUDI <- r_input_table %>% filter(Make == “Audi”)
- Provide a table name of AUDI to the “Provide name of R data frame…” textbox. This is used by R Runner to write to the defined output table.
- Run the flow and navigate to the work library to open WORK.AUDI.
Submitting an R Script
Next, I put the same code in an R script and provided the location of the script to R Runner which gave the same results. This method is especially helpful to run longer programs and allows users to use the editing capabilities provided in an IDE such as R Studio to build the script, then execute it within a flow.
In some Viya environments, if using the folder button on the right, the path provided may not be complete. If you receive an error, find the location of your R script in the file system and copy/paste it to the text box instead with sasserver: in front of the path.
Data Visualization
You can even take advantage of R’s graphing capabilities within a flow. Below is an example creating a histogram using the ggplot2 package in R. This feature is limited to creating PDFs currently, but future developments will expand R Runner’s graphing capabilities.
I created a histogram of the Highway MPG data from SASHELP.CARS following these steps:
- Connect the SASHELP.CARS dataset to R Runner following steps 1-2 from the previous examples.
- Add code to create and save the plot in the textbox.
p <- ggplot(r_input_table, aes(x=MPG_Highway)) +
geom_histogram(fill = “white”, color = “blue”) +
ggtitle(“Highway MPG”)
ggsave(“MPG_Hist.pdf”, p, path = “/home/student”)
- Locate the PDF in your file directory.
Installing R Packages
When configuring R with SAS Viya using the SAS Configurator for Open Source, certain R packages are suggested as part of the installation manifest. If there are specific R packages you know you will need to use as part of your R script, please let your SAS administrator know so they can include those packages in the configuration. As of now, users cannot install R packages directly using R Runner.
Future Developments
R Runner developers are in the process of planning improvements and maintain an upstream repository for this purpose here.
Summary
R Runner serves as an excellent tool for integrating the capabilities of both R and SAS within flows. The most current version of R Runner can be accessed here.
References
SAS Studio Flows Quick Start Video
SAS Studio Flows Documentation
SAS Studio Custom Steps -- R Runner Git Repository
Custom Step Installation Instructions
R Runner Upstream GitHub Repository
Find more articles from SAS Global Enablement and Learning here.