Get it now: A 3-1 Special on R!
If you thought we were done helping you get your data into SAS Viya for Learners (VFL), read this: you were wrong. This library article will cater to our R lovers out there. Another way to handle the 100mb restriction for uploads is to do your data preprocessing elsewhere – like R. You can then import the cleaned data into SAS VFL to use our visualizations, dashboards, advanced machine learning tools, etc.
But wait – what exactly is the 3-1 special? Well, here’s a teaser for this article, bullet point style:
Now, let’s get into those finer details. Let’s start from the top: by launching Jupyter from the main SAS Drive page:
Once Jupyter launches, you’ll see a browser akin to the following:
Since we’re in the R spirit of things, let’s open an R Notebook by clicking here:
For my example, I’m going to use data from the ggplot2 library – which is used to create graphics in R. If you’re just here to learn, then please follow along. But, if you’ve already got your own R dataset, please upload it to VFL using the approach outline in one of my earlier posts:
See what I did there? I cited myself. That’s how we get more cites.
Anyway, to access ggplot2, type the following in the first command line:
And go ahead and submit that line, if you’re eager to see output. As a reminder there are two ways to submit lines of code in Jupyter. The first is the old-school play button:
The second is to click in the cell and press Shift+Enter.
Now that ggplot2 is loaded, let’s examine which datasets are available in the package. In the second cell, type:
And submit. Yup, it’s really that easy. Now examine all the data sets – there are a lot!
Again – and sneaky, sneaky – this helps us locate data readily available for analytics. In our case, all of these datasets should be useful for visualizations – as they’re part of ggplot2 package.
Now, let’s transfer some data from R into SAS. Since cars go vroom, vroom, the mpg data set is as good of a place to start as any. First let’s get a better understanding of mpg by examining the data. Type the following in cell 3:
A small sample of the output:
The next step is to load the R package that will help us convert the R data frame to a SAS data file. This line will load the requisite package:
Yes, foreign is the magic we’ll need. Go ahead and submit that cell. Moreover, given formatting and data structure issues that differ across R and SAS, the easiest approach to convert from R to SAS is to
Wait, what? We’re going to convert the R file to .txt and then read it into SAS via a SAS program?
To use foreign, I’ll refer you first to some good documentation, found here: https://cran.r-project.org/web/packages/foreign/foreign.pdf. The main nugget we need is on page 23:
Ponder those arguments for a bit. Let’s continue with the call that I used – and then unpack it a bit:
For my environment, this call starts with the mpg data frame. In the second argument, I then transfer the data to a .txt file saved in my casuser library. Finally, I then create mpg.sas – in the same casuser folder – which is a SAS program that will read in the .txt file.
Let’s submit that cell. We’ll know it’s done processing when the grey dot no longer appears in the tab for our Notebook:
Alright, now onto the SAS Studio part of the programming. Yay! As shown in my earlier posts on Ways to Handle the 100mb Data Upload Restriction in SAS Viya for Learners, let’s navigate to SAS Studio in VFL:
Now find your casuser folder under Explorer. Ensure that mpg.sas and mpg.txt are now included in your library. The requisite clicks are shown below:
We’re getting close! The last big step is to double-click on mpg.sas to open it. Then submit.
And just like that – the R file has been converted to a SAS file. As shown from the code above, no permanent library is assigned, so we’ll find rdata in the WORK library:
One housekeeping item before we move on. It’s highly recommended that you delete the .txt file – and perhaps the R file – after you’ve uploaded your data. Why? Well, this prevents you from having duplicate information – and saves your cloud space. Moreover, you can use the compressing and file combining hacks used in my earlier posts, as need.
2 of 3 insights completed. This last one is the grand finale!
Look once more at cars.txt. In particular, open it and scroll to the end:
Do you see what I see? The cursor shows that cars.txt is an editable file – one that’s already uploaded in VFL. Thus, my spidey-senses tell me that this file can be much bigger that 100mb, so long as you cut-and-paste an equivalent data structure into the file.
And guess what? My spidey senses are correct! I artificially inflated cars.txt to 196mb by simply copying-and-pasting the same data a bunch of times:
Now there was a bit of latency in the copying-and-pasting, so be patient. But the file is greater than 100mb and SAS VFL was still able to read in the file without issue:
So, ladies and gentleman, I present to you Hack #3 for getting your data larger than 100mb into SAS VFL. And this Hack came with the added bonus of showing you how to convert R data frames to SAS datasets and how to access data already part of R packages. That’s a 3-1 special!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.