In case you haven’t heard: all the cool kids are now using SAS Viya for Learners 4 (VFL4) for teaching and learning.
SAS Viya for Learners 3.5 had a great run as an integrated code/low/no code experience for academics. But SAS Viya for Learners 4 will provide academics with the latest-and-greatest version of our SAS Viya platform. And if you want a full list of all the reasons that I think you should switch, today, please see my earlier SAS Communities article, found here.
One exciting development mentioned in my previous article was found under the Jupyter Notebook section. And for those of you who didn’t memorize that article, the section read: Leverage updated Python + R packages, including the rdataset package in Python – which provides over 2000+ datasets for teaching and learning!
This SAS Communities article shows exactly how to leverage those data sets.
I’ll start with the “what”. What is rdatasets? Well, it’s a user-created package that compiles datasets from a large number of commonly used R packages into a single Python package. Yes, this is the beauty of open-source collaboration: someone spent a LOT of their free time creating something incredibly valuable for the broader learning community. And we thank you, Vincent! More details on the package can be found here: https://pypi.org/project/rdatasets/
User-community generated content gratitude aside, here is the GitHub repo that contains the Python notebook we’ll use in this demonstration. Our three-part quest:
Let’s get started!
Part 1: Examine all the datasets available in rdatasets
# Import the required packages.
import rdatasets
import pandas as pd
rdatasets.summary()
Part 2: Load an interesting dataset
# Get the data ready to load
from rdatasets import data
# Load the "Affairs" dataset from the "AER" package
affairs_data = data(package='AER', item='Affairs')
# Let's check out a sample of the data
print(affairs_data.head()) # Print the first few rows of the dataset
Part 3: Convert that python dataframe to a CAS table
# Load some SAS Packages so that we can access the CAS engine in SAS Viya
import os,swat
# Setup the access rules
conn = swat.CAS(os.environ['CAS_CONTROLLER'], 5570, password=os.environ['ACCESS_TOKEN'])
# Push Affairs Data to Public CAS to use in SAS VA + SAS Model Studio (note ==> you can't overwrite the file if it already exists)
cas_table = conn.upload_frame(affairs_data, casout=dict(name='affairs_data', caslib='public', promote = 'true'))
# Push Affairs Data to CAS in you CASUSER folder (to use in SAS Studio)
cas_table = conn.upload_frame(affairs_data, casout=dict(name='affairs_data', replace=True))
# Examine the tables in casuser... because why not?
conn.tableInfo(caslib = 'casuser')
Happy hackin’!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.