Hello everyone,
Analytic models come in many shapes, sizes, and complexities. A common SAS use case is to run through the entire Model Management lifecycle with the SAS software; however, we recognize that this is not always the case. You may need to work with SAS objects outside of SAS. Whether that be using models created in Python or R and deploying them to SAS or creating a model in SAS and needing to deploy it elsewhere.
To assist in the latter use case, I am here to present you to the new Open Source packages for Python, sas-scoring-translator-python (pysct), and for R, sas-scoring-translator-r (rsct), which translate SAS scoring code to those languages. This makes it easy to call SAS models from your application (or if you just want to learn a bit more of how the SWAT package works). The tools are available on sassoftware GitHub.
Get the SAS Scoring Code
As you might know by now, SAS has a lot of interfaces where you can build models (it can even build one for you with Auto ML). And, after you create your great models you want to use them, so what could you do? Export a scoring code, of course.
First, in Model Studio go to Pipeline comparison as seen below.
Next, select your model (which doesn't have to be published to MAS), and export its scoring code.
This results in a zip file with your SAS scoring code (save the file name dmcas_epsscorecode.sas in your head; we'll use it later on).
You don't have to unzip the file, but if you take a look inside, you will find something like the example below depending on which model you are using.
/*
* This score code file references one or more analytic stores that are located in the caslib "Models".
* This score code file references the following analytic-store tables:
* _28LWD4IVTS9I294A6F893FNBY_ast
*/
/*----------------------------------------------------------------------------------*/
/* Product: Visual Data Mining and Machine Learning */
/* Release Version: V2021.1.1 */
/* Component Version: V2020.1.5 */
/* CAS Version: V.04.00M0P05162021 */
/* SAS Version: V.04.00M0P051621 */
/* Site Number: 70180938 */
/* Host: sas-cas-server-default-client */
/* Encoding: utf-8 */
/* Java Encoding: UTF8 */
/* Locale: en_US */
/* Project GUID: f8b37e46-c893-4cf5-8183-0802c729b1b0 */
/* Node GUID: 25d24897-fad4-4e1d-bc94-1a7d93340ade */
/* Node Id: 28LWD4IVTS9I294A6F893FNBY */
/* Algorithm: Gradient Boosting */
/* Generated by: sasdemo */
/* Date: 26JUL2021:19:12:08 */
/*----------------------------------------------------------------------------------*/
data sasep.out;
dcl package score _28LWD4IVTS9I294A6F893FNBY();
dcl double "P_BAD1" having label n'Predicted: BAD=1';
dcl double "P_BAD0" having label n'Predicted: BAD=0';
dcl nchar(32) "I_BAD" having label n'Into: BAD';
dcl nchar(4) "_WARN_" having label n'Warnings';
dcl double EM_EVENTPROBABILITY;
dcl nchar(12) EM_CLASSIFICATION;
dcl double EM_PROBABILITY;
varlist allvars [_all_];
method init();
_28LWD4IVTS9I294A6F893FNBY.setvars(allvars);
_28LWD4IVTS9I294A6F893FNBY.setkey(n'C08002467175AC235F1C68321869975F6170F229');
end;
method post_28LWD4IVTS9I294A6F893FNBY();
dcl double _P_;
if "P_BAD0" = . then "P_BAD0" = 0.8005033557;
if "P_BAD1" = . then "P_BAD1" = 0.1994966443;
if MISSING("I_BAD") then do ;
_P_ = 0.0;
if "P_BAD1" > _P_ then do ;
_P_ = "P_BAD1";
"I_BAD" = ' 1';
end;
if "P_BAD0" > _P_ then do ;
_P_ = "P_BAD0";
"I_BAD" = ' 0';
end;
end;
EM_EVENTPROBABILITY = "P_BAD1";
EM_CLASSIFICATION = "I_BAD";
EM_PROBABILITY = MAX("P_BAD1", "P_BAD0");
end;
method run();
set SASEP.IN;
_28LWD4IVTS9I294A6F893FNBY.scoreRecord();
post_28LWD4IVTS9I294A6F893FNBY();
end;
method term();
end;
enddata;
This looks fine, as long you know enough SAS. However, if you don't, or if you want to score tables with the distributed power of SAS Viya using Python or R, this doesn't really help you. Of course, you could use SWAT and translate all this by hand, but this would not be efficient or quick. This is why we are here! The pysct - Python Scoring Code Translator (and rsct) will help you. It will read the zip file and translate it for you, let's look at it.
We will start with Python, but if you are interested only in R, feel free to jump down to the R section.
Python Scoring Code Translator
First, let's install the package from SAS GitHub:
## Install directly from git if you don't have it
pip install git+https://github.com/sassoftware/sas-scoring-translator-python.git
The tool is quite easy to use. Look at the reference table and check where your model came from and the scoring code type. In our example, we've got a model from Model Studio, and the scoring code type is the name I told you earlier to save, dmcas_epscorecode.sas. Now, we use the EPS_translate() function. Other model and scoring code combos are defined in the following table.
Interface | Code Type | Base File Name | Translation Function |
---|---|---|---|
Model Studio | DataStep | dmcas_scorecode.sas | pysct.DS_translate() |
Model Studio | DS2 | dmcas_epscorecode.sas | pysct.EPS_translate() |
Visual Text Analytics | Sentiment - CAS Procedure | scoreCode.sas | pysct.nlp_sentiment_translate() |
Visual Text Analytics | Categories - CAS Procedure | scoreCode.sas | pysct.nlp_category_translate() |
Visual Text Analytics | Topics - CAS Procedure | AstoreScoreCode.sas | pysct.nlp_topics_translate() |
Visual Text Analytics | Concepts - CAS Procedure | ScoreCode.sas | pysct.nlp_concepts_translate() |
With just the following line, you will have everything you need. In my case it would look like the code below.
import pysct
out = pysct.EPS_translate(
in_file = "C:/score_code_Gradient Boosting.zip", ## path to your file (yes, zipped, you don't have to worry)
out_caslib = "casuser", ## the caslib of the output table (after data scored)
out_castable = "hmeq", ## the table name of the output table (after data scored)
in_caslib = "public", ## the caslib table you want to score
in_castable = "hmeq", ## the table name of the table you want to score
copyVars="ALL", ## by default SAS only returns the scored output, use "ALL" if you want to copy all table vars, or just omit if you don't want to copy
out_file="gradientBoosting.py" ## the output file path
)
out.keys()
Sample response:
The file was successfully written to gradientBoosting.py
dict_keys(['ds2_raw', 'py_code', 'out_caslib', 'out_castable', 'out_file'])
By default, pysct writes the file to your current working directory. All of the code can be found in the out
object in case you want to see it, but lets take a look in the output gradientBoosting.py
.
## SWAT package needed to run the codes, below the packages in pip and conda
# documentation: https://github.com/sassoftware/python-swat/
# pip install swat
# conda install -c sas-institute swat
import swat
## Defining tables and models variables
in_caslib = "public"
in_castable = "hmeq"
out_caslib = "casuser"
out_castable = "hmeq"
astore_name = "_28LWD4IVTS9I294A6F893FNBY_ast"
astore_file_name = "_28LWD4IVTS9I294A6F893FNBY_ast.sashdat"
## Connecting to SAS Viya
conn = swat.CAS(hostname = "myserver.com", ## change if needed
port = 8777,
protocol='http', ## change protocol to cas and port to 5570 if using binary connection (unix)
username='username', ## use your own credentials
password='password') ## we encorage using .authinfo
## Loading model to memory
## assuming the model is already inside the viya server
conn.table.loadTable(caslib= "Models",
path = astore_file_name, #case sensitive
casOut = {"name": astore_name,
"caslib": "Models"}
)
score_table = conn.CASTable(name = in_castable,
caslib = in_caslib
)
column_names = score_table.columns.tolist()
## loading astore actionset and scoring
conn.loadActionSet("astore")
conn.astore.score(table = {"caslib": in_caslib, "name": in_castable},
out = {"caslib": out_caslib, "name": out_castable, "replace": True},
copyVars = column_names,
rstore = {"name": astore_name, "caslib": "Models"}
)
## Obtaining output/results table
scored_table = conn.CASTable(name = out_castable,
caslib = out_caslib)
scored_table.head()
And the magic is done, you just have to edit the connection (swat.CAS
) with your credentials and server name, and your code is ready to use in Python.
Even though it uses some default values (or copy from your scoring code file), you are free to change things as you fit. At this point though, you have a good starting point for better integration.
# Since the package is not available on cran, you have to install from our git
# we recommend using the remotes package
# install.packages("remotes") # uncomment if you don't have it yet
remotes::install_github("sassoftware/sas-scoring-translator-r")
Interface | Code Type | Base File Name | Translation Function |
---|---|---|---|
Model Studio | DataStep | dmcas_scorecode.sas | DS_translate() |
Model Studio | DS2 | dmcas_epscorecode.sas | EPS_translate() |
Visual Text Analytics | Sentiment - CAS Procedure | scoreCode.sas | nlp_sentiment_translate() |
Visual Text Analytics | Categories - CAS Procedure | scoreCode.sas | nlp_category_translate() |
Visual Text Analytics | Topics - CAS Procedure | AstoreScoreCode.sas | nlp_topics_translate() |
Visual Text Analytics | Concepts - CAS Procedure | ScoreCode.sas | nlp_concepts_translate() |
And with a couple of lines we will be able to translate our code. We don't even need to unzip our Scoring code.
## load the package
library("rsct")
output_infos <- EPS_translate(in_file = "C:/score_code_Gradient Boosting.zip", ## path to your file (yes, zipped, you don't have to worry)
out_caslib = "casuser", ## the caslib of the output table (after data scored)
out_castable = "hmeq_scored", ## the table name of the output table (after data scored)
in_caslib = "public", ## the caslib table you want to score
in_castable = "hmeq", ## the table name of the table you want to score
copyVars = "ALL", ## by default SAS only returns the scored output, use "ALL" if you want to copy all table vars, or just omit if you don't want to copy
out_file = "gb_translated.R" ## the output file path
)
names(output_infos)
Sample response:
File successfully written to gb_translated.R
[1] "r_code" "out_file" "out_caslib" "out_castable"
gb_translated.R
was written to your working directory, but you could also set a full path. The output_infos
is a list with details if you need to use the results somewhere else. Look at the output code:## install swat package from github if needed, uncomment OS version
# install.packages('https://github.com/sassoftware/R-swat/releases/download/v1.6.1/R-swat-1.6.1-linux64.tar.gz',repos=NULL, type='file') ## linux
# install.packages('https://github.com/sassoftware/R-swat/releases/download/v1.6.1/R-swat-1.6.1-win64.tar.gz',repos=NULL, type='file') ## windows
# install.packages('https://github.com/sassoftware/R-swat/releases/download/v1.6.1/R-swat-1.6.1-REST-only-osx64.tar.gz',repos=NULL, type='file') ## osx
## Load library
library("swat")
## Defining tables and models variables
in_caslib <- "public"
in_castable <- "hmeq"
out_caslib <- "casuser"
out_castable <- "hmeq_scored"
astore_name <- "_28LWD4IVTS9I294A6F893FNBY_ast"
astore_file_name <- "_28LWD4IVTS9I294A6F893FNBY_ast.sashdat"
## Connecting to SAS Viya
conn <- CAS(hostname = "myserver.com", ## change if needed
port = 8777,
protocol='http', ## change protocol to cas and port to 5570 if using binary connection (unix)
username='sasusername', ## use your own credentials
password='password') ## we encorage using .authinfo
## Loading model to memory
cas.table.loadTable(conn,
caslib= "Models",
path = astore_file_name , #case sensitive
casOut = list(name = astore_name,
caslib = "Models")
)
## Defining scoring table obtaining column names
score_table <- defCasTable(conn,
tablename = in_castable,
caslib = in_caslib)
column_names <- names(score_table)
## loading astore actionset and scoring
loadActionSet(conn, "astore")
cas.astore.score(conn,
table = list(caslib= in_caslib, name = in_castable),
out = list(caslib = out_caslib, name = out_castable, replace = TRUE),
copyVars = column_names,
rstore = list(name = astore_name, caslib = "Models")
)
## Obtaining output/results table
scored_table <- defCasTable(conn,
tablename = out_castable,
caslib = out_caslib)
head(scored_table)
As you can see, it is almost ready to use. You just have to edit the CAS connection (CAS
) with your credentials and server name, and your code is ready to be used in your R environment. You are free to change as you please. This is a good way to start understanding how SAS and Open Source can work together.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.