The SAS Code Node in SAS Model Studio versus SAS Enterprise Miner
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
The SAS Code node in SAS Model Studio is extremely versatile. If you want to accomplish something that can’t easily be done with the out-of-the-box Model Studio nodes, try using the SAS Code node! It can be used to run pretty much any SAS code. This blog will illustrate using the SAS Code node for:
- Pre-processing
- Supervised Learning
- Post-processing
SAS Code Node Interface
Some of you will remember the SAS Code Node in SAS Enterprise Miner. The SAS Code Node in SAS Model Studio is fairly similar, in that you run SAS code in a node within your pipeline. The interface, however, is quite a bit different.
In SAS Model Studio, you will find the SAS Code Node is found under “Miscellaneous.” After you add the SAS Code Node to your pipeline and the node is selected, you will see on the right a button to Open code editor.
Within the code editor are two panes:
- Training code
- Scoring code
The SAS Code Node does NOT change the original training data set in the pipeline. Therefore, if you want to change values of a variable, create new variables, or delete observations, you need to do this in the Scoring Code pane.
For you Enterprise Miner users, recall that EM also had a Training Code pane and Score Code pane.
Training code pane
Score code pane
SAS Code Node processing
Some of you remember the SAS code node in SAS Enterprise Miner. Processing works differently in SAS Model Studio from how it works in SAS Enterprise Miner. Below are a few differences between how the SAS code node works in SAS Enterprise Miner versus SAS Model Studio:
SAS CODE NODES EXAMPLES Let’s look at some examples using the SAS Code Node:
- Pre-processing
- Supervised Learning
- Post-processing
Preprocessing Example
Let’s say you want to accomplish some pre-processing in the SAS Code Node. Examples might include:
- Engineering features. For example you might want to:
- Create a new variable that is some function of a variable that already exists in your pipeline. Remember that new variables must be created in the Scoring Code pane.
- Selectively apply imputations or transformations. For example, you could write logic to:
- Calculate each input variable’s skewness of inputs.
- Log transform those input variables that have a skewness > 3.14.
- Select only a subset of the data, for example, only include observations where home values are greater than 300,000.
- Modify the metadata, for example, you might want to change a variable’s role or level. You CANNOT, however, change the target role.
Remember that some of this pre-processing can be done from Model Studio’s Data Tab or with the Manage Variables node. Check first to see if the functionality you need is available there, before you reinvent the wheel.
In order to change metadata, use %dmcas_metaChange. For example, as shown below, we set VALUE (home value) to REJECTED. This variable will not be deleted from the data set, but will not be used in modeling. We set NewValue to INPUT, so that NewValue will be considered as an input variable in our models.
Remember! Modification of the data that you want to pass on to subsequent nodes and to publishing must be done within the scoring code.
Below is the simple code example.
/* Training Code */
/* Replace variable VALUE with NewValue */
%dmcas_metaChange(NAME= VALUE, ROLE=REJECTED, LEVEL=INTERVAL);
%dmcas_metaChange(NAME= NewValue, ROLE=INPUT, LEVEL=INTERVAL);
/* Scoring Code */
length NewValue 8;
if 'VALUE'n < 100000 then NewValue = VALUE * 2;
else NewValue = VALUE * 2.1;
Supervised Learning Example
There may be a supervised learning algorithm or options that you cannot accomplish with the existing nodes. The SAS Code Node lets you use SAS code to run any supervised learning algorithm (or option). Once you have created the SAS Code Node you can Move it to Supervised Learning, and then it is treated as any other supervised learning node. It will be compared to the other models in your model comparison node and you can publish, deploy, etc. the model. You can even get interpretability graphs, such as PD plots, LIME, etc!
/* Training Code */
proc gradboost data=&dm_data
numBin=20 maxdepth=6 maxbranch=2 minleafsize=5
minuseinsearch=1 ntrees=10 learningrate=0.1 samplingrate=0.5 lasso=0 ridge=0 seed=1234;
%if &dm_num_interval_input %then %do;
input %dm_interval_input / level=interval;
%end;
%if &dm_num_class_input %then %do;
input %dm_class_input/ level=nominal;
%end;
%if “&dm_dec_level”=”INTERVAL” %then %do;
target %dm_dec_target / level=interval ;
%end;
%else %do;
target %dm_dec_target / level=nominal;
%end;
&dm_partition_statement;
ods output
VariableImportance = &dm_lib..VarImp
Fitstatistics = &dm_data_outfit
;
savestate rstore=&dm_data_rstore;
run;
%dmcas_report(dataset=VarImp, reportType=Table, description=%nrbquote(Variable Importance));
%dmcas_report(dataset=VarImp, reportType=BarChart, category=Variable, response=RelativeImportance, description=%nrbquote(Relative Importance Plot));
run;
Post-Processing Example
During post-processing you may wish to:
- summarize data
- create tailored graphs or tables from modeling results using dmcas_report macro
- generate ODS output
/* Training Code */
data &dm_lib..bethsamp;
set &dm_data(obs=500);
residual = BAD1 – P_BAD1;
run;
%dmcas_report(dataset=bethsamp,
reportType=ScatterPlot,
x=P_BAD1,
y=residual,
description=%nrbquote(Scatter Plot),
yref=0);
Macros and macro variables
Similar to SAS Enterprise Miner, SAS Model Studio has a bunch of pre-built macros and macro variables included in the software out-of-the-box.
Macros and Macro Variables in General
Macros
- Start with %
- Can be:
- Global or
- Local (defined inside a macro and used inside a macro)
- Macro examples:
%LET iterations = 10;
%LET singer = Taylor Swift;
Macro Variables
- Macro variables
- Start with &
- Let you substitute into your program
- Could be a variable name, a numeral, or any text string
- Macro variables example:
do i = 1 to &iterations;
Title “Performed by &singer”;
- Creating your own macros:
MACRO EXAMPLE
%MACRO printYearlyElectric (datayear=);
proc print data = bethdata.electric&datayear;
title “Electricity Generation Data for &datayear”;
var generation_mwh energysource state
oilproduction oilimports price;
format price dollar6.;
run;
%MEND printYearlyElectric;
TO INVOKE MACRO
%printYearlyElectric (datayear=2020)
Macros and Macro Variables in EM and VDMML
In SAS Enterprise Miner and SAS Model Studio, use the macro variables and the variables macros to reference information about:
- imported data sets
- target and input variables
- exported data sets
- files that store the scoring code
- et cetera
Use the utility macros to manage data and format output. Utility macros accept arguments.
EM versus VDMML Macros and Macro Variables
More extensive list of SAS Code Node macros:
Extensive list of SAS Code Node macro variables:
Summary:
The SAS Code node lets you bring the flexibility of your own SAS code into your SAS Model Studio pipeline. The SAS Code node extends SAS Model Studio’s functionality by including any SAS procedure or any SAS DATA steps. Examples of tasks you can accomplish with the SAS Code node (and these are just a few examples) are:
- Create custom score code
- Munge data
- Build custom predictive models
- Format SAS output
- Create graphs, plots, and tables to meet your exact specifications
- Modify metadata of individual variables
Data exported by the SAS Code node can be used by subsequent nodes in your SAS Model Studio pipeline.
Keep in mind that there are already a ton of tasks available to you with the out-of-the-box nodes in SAS Model Studio. If you can’t find exactly what you need there, well, I’m not saying you’re high maintenance, but hey…if the expensive Italian leather shoe fits, then wear it! I’m not here to judge!
But seriously, sometimes you just need to do something that’s not already built into a node. This is where the complete versatility of the SAS Code Node will really help you out!
For More Information:
- Executing SAS Code in SAS Visual Data Mining and Machine Learning Pipelines YouTube by Wendy Czika
- Tip: How to include the SAS Code node in SAS® Visual Data Mining and Machine Learning's Model Studio...
- Wendy Czika and Peter Garakios Paper SAS3236-2019 Playing Favorites: Our Top 10 Model Studio Feature...
Selected examples in github
- Example to access data from predecessor node
- Example to bin model and plot assessments
- Example to create model assessment report
- Examples SAS Code Node
Model Studio Documentation
SAS Code Node in Enterprise Miner
- SAS® Enterprise Miner™ 15.1 Extension Nodes: Developer’s Guide: SAS Code Node
- Using the SAS® Code Node in SAS® Enterprise Miner™
- SAS Enterprise Miner macro variables reference
Macro Primers