For those of you SAS Enterprise Miner users whose organizations are evolving their analytics environments with SAS Viya, you may be wondering how you can take advantage of what Viya has to offer without losing your significant investments in Enterprise Miner projects. This tip shows you how Enterprise Miner process flows can execute code in the SAS Viya environment and how results from SAS Viya can be incorporated back into your Enterprise Miner project. This mechanism is also often referred to as the SAS Enterprise Miner “bridge” to SAS Viya.
Note: This tip is written using SAS Enterprise Miner 14.1, which is based on SAS 9.4. The bridge mechanism described herein may also work for other versions.
The diagram in the figure below depicts a simple process in which the standard home equity sample data set is used to train models using the HP Regression node in SAS 9.4 and the various modeling procs (NNET, TREESPLIT, and GENSELECT) available in SAS Viya. The xml for this diagram and other examples can be found in our github repository at github.com/sassoftware/em-bridge2viya.
SAS Code nodes in Enterprise Miner are used to make a connection to the SAS Viya environment, submit code to execute, and retrieve results. The connection to the SAS Viya environment is established using long-existing capabilities in SAS/CONNECT, a SAS client/server toolset that allows a SAS client session (in this case SAS 9.4 in Enterprise Miner) to connect to a SAS server session (in this case a SAS Viya server). The only aspects of SAS/CONNECT that you need to understand are how to sign on to the remote server, how to upload/download data, and how to submit code to execute. The example code included in this tip demonstrates these functions.
Important: Be sure you have SAS/CONNECT installed on the SAS Enterprise Miner machine. To check this, run the following code from a SAS program editor (you can access a program editor from the View menu in SAS Enterprise Miner) and look for "SAS/CONNECT" in the Log output (use Ctrl+F to find it).
proc setinit; run;
The SAS Viya platform provides analytics as “actions” that run within the CAS (Cloud Analytic Services) distributed in-memory execution environment. These actions can be invoked through APIs from a number of different languages (SAS, Python, Java, Lua, REST, etc) – in this case we will be calling them from the SAS language. Thus, the SAS Viya server that we are connecting to using SAS/CONNECT will in turn initiate a CAS session for executing the actions by submitting SAS proc calls as shown in the figure below. So the SAS session in SAS Viya is serving as a server in relation to the SAS 9.4 session, but it is serving as a client in relation to the CAS server.
Much of the work in this process has to do with ensuring that data is properly transferred among these sessions and results are processed appropriately. Let’s look at the various steps required.
The first node in the flow, the "Home Equity" node, is an Input Data node that requires that the "Home Equity" data source be defined. To define this data source you can simply select "Help --> Generate Sample Data Sources..." and select the Home Equity data set. The flow then proceeds to partition the data into training and validation sets, and imputes missing values.
The first code node in this template (“Load Data in SAS Viya”) serves to partition the data and upload it to the CAS in-memory execution engine in SAS Viya. Even though Enterprise Miner previously partitioned the original complete data set into separate training and validation data sets, the procedures in SAS Viya do not take in separate training and validation data sets. Therefore, we must use a data step to append the validation data set to the training data set and add a partition indicator variable (“_partind_” in this example, but you can call it anything you want). The SAS Viya procedures will make use of the partition indicator to recognize and use the training and validation partitions appropriately.
data work.trainTable; set &em_import_data(in=_a) &em_import_validate(in=_b); if _a then _partind_=1; else _partind_ = 0; run;
We then sign on to the remote server using the signon command from SAS/CONNECT, providing appropriate values for the placeholders denoted by “<>”. Finally, we upload the data to the CAS server, first attempting to delete any existing version since attempting to overwrite an existing global table will fail.
%let viya=<SAS Viya server>; signon viya.<port #> user=<username> pwd='<your password>';
rsubmit; cas mycas host="<CAS server>" port=<CAS server port>;
libname mycaslib CAS caslib="casuser"; proc delete data=mycaslib.trainTable; run; proc upload data=work.trainTable out=mycaslib.trainTable (PROMOTE=YES); run; libname mycaslib clear; endrsubmit; signoff;
proc PWENCODE in="<your password>"; run;
Now that the data set that your Enterprise Miner flow is using is loaded into the CAS execution engine in the SAS Viya platform, you can submit code to perform analytical tasks on that data using SAS Viya procedures. In this case, we have three SAS Code nodes to submit code to train three different types of models. The process is the same for each, so we will just look at “SAS Viya Treesplit” here, which uses PROC TREESPLIT to train a decision tree in SAS Viya.
First, use the %em_register macro to register dataset keys that can be used later to plot results. For more information on this process, refer to the tip Tip: Create Graphs in a SAS Code Node Using %em_report.
/* Register EM tables */ %em_register(key=casRoc, type=DATA); %em_register(key=casLift, type=DATA); %em_register(key=casFitStat, type=DATA);
Then, just as in the data loading node described previously, we sign on to the remote server using the signon command from SAS/CONNECT, providing appropriate values for the placeholders denoted by “<>”. We also use the %syslput macro to assign values from Enterprise Miner variables to variables in the SAS Viya session.
%let viya=<SAS Viya server>; signon viya.<port #> user=<username> pwd='<your password>'; /* Initialize macro variables for SAS Viya */ %syslput EMLib = &EM_LIB; %syslput EMnodeid = &em_nodeid; %syslput em_interval_input= %em_interval_input; %syslput em_nominal_input = %em_nominal_input; %syslput EMScoreFile = &EM_FILE_EMFLOWSCORECODE; %syslput casRoc = %scan(&em_user_casRoc, 2, .); %syslput casLift = %scan(&em_user_casLift, 2, .); %syslput casFitStat = %scan(&em_user_casFitStat, 2, .); filename _fref2 "&EM_FILE_EMFLOWSCORECODE";
We can then submit code to execute in the SAS session in SAS Viya using rsubmit. Note that it instantiates a CAS session indirectly by defining a libref to a CAS library. In this case we are training a decision tree using PROC TREESPLIT, assessing the model on the scored data set, and downloading the score code and assessment results back to the SAS 9.4 session.
/* Remote submit statement to SAS Viya */ rsubmit; cas mycas host="<CAS server>" port=<CAS server port>;
libname mycaslib CAS caslib="casuser"; %let casScoreTable =mycaslib.&EMnodeid._scoreTable; %let casScoreFile = %sysfunc(pathname(work))/score.sas; proc treesplit data=mycaslib.trainTable; target BAD/level=NOMINAL; input &em_interval_input / level=interval; input &em_nominal_input / level=nominal; prune reducederror; partition rolevar=_partind_ (train="1" valid="0"); code file="&casScoreFile"; score out=&casScoreTable copyvars=(BAD _PARTIND_); run; filename _fref "&casScoreFile"; /* Move score file to EM SAS 9.4 */ proc download infile=_fref outfile=_fref2; run; filename _fref; /* Assess the scored table */ proc assess data=&casScoreTable maxiters=5000 nbins=20 ncuts=20 debug; var P_BAD1; target BAD / event="1" level=NOMINAL; fitstat pvar=P_BAD0 / pevent="0"; by _partind_; ods output ROCInfo=&casRoc LiftInfo=&casLift FitStat=&casFitStat; run; proc delete data=&casScoreTable; run; /* Move the SAS Viya assessment results to EM 9.4 */ proc download inlib=work outlib=&emLib; select &casRoc &casLift &casFitStat/ mt=DATA; run; libname mycaslib clear; endrsubmit; signoff; filename _fref2;
Now we can use the %em_report macro to create plots to be displayed for this node in SAS Enterprise Miner.
data &em_user_casRoc; set &em_user_casRoc; label oneminusspecificity = "1 - Specificity"; oneminusspecificity = 1 - specificity; run; /* Define reports to be displayed in the Results window */ %em_report(key=casRoc, viewtype=lineplot, x= oneminusspecificity , y=sensitivity, group=_partind_, description=SAS Viya ROC, autoDisplay=Y); %em_report(key=casLift,viewtype=lineplot, x=depth, y=cumlift, group=_partind_, description=SAS Viya Lift, autoDisplay=Y); %em_report(key=casFitStat, viewtype=DATA, description=SAS Viya Fit Statistics); %em_report(key=odsOutput, viewtype=FILEVIEWER, description=SAS Viya Output, AutoDisplay=Y); %em_model(DATA=, TARGET=BAD, ASSESS=Y, FITSTATISTICS=Y, CLASSIFICATION=Y, RESIDUALS=Y);
The %em_model() macro provides Enterprise Miner with information about the target being modeled and what post processing actions to take for this model.
After running this node, note that all of the standard results (ROC/lift curves, fit statistics table) are provided as expected.
The final node in our example flow shows that the Model Comparison node can be used to compare models from native Enterprise Miner modeling nodes with models from SAS Viya (produced by the SAS Code nodes). At this point, it does not matter that the models were trained in a separate environment – they are all represented the same way for easy model comparison.
This tip demonstrated how the SAS Code node can be used in SAS Enterprise Miner to call SAS Viya to train models and how the models and results can be brought back into the Enterprise Miner project for consumption and further usage. Actually, this same “bridge” mechanism can be used to execute any desired code in the SAS Viya platform from SAS Enterprise Miner (ie, not just to train models). For more information on the analytical procedures available in SAS Viya, refer to the SAS Visual Data Mining and Machine Learning documentation.
In the upcoming SAS 9.4m4 release, SAS Enterprise Miner 14.2 will contain a new node dedicated to invoking SAS Viya code. This “SAS Viya” node will use the same mechanism employing SAS/CONNECT as described above, but much of it will be packaged up in a cleaner set of macros for you to invoke along with other built-in conveniences.
Please visit our Github repository github.com/sassoftware/em-bridge2viya for more examples that demonstrate this “bridge” mechanism with SAS Code nodes or the new “SAS Viya” node available in the upcoming 14.2 release of SAS Enterprise Miner.