BookmarkSubscribeRSS Feed

How to create a custom TensorFlow node in SAS Visual Forecasting with GUI parameters

Started ‎04-06-2022 by
Modified ‎04-06-2022 by
Views 2,000

SAS® Visual Forecasting (VF) allows users to easily create custom modeling nodes, programmed either in open-source languages (Python or R) or programmed in SAS. These custom nodes can then be added to forecasting pipelines, alongside default, state-of-the-art modeling nodes provided by SAS. This facilitates numerous advantages to open-source forecasting processes such as scalability, model governance, security, and production deployment. In addition, SAS Visual Forecasting facilitate parallel task execution. Any open-source code can be tuned to utilize the SAS Viya parallelization capabilities and can, therefore model multiple time-series at once.

The step-by-step procedure on how to configure open-source packages in your SAS Viya environment and building custom modeling nodes from scratch is provided in this link. In this article, I will discuss how to create custom GUI parameter forms for a custom LSTM TensorFlow forecasting node (see Figure 1).

 

arpitjain90_0-1649246767062.png

Figure 1: Example of custom LSTM TensorFlow forecasting node with custom UI parameters on right-side pane.

 

Prerequisites

The open-source languages (Python / R) and SAS EXTLANG package must be installed on your Viya environment. These installations are usually managed by your SAS Viya Administrator. The detailed steps on how to configure these packages on your Viya environment are provided in this link.  

Preparing Custom node files

A SAS VF modeling node consists of the following four files:

  1. metadata.json
  2. template.json
  3. validation.xml
  4. code.sas

Please find attached a zip file in this article containing all four files mentioned above. Let us discuss each of these files in detail with respect to the blocks that one needs to adapt to create a new Custom modeling node.

 

metadata.json

This file contains the build version of the node. This information can be obtained by downloading any modeling node from SAS VF and reading the metadata.json file after unzipping the node file (see Figure 2).

{
"buildVersion" : "B.023"
}

Figure 2: Example - metadata.json file

template.json

This file contains key information of the node – Name & description of the forecasting node, default value of the attributes as displayed in the parameter form and output attributes format. The contents of this file designed for LSTM TensorFlow Forecasting node is shown in Figure 3. To adapt this file for a new forecasting node, make changes at the following rows:

  1. Row 7 & Row 17: Name of the forecasting node
  2. Row 8 & Row 18: Description of the forecasting node
  3. Row 28 - Row 38: Default values of the attributes in the parameter form. One can modify these attributes by adding new attribute row confining to the specific format as shown or by simply deleting the attributes that are no longer required in the new node.
  4. Row 67: Specify the Application id – “forecasting”, “text” or “datamining”. For SAS Visual Forecasting node, specify “forecasting”.
1 {
2   "creationTimeStamp" : null,
3   "modifiedTimeStamp" : null,
4   "createdBy" : null,
5   "modifiedBy" : null,
6   "id" : null,
7   "name" : "LSTM TensorFlow Forecasting",
8   "description" : "LSTM Forecasting with the EXTLANG Package and TensorFlow.",
9   "revision" : 0,
10  "version" : 3,
11  "prototype" : {
12    "creationTimeStamp" : null,
13    "modifiedTimeStamp" : null,
14    "createdBy" : null,
15    "modifiedBy" : null,
16    "id" : null,
17    "name" : "LSTM TensorFlow Forecasting",
18    "description" : "LSTM Forecasting with the EXTLANG Package and TensorFlow.",
19    "revision" : 0,
20    "iconCode" : null,
21    "imageUri" : null,
22    "executionProviderId" : "Compute",
23    "code" : null,
24    "classification" : null,
25    "group" : null,
26    "status" : "undefined",
27    "componentProperties" : {
28      "_NINPUT" : 12,
29	  "_MAXEPOCH" : 100,
30      "_LEARNING_RATE" : 0.01,
31      "_BATCH_SIZE" : 16,
32      "_SEED" : 12345,
33      "_ES_MIN_DELTA" : 0.01,
34      "_ES_PATIENCE" : 5,
35	  "_MAX_LSTM_LAYER" : 30,
36      "_arimaxInclude" : true,
37      "_esmInclude" : true,
38      "_holdoutSampleSize" : 0,
39      "_idmInclude" : false,
40      "_idmMethod" : "BEST",
41      "_intermittencySensitivity" : 2,
42      "_minobs" : 1,
43      "_minobsSeason" : 2,
44      "_minobsTrend" : 1,
45      "_selectionCriteria" : "MAPE",
46      "_ucmInclude" : false,
47      "dataSpecification" : {
48        "outAttributesList" : [ {
49          "columns" : [ "NOBS", "N", "NMISS", "MIN", "MAX", "MEAN", "STDDEV", "_STATUS_" ],
50          "name" : "OUTSUM"
51        }, {
52          "columns" : [ "_REGION_", "_SELECT_", "_MODEL_", "DFE", "N", "NOBS", "NMISSA", "NMISSP", "NPARMS", "TSS", "SST", "SSE", "MSE", "RMSE", "UMSE", "URMSE", "MAPE", "MAE", "RSQUARE", "ADJRSQ", "AADJRSQ", "RWRSQ", "AIC", "AICC", "SBC", "APC", "MAXERR", "MINERR", "MAXPE", "MINPE", "ME", "MPE", "MDAPE", "GMAPE", "MINPPE", "MAXPPE", "MPPE", "MAPPE", "MDAPPE", "GMAPPE", "MINSPE", "MAXSPE", "MSPE", "SMAPE", "MDASPE", "GMASPE", "MINRE", "MAXRE", "MRE", "MRAE", "MDRAE", "GMRAE", "MASE", "MINAPES", "MAXAPES", "MAPES", "MDAPES", "GMAPES" ],
53          "name" : "OUTSTAT"
54        }, {
55          "columns" : [ "_MODEL_", "_MODELTYPE_", "_DEPTRANS_", "_SEASONAL_", "_TREND_", "_INPUTS_", "_EVENTS_", "_OUTLIERS_", "_STATUS_", "_SOURCE_" ],
56          "name" : "OUTMODELINFO"
57        } ]
58      }
59    },
60    "systemAttributes" : null,
61    "validationModel" : null,
62    "validationErrors" : null,
63    "version" : 3,
64    "componentPropertyMetadata" : { },
65    "eTag" : "W/\"1620053011943485000\""
66  },
67  "applicationId" : "forecasting",
68  "classification" : "pluggable",
69  "providerId" : "CustomTemplate",
70  "group" : "modeling",
71  "hidden" : false,
72  "eTag" : "W/\"1620053011943410000\""
73 }

Figure 3: Example - template.json file 

 

The other rows in this file are system generated and it is advised to be kept the same. Further details on these components are provided in this link.

 

validation.xml

For any custom modeling node, this file defines the schema (type, name, display name etc.) of the node that a user experiences on the right-side panel of SAS Visual Forecasting modeling interface. The default values of the node parameters are defined in template.json file (as described in section above) and the validation.xml file tells the user about the list of parameters that are allowed to change. Figure 4 shows the example validation.xml file for Python LSTM TensorFlow Forecasting node.

 

1 <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
2 <validationModel eTag="&quot;Y29tLnNhcy5hbmFseXRpY3MudmFsaWRhdGlvbi5yZXByZXNlbnRhdGlvbnMuVmFsaWRhdGlvbk1vZGVs1&quot;" description="LSTM Forecasting with the EXTLANG Package and TensorFlow." name="LSTM TensorFlow Forecasting" revision="0">
3     <links/>
4     <version>3</version>
5     <properties>
6         <group style="nested" displayName="TensorFlow Keras Model Options" array="false" enabledWhenValue="false">
7             <clientProperties>
8                 <property name="initialState">
9                     <value>collapsed</value>
10                </property>
11            </clientProperties>
12            <properties>
13                <property type="integer" required="true" selector="false" id="5b0823db-e35e-438d-89f7-b741d9e2a162" name="_NINPUT" displayName="Input Window Width" array="false" enabledWhenValue="false">
14                    <clientProperties/>
15                    <constraints/>
16                </property>
17				<property type="integer" required="true" selector="false" id="5b0823db-e35e-438d-89f7-b741d9e2a162" name="_NUM_LSTM_LAYER" displayName="Maximum number of LSTM Layer" array="false" enabledWhenValue="false">
18                    <clientProperties/>
19                    <constraints/>
20                </property>
21                <property type="integer" required="true" selector="false" id="5b0823db-e35e-438d-89f7-b741d9e2a162" name="_MAXEPOCH" displayName="Maximum Number of Epochs" array="false" enabledWhenValue="false">
22                    <clientProperties/>
23                    <constraints/>
24                </property>
25                <property type="double" required="true" selector="false" id="286b4b5c-1c7e-4b0c-a38b-b886620eb16b" name="_LEARNING_RATE" displayName="Learning Rate For Optimizer" array="false" enabledWhenValue="false">
26                    <clientProperties/>
27                    <constraints/>
28                </property>
29                <property type="integer" required="true" selector="false" id="a3339521-07c2-4fef-9209-ab6ae3105192" name="_BATCH_SIZE" displayName="Minibatch Size" array="false" enabledWhenValue="false">
30                    <clientProperties/>
31                    <constraints/>
32                </property>
33                <property type="integer" required="true" selector="false" id="a3339521-07c2-4fef-9209-ab6ae3105192" name="_SEED" displayName="Seed For Random Number" array="false" enabledWhenValue="false">
34                    <clientProperties/>
35                    <constraints/>
36                </property>
37                <property type="double" required="true" selector="false" id="a3339521-07c2-4fef-9209-ab6ae3105192" name="_ES_MIN_DELTA" displayName="Early Stopping Delta Parameter" array="false" enabledWhenValue="false">
38                    <clientProperties/>
39                    <constraints/>
40                </property>
41                <property type="integer" required="true" selector="false" id="a3339521-07c2-4fef-9209-ab6ae3105192" name="_ES_PATIENCE" displayName="Early Stopping Stagnation Parameter" array="false" enabledWhenValue="false">
42                    <clientProperties/>
43                    <constraints/>
44                </property>
45            </properties>
46        </group>
47        <group style="nested" displayName="Model Selection" array="false" enabledWhenValue="false">
48            <clientProperties>
49                <property name="initialState">
50                    <value>collapsed</value>
51                </property>
52            </clientProperties>
53            <properties>
54                <property type="integer" required="true" selector="false" id="8e183dec-6f0f-48be-b01a-2036508d025b" name="_holdoutSampleSize" displayName="Number of data points used in the holdout sample" description="Specifies the number of data points used for validation." array="false" enabledWhenValue="false">
55                    <clientProperties/>
56                    <constraints>
57                        <range min="0" includeMin="true" includeMax="true" enabledWhenValue="false"/>
58                    </constraints>
59                </property>
60                <property type="double" required="false" selector="false" id="14e6e717-bdaf-4cda-9a49-109f78d3c27f" name="_holdoutSamplePercent" displayName="Percentage of data points used in the holdout sample" description="Specifies the maximum percentage of data points used for validation. Holdout percentage will override holdout size if it results in fewer observations." array="false" enabledWhen="_holdoutSampleSize &gt;=1" enabledWhenValue="false">
61                    <clientProperties>
62                        <property name="hideWhenDisabled">
63                            <value>true</value>
64                        </property>
65                    </clientProperties>
66                    <constraints>
67                        <range min="0" max="100" includeMin="true" includeMax="false" enabledWhenValue="false"/>
68                    </constraints>
69                </property>
70            </properties>
71        </group>
72    </properties>
73</validationModel>
74

Figure 4: Example - validation.xml file

 

To adapt this file for a new forecasting node, make changes at the following places:

  1. Row 2: Specify the description and name of the forecasting node
  2. Row 6: Specify the display name. This corresponds to the name given to the parameter window.
  3. Row 13 – Row 16: This block is an example of a property or one parameter that is displayed on the right-side panel in SAS Visual Forecasting. For adding new parameters, copy the block and make changes to the following:
    1. type: define “integer” or “double”
    2. name: name of the parameter as defined in the code.sas file.
    3. displayName: description of the parameter

Leave all the other attributes the same. Simply delete blocks of parameters that are not required in your new node. For further details on these attributes, please go to the following link.

 

code.sas

In this file, we define the run-time SAS code that can be executed in SAS Visual Forecasting pipelines to generate forecasts. The code for running LSTM TensorFlow process is shown in Figure 5. We write the open-source code inside SAS Visual Forecasting TSMODEL procedure. This allows the user to take advantage of the capabilities of SAS VF’s procedures as the data is automatically preprocessed and accumulated prior to execution of the open-source code. In addition, by using the TSMODEL procedure, the open-source algorithm is run in the distributed in-memory compute engine of SAS Viya. For further information on automatic data preprocessing and distribution, refer to this publication.

 

1  /*----------------------------------------------------------------------+
2   | LSTM Forecasting with the EXTLANG Package and TensorFlow 
3   | 
4   | Any questions, please contact:
5   |
6   | Taiyeong.Lee@sas.com for TensorFlow python code 
7   | Javier.Delgado@sas.com for the EXTLANG package 
8   | Iman.VasheghaniFarahani@sas.com for Visual Forecasting pluggable code 
9  +------------------------------------------------------------------------*/
10 
11 ods output OutInfo = _outInformation;
12 
13 proc tsmodel data=&vf_libIn.."&vf_inData"n lead = &vf_lead.
14             outobj=(outfor  = &vf_libOut.."&vf_outFor"n
15             outSelect = &vf_libOut.."&vf_outSelect"n
16             outStat = &vf_libOut.."&vf_outStat"n
17             outmodelinfo = &vf_libOut.."&vf_outModelInfo"n
18             outvarstatus=&vf_libOut..outvarstatus
19             pylog=&vf_libOut..pylog)
20             outarray = &vf_libOut..outarray
21             outlog = &vf_libOut.."&vf_outLog"n;
22     id &vf_timeID interval = &vf_timeIDInterval setmissing = &vf_setMissing trimid = LEFT;
23     %vf_varsTSMODEL;
24 
25     *define the by variables if exist;
26     %if "&vf_byVars" ne "" %then %do;
27        by &vf_byVars;
28     %end;
29     
30     outarray tf_fcst;
31     require atsm tsm extlang;
32     submit;
33     /* specify common options  */
34     NINPUT = &_NINPUT;                      /* input window width, time_steps     */ 
35     NHOLDOUT = &_holdoutSampleSize;         /* holdout sample size                */
36 
37     /* some TensorFlow Keras model options */
38 
39     MAXEPOCH = &_MAXEPOCH.;                 /* maximum number of epochs           */
40     LEARNING_RATE = &_LEARNING_RATE.;       /* learning rate for optimizer        */ 
41     BATCH_SIZE = &_BATCH_SIZE.;             /* minibatch size                     */
42     SEED = &_SEED.;                         /* seed for random number             */ 
43     ES_MIN_DELTA = &_ES_MIN_DELTA;          /* early stopping delta parameter     */
44     ES_PATIENCE = &_ES_PATIENCE.;           /* early stopping stagnation parameter*/
45 	NUM_LSTM_LAYER = &_NUM_LSTM_LAYER.;		/* maximum number of LSTM layers	  */
46 
47     declare object py(PYTHON3); 
48     rc = py.Initialize();
49     rc = py.AddVariable(&vf_depVar,'ALIAS','TARGET') ;
50     rc = py.AddVariable(&vf_timeID);
51     rc = py.AddVariable(NINPUT);
52     rc = py.AddVariable(NHOLDOUT);
53     rc = py.AddVariable(_LEAD_); /*pass the predefined variable to TF*/ 
54     rc = py.AddVariable(MAXEPOCH);
55     rc = py.AddVariable(LEARNING_RATE);
56     rc = py.AddVariable(BATCH_SIZE);
57     rc = py.AddVariable(SEED);
58     rc = py.AddVariable(ES_MIN_DELTA);
59     rc = py.AddVariable(ES_PATIENCE);
60 	rc = py.AddVariable(NUM_LSTM_LAYER);
61     rc = py.AddVariable(tf_fcst,"READONLY","NO","ARRAYRESIZE","YES","ALIAS",'PREDICT'); 
62     * rc = py.AddEnvVariable('_TKMBPY_DEBUG_FILES_PATH', &log_folder);
63 
64     /* The beginning of TensorFlow python code */
65     rc = py.PushCodeLine("import numpy as np");
66     rc = py.PushCodeLine("import tensorflow as tf");
67     rc = py.PushCodeLine("from tensorflow import keras");
68     rc = py.PushCodeLine("from sklearn.preprocessing import StandardScaler");
69     rc = py.PushCodeLine("from tensorflow.keras.callbacks import EarlyStopping");
70     rc = py.PushCodeLine("time_steps = int(NINPUT)");
71     rc = py.PushCodeLine("lead = int(_LEAD_)");
72     rc = py.PushCodeLine("nholdout = int(NHOLDOUT)");
73     rc = py.PushCodeLine("maxepoch = int(MAXEPOCH)");
74     rc = py.PushCodeLine("learning_rate = float(LEARNING_RATE)");
75     rc = py.PushCodeLine("batch_size = int(BATCH_SIZE)");
76     rc = py.PushCodeLine("es_min_delta = float(ES_MIN_DELTA)");
77     rc = py.PushCodeLine("es_patience = int(ES_PATIENCE)");
78     rc = py.PushCodeLine("seed = int(SEED)");
79     rc = py.PushCodeLine("np.random.seed(seed)");
80     rc = py.PushCodeLine("tf.random.set_seed(seed)");
81     rc = py.PushCodeLine("x = TARGET[0:len(TARGET)-lead]");
82     rc = py.PushCodeLine("x = np.reshape(x, (x.shape[0], 1))");
83     rc = py.PushCodeLine("date = np.reshape(DATE, (DATE.shape[0], 1))");
84     rc = py.PushCodeLine("scaler = StandardScaler()");
85     rc = py.PushCodeLine("std_x  = scaler.fit_transform(x)");
86     rc = py.PushCodeLine("inputdata, targetdata = [], []");
87     rc = py.PushCodeLine("for i in range(len(std_x) - time_steps):");
88     rc = py.PushCodeLine("  inputdata.append(std_x[i: (i+time_steps),])");
89     rc = py.PushCodeLine("  targetdata.append(std_x[i+time_steps,])");
90     rc = py.PushCodeLine("inputdata  = np.array(inputdata)");
91     rc = py.PushCodeLine("targetdata = np.array(targetdata)");
92     rc = py.PushCodeLine("datalength = len(inputdata)");
93     rc = py.PushCodeLine("ntrain = datalength - nholdout");
94     rc = py.PushCodeLine("xtrain, ytrain = inputdata[0:ntrain, ], targetdata[0:ntrain,]");
95     rc = py.PushCodeLine("xvalid, yvalid = inputdata[ntrain:, ], targetdata[ntrain:,]");
96     rc = py.PushCodeLine("model = keras.Sequential()");
97     rc = py.PushCodeLine("model.add(keras.layers.LSTM(NUM_LSTM_LAYER, input_shape=(xtrain.shape[1], xtrain.shape[2])))");
98     rc = py.PushCodeLine("model.add(keras.layers.Dense(1))");
99     rc = py.PushCodeLine("early_stopping = EarlyStopping(monitor='val_loss', min_delta=es_min_delta, patience=es_patience, restore_best_weights=True)");
100    rc = py.PushCodeLine("model.compile(loss='mean_squared_error',optimizer=keras.optimizers.Adam(learning_rate))");
101    rc = py.PushCodeLine("fit_history = model.fit(x=xtrain, y=ytrain, validation_data= (xvalid, yvalid),epochs=maxepoch, batch_size=batch_size, shuffle=False, callbacks=[early_stopping])");
102    rc = py.PushCodeLine("std_pred_train = model.predict(xtrain)");
103    rc = py.PushCodeLine("std_pred_valid = model.predict(xvalid)");
104    rc = py.PushCodeLine("pred_train = scaler.inverse_transform(std_pred_train)");
105    rc = py.PushCodeLine("pred_valid = scaler.inverse_transform(std_pred_valid)");
106    rc = py.PushCodeLine("init_window_pred = np.full([time_steps,1], np.nan)");
107    rc = py.PushCodeLine("pred = np.concatenate((init_window_pred, pred_train, pred_valid), axis=0)");
108    rc = py.PushCodeLine("if nholdout > 0:");
109    rc = py.PushCodeLine("	length = len(xvalid)");
110    rc = py.PushCodeLine("	ylast = yvalid[length-1:length]");
111    rc = py.PushCodeLine("	xlast = xvalid[length-1:length]");
112    rc = py.PushCodeLine("else:");
113    rc = py.PushCodeLine("	length = len(xtrain)");
114    rc = py.PushCodeLine("	xlast = xtrain[length-1:length]");
115    rc = py.PushCodeLine("	ylast = ytrain[length-1:length]");
116    rc = py.PushCodeLine("xnew = np.copy(xlast)");
117    rc = py.PushCodeLine("for i in range(time_steps):");
118    rc = py.PushCodeLine("	if(i < time_steps-1):");
119    rc = py.PushCodeLine("		xnew.itemset((0,i,0), xlast.item(0,i+1,0))");
120    rc = py.PushCodeLine("	if(i == (time_steps-1)):");
121    rc = py.PushCodeLine("		xnew.itemset((0,i,0), ylast.item(0,0))");
122    rc = py.PushCodeLine("std_fcst = list()");
123    rc = py.PushCodeLine("for k in range(lead):");
124    rc = py.PushCodeLine("	std_pred_xlast = model.predict(xlast)");
125    rc = py.PushCodeLine("	std_fcst.append(std_pred_xlast[0:1,0])");
126    rc = py.PushCodeLine("	xnew = np.copy(xlast)");
127    rc = py.PushCodeLine("	for i in range(time_steps):");
128    rc = py.PushCodeLine("		if(i < time_steps-1):");
129    rc = py.PushCodeLine("			xnew.itemset((0,i,0), xlast.item(0,i+1,0))");
130    rc = py.PushCodeLine("		if(i == (time_steps-1)):");
131    rc = py.PushCodeLine("			xnew.itemset((0,i,0), ylast.item(0,0))");
132    rc = py.PushCodeLine("	xlast = np.copy(xnew)");
133    rc = py.PushCodeLine("std_fcstarray = np.array(std_fcst)");
134    rc = py.PushCodeLine("forecast = scaler.inverse_transform(std_fcstarray)");
135    rc = py.PushCodeLine("pred_all = np.concatenate((pred, forecast), axis=0)");
136    rc = py.PushCodeLine("pred_all = np.reshape(pred_all, pred_all.shape[0])");
137    rc = py.PushCodeLine("PREDICT = pred_all");
138    /* The ending of TF python code */
139    rc = py.Run();  
140    /* Store the execution and resource usage statistics logs */
141    declare object pylog(OUTEXTLOG);
142    rc = pylog.Collect(py,'EXECUTION');
143    declare object outvarstatus(OUTEXTVARSTATUS);
144    rc = outvarstatus.Collect(py);
145    declare object pyExmSpec(EXMSPEC);
146    rc = pyExmSpec.open();
147    rc = pyExmSpec.setOption('METHOD','PERFECT');
148    rc = pyExmSpec.setOption('NLAGPCT',0);
149    rc = pyExmSpec.setOption('PREDICT','tf_fcst');
150    rc = pyExmSpec.close();
151    
152    declare object dataFrame(tsdf);
153    declare object diagnose(diagnose);
154    declare object diagSpec(diagspec);
155    declare object inselect(selspec); 
156    declare object forecast(foreng);
157    
158    /*initialize the tsdf object and assign the time series roles: setup dependent and independent variables*/
159    rc = dataFrame.initialize();
160    rc = dataFrame.AddSeries(tf_fcst);
161    rc = dataFrame.addY(&vf_depVar);
162    
163    /*Run model selection and forecast*/                     
164    rc = inselect.Open(1); 
165    rc = inselect.AddFrom(pyExmSpec);
166    rc = inselect.close(); 
167    
168    /*initialize the foreng object with the diagnose result and run model selecting and generate forecasts;*/         
169    rc = forecast.initialize(dataFrame);
170    rc = forecast.AddFrom(inselect);
171    rc = forecast.setOption('lead', &vf_lead);
172    rc = forecast.setOption('back', &vf_back);
173    
174    %if "&vf_allowNegativeForecasts" eq "FALSE" %then %do;
175        rc = forecast.setOption('fcst.bd.lower',0);
176    %end;
177    rc = forecast.Run();
178
179    /*collect forecast results*/
180    declare object outFor(outFor);
181    declare object outSelect(outSelect);
182    declare object outStat(outStat);
183    declare object outModelInfo(outModelInfo);
184
185    /*collect the forecast and statistic-of-fit from the forgen object run results; */
186    rc = outFor.collect(forecast);
187    rc = outSelect.collect(forecast); 
188    rc = outStat.collect(forecast);  
189    rc = outModelInfo.collect(forecast);
190endsubmit;
191run;
192
193/* generate outinformation CAS table */
194data &vf_libOut.."&vf_outInformation"n;
195    set work._outInformation;
196run;
197

Figure 5: Example - code.sas file

 

 

Let us now discuss where specific changes are required to adapt this template code for your new node:

  • Row 13 – Row 30: This is system generated code that uses the built-in data preparation macro code to prepare input data. The data node in the VF pipeline prepares the input data. The output data is also declared in this step – these includes tables such as outFor, outSelect etc.
  • Row 31: “require extlang;” assures that the external language package is installed in the Viya runtime environment. EXTLANG is required to run open-source codes inside Visual Forecasting.
  • Row 34 – Row 35: Specify the common/default input parameters, input window width and holdout sample size
  • Row 39 – Row 45: Here, we specify the TensorFlow Keras model parameters. For your own custom node, you will need to specify your model parameters here.
  • Row 47: Declaring Python as the open-source programming language.
  • Row 54 – Row 60: Specifying variables that are shared between the SAS and Python code. Adapt this block for your custom node.
  • Row 65 – Row 137: The TensorFlow Python code. Here, we have pushed the Python code line by line into the SAS compute engine. An alternative strategy here could be to place the corresponding Python script on the deployment (where it is accessible to all worker nodes) and calling it in the code.sas using the InterpreterObject.PushCodeFile method (see Figure 6).
  • Row 139 – Row 196: This block is of the system-generated code and need not be changed.

A detailed description of each system-generated macro variable is provided in the following link.

 

arpitjain90_0-1649248188451.png

 

Figure 6: An alternative approach to call the Python script using PushCodeFile method. Line 63 in the code.sas file calls the external LSTM_Python_Code.py script. The path where your open-source script lies needs to be updated as per your needs.

 

Zipping the Custom node

Once all the four files are created, the next step is to simply zip these files and import the zipped file through The Exchange in SAS Visual Forecasting (see Figure 7).

 

arpitjain90_1-1649248258782.png

Figure 7: How to add Custom forecasting node on Model Studio.

 

Summary

In this article, I have explained how one can easily create a custom forecasting node in SAS Visual Forecasting by simply making necessary changes to the template files provided with this article (A ready-to-use zipped Python_LSTM_Forecasting node is available for download). An experienced Data Scientist can develop the custom open-source forecasting code and can then easily create a GUI parameter form following the above-mentioned steps. With the UI parameter forms, any business users (coders or non-coders) can then change the model parameters and adapt the underlying forecasting code as per their business needs.

Acknowledgments

I would like to thank Taiyeong Lee for sharing the TensorFlow Python code, Javier Delgado for sharing the zipped LSTM TensorFlow VF node and Iman Vasheghani Farhani for sharing details of the attributes in VF node.

Version history
Last update:
‎04-06-2022 09:17 AM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags