Question How can I deploy score code for anomaly detection from a Data Mining and Machine Learning project in Model Studio?
Answer Currently in a Data Mining and Machine Learning project in Model Studio, you can only deploy the score code for a predictive model, i.e. a branch of the pipeline that includes a Supervised Learning node. But perhaps you want the score code from the Anomaly Detection node, which uses an unsupervised learning method (not involving the target variable). The SVDD procedure used in the node performs the support vector data description algorithm to detect anomalies or outliers in your data, based on the input variables only. Its score code is saved in an analytic store. To be able to deploy this analytic store to score new data, you can emulate having a model in the branch of your pipeline by adding a SAS Code node after the Anomaly Detection node and moving it to Supervised Learning. Your pipeline should now look as follows:
Now in the SAS Code node, you can include code to simulate a column of target predictions. Note if you don’t have a true target in your data, you can either create a pseudo one or use another variable in your data set that is not used as an input for the Anomaly Detection node. If your target is interval, enter the following code into the Scoring code pane of the SAS Code node editor to create a pseudo variable of target predictions:
length P_target 8;
P_target = .;
where target is replaced in both places above with the name of the actual target. If it is a binary or nominal target, you will need to do the following to create a pseudo variable for the posterior probability of the target event:
length P_targetlevel 8;
P_targetlevel = .;
where target is replaced with the target variable name, and level is replaced with the event level of the target. For example, if the target variable is named BAD with event level 1, the variable name would be P_BAD1. Note that these variable names assume that the resulting name length is 32 characters or less.
You will then from the Pipeline Comparison tab be able to deploy your score code in various ways (register, publish, download), just as you would for a supervised model. This approach also works for deploying score code from other Data Mining Preprocessing nodes, such as Clustering.
... View more