About WendyCzika

WendyCzika · ‎04-09-2020

Yes your data has been partitioned by default, and those results are only on the training observations. You can see this post for how to re-run your pipeline using different partition settings. You can change your partition settings either in global settings or project settings (it is easiest if you change them for a project before you have run any nodes in a pipeline). https://communities.sas.com/t5/SAS-Communities-Library/Asked-amp-Answered-How-to-change-data-source-or-data-partition/ta-p/532628

WendyCzika · ‎04-01-2020

In any of the Supervised Learning nodes in Model Studio, there are model assessment results that include various types of lift and gain plots, where the data is sorted in descending order by predicted probability of event then divided into demi-deciles (every 5%). I'm not sure if that is what you mean or something else?

WendyCzika · ‎03-11-2020

For logistic regression and lift/gain charts, you need to have a categorical target (binary or nominal) and it seems like your target might have level of Interval if you are getting those predicted plots instead of the lift/gain plots. So you might want to check in your Input Data node that the target has the correct level set.

WendyCzika · ‎02-19-2020

In Model Studio, you can't specify priors for data that has already been sampled. The percentages that you enter in Project Settings are the ones that you want to achieve via sampling the full data. So since you already have a 50/50 sample, it is trying to create a 5/95 sample, which can't be done. Hope that makes sense.

WendyCzika · ‎02-12-2020

I think this section of the documentation on the Incremental Response node might help to explain how the scores are calculated for the 2 approaches (using a combined model, or separate models for treatment vs. control): https://go.documentation.sas.com/?docsetId=emex&docsetTarget=p002lnittq9wd5n1cf0ujrccc0hv.htm&docsetVersion=14.2&locale=en#n0dxc9jia326f9n1a0owv7gb6cjx

WendyCzika · ‎01-13-2020

Question How can I deploy score code for anomaly detection from a Data Mining and Machine Learning project in Model Studio? Answer Currently in a Data Mining and Machine Learning project in Model Studio, you can only deploy the score code for a predictive model, i.e. a branch of the pipeline that includes a Supervised Learning node. But perhaps you want the score code from the Anomaly Detection node, which uses an unsupervised learning method (not involving the target variable). The SVDD procedure used in the node performs the support vector data description algorithm to detect anomalies or outliers in your data, based on the input variables only. Its score code is saved in an analytic store. To be able to deploy this analytic store to score new data, you can emulate having a model in the branch of your pipeline by adding a SAS Code node after the Anomaly Detection node and moving it to Supervised Learning. Your pipeline should now look as follows: Now in the SAS Code node, you can include code to simulate a column of target predictions. Note if you don’t have a true target in your data, you can either create a pseudo one or use another variable in your data set that is not used as an input for the Anomaly Detection node. If your target is interval, enter the following code into the Scoring code pane of the SAS Code node editor to create a pseudo variable of target predictions: length P_target 8; P_target = .; where target is replaced in both places above with the name of the actual target. If it is a binary or nominal target, you will need to do the following to create a pseudo variable for the posterior probability of the target event: length P_targetlevel 8; P_targetlevel = .; where target is replaced with the target variable name, and level is replaced with the event level of the target. For example, if the target variable is named BAD with event level 1, the variable name would be P_BAD1. Note that these variable names assume that the resulting name length is 32 characters or less. You will then from the Pipeline Comparison tab be able to deploy your score code in various ways (register, publish, download), just as you would for a supervised model. This approach also works for deploying score code from other Data Mining Preprocessing nodes, such as Clustering.

WendyCzika · ‎12-18-2019

This might help (includes link to code in GitHub): https://communities.sas.com/t5/SAS-Communities-Library/Autoencoder-analysis-using-PROC-NNET-and-neuralNet-action-set/ta-p/447135

WendyCzika · ‎12-04-2019

I'm not sure why it's behaving like that for you. When I do the same thing making a split just on 1 input, in the score code of the Score node, I see just that split. Do you have data preprocessing nodes in the flow before the Decision Tree node like Impute or Transform Variables that you are seeing the score code for? When you view the results of the Decision Tree node, is the tree the one that you created interactively?

WendyCzika · ‎12-03-2019

The score code should represent the tree that you created in the interactive editor, as long as you didn't re-run the node without the Use Frozen Tree property checked. So you might need to go back into the interactive editor and check that your interactive changes are still there, then when you close out of that and save changes if prompted, the score code in the results of the node should represent that tree and only contain the 3 inputs that were used for interactive splitting.

WendyCzika · ‎12-03-2019

Variable Clustering clusters your inputs for the purpose of dimension reduction. The other two, Cluster and HP Cluster, are for clustering observations for unsupervised segmentation of your data. There is some information about the differences between the 2 procedures used in these nodes here: https://communities.sas.com/t5/SAS-Communities-Library/Tip-K-means-clustering-in-SAS-comparing-PROC-FASTCLUS-and-PROC/ta-p/221369 And more information on the Cluster node here: https://communities.sas.com/t5/SAS-Communities-Library/Tip-Guidelines-for-Choosing-a-Clustering-Method-in-the-Cluster/ta-p/223483 Since these 2 nodes aren't performing hierarchical clustering, I don't believe you can create dendrograms for these.

WendyCzika · ‎10-31-2019

Just checking that you set EM_PMML to Yes before running the Regression node? If not, you might need to rerun it, you can force it to rerun by changing the property on the Input Data node to for "Rerun" to Yes then rerunning Regression. If that's not it, are you running with default properties for the Regression node or did you change any? I'm not saying that is necessarily why it is not being produced, but just trying to figure out what is different - running with defaults and one of the sample data sets, I am getting the PMML score code in the results.

WendyCzika · ‎10-29-2019

Two things that might help: 1) From the Properties panel of the Cluster node, you can open up the window for Exported Data, then browse the Train data to see which cluster each observation is assigned to (the columns on the far right of the table will show the cluster assignment) 2) You can run the Segment Profile node (from the Assess tab) after the Cluster node to get an idea of the make-up of the clusters/segments with the various Profile graphs.

WendyCzika · ‎10-09-2019

In the score data, you typically do not have an observed target, so you wouldn't be able to create an ROC curve for that.

WendyCzika · ‎09-26-2019

It looks like you need to do it as follows: ods pmml file=" &pth.\sas_rule&i..XML" encoding="UTF-8"; proc arbor. ... ... code pmml; run; ods pmml close;

WendyCzika · ‎09-17-2019

No, the whole reason that HPFOREST generates a binary representation of the score code instead of DATA step is because it is typically too large to even compile as DATA step score code.

Online Status	Offline
Date Last Visited	‎08-07-2025 02:41 PM