05-05-2014 06:51 PM
I am new in forum and in SAS.I am trying to find classification matrix on the Test Set, after using a Scorecard Node, but I am only able to see it for the training set.
I am uploading my diagram.
Any help would be much appreciated
05-06-2014 11:46 AM
Thanks for the clear screenshot. It sure helps.
Not sure why you are not seeing the Classification Table for your Test Set. I assume you are seeing it for your Train and Validation, but not for your Test set?
Investigating on that...
In the meantime, I noticed that you kind of combined two popular approaches for this problem. A quick scoop of the 2 approaches and why I think you might not want to combine them.
Take a look at the attached image.
Flow A (Data->Partition->IGN->SC) is the most common way to model a binary target for a regulated environment like credit scoring.
Flow B (Data->Partition->IGN->Regression->Cutoff is a common way to customize a regression when you do not care much about having score points to interpret.
When you combine both approaches they way you showed in your screenshot, I am pretty sure that your Regression and Cutoff get ignored, although I haven't checked this thoroughly. The reason is that the Scorecard node will do its own regression anyway and there is no way to turn it off, and it cannot just pick up any findings from another regression node. A way to see this is that the Scorecard node is a model node on its own.
I hope this helps with your task. I will keep you posted with what I find about the test classification matrix.
05-06-2014 04:03 PM
Thank you for your concern, I understood know why I shouldn't have both regression and scorecard. In Data partition I have 66% train and 34% test. What I am trying to find is first the parameter estimates for the predictive variables using logistic regression after using coarse classification, which confuses me a bit because I have group variables and WoE variables. And then I am trying to find the classification matrix and ROC diagram for both train and test sets with a 0.5 cut-off. I used Scorecard because it was the only node that had a result for the ROC curve.Any idea of how I can do it?
I hope what I am trying to find is possible.
05-06-2014 04:08 PM
I think what you want is to:
-have 66% train and 34% validation in the data partition node.
-if you want to use the 0.5 cutoff, you don't need the Cutoff node. If you want any other value than 0.5, then you do need the cutoff node to specify that value.
-add a Model Comparison node after any model node (like regression node) to see a ROC curve. Even if you only have one model to compare, this node will give you all the stats you want, including a ROC plot. Notice that the area under the ROC curve is called c-statistic in the Model Comparison node results.
I hope it helps,
05-06-2014 04:41 PM
Thanks I am understanding more and more . So validation set or test is the same?
I found ROC plot and I can see classification table for both train and validation sets.
I am also trying to find the accuracy ratio, is it the same with Gini coefficient?(Found it)
I can't find AUC or C-statistic, I am uploading a screenshot from my results.
Thank you so much!
05-07-2014 07:22 AM
I found AUC it is called ROC index.
My only issue now is regarding test set or validation set usage. From what I read is not the same. There should be a way to see the results for the test set.