Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

SAS enterprise miner - Score node : how can I adapte SAS code to my issue

Reply
New Contributor
Posts: 3

SAS enterprise miner - Score node : how can I adapte SAS code to my issue

Hello all,

Actually, I work on modelling an imbalanced dataset. I try to improve my results with the cutoff node in SAS enterprise miner. For instance, I set the value (0.05) and I connect sas code node to cutoff node with the following instruction (http://support.sas.com/resources/papers/proceedings12/127-2012.pdf):

I_top_order = EM_CUTOFF;

After sas code node,I connect an other model comparison node and finally I connect a score node in order to obtain sas code.

This sas code seems to be ok, it contains my new instruction, but just after it create 10 segments, and values are exactly the same than modelling without cutoff. See this extract of sas code :

if
(P_top_order1 ge
0.01470274377704) then do;

b_top_order= 1;

end;

else if
(P_top_order1 ge
0.00634736646348) then do;

b_top_order = 2;

end;

else if
(P_top_order1 ge
0.00470666288792) then do;

b_top_order = 3;

How can I do to change these values regarding my cutoff value? Maybe can you tell me how can I calculate these different values?

Thanks for your assistance.

Benjamin

Super Contributor
Posts: 337

Re: SAS enterprise miner - Score node : how can I adapte SAS code to my issue

Posted in reply to Benj_natio

Hi Benjamin,

The workaround that Yogen suggested in the paper you cited will only work if you have a binary target and you are predicting the event "1".

How Yogen's code works

I took a look at the score code produced by the Cutoff node. It creates the EM_Cutoff variable as a flag of whether the probability of event is higher than a certain cutoff. You can find that cutoff through specific methods, or specify your own in the Cutoff node properties.

For example, I created a binary model to predict the event "good" on the binary target "good_bad". Then I used the Cutoff node with mehtod "Event Precision Equal Recall", which found that 0.75 was a better cutoff. The Cutoff node created the flag EM_Cutoff based on the new cutoff as below:

     IF P_good_badgood > 0.75 THEN EM_CUTOFF = 1;

     ELSE EM_CUTOFF = 0;

Yogen's workaround is not going to work for my example because it does not make sense to set the "into" variable I_good_bad to be equal to EM_Cutoff which is a binary flag.

How to fix this

Instead of doing I_good_bad=EM_Cutoff, I should do as below in my SAS Code node.

     if EM_cutoff = 1 then I_good_bad="good";

     else I_good_bad="bad";

Now this works!

This fixed this workaround for my example. I was not exactly sure what you were doing and what b_top_order variables were for your example. But I think this fixed workaround should help you. If not, please explain a bit more and someone from this community or myself can help you fix this code some more.

good luck!

-Miguel

Ask a Question
Discussion stats
  • 1 reply
  • 385 views
  • 0 likes
  • 2 in conversation