BookmarkSubscribeRSS Feed
Benj_natio
Calcite | Level 5

Hello all,

Actually, I work on modelling an imbalanced dataset. I try to improve my results with the cutoff node in SAS enterprise miner. For instance, I set the value (0.05) and I connect sas code node to cutoff node with the following instruction (http://support.sas.com/resources/papers/proceedings12/127-2012.pdf😞

I_top_order = EM_CUTOFF;

After sas code node,I connect an other model comparison node and finally I connect a score node in order to obtain sas code.

This sas code seems to be ok, it contains my new instruction, but just after it create 10 segments, and values are exactly the same than modelling without cutoff. See this extract of sas code :

if
(P_top_order1 ge
0.01470274377704) then do;

b_top_order= 1;

end;

else if
(P_top_order1 ge
0.00634736646348) then do;

b_top_order = 2;

end;

else if
(P_top_order1 ge
0.00470666288792) then do;

b_top_order = 3;

How can I do to change these values regarding my cutoff value? Maybe can you tell me how can I calculate these different values?

Thanks for your assistance.

Benjamin

1 REPLY 1
M_Maldonado
Barite | Level 11

Hi Benjamin,

The workaround that Yogen suggested in the paper you cited will only work if you have a binary target and you are predicting the event "1".

How Yogen's code works

I took a look at the score code produced by the Cutoff node. It creates the EM_Cutoff variable as a flag of whether the probability of event is higher than a certain cutoff. You can find that cutoff through specific methods, or specify your own in the Cutoff node properties.

For example, I created a binary model to predict the event "good" on the binary target "good_bad". Then I used the Cutoff node with mehtod "Event Precision Equal Recall", which found that 0.75 was a better cutoff. The Cutoff node created the flag EM_Cutoff based on the new cutoff as below:

     IF P_good_badgood > 0.75 THEN EM_CUTOFF = 1;

     ELSE EM_CUTOFF = 0;

Yogen's workaround is not going to work for my example because it does not make sense to set the "into" variable I_good_bad to be equal to EM_Cutoff which is a binary flag.

How to fix this

Instead of doing I_good_bad=EM_Cutoff, I should do as below in my SAS Code node.

     if EM_cutoff = 1 then I_good_bad="good";

     else I_good_bad="bad";

Now this works!

This fixed this workaround for my example. I was not exactly sure what you were doing and what b_top_order variables were for your example. But I think this fixed workaround should help you. If not, please explain a bit more and someone from this community or myself can help you fix this code some more.

good luck!

-Miguel

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1035 views
  • 0 likes
  • 2 in conversation