BookmarkSubscribeRSS Feed
ugly_duck_ling
Calcite | Level 5

I have a dataset that I need help with.

 

data help;
input modelstep var1 $ var2 $ var3 $ var4 $ var5 $ outcome auc;
datalines;
0 v1 v2 v3 v4 v5 1 0.003
1 v1 v2 v3 '' v5 1 0.004
2 v1 v2 v3 '' '' 1 0.007
3 '' v2 v3 '' '' 1 0.01 
4 '' '' v3 '' '' 1 0.02
0 v1 v2 '' v4 v5 2 0.005
1 v1 '' '' v4 v5 2 0.006
2 v1 '' '' '' v5 2 0.02
3 v1 '' '' '' '' 2 0.03
;
run;

This dataset represents an iterative process of backward selection of model variables where
I am required to identify (for each outcome) the model step at which a specified number of variables yields
an AUC difference of 0.01, and then specify the previous model step variables as my model of choice. I want to do that using a lag function to get the table outlined below.

mock pic.PNG
Because every row represents a model step (that is a number of variables leading to the desired AUC difference,
I would like to specify the row just before the model leading to the AUC difference of 0.01 for each outcome.*/

Any help with that, please?

Thanks

4 REPLIES 4
Ksharp
Super User
data help;
input modelstep var1 $ var2 $ var3 $ var4 $ var5 $ outcome auc;
datalines;
0 v1 v2 v3 v4 v5 1 0.003
1 v1 v2 v3 . v5 1 0.004
2 v1 v2 v3 . . 1 0.007
3 . v2 v3 . . 1 0.01 
4 . . v3 . . 1 0.02
0 v1 v2 . v4 v5 2 0.005
1 v1 . . v4 v5 2 0.006
2 v1 . . . v5 2 0.02
3 v1 . . . . 2 0.03
;
run;

data want;
 set help;
 lag_auc1=lag(auc);  
 lag_auc2=lag2(auc); 
 lag_auc3=lag3(auc); 
 lag_auc4=lag4(auc); 

 dif_auc1=dif(auc);  
 dif_auc2=dif2(auc); 
 dif_auc3=dif3(auc); 
 dif_auc4=dif4(auc); 

 if outcome ne lag(outcome)  then call missing(lag_auc1,dif_auc1);
 if outcome ne lag2(outcome) then call missing(lag_auc2,dif_auc2);
 if outcome ne lag3(outcome) then call missing(lag_auc3,dif_auc3);
 if outcome ne lag4(outcome) then call missing(lag_auc4,dif_auc4);

run;
ugly_duck_ling
Calcite | Level 5

Thank you so much for the feedback. I have one more question.

 

If I wanted to specify the model step before which a 'dif_auc' was equal to 0.01, how can I code for that?

 

I am guessing:

if dif_auc= 0.01 then modelstep= _N_-1

 

I am not sure.

Ksharp
Super User

You want this ?

 

data help;
input modelstep var1 $ var2 $ var3 $ var4 $ var5 $ outcome auc;
datalines;
0 v1 v2 v3 v4 v5 1 0.003
1 v1 v2 v3 . v5 1 0.004
2 v1 v2 v3 . . 1 0.007
3 . v2 v3 . . 1 0.01 
4 . . v3 . . 1 0.02
0 v1 v2 . v4 v5 2 0.005
1 v1 . . v4 v5 2 0.006
2 v1 . . . v5 2 0.02
3 v1 . . . . 2 0.03
;
run;

data want;
 set help;
 lag_auc1=lag(auc);  
 lag_auc2=lag2(auc); 
 lag_auc3=lag3(auc); 
 lag_auc4=lag4(auc); 

 dif_auc1=dif(auc);  
 dif_auc2=dif2(auc); 
 dif_auc3=dif3(auc); 
 dif_auc4=dif4(auc); 

 if outcome ne lag(outcome)  then call missing(lag_auc1,dif_auc1);
 if outcome ne lag2(outcome) then call missing(lag_auc2,dif_auc2);
 if outcome ne lag3(outcome) then call missing(lag_auc3,dif_auc3);
 if outcome ne lag4(outcome) then call missing(lag_auc4,dif_auc4);


lag_modelstep=lag(modelstep);
if round(dif_auc1,1e-6) ne 0.01 or  outcome ne lag(outcome) then call missing(lag_modelstep);

run;
PGStats
Opal | Level 21

Combine DO UNTIL() loops with BY processing instead of awkward LAG or DIF functions :

 

data want;
do until(last.outcome);
    set help; by outcome;
    if auc < 0.01 then lastStep = modelstep;
    end;
do until(last.outcome);
    set help; by outcome;
    if modelstep = lastStep then output;
    end;
drop lastStep;
run;

proc print noobs data=want; run;

PGStats_0-1655145739562.png

All auc < 0.01 and no auc < 0.01 cases will be handled appropriately, i.e. by returning the last model and no model, respectively.

PG

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1323 views
  • 1 like
  • 3 in conversation