BookmarkSubscribeRSS Feed
ugly_duck_ling
Calcite | Level 5

I have a dataset that I need help with.

 

data help;
input modelstep var1 $ var2 $ var3 $ var4 $ var5 $ outcome auc;
datalines;
0 v1 v2 v3 v4 v5 1 0.003
1 v1 v2 v3 '' v5 1 0.004
2 v1 v2 v3 '' '' 1 0.007
3 '' v2 v3 '' '' 1 0.01 
4 '' '' v3 '' '' 1 0.02
0 v1 v2 '' v4 v5 2 0.005
1 v1 '' '' v4 v5 2 0.006
2 v1 '' '' '' v5 2 0.02
3 v1 '' '' '' '' 2 0.03
;
run;

This dataset represents an iterative process of backward selection of model variables where
I am required to identify (for each outcome) the model step at which a specified number of variables yields
an AUC difference of 0.01, and then specify the previous model step variables as my model of choice. I want to do that using a lag function to get the table outlined below.

mock pic.PNG
Because every row represents a model step (that is a number of variables leading to the desired AUC difference,
I would like to specify the row just before the model leading to the AUC difference of 0.01 for each outcome.*/

Any help with that, please?

Thanks

4 REPLIES 4
Ksharp
Super User
data help;
input modelstep var1 $ var2 $ var3 $ var4 $ var5 $ outcome auc;
datalines;
0 v1 v2 v3 v4 v5 1 0.003
1 v1 v2 v3 . v5 1 0.004
2 v1 v2 v3 . . 1 0.007
3 . v2 v3 . . 1 0.01 
4 . . v3 . . 1 0.02
0 v1 v2 . v4 v5 2 0.005
1 v1 . . v4 v5 2 0.006
2 v1 . . . v5 2 0.02
3 v1 . . . . 2 0.03
;
run;

data want;
 set help;
 lag_auc1=lag(auc);  
 lag_auc2=lag2(auc); 
 lag_auc3=lag3(auc); 
 lag_auc4=lag4(auc); 

 dif_auc1=dif(auc);  
 dif_auc2=dif2(auc); 
 dif_auc3=dif3(auc); 
 dif_auc4=dif4(auc); 

 if outcome ne lag(outcome)  then call missing(lag_auc1,dif_auc1);
 if outcome ne lag2(outcome) then call missing(lag_auc2,dif_auc2);
 if outcome ne lag3(outcome) then call missing(lag_auc3,dif_auc3);
 if outcome ne lag4(outcome) then call missing(lag_auc4,dif_auc4);

run;
ugly_duck_ling
Calcite | Level 5

Thank you so much for the feedback. I have one more question.

 

If I wanted to specify the model step before which a 'dif_auc' was equal to 0.01, how can I code for that?

 

I am guessing:

if dif_auc= 0.01 then modelstep= _N_-1

 

I am not sure.

Ksharp
Super User

You want this ?

 

data help;
input modelstep var1 $ var2 $ var3 $ var4 $ var5 $ outcome auc;
datalines;
0 v1 v2 v3 v4 v5 1 0.003
1 v1 v2 v3 . v5 1 0.004
2 v1 v2 v3 . . 1 0.007
3 . v2 v3 . . 1 0.01 
4 . . v3 . . 1 0.02
0 v1 v2 . v4 v5 2 0.005
1 v1 . . v4 v5 2 0.006
2 v1 . . . v5 2 0.02
3 v1 . . . . 2 0.03
;
run;

data want;
 set help;
 lag_auc1=lag(auc);  
 lag_auc2=lag2(auc); 
 lag_auc3=lag3(auc); 
 lag_auc4=lag4(auc); 

 dif_auc1=dif(auc);  
 dif_auc2=dif2(auc); 
 dif_auc3=dif3(auc); 
 dif_auc4=dif4(auc); 

 if outcome ne lag(outcome)  then call missing(lag_auc1,dif_auc1);
 if outcome ne lag2(outcome) then call missing(lag_auc2,dif_auc2);
 if outcome ne lag3(outcome) then call missing(lag_auc3,dif_auc3);
 if outcome ne lag4(outcome) then call missing(lag_auc4,dif_auc4);


lag_modelstep=lag(modelstep);
if round(dif_auc1,1e-6) ne 0.01 or  outcome ne lag(outcome) then call missing(lag_modelstep);

run;
PGStats
Opal | Level 21

Combine DO UNTIL() loops with BY processing instead of awkward LAG or DIF functions :

 

data want;
do until(last.outcome);
    set help; by outcome;
    if auc < 0.01 then lastStep = modelstep;
    end;
do until(last.outcome);
    set help; by outcome;
    if modelstep = lastStep then output;
    end;
drop lastStep;
run;

proc print noobs data=want; run;

PGStats_0-1655145739562.png

All auc < 0.01 and no auc < 0.01 cases will be handled appropriately, i.e. by returning the last model and no model, respectively.

PG

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 588 views
  • 1 like
  • 3 in conversation