BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aegeangu
Calcite | Level 5

Hi All,

 

I was asked to verify a risk-score vendor and had 5000 samples with three variables: Client_ID, Bad_flag (0 for good clients, 1 for default ones) and risk_score (range from 300 to 500). The vendor refused to give the formula of the risk_score. 

I wonder if there is any function I can use to calculate AUC in this case by SAS.

 

Please enlighten me and thank you indeed!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

There is a convenient way to get ROC curve and its value. mentioned by @Rick_SAS

 

proc logistic data=have ;

model Bad_flag(event='1')=risk_score /nofit;

roc 'Expected ROC' pred=risk_score;

run;

 

 

 

If your SAS is old try this code;

 

data have;

 set have;

 dummy=1;

run;

proc logistic data=have ;

model Bad_flag(event='1')=risk_score dummy /nofit;

roc 'Expected ROC' pred=risk_score;

roc 'Dummy ROC' pred=dummy ;

run;

View solution in original post

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I would start by in identifying what software you are using, e.g. Enterprise Guide etc., then explain what AUC means (as I expect it means something different to what it means in my industry).  Then show some test data in the form of a datastep:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

And show what the output should look like, or explain further.

To my mind if the vendor will not give you a formula, then its going to be very hard to replicate what they have done.  I mean something simple like removing or keeping missings can drastically change results, so you need to know the rules they have used as well.  

Ksharp
Super User

There is a convenient way to get ROC curve and its value. mentioned by @Rick_SAS

 

proc logistic data=have ;

model Bad_flag(event='1')=risk_score /nofit;

roc 'Expected ROC' pred=risk_score;

run;

 

 

 

If your SAS is old try this code;

 

data have;

 set have;

 dummy=1;

run;

proc logistic data=have ;

model Bad_flag(event='1')=risk_score dummy /nofit;

roc 'Expected ROC' pred=risk_score;

roc 'Dummy ROC' pred=dummy ;

run;

shidehrafigh
Fluorite | Level 6

Hi

 

Thank you for this post. However, I have a technical question:

i'snt ROC calculated/graphed by calculating sensitivity and FP rate for different cut-offs? what I am trying to say is that in this case when we don't know how the model looks like and all we know is the credit score  how can we calculate  the number of TP and FP for different cut-offs?

 

thank you

Ksharp
Super User

Yes. You are right.

 

"how can we calculate  the number of TP and FP for different cut-offs?"

Actually ROC curve is just plot by TP and FP and you can get it by ODS OUTPUT .......

 

If you want calculated it on your own . @Rick_SAS wrote a blog about it before.

 

Here are some code to calculated TP and FP .

options validvarname=any;
libname x v9 'D:\工作文件\花生好车2\备份\hs_data' access=readonly;

data have;
 set x.score_card;
 keep good_bad total_score;
run;

proc sort data=have(keep=total_score) out=score nodupkey;
by descending total_score;
run;
data score;
 set score end=last;
 output;
 if last then do;total_score=total_score-1;output;end;
run;
proc sort data=score;
by total_score;
run;

proc sort data=have;
by good_bad total_score;
run;



proc delete data=want;run;
%macro roc(score=);
data temp;
 set have;
 if total_score<=&score then score_good_bad='bad ';
  else score_good_bad='good';
run;
proc sql;
create table temp1 as
 select good_bad,sum(score_good_bad='good')/count(*) as percent
  from temp
   group by good_bad;
quit;
proc transpose data=temp1 out=temp2;
id good_bad;
var percent;
run;
data temp3;
 set temp2(rename=(good=sensitity bad=_1_minus_specifity));
 score=&score;
 drop _name_;
run;
proc append base=want data=temp3 force;
run;
%mend;



data _null_;
 set score;
 call execute(cats('%roc(score=',total_score,')'));
run;



data roc;
 set want;
 dx=-dif(_1_minus_specifity);
 dy=mean(sensitity,lag(sensitity));
 roc=dx*dy;
run;

proc sql noprint;
select sum(roc) into : roc from roc;
quit;






proc sgplot data=want aspect=1 noautolegend;
lineparm x=0 y=0 slope=1/lineattrs=(color=grey);
series x=_1_minus_specifity y=sensitity;
inset "ROC = &roc"/position=topleft;
xaxis grid;
yaxis grid;
run;




/*方法二  Second Way*/
ods graphics on;
data have2;
 set have;
 retain dummy 1;
run;
proc logistic data=have2;
model good_bad(event='good')=total_score dummy/nofit;
roc 'Expert Predictions' pred=total_score;
roc 'Dummy' pred=dummy;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 6875 views
  • 3 likes
  • 4 in conversation