Re: How to assess quality of prediction in a logistic regression with ...

SBuc · Posted 11-01-2018 06:32 AM

Dear list,

I am trying to predict a dichotomous variable using several covariates (2 continuous covariates, 1 dichotomous variable) using a random effect from clustering (ie animals are coming from different farms and farm (n=19; total of 280 individual data).

I fit my model using proc GLIMMIX using a logit link.

I am used with proc logistic diagnostics using area under curve and looking for quality of predictions in terms of sensitivity and specificity of the model for determining the prediction accuracy.

I am not aware of these types of procedure in PROC GLIMMIX and especially if the same assumptions hold when we add a random effect to a logistic regression model.

Any specific clue/guide to assess the accuracy of prediction in a GLMM?

Ksharp · Posted 11-01-2018 08:49 AM

Confusion Matrix ? and check the goodness of fit statistic in its documentation.

Or plot ROC curve by yourself ?

/********Plot ROC curve***********/
options validvarname=any;
libname x v9 'D:\工作文件\花生好车2\备份\hs_data' access=readonly;

data have;
 set x.score_card;
 keep good_bad total_score;
run;

proc sort data=have(keep=total_score) out=score nodupkey;
by descending total_score;
run;
data score;
 set score end=last;
 output;
 if last then do;total_score=total_score-1;output;end;
run;
proc sort data=score;
by total_score;
run;

proc sort data=have;
by good_bad total_score;
run;



proc delete data=want;run;
%macro roc(score=);
data temp;
 set have;
 if total_score<=&score then score_good_bad='bad ';
  else score_good_bad='good';
run;
proc sql;
create table temp1 as
 select good_bad,sum(score_good_bad='good')/count(*) as percent
  from temp
   group by good_bad;
quit;
proc transpose data=temp1 out=temp2;
id good_bad;
var percent;
run;
data temp3;
 set temp2(rename=(good=sensitity bad=_1_minus_specifity));
 score=&score;
 drop _name_;
run;
proc append base=want data=temp3 force;
run;
%mend;



data _null_;
 set score;
 call execute(cats('%roc(score=',total_score,')'));
run;



data roc;
 set want;
 dx=-dif(_1_minus_specifity);
 dy=mean(sensitity,lag(sensitity));
 roc=dx*dy;
run;

proc sql noprint;
select sum(roc) into : roc from roc;
quit;




proc sgplot data=want aspect=1 noautolegend;
lineparm x=0 y=0 slope=1/lineattrs=(color=grey);
series x=_1_minus_specifity y=sensitity;
inset "ROC = &roc"/position=topleft;
xaxis grid;
yaxis grid;
run;

Ksharp · Posted 11-01-2018 08:51 AM

Confusion Matrix ? and check the goodness of fit statistic in its documentation.

Or plot ROC curve by yourself ?

/********Plot ROC curve***********/
options validvarname=any;
libname x v9 'D:\工作文件\花生好车2\备份\hs_data' access=readonly;

data have;
 set x.score_card;
 keep good_bad total_score;
run;

proc sort data=have(keep=total_score) out=score nodupkey;
by descending total_score;
run;
data score;
 set score end=last;
 output;
 if last then do;total_score=total_score-1;output;end;
run;
proc sort data=score;
by total_score;
run;

proc sort data=have;
by good_bad total_score;
run;



proc delete data=want;run;
%macro roc(score=);
data temp;
 set have;
 if total_score<=&score then score_good_bad='bad ';
  else score_good_bad='good';
run;
proc sql;
create table temp1 as
 select good_bad,sum(score_good_bad='good')/count(*) as percent
  from temp
   group by good_bad;
quit;
proc transpose data=temp1 out=temp2;
id good_bad;
var percent;
run;
data temp3;
 set temp2(rename=(good=sensitity bad=_1_minus_specifity));
 score=&score;
 drop _name_;
run;
proc append base=want data=temp3 force;
run;
%mend;



data _null_;
 set score;
 call execute(cats('%roc(score=',total_score,')'));
run;



data roc;
 set want;
 dx=-dif(_1_minus_specifity);
 dy=mean(sensitity,lag(sensitity));
 roc=dx*dy;
run;

proc sql noprint;
select sum(roc) into : roc from roc;
quit;




proc sgplot data=want aspect=1 noautolegend;
lineparm x=0 y=0 slope=1/lineattrs=(color=grey);
series x=_1_minus_specifity y=sensitity;
inset "ROC = &roc"/position=topleft;
xaxis grid;
yaxis grid;
run;

StatDave · Posted 11-01-2018 10:20 AM

An ROC analysis can be done by using the predicted probabilities from the GLIMMIX model in PROC LOGISTIC as discussed and illustrated in this note.

How to assess quality of prediction in a logistic regression with random effect.

Re: How to assess quality of prediction in a logistic regression with random effect.

Re: How to assess quality of prediction in a logistic regression with random effect.

Re: How to assess quality of prediction in a logistic regression with random effect.

How to assess quality of prediction in a logistic regression with random effect.

Re: How to assess quality of prediction in a logistic regression with random effect.

Re: How to assess quality of prediction in a logistic regression with random effect.

Re: How to assess quality of prediction in a logistic regression with random effect.

Ready to join fellow brilliant minds for the SAS Hackathon?