I have a test dataset, similar to the one below:
data have;
input ID$ CAT$ GROUP$ VISIT$ LAB STATUS$ BSL_CAT$;
datalines;
a001 1 1 1 1997.02 0 1
a001 1 1 2 1275.52 0 1
a001 4 1 3 180.23 1 1
a001 2 1 4 735.91 0 1
a002 1 2 1 454.16 0 1
a002 1 2 3 1776.52 0 1
a002 3 2 4 73.15 1 1
a003 1 2 1 1700.26 0 1
a003 3 2 2 1621.32 1 1
a003 2 2 4 850.65 0 1
a004 2 3 1 1963.25 0 2
a004 2 3 2 544.87 0 2
a004 4 3 3 768.54 1 2
a004 2 3 4 780.16 0 2
a005 1 2 1 655.24 0 1
a005 2 2 4 722.14 0 1
a006 1 1 1 1472.06 0 1
a006 1 1 4 749.78 0 1
a007 2 1 1 848.88 0 2
a007 2 1 2 1482.78 0 2
a007 3 1 4 735.26 1 2
a008 1 1 1 1752.35 0 1
a008 1 1 2 1698.82 0 1
a008 3 1 3 1871.25 1 1
a008 4 1 4 587.35 1 1
a009 1 3 1 1549.89 0 1
a009 3 3 3 785.52 1 1
a009 1 3 4 384.72 0 1
a010 3 3 1 1211.95 1 3
a010 3 3 4 1596.38 1 3
a011 4 1 1 1785.45 1 4
a011 4 1 4 644.12 1 4
a012 3 3 1 798.28 1 3
a012 3 3 2 742.69 1 3
a012 3 3 3 1423.59 1 3
a012 3 3 4 1089.47 1 3
;
run;
proc print data=have;
run;
where CAT is an ordinal categorical variable with 4 levels;
GROUP denotes the age group the participants are in;
VISIT is the follow up visit - participants have baseline (visit=1), and afterwards they can have up to 3 additional follow-up visits (visit=2,3,4);
LAB is a specific laboratory value;
STATUS is binary variable denoting the severity of their disease and is based on CAT – if CAT in (0,1) then STATUS=0 (not severe), else if CAT in (2,3,4) then STATUS=1 (severe);
BSL_CAT is the baseline value of CAT;
I would like to estimate the predictive ability of LAB as a marker for the detection of disease severity (STATUS severe vs not severe). I want to look at several cut-offs for LAB variable and assess its diagnostic performance by computing accuracy, sensitivity, specificity, positive predictive value, negative predictive value and likelihood ratios.
How can I do this?
Here is what I was thinking/I've done so far:
1. use PROC GLIMMIX (given the longitudinal nature of the observations) to investigate the association between STATUS as a binary outcome and LAB, VISIT as covariates, with VISIT as random effect.
2. then, take the predicted probabilities from this model and feed them into a logistic model to get a ROC curve but after this I'm pretty much stuck as I don't really know how to move forward or obtain accuracy, sensitivity, specificity, PPV, NPV and likelihood ratios.
*MODEL OUTCOME AS BINARY;
proc glimmix data=have noclprint;
class ID VISIT (ref="1");
model STATUS (event='1')= LAB VISIT/ dist=binary link=logit solution;
random VISIT/subject=ID residual type=cs;
output out=FITDAT pred(ilink noblup)=predprob;
NLOPTIONS tech=NRRIDG Maxiter=1000;
run;
proc print data=FITDAT; run;
*ROC CURVE BASED ON PREDICTED PROBABILITIES FROM GLIMMIX;
proc logistic data=FITDAT;
model STATUS (event="1")= / nofit;
roc 'GLMM Model' pred=predprob;
run;
Does somebody had any code/suggestions they can share to help?
Thank you kindly.
... View more