Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- ROC analysis for repeated measures

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

☑ This topic is **solved**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-21-2023 12:05 AM
(301 views)

I have a test dataset, similar to the one below:

```
data have;
input ID$ CAT$ GROUP$ VISIT$ LAB STATUS$ BSL_CAT$;
datalines;
a001 1 1 1 1997.02 0 1
a001 1 1 2 1275.52 0 1
a001 4 1 3 180.23 1 1
a001 2 1 4 735.91 0 1
a002 1 2 1 454.16 0 1
a002 1 2 3 1776.52 0 1
a002 3 2 4 73.15 1 1
a003 1 2 1 1700.26 0 1
a003 3 2 2 1621.32 1 1
a003 2 2 4 850.65 0 1
a004 2 3 1 1963.25 0 2
a004 2 3 2 544.87 0 2
a004 4 3 3 768.54 1 2
a004 2 3 4 780.16 0 2
a005 1 2 1 655.24 0 1
a005 2 2 4 722.14 0 1
a006 1 1 1 1472.06 0 1
a006 1 1 4 749.78 0 1
a007 2 1 1 848.88 0 2
a007 2 1 2 1482.78 0 2
a007 3 1 4 735.26 1 2
a008 1 1 1 1752.35 0 1
a008 1 1 2 1698.82 0 1
a008 3 1 3 1871.25 1 1
a008 4 1 4 587.35 1 1
a009 1 3 1 1549.89 0 1
a009 3 3 3 785.52 1 1
a009 1 3 4 384.72 0 1
a010 3 3 1 1211.95 1 3
a010 3 3 4 1596.38 1 3
a011 4 1 1 1785.45 1 4
a011 4 1 4 644.12 1 4
a012 3 3 1 798.28 1 3
a012 3 3 2 742.69 1 3
a012 3 3 3 1423.59 1 3
a012 3 3 4 1089.47 1 3
;
run;
proc print data=have;
run;
```

where CAT is an ordinal categorical variable with 4 levels;

GROUP denotes the age group the participants are in;

VISIT is the follow up visit - participants have baseline (visit=1), and afterwards they can have up to 3 additional follow-up visits (visit=2,3,4);

LAB is a specific laboratory value;

STATUS is binary variable denoting the severity of their disease and is based on CAT – if CAT in (0,1) then STATUS=0 (not severe), else if CAT in (2,3,4) then STATUS=1 (severe);

BSL_CAT is the baseline value of CAT;

I would like to estimate the predictive ability of LAB as a marker for the detection of disease severity (STATUS severe vs not severe). I want to look at several cut-offs for LAB variable and assess its diagnostic performance by computing accuracy, sensitivity, specificity, positive predictive value, negative predictive value and likelihood ratios.

How can I do this?

Here is what I was thinking/I've done so far:

1. use PROC GLIMMIX (given the longitudinal nature of the observations) to investigate the association between STATUS as a binary outcome and LAB, VISIT as covariates, with VISIT as random effect.

2. then, take the predicted probabilities from this model and feed them into a logistic model to get a ROC curve but after this I'm pretty much stuck as I don't really know how to move forward or obtain accuracy, sensitivity, specificity, PPV, NPV and likelihood ratios.

```
*MODEL OUTCOME AS BINARY;
proc glimmix data=have noclprint;
class ID VISIT (ref="1");
model STATUS (event='1')= LAB VISIT/ dist=binary link=logit solution;
random VISIT/subject=ID residual type=cs;
output out=FITDAT pred(ilink noblup)=predprob;
NLOPTIONS tech=NRRIDG Maxiter=1000;
run;
proc print data=FITDAT; run;
*ROC CURVE BASED ON PREDICTED PROBABILITIES FROM GLIMMIX;
proc logistic data=FITDAT;
model STATUS (event="1")= / nofit;
roc 'GLMM Model' pred=predprob;
run;
```

Does somebody had any code/suggestions they can share to help?

Thank you kindly.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If your final goal is to find an optimal cutoff, then note that there are statistics (like Youden's index and others) that are often used for that. These can be obtained using the ROCPLOT macro (or in PROC LOGISTIC if you have a recent version of SAS Viya). However, note that the unique predicted probabilities, which are the cutoffs used for the ROC curve, are computed using ALL of the predictor values. So, it is not possible to talk about cutoffs on just your LAB predictor with your model. Each cutoff is determined by both LAB and VISIT using your model. If you remove VISIT from the MODEL statement then you can add the OUTROC= option in the MODEL statement in your PROC LOGISTIC step and then merge that data set together with your FITDAT data set.

```
proc sort data=fitdat out=fitdat2(rename=(predprob=_PROB_)); by predprob; run;
proc sort data=or out=or2; by _prob_; run;
data or3; merge fitdat2 or2; by _prob_; run;
```

This allows you to have a data set (OR3) that shows the LAB value corresponding to each cutoff. That data set also has the cell counts of the 2x2 table associated with each cutpoint and the sensitivity and 1-specificity statistics. Using those, you can easily compute the other statistics you want as shown in this note on computing various 2x2 table statistics.

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes. Better post it at Statistical Forum.

https://communities.sas.com/t5/Statistical-Procedures/bd-p/statistical_procedures

Maybe @StatDave could give you a hand.

From my thought, I think you should use predicted value to make a 2x2 contingency table to get these estimators.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

A Google search turns up answers

https://support.sas.com/resources/papers/proceedings/proceedings/sugi27/p261-27.pdf

https://support.sas.com/resources/papers/proceedings/proceedings/sugi22/STATS/PAPER278.PDF

You can find other links as well

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

If your final goal is to find an optimal cutoff, then note that there are statistics (like Youden's index and others) that are often used for that. These can be obtained using the ROCPLOT macro (or in PROC LOGISTIC if you have a recent version of SAS Viya). However, note that the unique predicted probabilities, which are the cutoffs used for the ROC curve, are computed using ALL of the predictor values. So, it is not possible to talk about cutoffs on just your LAB predictor with your model. Each cutoff is determined by both LAB and VISIT using your model. If you remove VISIT from the MODEL statement then you can add the OUTROC= option in the MODEL statement in your PROC LOGISTIC step and then merge that data set together with your FITDAT data set.

```
proc sort data=fitdat out=fitdat2(rename=(predprob=_PROB_)); by predprob; run;
proc sort data=or out=or2; by _prob_; run;
data or3; merge fitdat2 or2; by _prob_; run;
```

This allows you to have a data set (OR3) that shows the LAB value corresponding to each cutoff. That data set also has the cell counts of the 2x2 table associated with each cutpoint and the sensitivity and 1-specificity statistics. Using those, you can easily compute the other statistics you want as shown in this note on computing various 2x2 table statistics.

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.