BookmarkSubscribeRSS Feed
ZoeUH
Calcite | Level 5

Dear,

 

I would like to determine a cutpoint for a diagnositic test. Similar to what is shown in the following document: https://www.pharmasug.org/proceedings/2016/SP/PharmaSUG-2016-SP11.pdf

However, the difficulty in my situation is that I do have some missing data. Therefore, I proceeded in the following way:

  1. Performed multiple imputation using proc MI.
  2. Performed a logistic regression model for each of the imputed datasets and outputed the results using the outroc option in the model statement.
  3. Combined the results from using proc MIanalyze. 
  4. Used proc logistic with the options inest = (combined parameter estimates from proc mianalyze) and maxiter=0.

These were the steps suggested by this thread (https://communities.sas.com/t5/Statistical-Procedures/How-to-obtain-an-average-ROC-curve-using-multi...) to obtain a single ROC curve. 

 

My question is, how do I get standard errors around my cutpoints obtained from this ROC curve to correctly display the variance from the different imputed datasets? Or do I just assume that on average my cutpoint will be X from the single ROC curve? Could someone please provide some insight on how to correctly account for the variation from the different imputed datasets when calculating a cutpoint?

 

Thanks in advance,

Zoë 

4 REPLIES 4
StatDave
SAS Super FREQ

You don't need standard errors on the cutpoints (predicted probabilities) to compute statistics for finding the optimal cutpoint if that is the goal. With the OUTROC= data set from your step 4 analysis, and adding the data set of predicted probabilities (from adding the OUTPUT OUT= P= statement) in that step, you can then use the ROCPLOT macro to find an optimal cutpoint based on various statistics such as Youden's index or the minimum distance. This only requires the point estimates of the probabilities.

ZoeUH
Calcite | Level 5

Hi StatDave_sas,

 

Thanks for your reponse. If I understand you correctly: when the aim is determining the optimal cutpoint, it is not necessary to have standard errors (from the imputated datasets). Then I have a follow-up question: How would you interpret the obtained cutpoints (for instance based on the Youden index)? As an average cutpoint over all imputed datasets? 

StatDave
SAS Super FREQ

They are simply the cutpoints from the final model fit to imputed data. Recall that the cutpoints are just predicted probabilities which are computed from the final parameter estimates and the data values (imputed) in the observations.

ZoeUH
Calcite | Level 5

Thanks a lot for your explanation. It is clear to me. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1736 views
  • 2 likes
  • 2 in conversation