Determine a cutpoint (from ROC curve) after multiple imputation

ZoeUH · Posted 02-18-2022 05:07 AM

Dear,

I would like to determine a cutpoint for a diagnositic test. Similar to what is shown in the following document: https://www.pharmasug.org/proceedings/2016/SP/PharmaSUG-2016-SP11.pdf

However, the difficulty in my situation is that I do have some missing data. Therefore, I proceeded in the following way:

Performed multiple imputation using proc MI.
Performed a logistic regression model for each of the imputed datasets and outputed the results using the outroc option in the model statement.
Combined the results from using proc MIanalyze.
Used proc logistic with the options inest = (combined parameter estimates from proc mianalyze) and maxiter=0.

These were the steps suggested by this thread (https://communities.sas.com/t5/Statistical-Procedures/How-to-obtain-an-average-ROC-curve-using-multi...) to obtain a single ROC curve.

My question is, how do I get standard errors around my cutpoints obtained from this ROC curve to correctly display the variance from the different imputed datasets? Or do I just assume that on average my cutpoint will be X from the single ROC curve? Could someone please provide some insight on how to correctly account for the variation from the different imputed datasets when calculating a cutpoint?

Thanks in advance,

Zoë

StatDave · Posted 02-18-2022 11:03 AM

You don't need standard errors on the cutpoints (predicted probabilities) to compute statistics for finding the optimal cutpoint if that is the goal. With the OUTROC= data set from your step 4 analysis, and adding the data set of predicted probabilities (from adding the OUTPUT OUT= P= statement) in that step, you can then use the ROCPLOT macro to find an optimal cutpoint based on various statistics such as Youden's index or the minimum distance. This only requires the point estimates of the probabilities.

ZoeUH · Posted 02-21-2022 03:51 AM

Hi StatDave_sas,

Thanks for your reponse. If I understand you correctly: when the aim is determining the optimal cutpoint, it is not necessary to have standard errors (from the imputated datasets). Then I have a follow-up question: How would you interpret the obtained cutpoints (for instance based on the Youden index)? As an average cutpoint over all imputed datasets?

StatDave · Posted 02-21-2022 10:00 AM

They are simply the cutpoints from the final model fit to imputed data. Recall that the cutpoints are just predicted probabilities which are computed from the final parameter estimates and the data values (imputed) in the observations.

ZoeUH · Posted 02-22-2022 03:32 AM

Thanks a lot for your explanation. It is clear to me.

Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Re: Determine a cutpoint (from ROC curve) after multiple imputation

Registration is open