DATA Step, Macro, Functions and more

Trying to calculate area under the curve, already have sensitivity and specificity

Accepted Solution Solved
Reply
Contributor
Posts: 60
Accepted Solution

Trying to calculate area under the curve, already have sensitivity and specificity

I have a medical claims dataset (Med), as well as newborn screening (NBS) data.  We have several case definitions for identifying if a person has a certain disease using the Med dataset.  Then I compare that against the NBS data to determine True Positive, False Positive, etc.  From there I calculate sensitivity and specificity.

 

So the final dataset has the person's ID, and then flag variables for positive in the medical dataset (Med_pos) and positive in the NBS dataset (NBS_pos).  Those are compared with IF-THEN statements to create True_Pos, False_Pos, False_Neg, and True_Neg.

 

Where I'm getting confused is with calculating the area under the curve.  I know it's related to ROC graphs, but I don't understand how a single point (the coordinates of the sensitivity and specificity values) can provide an area under the curve.  It seems like, in order to plot a curve, I would need several data points.  But I don't know what those data points are.

 

On a side note, I don't actually need a graph of the ROC curve -- I was already provided with an Excel template that will plot the sensitivity and specificity values for each case definition.  But if it's easier to pull AUC from the ROC graph, that's fine too.


Accepted Solutions
Solution
‎03-26-2018 08:23 AM
SAS Super FREQ
Posts: 4,272

Re: Trying to calculate area under the curve, already have sensitivity and specificity

Posted in reply to Wolverine

Yes, you are correct. The example has a continuous covariate whereas you have a categorical covariate. Sorry for the confusion.

 

ROC curves are usually formed when you have an explanatory variable that is either continuous or at least ordinal with several levels. For a dicotomous explanatory variable you probably only need to report the sensitivity and specificity. However, you can draw the empirical ROC curve and figure out the area of the one triangle and one trapezoid. The answer is

Area = (Sensitivity + Specificity) / 2

 

View solution in original post


All Replies
SAS Super FREQ
Posts: 4,272

Re: Trying to calculate area under the curve, already have sensitivity and specificity

Posted in reply to Wolverine

As you say, you need more than one point. I don't fully understand the process you are following ("compare that against the NBS data to determine True Positive, False Positive, etc."), but look at the article "Computing an ROC curve from basic principles" and see if that clarifies the issues.

Contributor
Posts: 60

Re: Trying to calculate area under the curve, already have sensitivity and specificity

Below are my IF-THEN statements comparing the Med data against the NBS data.

 

IF (Def1_Med_case = 1 AND NBS_pos = 1) THEN True_Pos = 1;
IF (Def1_Med_case = 1 AND NBS_pos = 0) THEN False_Pos = 1;
IF (Def1_Med_case = 0 AND NBS_pos = 1)  THEN False_Neg = 1;
IF (Def1_Med_case = 0 AND NBS_pos = 0) THEN True_Neg = 1;

 

I read the article you linked to, but I'm having trouble relating the example in the article to my case definitions.  They use 15 pairs of shoes as a cut-off to determine gender, and then (if I'm understanding correctly) get SAS to calculate sensitivity and specificity over a range of values centered around 15.  But most of my definitions are yes/no: did the person have any claims with a specific diagnosis code or not?

 

As I was researching this, I remember reading something about the difference between binomial and continuous data.  So my data is binomial and the example is continuous, correct? 

Solution
‎03-26-2018 08:23 AM
SAS Super FREQ
Posts: 4,272

Re: Trying to calculate area under the curve, already have sensitivity and specificity

Posted in reply to Wolverine

Yes, you are correct. The example has a continuous covariate whereas you have a categorical covariate. Sorry for the confusion.

 

ROC curves are usually formed when you have an explanatory variable that is either continuous or at least ordinal with several levels. For a dicotomous explanatory variable you probably only need to report the sensitivity and specificity. However, you can draw the empirical ROC curve and figure out the area of the one triangle and one trapezoid. The answer is

Area = (Sensitivity + Specificity) / 2

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 141 views
  • 0 likes
  • 2 in conversation