BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Wolverine
Quartz | Level 8

I have a medical claims dataset (Med), as well as newborn screening (NBS) data.  We have several case definitions for identifying if a person has a certain disease using the Med dataset.  Then I compare that against the NBS data to determine True Positive, False Positive, etc.  From there I calculate sensitivity and specificity.

 

So the final dataset has the person's ID, and then flag variables for positive in the medical dataset (Med_pos) and positive in the NBS dataset (NBS_pos).  Those are compared with IF-THEN statements to create True_Pos, False_Pos, False_Neg, and True_Neg.

 

Where I'm getting confused is with calculating the area under the curve.  I know it's related to ROC graphs, but I don't understand how a single point (the coordinates of the sensitivity and specificity values) can provide an area under the curve.  It seems like, in order to plot a curve, I would need several data points.  But I don't know what those data points are.

 

On a side note, I don't actually need a graph of the ROC curve -- I was already provided with an Excel template that will plot the sensitivity and specificity values for each case definition.  But if it's easier to pull AUC from the ROC graph, that's fine too.

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Yes, you are correct. The example has a continuous covariate whereas you have a categorical covariate. Sorry for the confusion.

 

ROC curves are usually formed when you have an explanatory variable that is either continuous or at least ordinal with several levels. For a dicotomous explanatory variable you probably only need to report the sensitivity and specificity. However, you can draw the empirical ROC curve and figure out the area of the one triangle and one trapezoid. The answer is

Area = (Sensitivity + Specificity) / 2

 

View solution in original post

3 REPLIES 3
Rick_SAS
SAS Super FREQ

As you say, you need more than one point. I don't fully understand the process you are following ("compare that against the NBS data to determine True Positive, False Positive, etc."), but look at the article "Computing an ROC curve from basic principles" and see if that clarifies the issues.

Wolverine
Quartz | Level 8

Below are my IF-THEN statements comparing the Med data against the NBS data.

 

IF (Def1_Med_case = 1 AND NBS_pos = 1) THEN True_Pos = 1;
IF (Def1_Med_case = 1 AND NBS_pos = 0) THEN False_Pos = 1;
IF (Def1_Med_case = 0 AND NBS_pos = 1)  THEN False_Neg = 1;
IF (Def1_Med_case = 0 AND NBS_pos = 0) THEN True_Neg = 1;

 

I read the article you linked to, but I'm having trouble relating the example in the article to my case definitions.  They use 15 pairs of shoes as a cut-off to determine gender, and then (if I'm understanding correctly) get SAS to calculate sensitivity and specificity over a range of values centered around 15.  But most of my definitions are yes/no: did the person have any claims with a specific diagnosis code or not?

 

As I was researching this, I remember reading something about the difference between binomial and continuous data.  So my data is binomial and the example is continuous, correct? 

Rick_SAS
SAS Super FREQ

Yes, you are correct. The example has a continuous covariate whereas you have a categorical covariate. Sorry for the confusion.

 

ROC curves are usually formed when you have an explanatory variable that is either continuous or at least ordinal with several levels. For a dicotomous explanatory variable you probably only need to report the sensitivity and specificity. However, you can draw the empirical ROC curve and figure out the area of the one triangle and one trapezoid. The answer is

Area = (Sensitivity + Specificity) / 2

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1210 views
  • 0 likes
  • 2 in conversation