turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Trying to calculate area under the curve, already ...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-20-2018 08:44 AM

I have a medical claims dataset (Med), as well as newborn screening (NBS) data. We have several case definitions for identifying if a person has a certain disease using the Med dataset. Then I compare that against the NBS data to determine True Positive, False Positive, etc. From there I calculate sensitivity and specificity.

So the final dataset has the person's ID, and then flag variables for positive in the medical dataset (Med_pos) and positive in the NBS dataset (NBS_pos). Those are compared with IF-THEN statements to create True_Pos, False_Pos, False_Neg, and True_Neg.

Where I'm getting confused is with calculating the area under the curve. I know it's related to ROC graphs, but I don't understand how a single point (the coordinates of the sensitivity and specificity values) can provide an area under the curve. It seems like, in order to plot a curve, I would need several data points. But I don't know what those data points are.

On a side note, I don't actually need a graph of the ROC curve -- I was already provided with an Excel template that will plot the sensitivity and specificity values for each case definition. But if it's easier to pull AUC from the ROC graph, that's fine too.

Accepted Solutions

Solution

03-26-2018
08:23 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Wolverine

03-20-2018 11:50 AM

Yes, you are correct. The example has a continuous covariate whereas you have a categorical covariate. Sorry for the confusion.

ROC curves are usually formed when you have an explanatory variable that is either continuous or at least ordinal with several levels. For a dicotomous explanatory variable you probably only need to report the sensitivity and specificity. However, you can draw the empirical ROC curve and figure out the area of the one triangle and one trapezoid. The answer is

Area = (Sensitivity + Specificity) / 2

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Wolverine

03-20-2018 09:20 AM

As you say, you need more than one point. I don't fully understand the process you are following ("compare that against the NBS data to determine True Positive, False Positive, etc."), but look at the article "Computing an ROC curve from basic principles" and see if that clarifies the issues.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

03-20-2018 09:56 AM

Below are my IF-THEN statements comparing the Med data against the NBS data.

IF (Def1_Med_case = 1 AND NBS_pos = 1) THEN True_Pos = 1;

IF (Def1_Med_case = 1 AND NBS_pos = 0) THEN False_Pos = 1;

IF (Def1_Med_case = 0 AND NBS_pos = 1) THEN False_Neg = 1;

IF (Def1_Med_case = 0 AND NBS_pos = 0) THEN True_Neg = 1;

I read the article you linked to, but I'm having trouble relating the example in the article to my case definitions. They use 15 pairs of shoes as a cut-off to determine gender, and then (if I'm understanding correctly) get SAS to calculate sensitivity and specificity over a range of values centered around 15. But most of my definitions are yes/no: did the person have any claims with a specific diagnosis code or not?

As I was researching this, I remember reading something about the difference between binomial and continuous data. So my data is binomial and the example is continuous, correct?

Solution

03-26-2018
08:23 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Wolverine

03-20-2018 11:50 AM

Yes, you are correct. The example has a continuous covariate whereas you have a categorical covariate. Sorry for the confusion.

ROC curves are usually formed when you have an explanatory variable that is either continuous or at least ordinal with several levels. For a dicotomous explanatory variable you probably only need to report the sensitivity and specificity. However, you can draw the empirical ROC curve and figure out the area of the one triangle and one trapezoid. The answer is

Area = (Sensitivity + Specificity) / 2