turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- ROC curve: associate predicted probability with pa...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2010 07:04 AM

Hi,

I'm using Proc Logistic to create a ROC curve; I have the Y variable which codifies the event of interest (cells differentiation, 1 = event = differentiation; 0 = non-event = non-differentiation) and the X independent variable which is a parameter measuring the differentiation level.

Here is the code:

proc logistic data=total descending noprint;

model Y=X / outroc=rocdata_sptot;

output out=sptot1 p=predprob;

run;

After that, I define a cut-off on the X to distinguish between events and non-events.

The correct interpretation of my data is: the higher the X, the higher the probability of differentiation (event=1); but the interpretation given by SAS is the opposite (i.e. the higher the X, the lower the probability of differentiation). I checked it in the out dataset, where a lower probability of event is associated to higher levels of my X.

Is there a statement or any other solution I can use to "reverse" the association done by SAS, in order to make it correctly fit my data?

Thanks a lot,

Anna

I'm using Proc Logistic to create a ROC curve; I have the Y variable which codifies the event of interest (cells differentiation, 1 = event = differentiation; 0 = non-event = non-differentiation) and the X independent variable which is a parameter measuring the differentiation level.

Here is the code:

proc logistic data=total descending noprint;

model Y=X / outroc=rocdata_sptot;

output out=sptot1 p=predprob;

run;

After that, I define a cut-off on the X to distinguish between events and non-events.

The correct interpretation of my data is: the higher the X, the higher the probability of differentiation (event=1); but the interpretation given by SAS is the opposite (i.e. the higher the X, the lower the probability of differentiation). I checked it in the out dataset, where a lower probability of event is associated to higher levels of my X.

Is there a statement or any other solution I can use to "reverse" the association done by SAS, in order to make it correctly fit my data?

Thanks a lot,

Anna

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2010 08:10 AM

Hi Anna,

Try this:

proc logistic data=total descending noprint;

model Y(event='1')=X / outroc=rocdata_sptot;

output out=sptot1 p=predprob;

run;

The way I read the documentation, this should give what you need.

Steve Denham

Try this:

proc logistic data=total descending noprint;

model Y(event='1')=X / outroc=rocdata_sptot;

output out=sptot1 p=predprob;

run;

The way I read the documentation, this should give what you need.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2010 10:43 AM

Dear Steve,

thank you very much for your answer.

I tried what you suggested but I still have the same results as before: SAS is still associating low values of predicted probability to high values of X, and I need the opposite.

I also tried with some variations but the results don't change.

Many thanks for your help anyway

thank you very much for your answer.

I tried what you suggested but I still have the same results as before: SAS is still associating low values of predicted probability to high values of X, and I need the opposite.

I also tried with some variations but the results don't change.

Many thanks for your help anyway

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2010 12:00 PM

then Don't use 'descending'.

But i am curious that Why you do want to do that?

But i am curious that Why you do want to do that?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2010 03:00 PM

I've already tried, I tried everything I can think of.

I need to do this because that's the way my data must be interpreted and handled, it's not for fun but because of the biological background

I need to do this because that's the way my data must be interpreted and handled, it's not for fun but because of the biological background

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-29-2010 06:45 AM

Hi.Anna.

I think Doc@Duck has the right idea here.That is to say you need more covariates to add into your model or need more obs.Your logistic model is perfect prediction(i.e. you only have one outcome variable and one covariate).The perfect prediction is not a good model which will give your not good estimate and interpretation. I think SAS will issue a NOTE in the log file. Message was edited by: Ksharp

I think Doc@Duck has the right idea here.That is to say you need more covariates to add into your model or need more obs.Your logistic model is perfect prediction(i.e. you only have one outcome variable and one covariate).The perfect prediction is not a good model which will give your not good estimate and interpretation. I think SAS will issue a NOTE in the log file. Message was edited by: Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-29-2010 10:11 AM

Anna,

If the descending option is kept, then at least try:

proc logistic data=total descending noprint;

model Y(event='0')=X / outroc=rocdata_sptot;

output out=sptot1 p=predprob;

run;

Good luck,

Steve Denham

If the descending option is kept, then at least try:

proc logistic data=total descending noprint;

model Y(event='0')=X / outroc=rocdata_sptot;

output out=sptot1 p=predprob;

run;

Good luck,

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-28-2010 09:57 PM

I've got to think that this has something to do with the model or the data.

Have you done plots of the raw data? Maybe your putative biological model isn't reflected in the data. (SAS 9.2's ODS statistical graphics has some that are quite useful. Also, see Frank Harrell's book on regression modeling for diagnostics.)

Have you simplified the model for the presentation here? E.g. are there other covariables in the model? If so, and they are related to X, they can attenuate or even reverse the effect of X.

Doc Muhlbaier

Duke

Have you done plots of the raw data? Maybe your putative biological model isn't reflected in the data. (SAS 9.2's ODS statistical graphics has some that are quite useful. Also, see Frank Harrell's book on regression modeling for diagnostics.)

Have you simplified the model for the presentation here? E.g. are there other covariables in the model? If so, and they are related to X, they can attenuate or even reverse the effect of X.

Doc Muhlbaier

Duke

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-29-2010 08:56 AM

Ksharp mentioned perfect prediction. You won't get that unless there is total separation of the y's by the x's. We have, however, been assuming that X is a continuous variable. Although a logistic regression can be done with a single binary predictor, it is then just a Chi Square and the ROC is pretty meaningless.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-29-2010 11:35 AM

Hi.Doc@Duke

I guess that the sample size is too small or the events are too sparse.So the estimation of coefficient would be bias.

So anna You can try to use exact logistic regression .Such as:

[pre]

proc logistic data=yourdataset descending exactonly;

model Y=X Z;

exact X Z /estimate=both;

run;

[/pre]

Ksharp

I guess that the sample size is too small or the events are too sparse.So the estimation of coefficient would be bias.

So anna You can try to use exact logistic regression .Such as:

[pre]

proc logistic data=yourdataset descending exactonly;

model Y=X Z;

exact X Z /estimate=both;

run;

[/pre]

Ksharp

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-02-2010 04:12 AM

Dear All,

many thanks for your help!

I'm currently working on the problem, considering all your suggestions

many thanks for your help!

I'm currently working on the problem, considering all your suggestions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-11-2010 10:20 AM

Hi again,

I'm going crazy with this issue, again, I wasn't able to solve it in any way.

The problem is quite simple, I tried all the possible combinations but I can never find the right one.

** I am using proc logistic to build a ROC curve.

** I have a binary "event" variable (dependent). The two categories correspond to Differentiation and Undifferentiation.

** I have only 1 continuous predictor (independent)

** The right intrerpretation of the data is: the higher the predictor, the higher the differentiation

I tried all the combinations obtained changing the way of codifying the event (what it's coded as 0 and what as 1), and the event for which the probability of the logistic model is modeled. I will list them here:

1) Differentiated = 0

Undifferentiated = 1

Event for which probability is modeled = 1

------> RESULT: the higher the predictor, the higher the prob of UNdiffer --> WRONG

2) Differ = 0

Undiffer = 1

Event = 0

------> RESULT: the higher the predictor, the lower the probability of Differ --> WRONG

3) Differ = 1

Undiffer = 0

Event = 1

-------> RESULT: the higher the predictor, the lower the prob of Differ --> WRONG

4) Differ = 1

Undiff = 0

Event = 0

------> RESULT: the higher the predictor, the higher the prob of UNdiff --> WRONG

Do you think it's normal?

I'm convinced there must be something very wrong but I really don't know what to do.

Thanks again,

Anna

I'm going crazy with this issue, again, I wasn't able to solve it in any way.

The problem is quite simple, I tried all the possible combinations but I can never find the right one.

** I am using proc logistic to build a ROC curve.

** I have a binary "event" variable (dependent). The two categories correspond to Differentiation and Undifferentiation.

** I have only 1 continuous predictor (independent)

** The right intrerpretation of the data is: the higher the predictor, the higher the differentiation

I tried all the combinations obtained changing the way of codifying the event (what it's coded as 0 and what as 1), and the event for which the probability of the logistic model is modeled. I will list them here:

1) Differentiated = 0

Undifferentiated = 1

Event for which probability is modeled = 1

------> RESULT: the higher the predictor, the higher the prob of UNdiffer --> WRONG

2) Differ = 0

Undiffer = 1

Event = 0

------> RESULT: the higher the predictor, the lower the probability of Differ --> WRONG

3) Differ = 1

Undiffer = 0

Event = 1

-------> RESULT: the higher the predictor, the lower the prob of Differ --> WRONG

4) Differ = 1

Undiff = 0

Event = 0

------> RESULT: the higher the predictor, the higher the prob of UNdiff --> WRONG

Do you think it's normal?

I'm convinced there must be something very wrong but I really don't know what to do.

Thanks again,

Anna

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-07-2010 04:01 PM

Hello Anna,

If you could supply a sample of you data to experiment?

SPR

If you could supply a sample of you data to experiment?

SPR