BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
I'm using Proc Logistic to create a ROC curve; I have the Y variable which codifies the event of interest (cells differentiation, 1 = event = differentiation; 0 = non-event = non-differentiation) and the X independent variable which is a parameter measuring the differentiation level.

Here is the code:

proc logistic data=total descending noprint;
model Y=X / outroc=rocdata_sptot;
output out=sptot1 p=predprob;
run;

After that, I define a cut-off on the X to distinguish between events and non-events.

The correct interpretation of my data is: the higher the X, the higher the probability of differentiation (event=1); but the interpretation given by SAS is the opposite (i.e. the higher the X, the lower the probability of differentiation). I checked it in the out dataset, where a lower probability of event is associated to higher levels of my X.

Is there a statement or any other solution I can use to "reverse" the association done by SAS, in order to make it correctly fit my data?

Thanks a lot,
Anna
12 REPLIES 12
SteveDenham
Jade | Level 19
Hi Anna,

Try this:

proc logistic data=total descending noprint;
model Y(event='1')=X / outroc=rocdata_sptot;
output out=sptot1 p=predprob;
run;

The way I read the documentation, this should give what you need.

Steve Denham
deleted_user
Not applicable
Dear Steve,
thank you very much for your answer.
I tried what you suggested but I still have the same results as before: SAS is still associating low values of predicted probability to high values of X, and I need the opposite.
I also tried with some variations but the results don't change.
Many thanks for your help anyway 🙂
Ksharp
Super User
then Don't use 'descending'.
But i am curious that Why you do want to do that?
deleted_user
Not applicable
I've already tried, I tried everything I can think of.
I need to do this because that's the way my data must be interpreted and handled, it's not for fun but because of the biological background 😉
Ksharp
Super User
Hi.Anna.
I think Doc@Duck has the right idea here.That is to say you need more covariates to add into your model or need more obs.Your logistic model is perfect prediction(i.e. you only have one outcome variable and one covariate).The perfect prediction is not a good model which will give your not good estimate and interpretation. I think SAS will issue a NOTE in the log file. Message was edited by: Ksharp
SteveDenham
Jade | Level 19
Anna,

If the descending option is kept, then at least try:

proc logistic data=total descending noprint;
model Y(event='0')=X / outroc=rocdata_sptot;
output out=sptot1 p=predprob;
run;

Good luck,

Steve Denham
Doc_Duke
Rhodochrosite | Level 12
I've got to think that this has something to do with the model or the data.

Have you done plots of the raw data? Maybe your putative biological model isn't reflected in the data. (SAS 9.2's ODS statistical graphics has some that are quite useful. Also, see Frank Harrell's book on regression modeling for diagnostics.)

Have you simplified the model for the presentation here? E.g. are there other covariables in the model? If so, and they are related to X, they can attenuate or even reverse the effect of X.

Doc Muhlbaier
Duke
Doc_Duke
Rhodochrosite | Level 12
Ksharp mentioned perfect prediction. You won't get that unless there is total separation of the y's by the x's. We have, however, been assuming that X is a continuous variable. Although a logistic regression can be done with a single binary predictor, it is then just a Chi Square and the ROC is pretty meaningless.
Ksharp
Super User
Hi.Doc@Duke
I guess that the sample size is too small or the events are too sparse.So the estimation of coefficient would be bias.
So anna You can try to use exact logistic regression .Such as:

[pre]
proc logistic data=yourdataset descending exactonly;
model Y=X Z;
exact X Z /estimate=both;
run;
[/pre]



Ksharp
deleted_user
Not applicable
Dear All,
many thanks for your help!
I'm currently working on the problem, considering all your suggestions 🙂
deleted_user
Not applicable
Hi again,
I'm going crazy with this issue, again, I wasn't able to solve it in any way.
The problem is quite simple, I tried all the possible combinations but I can never find the right one.

** I am using proc logistic to build a ROC curve.
** I have a binary "event" variable (dependent). The two categories correspond to Differentiation and Undifferentiation.
** I have only 1 continuous predictor (independent)
** The right intrerpretation of the data is: the higher the predictor, the higher the differentiation

I tried all the combinations obtained changing the way of codifying the event (what it's coded as 0 and what as 1), and the event for which the probability of the logistic model is modeled. I will list them here:

1) Differentiated = 0
Undifferentiated = 1
Event for which probability is modeled = 1
------> RESULT: the higher the predictor, the higher the prob of UNdiffer --> WRONG


2) Differ = 0
Undiffer = 1
Event = 0
------> RESULT: the higher the predictor, the lower the probability of Differ --> WRONG


3) Differ = 1
Undiffer = 0
Event = 1
-------> RESULT: the higher the predictor, the lower the prob of Differ --> WRONG


4) Differ = 1
Undiff = 0
Event = 0
------> RESULT: the higher the predictor, the higher the prob of UNdiff --> WRONG


Do you think it's normal?
I'm convinced there must be something very wrong but I really don't know what to do.
Thanks again,
Anna
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello Anna,

If you could supply a sample of you data to experiment?

SPR

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 3309 views
  • 0 likes
  • 5 in conversation