I'm using Proc Logistic to create a ROC curve; I have the Y variable which codifies the event of interest (cells differentiation, 1 = event = differentiation; 0 = non-event = non-differentiation) and the X independent variable which is a parameter measuring the differentiation level.
After that, I define a cut-off on the X to distinguish between events and non-events.
The correct interpretation of my data is: the higher the X, the higher the probability of differentiation (event=1); but the interpretation given by SAS is the opposite (i.e. the higher the X, the lower the probability of differentiation). I checked it in the out dataset, where a lower probability of event is associated to higher levels of my X.
Is there a statement or any other solution I can use to "reverse" the association done by SAS, in order to make it correctly fit my data?
thank you very much for your answer.
I tried what you suggested but I still have the same results as before: SAS is still associating low values of predicted probability to high values of X, and I need the opposite.
I also tried with some variations but the results don't change.
Many thanks for your help anyway
I think Doc@Duck has the right idea here.That is to say you need more covariates to add into your model or need more obs.Your logistic model is perfect prediction(i.e. you only have one outcome variable and one covariate).The perfect prediction is not a good model which will give your not good estimate and interpretation. I think SAS will issue a NOTE in the log file.
Message was edited by: Ksharp
I've got to think that this has something to do with the model or the data.
Have you done plots of the raw data? Maybe your putative biological model isn't reflected in the data. (SAS 9.2's ODS statistical graphics has some that are quite useful. Also, see Frank Harrell's book on regression modeling for diagnostics.)
Have you simplified the model for the presentation here? E.g. are there other covariables in the model? If so, and they are related to X, they can attenuate or even reverse the effect of X.
Ksharp mentioned perfect prediction. You won't get that unless there is total separation of the y's by the x's. We have, however, been assuming that X is a continuous variable. Although a logistic regression can be done with a single binary predictor, it is then just a Chi Square and the ROC is pretty meaningless.
I'm going crazy with this issue, again, I wasn't able to solve it in any way.
The problem is quite simple, I tried all the possible combinations but I can never find the right one.
** I am using proc logistic to build a ROC curve.
** I have a binary "event" variable (dependent). The two categories correspond to Differentiation and Undifferentiation.
** I have only 1 continuous predictor (independent)
** The right intrerpretation of the data is: the higher the predictor, the higher the differentiation
I tried all the combinations obtained changing the way of codifying the event (what it's coded as 0 and what as 1), and the event for which the probability of the logistic model is modeled. I will list them here:
1) Differentiated = 0
Undifferentiated = 1
Event for which probability is modeled = 1
------> RESULT: the higher the predictor, the higher the prob of UNdiffer --> WRONG
2) Differ = 0
Undiffer = 1
Event = 0
------> RESULT: the higher the predictor, the lower the probability of Differ --> WRONG
3) Differ = 1
Undiffer = 0
Event = 1
-------> RESULT: the higher the predictor, the lower the prob of Differ --> WRONG
4) Differ = 1
Undiff = 0
Event = 0
------> RESULT: the higher the predictor, the higher the prob of UNdiff --> WRONG
Do you think it's normal?
I'm convinced there must be something very wrong but I really don't know what to do.