Hi,
I have a model y (disease status with 0 or 1) = sex + treatment, but there is no events for female at treatment B as below:
| Disease | Treatment A | Treatment B |
Female | 1 | 1 | 0 |
| 0 | 8 | 18 |
Male | 1 | 9 | 5 |
| 0 | 87 | 86 |
Below code was used:
proc logistic data=xxx ;
Class sex treatment ;
model disease(event="1") = treatment sex sex*treatment;
oddsratio treatmen/diff=ref;
run;
I got warning message from SAS:
WARNING: There is possibly a quasi-complete separation of data points. The maximum likelihood
estimate may not exist.
WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based
on the last maximum likelihood iteration. Validity of the model fit is questionable.
Meanwhile SAS reported ORs (95% CI):
Female: <0.001 (<0.001 >999.999)
Male: 1.55(0.48,5.01)
I understand OR and CI are not making sense for Female, but how about OR and CI for Male, whether it is still making sense to interpret it? Any suggestion for this situation if I do want to keep both factors in the model?
Thank you very much for your help!
Nancy
using your code, those cell counts generate slightly different results than you report. The treatment odds ratio for:
female: >999.999
male: 1.779 (0.573, 5.525)
When you have separation problems due to sparseness like this, a useful alternative is to use Firth's penalized likelihood method FIRTH option, which gives:
female: 6.529 (0.214, 198.994)
male: 1.708 (0.570, 5.116)
Or, for fairly small sample problems like this, use the exact method:
female: 2.000 (0.105, infinity)
male: 1.774 (0.509, 7.023)
Notice that the treatment odds ratio for males doesn't change much.
The Firth and exact results both provided by this code (remove firth option to see the regular, asymptotic results):
data a;
do sex='f','m';
do disease=1,0;
do trt='a','b';
input count @@;
output;
end; end; end;
datalines;
1 0
8 18
9 5
87 86
;
proc logistic;
freq count;
class sex trt/param=ref;
model disease(event="1") = sex trt(sex)/firth;
oddsratio trt/diff=ref;
exact trt(sex)/estimate=both;
run;
Yes, you can still use the estimates for the male variable.
You can also look at exact logistic regression as well, and/or adding a small value to the B to see what happens - the estimates shouldn't change.
I don't remember the exact reason why, but I remember looking into in (years back) and seeing that it was valid. It had to do with it being to small to affect results in the end.
Did you try PROC CATMOD ?
using your code, those cell counts generate slightly different results than you report. The treatment odds ratio for:
female: >999.999
male: 1.779 (0.573, 5.525)
When you have separation problems due to sparseness like this, a useful alternative is to use Firth's penalized likelihood method FIRTH option, which gives:
female: 6.529 (0.214, 198.994)
male: 1.708 (0.570, 5.116)
Or, for fairly small sample problems like this, use the exact method:
female: 2.000 (0.105, infinity)
male: 1.774 (0.509, 7.023)
Notice that the treatment odds ratio for males doesn't change much.
The Firth and exact results both provided by this code (remove firth option to see the regular, asymptotic results):
data a;
do sex='f','m';
do disease=1,0;
do trt='a','b';
input count @@;
output;
end; end; end;
datalines;
1 0
8 18
9 5
87 86
;
proc logistic;
freq count;
class sex trt/param=ref;
model disease(event="1") = sex trt(sex)/firth;
oddsratio trt/diff=ref;
exact trt(sex)/estimate=both;
run;
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.