07-25-2017 07:48 PM
Dear SAS support board,
I ran a SAS code to do a multiple regression analysis. When I got the output, I found that estimates and point estimates (OR) are inconsistent: When I exponentiate the estimate (the logit) I should get the point estimate, I also found that the p value for the estimate is not the same as the p value of the point estimate. Here is a part of the output to explain:
Effect F Value Num DF Den DF Pr > F
group 25.25 1 749 <.0001
age.order 2.81 4 746 0.0249
sex 8.02 1 749 0.0048
Analysis of Maximum Likelihood Estimates
Parameter Estimate Error t Value Pr > |t|
Intercept -3.1525 0.1934 -16.30 <.0001
group 2 0 . . .
group 3 0.3609 0.0718 5.03 <.0001
age.oreder2 -0.6164 0.4560 -1.35 0.1769
age.order3 0.3061 0.3236 0.95 0.3446
age.order4 0.3948 0.2401 1.64 0.1006
age.order5 0.6882 0.2651 2.60 0.0097
sex F 0.5458 0.1927 2.83 0.0048
Odds Ratio Estimates
Point 95% Confidence
Effect Estimate Limits
group3 3 vs 1 2.058 1.552 2.728
age.order 2 vs 1 1.169 0.301 4.547
age.order 3 vs 1 2.941 0.963 8.985
age.order 4 vs 1 3.214 1.248 8.277
age.order 5 vs 1 4.310 1.612 11.521
sex 2 vs 1 2.979 1.398 6.350
In the first table you can see that both age and sex have significant effect on the output. The second table shows that only age.order 5 and sex have significant effect. While the third table (odds ratio and 95%CI) shows that age.order 4 also has a significant odds ratio and CI that doesn't cross 1. While its p value in the second table is ( 0.1006). When I exponentiate the estimate for age.order4 (which is 0.3948) I don't get its point estiamte (3.214) in the third table. How estimate and point estimate could be inconsistent??
note: this is a domain analysis for three sample groups, this is an output for the domain of sample group 2. I have the same probllem for the two other domains (groups). I checked the number of observations in each domain, it matches the number of each one of these groups. What could be the problem? a statistician has told me that the estimate is more accurate than the odds ratios and I have to figure out if there is a problem in the odds ratios outcomes.
Any thoughts please?
07-25-2017 08:25 PM
It looks like it's the categorical variables that are off, so I suspect you didn't specify your parametrization correctly. You didn't post your code so it's hard to say, but I'd consider changing the parametrization to REF and see if you get what you expect.
07-25-2017 08:38 PM
In the code I did it this way;
CLASS group (REF=FIRST) age.order (REF=FIRST) sex (REF=FIRST);DOMAIN group;
and the output showed this:
Class Level Information
Class Value Design Variables
group 1 -1 -1
2 1 0
3 0 1
age.order 1 -1 -1 -1 -1
2 1 0 0 0
3 0 1 0 0
4 0 0 1 0
5 0 0 0 1
sex 1 -1
What do you think?!
07-25-2017 08:51 PM
You tell me
Is the design matrix what you expect? What's the default parameterization and what options are available? Check the documentation for the CLASS statement.
It looks like Effect is default and you probably want REF, so your code is incorrect for the assumption you've made, that exponential of the estimate will be the OR.
07-25-2017 08:58 PM
I found this warning in the log:
WARNING: There is possibly a quasi-complete separation of data points. The maximum likelihood estimate may not exist. WARNING: The SURVEYLOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable.
This seems scary!!
I'm reading some help from this SAS support but still cannot figure out what to do!
(Usage Note 22599: Understanding and correcting complete or quasi-complete separation problems)
should I fix the problem by looking at your suggestion about CLASS statement or the above website?!!