I just tried to run a logistic model. But the two procedures produced different parameter estimates for intercept and coefficients. The TYPE3 result is also slighly different.
My understanding is that the result from PROC GENMOD is correct. But don't understand the output from PROC LOGISTIC. They should not really product different results.
data work.a;
input y x1 $ x2 $;
datalines;
0 a a
1 a a
1 a b
0 a b
1 b a
0 b a
1 b b
0 b b
1 c a
1 c a
0 c b
1 c b
1 a b
0 a a
1 b a
1 a a
0 c b
0 b b
1 b a
0 c a
;
proc logistic data=work.a outest=work.coeff descending;
class x1 x2;
model y=x1 x2;
run;
proc genmod data=work.a descending;
class x1 x2;
model y=x1 x2 / D=b type3;
ods output ParameterEstimates=work.coeff2(drop=lowerwaldcl upperwaldcl);
run;
proc print data=work.coeff;
run;
proc print data=work.coeff2;
run;
/*Results:*/
/*proc logistic*/
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 0.1612 0.4606 0.1226 0.7263
x1 a 1 0.0806 0.6426 0.0157 0.9002
x1 b 1 0.0806 0.6426 0.0157 0.9002
x2 a 1 0.3852 0.4602 0.7006 0.4026
/*proc genmod*/
Prob
Obs Parameter Level1 DF Estimate StdErr ChiSq ChiSq
1 Intercept 1 -0.3852 0.9505 0.16 0.6853
2 x1 a 1 0.2419 1.1394 0.05 0.8319
3 x1 b 1 0.2419 1.1394 0.05 0.8319
4 x1 c 0 0.0000 0.0000 . .
5 x2 a 1 0.7704 0.9204 0.70 0.4026
6 x2 b 0 0.0000 0.0000 . .
Ruth,
The results are consistent with each other. Both are correct. However, they have been obtained using different expansions of the categorical variables. The GENMOD procedure employs an overparameterized model in which a set of k binary variables are produced when the number of levels of a categorical variable is k. SAS refers to this as the GLM parameterization. By default, the LOGISTIC procedure employs a model with k-1 variables in the design matrix. Moreover, the k-1 variables are not binary, but can take on one of three values: -1, 0, or 1. This sort of parameterization is referred to as effect coding.
For variable X1, columns of the design matrix given GLM coding and effect coding are as follows:
GLM coding
X1 X1_1 X1_2 X1_3
a 1 0 0
b 0 1 0
c 0 0 1
Effect coding
X1 X1_1 X1_2
a 1 0
b 0 1
c -1 -1
It can be shown that these two parameterizations will yield the same predicted response values. But the parameters do have to be interpreted differently.
If you prefer the GLM parameterization (a lot of people do), you can request that parameterization in the LOGISTIC procedure. All you have to do is change your class statement to:
class x1 x1 / param=glm;
HTH
Ruth,
The results are consistent with each other. Both are correct. However, they have been obtained using different expansions of the categorical variables. The GENMOD procedure employs an overparameterized model in which a set of k binary variables are produced when the number of levels of a categorical variable is k. SAS refers to this as the GLM parameterization. By default, the LOGISTIC procedure employs a model with k-1 variables in the design matrix. Moreover, the k-1 variables are not binary, but can take on one of three values: -1, 0, or 1. This sort of parameterization is referred to as effect coding.
For variable X1, columns of the design matrix given GLM coding and effect coding are as follows:
GLM coding
X1 X1_1 X1_2 X1_3
a 1 0 0
b 0 1 0
c 0 0 1
Effect coding
X1 X1_1 X1_2
a 1 0
b 0 1
c -1 -1
It can be shown that these two parameterizations will yield the same predicted response values. But the parameters do have to be interpreted differently.
If you prefer the GLM parameterization (a lot of people do), you can request that parameterization in the LOGISTIC procedure. All you have to do is change your class statement to:
class x1 x1 / param=glm;
HTH
Hi Dale,
It is a thorough and clear answer. Thanks a lot!
It seemed thast SAS has evolved a lot in the past 10 years. The classic book that we (as new starters) use for logistic regression analysis is: Logistic Regression Using SAS: Theory and Application (author: Paul D. Allison). The book is very well written. But the problem is the book was published in 1999 and after that it never gets updated with new versions. So many new SAS features and changes are not reflected in this book. For example, the PROC LOGISTIC has no such parameter options or CLASS statement.
While SAS also highly recommend this book. I hope this book can be updated in the short future.
Thanks again.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.