Programming the statistical procedures from SAS

PROC LOGISTIC vs PROC GENMOD (different results)

Accepted Solution Solved
Reply
Contributor
Posts: 52
Accepted Solution

PROC LOGISTIC vs PROC GENMOD (different results)

I just tried to run a logistic model. But the two procedures produced different parameter estimates for intercept and coefficients. The TYPE3 result is also slighly different.

My understanding is that the result from PROC GENMOD is correct. But don't understand the output from PROC LOGISTIC. They should not really product different results.

data work.a;
input y x1 $ x2 $;
datalines;
0 a a
1 a a
1 a b
0 a b
1 b a
0 b a
1 b b
0 b b
1 c a
1 c a
0 c b
1 c b
1 a b
0 a a
1 b a
1 a a
0 c b
0 b b
1 b a
0 c a
;


proc logistic data=work.a outest=work.coeff descending;
  class x1 x2;
  model y=x1 x2;
run;

proc genmod data=work.a descending;
  class x1 x2;
  model y=x1 x2 / D=b type3;
  ods output ParameterEstimates=work.coeff2(drop=lowerwaldcl upperwaldcl);
run;

proc print data=work.coeff;
run;

proc print data=work.coeff2;
run;


/*Results:*/

/*proc logistic*/

                                 Standard          Wald
Parameter      DF    Estimate       Error    Chi-Square    Pr > ChiSq

Intercept       1      0.1612      0.4606        0.1226        0.7263
x1        a     1      0.0806      0.6426        0.0157        0.9002
x1        b     1      0.0806      0.6426        0.0157        0.9002
x2        a     1      0.3852      0.4602        0.7006        0.4026

/*proc genmod*/
                                                                         Prob
Obs    Parameter    Level1    DF    Estimate      StdErr      ChiSq     ChiSq

1     Intercept               1     -0.3852      0.9505       0.16    0.6853
2     x1             a        1      0.2419      1.1394       0.05    0.8319
3     x1             b        1      0.2419      1.1394       0.05    0.8319
4     x1             c        0      0.0000      0.0000        .       .
5     x2             a        1      0.7704      0.9204       0.70    0.4026
6     x2             b        0      0.0000      0.0000        .       .


Accepted Solutions
Solution
‎07-07-2011 04:12 PM
Regular Contributor
Posts: 169

Re: PROC LOGISTIC vs PROC GENMOD (different results)

Ruth,

The results are consistent with each other.  Both are correct.  However, they have been obtained using different expansions of the categorical variables.  The GENMOD procedure employs an overparameterized model in which a set of k binary variables are produced when the number of levels of a categorical variable is k.  SAS refers to this as the GLM parameterization.  By default, the LOGISTIC procedure employs a model with k-1 variables in the design matrix.  Moreover, the k-1 variables are not binary, but can take on one of three values: -1, 0, or 1.  This sort of parameterization is referred to as effect coding.

For variable X1, columns of the design matrix given GLM coding and effect coding are as follows:

             GLM coding

   X1    X1_1    X1_2    X1_3

    a     1       0       0

    b     0       1       0

    c     0       0       1

        Effect coding

   X1    X1_1    X1_2

    a     1       0

    b     0       1

    c    -1      -1

It can be shown that these two parameterizations will yield the same predicted response values.  But the parameters do have to be interpreted differently.

If you prefer the GLM parameterization (a lot of people do), you can request that parameterization in the LOGISTIC procedure.  All you have to do is change your class statement to:

class x1 x1 / param=glm;

HTH

View solution in original post


All Replies
Solution
‎07-07-2011 04:12 PM
Regular Contributor
Posts: 169

Re: PROC LOGISTIC vs PROC GENMOD (different results)

Ruth,

The results are consistent with each other.  Both are correct.  However, they have been obtained using different expansions of the categorical variables.  The GENMOD procedure employs an overparameterized model in which a set of k binary variables are produced when the number of levels of a categorical variable is k.  SAS refers to this as the GLM parameterization.  By default, the LOGISTIC procedure employs a model with k-1 variables in the design matrix.  Moreover, the k-1 variables are not binary, but can take on one of three values: -1, 0, or 1.  This sort of parameterization is referred to as effect coding.

For variable X1, columns of the design matrix given GLM coding and effect coding are as follows:

             GLM coding

   X1    X1_1    X1_2    X1_3

    a     1       0       0

    b     0       1       0

    c     0       0       1

        Effect coding

   X1    X1_1    X1_2

    a     1       0

    b     0       1

    c    -1      -1

It can be shown that these two parameterizations will yield the same predicted response values.  But the parameters do have to be interpreted differently.

If you prefer the GLM parameterization (a lot of people do), you can request that parameterization in the LOGISTIC procedure.  All you have to do is change your class statement to:

class x1 x1 / param=glm;

HTH

Contributor
Posts: 52

Re: PROC LOGISTIC vs PROC GENMOD (different results)

Hi Dale,

It is a thorough and clear answer. Thanks a lot!

It seemed thast SAS has evolved a lot in the past 10 years. The classic book that we (as new starters) use for logistic regression analysis is: Logistic Regression Using SAS: Theory and Application (author: Paul D. Allison). The book is very well written. But the problem is the book was published in 1999 and after that it never gets updated with new versions. So many new SAS features and changes are not reflected in this book. For example, the PROC LOGISTIC has no such parameter options or CLASS statement.

While SAS also highly recommend this book. I hope this book can be updated in the short future.

Thanks again.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 6169 views
  • 1 like
  • 2 in conversation