turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- BI
- /
- Enterprise Guide
- /
- mixed class level coding in PROC GENMOD?

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

08-13-2007 02:07 PM

Dear all,

I am using SAS 9.1.3 on the windows platform to perform a logistic-type analysis. I am treating all my variables as class-level. Imagine that I have 1 independent variable x1 that I would like to treat as reference-level coded, and 1 independent variable x2 for which I need GLM-type coding (same number of estimates as levels of x2). I am using the /noint option, so, in a way, you could say that I am estimating an intercept for each level of x2.

According to the documentation, the GLM coding can only be used as global option, but, according to the same documentation, variable-level options override the global option.

"Global v-options are applied to all the variables specified in the CLASS statement. If you specify more than one CLASS statement, the global v-options specified on any one CLASS statement apply to all CLASS statements. However, individual CLASS variable v-options override the global v-options."

In principle, that means that my case can easily be solved by using this class statement:

class x1(param=ref) x2 /param=GLM;

However, in fitting the model, the reference coding instruction gets ignored, and SAS gives me 1 estimate per level for x1, and 1 less than 1 per level for x2. Either there is something wrong in the implementation, in the documentation, or, of course, in my code.

For whoever wants to try it, there's a piece of sample code that shows what I mean.

In fact, I also want to raise a question about what "GLM coding" means. In the documentation, there is an example of a class-level variable with 4 levels. The GLM-design matrix is a diagonal matrix, with 1 on the diagonal and 0 in other cells. That would suggest that this coding generates 4 estimates. Yet, immediately under the table, the documentation states: "Parameter estimates of CLASS main effects using the GLM coding scheme estimate the difference in the effects of each level compared to the last level." This sounds like reference coding.

I'd be grateful for any hint on what I'm doing or understanding wrongly. Thanks,

Peter.

data genmodtest;input b_a orientdeg freq_a n_ab;datalines;

0.80 0 1 4

0.80 90 2 4

1 0 1 4

1 90 2 4

1.25 0 3 4

1.25 90 2 4

;quit;

proc genmod data=genmodtest;

class b_a(param=ref ref='1') orientdeg /param=GLM;

model freq_a/n_ab=b_a orientdeg /noint dist=binomial;

run;

I am using SAS 9.1.3 on the windows platform to perform a logistic-type analysis. I am treating all my variables as class-level. Imagine that I have 1 independent variable x1 that I would like to treat as reference-level coded, and 1 independent variable x2 for which I need GLM-type coding (same number of estimates as levels of x2). I am using the /noint option, so, in a way, you could say that I am estimating an intercept for each level of x2.

According to the documentation, the GLM coding can only be used as global option, but, according to the same documentation, variable-level options override the global option.

"Global v-options are applied to all the variables specified in the CLASS statement. If you specify more than one CLASS statement, the global v-options specified on any one CLASS statement apply to all CLASS statements. However, individual CLASS variable v-options override the global v-options."

In principle, that means that my case can easily be solved by using this class statement:

class x1(param=ref) x2 /param=GLM;

However, in fitting the model, the reference coding instruction gets ignored, and SAS gives me 1 estimate per level for x1, and 1 less than 1 per level for x2. Either there is something wrong in the implementation, in the documentation, or, of course, in my code.

For whoever wants to try it, there's a piece of sample code that shows what I mean.

In fact, I also want to raise a question about what "GLM coding" means. In the documentation, there is an example of a class-level variable with 4 levels. The GLM-design matrix is a diagonal matrix, with 1 on the diagonal and 0 in other cells. That would suggest that this coding generates 4 estimates. Yet, immediately under the table, the documentation states: "Parameter estimates of CLASS main effects using the GLM coding scheme estimate the difference in the effects of each level compared to the last level." This sounds like reference coding.

I'd be grateful for any hint on what I'm doing or understanding wrongly. Thanks,

Peter.

data genmodtest;input b_a orientdeg freq_a n_ab;datalines;

0.80 0 1 4

0.80 90 2 4

1 0 1 4

1 90 2 4

1.25 0 3 4

1.25 90 2 4

;quit;

proc genmod data=genmodtest;

class b_a(param=ref ref='1') orientdeg /param=GLM;

model freq_a/n_ab=b_a orientdeg /noint dist=binomial;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to peterc_2

08-23-2007 05:59 AM

In my opinion, PARAM=GLM and PARAM=REF were both sides of the same kind of parametrization. Both provide comparisons to a reference level ; the only differences are :

- you cannot choose the reference level with PARAM=GLM, it is always the last formatted value in alphabetic order that is chosen as reference ;

- you will not see the zeroed coefficient for the reference with PARAM=REF.

I prefer to use PARAM=GLM to show that explicit zero, and use a format to re-order my values and have the reference be the last one.

Also useful, PARAM=EFFECT compares to overall mean ; that means that all coefficients for a variables sum to zero. This is nice since it avoids the problem of choosing a meaningful reference ; but it does not display the last coefficient either, as PARAM=REF does, so you have to compute it by yourself.

On my computer (SAS 9.1.3 SP4), this code

proc genmod data=genmodtest;

class b_a(param=ref) orientdeg(param=effect) ;

model freq_a/n_ab=b_a orientdeg /noint ;

run;

gives mixed parametrization without error.

I dunno if it solves your problem, but I hope it is of any help.

Olivier

- you cannot choose the reference level with PARAM=GLM, it is always the last formatted value in alphabetic order that is chosen as reference ;

- you will not see the zeroed coefficient for the reference with PARAM=REF.

I prefer to use PARAM=GLM to show that explicit zero, and use a format to re-order my values and have the reference be the last one.

Also useful, PARAM=EFFECT compares to overall mean ; that means that all coefficients for a variables sum to zero. This is nice since it avoids the problem of choosing a meaningful reference ; but it does not display the last coefficient either, as PARAM=REF does, so you have to compute it by yourself.

On my computer (SAS 9.1.3 SP4), this code

proc genmod data=genmodtest;

class b_a(param=ref) orientdeg(param=effect) ;

model freq_a/n_ab=b_a orientdeg /noint ;

run;

gives mixed parametrization without error.

I dunno if it solves your problem, but I hope it is of any help.

Olivier

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to peterc_2

08-27-2009 11:03 AM

Hi Peter,

I know you posted this 2 years ago but my colleagues and I are having the same problem on an analysis we are doing. Did you ever find a solution to param=glm overriding everything? We want some variables GLM coded to get accurate odds ratios from estimate statements but we need others to be reference coded for interaction terms.

Thanks,

Amanda

I know you posted this 2 years ago but my colleagues and I are having the same problem on an analysis we are doing. Did you ever find a solution to param=glm overriding everything? We want some variables GLM coded to get accurate odds ratios from estimate statements but we need others to be reference coded for interaction terms.

Thanks,

Amanda