Programming the statistical procedures from SAS

Weighted effect coding in regressions

Reply
Frequent Contributor
Posts: 111

Weighted effect coding in regressions

Hi,

I would like to perform multinomial logit regressions with effect coding (-1 0 1), but groups sizes are unequal. Because of this, intercepts correspond to the mean of means rather than the real grand mean. I wonder if there is a possibility with SAS to get adjusted intercepts in order to take into account the unbalanced data.

 

Thank you.

Super User
Posts: 9,758

Re: Weighted effect coding in regressions

Are you fitting generalize logit regression or cumulative logit regression?


data class;
 set sashelp.class end=last;
output;
if last then do;
 sex='N';weight=34.5;height=123.4;output;
  sex='N';weight=134.5;height=23.4;output;
 sex='N';weight=74.5;height=223.4;output;
 sex='N';weight=44.5;height=93.4;output;
end;
run;


proc logistic data=class;
model sex=weight height/link=glogit equalslopes;
run;

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

Generalize logit regressions

 

What is equalslopes stand for?

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

My model looks like this:

 


proc logistic;
class sex eduM3(ref='2')/param=effect ;
model edu3(ref='2')= sex eduM3 /link=glogit rsquare;
weight pond / norm;
where model=1;
run;

 

edu3 has 3 categories.

 

PROC Star
PROC Star
Posts: 185

Re: Weighted effect coding in regressions

Some thoughts:

 

1)  Both of the predictor variables in your MODEL statement are on a categorical scale. That is fine, if appropriate, but consequently there are no "slopes" and hence the discussion about the EQUALSLOPES option is moot. I'll add that if you check the SAS documentation (always a good idea), you'll see that the EQUALSLOPES option affects slopes associated with a continuous predictor, and has no impact on intercepts.

2) You have not provided enough information about your study. What is "pond" and how is it related to the study design? What is "model"? In general, if you want a good and appropriate answer to your question, you'll need to provide enough information.

3) Be sure that you understand how different coding systems work. See http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm   The underlying model is the same, regardless of coding system. If you intend to interpret the parameter estimates, then you have to understand the coding system. If you do interpretation based on predicted values, then the coding system is moot because algebraically it all works out to the same thing.

 

 

 

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

1) Ok.

2) I am not really sure what information I should provide. The variable pond is a weight variable to make the sample fit with the population. I need to model the education (3 categories) by a serie a socioeconomic variables (in my example, I try only with 2). The problem is still present without weight and whatever are the independent variables.

3) It's probably just something I don't understand with the effect coding (-1 0 1). I want to figure out how I can replicate observed distributions with regression parameters. I have no problem doing this with parameters from dummy coding (0 1).

PROC Star
PROC Star
Posts: 185

Re: Weighted effect coding in regressions

Now I have lots more thoughts, but before I share I'd like to know more about your study. Something like a Methods section from a manuscript or report (which you have to write eventually anyway) would be a good start.

 

And could you clarify what you mean by making "the sample fit the population"? In what way does the sample not represent the population (e.g., do you have stratification or clustering or unequal weighting)?

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

The original weight was unequal and was used to make the sample represents adequately the population by age/sex/country. But as I said, I don't think it matters, because the problem is the same with or without the weight statement.

 

There is no proper method yet. It's in development.

PROC Star
PROC Star
Posts: 185

Re: Weighted effect coding in regressions

I recommend that you look into the SURVEYLOGISTIC procedure to deal with unequal weights. Search   lexjansen.com  for useful papers on SURVEYLOGISTIC and read the SURVEYLOGISTIC documentation to see whether this procedure would be appropriate for your study.

 

If you have data in hand, which appears to be the casee, then the methods by which those data were acquired, the definitions of variables, etc. are already determined, and you would be able to share those if you chose.

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

I don't understand. I don't have problem with the weight or the data. I just want to knnow how we can compute descriptive stastics (such as education by language) with parameters estimated with an effect coding (-1 0 1) rather than dummy (0 1).

PROC Star
PROC Star
Posts: 185

Re: Weighted effect coding in regressions

A different coding system will not "make the sample fit the population." You really ought to look into SURVEYLOGISTIC, it might be just the right tool for your problem, and it is able to make the weighted predictions that you appear to be interested in.

Super User
Posts: 9,758

Re: Weighted effect coding in regressions

It means these logit model have the same intercept term.

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

I don't understand what you mean. The output of the model is this:

 

Analysis of Maximum Likelihood Estimates
Parameter   edu3 DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept   0 1 -1.3085 0.0191 4707.0531 <.0001
Intercept   1 1 -0.0263 0.0104 6.3999 0.0114
sex 0 0 1 0.0978 0.0112 76.6518 <.0001
sex 0 1 1 0.1640 0.00906 327.2032 <.0001
eduM3 0 0 1 1.6999 0.0205 6854.5466 <.0001
eduM3 0 1 1 0.6441 0.0127 2559.4038 <.0001
eduM3 1 0 1 -0.4362 0.0256 290.2973 <.0001
eduM3 1 1 1 0.3134 0.0136 533.6794 <.0001

 

With those parameters, I should be able to replicate the distribution of independent variables, but I can't (and I think it is because intercepts are biaised due to the unequal size of groups).

 

Table of sex by edu3
sex edu3
0 1 2 Total
0 22.53 46.45 31.03  
1 22.86 40.35 36.79  
Total        
         
Table of eduM3 by edu3
eduM3 edu3
0 1 2 Total
0 34.07 42.57 23.36  
1 6.94 52.9 40.16  
2 5.23 25.6 69.17  
Total        
Super User
Posts: 9,758

Re: Weighted effect coding in regressions

That is odd. Two intercept should have the same estimate if you use equalslope.

Can you post the LOG ?

Frequent Contributor
Posts: 111

Re: Weighted effect coding in regressions

The previous outpus was without the equalslopes statement. With equalslopes, I now have this:

 

 

Analysis of Maximum Likelihood Estimates
Parameter   edu3 DF Estimate Standard
Error
Wald
Chi-Square
Pr > ChiSq
Intercept   0 1 -0.7922 0.0106 5571.4516 <.0001
Intercept   1 1 -0.1503 0.00945 253.0618 <.0001
sex 0   1 0.1445 0.00762 359.9070 <.0001
eduM3 0   1 0.9336 0.0107 7599.7073 <.0001
eduM3 1   1 0.1369 0.0117 135.8083 <.0001

 

I'm confused, because it seems that some parameters are missing. The dependent variable has 3 categories, so how interpretes a parameter such as the one for sex (0.1445)?

Ask a Question
Discussion stats
  • 17 replies
  • 295 views
  • 0 likes
  • 3 in conversation