- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello!
I'm using proc genmod for building GLM models. In variable selection we generally use LR test with type 3 statistics.
As I understand the LR test is working the following way:
D-deviance
which is distributed Chi-Square when number of degrees of freedom is equal to number of levels in investigated variable minus 1. That's true until a model includes interaction.
Suppose a model consists of 2 variables and their interaction: A,B,A*B
So, the LR test for variable A is supposed to be:
but I found occasionally that deviance is equal for 2 models: B,A*B and A,B,A*B
so this formula cannot be applied in this case
If anybody knows how SAS computes LR statistics for model with interaction?
This very important to me and I will be very appreciate.
Thanks a lot,
Alex
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This has nothing to do with DEVIANCE or LR tests or PROC GENMOD. This is the way SAS handles non-hierarchical models, in all SAS PROCs as far as I know.
Your first model with B and A*B is non-hierarchical in the sense that the main effect of A is not included, but A is included in a interaction. In this model with no main effect of A, the interaction has extra degrees of freedom, which includes the degrees of freedom for the main effect of A and the main effect of A is included in the parameter estimates of A*B along with the effect of A*B included. The model with A, B and A*B has the degrees of freedom for the main effect of A removed from A*B and included in the main effect of A, and also in this model the main effect of A is separated from the effect of A*B.
So the way SAS works, these are the same models, with the exact same fit and the exact same predictions; and the LR test should be zero. You have just requested the same model twice, but with different paramaterization.
Example:
title "Model 1";
proc genmod data=sashelp.cars;
class origin drivetrain;
model msrp=origin origin*drivetrain;
output out=model1_out pred=p;
run;
title "Model 2";
proc genmod data=sashelp.cars;
class origin drivetrain;
model msrp=origin drivetrain origin*drivetrain;
output out=model2_out pred=p;
run;
Note the Deviance for each model is identical, and the predictions are identical for every observation. As it should be.
The above applies to cases where both A and B are CLASS, or just B is CLASS.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much, very good explanation. I had feeling those are the same models.
But the problem is that, when I have main factors and their interaction in the model LR Chi-Square statistics for main factors don't equal to zero (but should be). It looks like SAS makes some adjustment or additional computation.
Please take a look on the output of:
proc genmod=sashelp.cars;
class origin driventrain;
model msrp=origin driventrain origin*driventrain /link=log dist=gamma type3 scale=pearson;
run;
LR Statistics For Type 3 Analysis | ||||||
Source | Num DF | Den DF | F Value | Pr > F | Chi-Square | Pr > ChiSq |
Origin | 2 | 419 | 54.78 | <.0001 | 109.56 | <.0001 |
DriveTrain | 2 | 419 | 47.28 | <.0001 | 94.57 | <.0001 |
Origin*DriveTrain | 4 | 419 | 2.11 | 0.0791 | 8.43 | 0.0771 |
I'm wondering how sas computes Ch-Square for Origin which is equal to 109.56, for example
Thank you,
Alex
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
But the problem is that, when I have main factors and their interaction in the model LR Chi-Square statistics for main factors don't equal to zero (but should be).
Should not be zero. And is not zero.
Why do you think these should be zero, when there is an effect due to ORIGIN and an effect to DRIVETRAIN?
When you have a question about how something is calculated, you should always check the SAS documentation for PROC GENMOD and then look under Details
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You've written LR test should be zero, see your previous post.
"So the way SAS works, these are the same models, with the exact same fit and the exact same predictions; and the LR test should be zero. You have just requested the same model twice, but with different paramaterization."
Now you contradict to yourself.
The problem is the SAS documentation is very poor that why people ask here their questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm not seeing any contradiction.
Paige Miller