BookmarkSubscribeRSS Feed
Dofin
Calcite | Level 5

Hello!

I'm using proc genmod for building GLM models. In variable selection we generally use LR test with type 3 statistics.

As I understand the LR test is working the following way:

Dofin_0-1629473707796.png

D-deviance

which is distributed Chi-Square when number of degrees of freedom is equal to number of levels in investigated variable minus 1. That's true until a model  includes interaction.

Suppose a model consists of 2 variables and their interaction: A,B,A*B

So, the LR test for variable A is supposed to be:

Dofin_1-1629474021572.png

but I found occasionally that deviance is equal for 2 models: B,A*B and A,B,A*B

so this formula cannot be applied in this case

If anybody knows how SAS computes LR statistics for model with interaction?

This very important to me and I will be very appreciate.

 

Thanks a lot,

 

Alex

 

5 REPLIES 5
PaigeMiller
Diamond | Level 26

This has nothing to do with DEVIANCE or LR tests or PROC GENMOD. This is the way SAS handles non-hierarchical models, in all SAS PROCs as far as I know.

 

Your first model with B and A*B is non-hierarchical in the sense that the main effect of A is not included, but A is included in a interaction. In this model with no main effect of A, the interaction has extra degrees of freedom, which includes the degrees of freedom for the main effect of A and the main effect of A is included in the parameter estimates of A*B along with the effect of A*B included. The model with A, B and A*B has the degrees of freedom for the main effect of A removed from A*B and included in the main effect of A, and also in this model the main effect of A is separated from the effect of A*B.

 

So the way SAS works, these are the same models, with the exact same fit and the exact same predictions; and the LR test should be zero. You have just requested the same model twice, but with different paramaterization.

 

Example:

title "Model 1";
proc genmod data=sashelp.cars;
class origin drivetrain;
model msrp=origin origin*drivetrain;
output out=model1_out pred=p;
run;
title "Model 2";
proc genmod data=sashelp.cars;
class origin drivetrain;
model msrp=origin drivetrain origin*drivetrain;
output out=model2_out pred=p;
run;

Note the Deviance for each model is identical, and the predictions are identical for every observation. As it should be.

 

The above applies to cases where both A and B are CLASS, or just B is CLASS.

--
Paige Miller
Dofin
Calcite | Level 5

Thank you very much, very good explanation. I had feeling those are the same models.

 

But the problem is that, when I have main factors and their interaction in the model LR Chi-Square statistics for main factors don't equal to zero (but should be). It looks like SAS makes some adjustment or additional computation. 

Please take a look on the output of:

proc genmod=sashelp.cars;
class origin driventrain;
model msrp=origin driventrain origin*driventrain /link=log dist=gamma type3 scale=pearson;
run;

LR Statistics For Type 3 Analysis

Source

Num DF

Den DF

F Value

Pr > F

Chi-Square

Pr > ChiSq

Origin

2

419

54.78

<.0001

109.56

<.0001

DriveTrain

2

419

47.28

<.0001

94.57

<.0001

Origin*DriveTrain

4

419

2.11

0.0791

8.43

0.0771

 

I'm wondering how sas computes Ch-Square for Origin which is equal to 109.56, for example

 

Thank you,

 

Alex

PaigeMiller
Diamond | Level 26

But the problem is that, when I have main factors and their interaction in the model LR Chi-Square statistics for main factors don't equal to zero (but should be).

Should not be zero. And is not zero.

 

Why do you think these should be zero, when there is an effect due to ORIGIN and an effect to DRIVETRAIN?

 

When you have a question about how something is calculated, you should always check the SAS documentation for PROC GENMOD and then look under Details

--
Paige Miller
Dofin
Calcite | Level 5

You've written LR test should be zero, see your previous post.

"So the way SAS works, these are the same models, with the exact same fit and the exact same predictions; and the LR test should be zero. You have just requested the same model twice, but with different paramaterization."

 

Now you contradict to yourself.

The problem is the SAS documentation is very poor that why people ask here their questions. 

 

 

PaigeMiller
Diamond | Level 26

I'm not seeing any contradiction.

--
Paige Miller

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 763 views
  • 0 likes
  • 2 in conversation