I am working with proc logistic on a model that has "level of physical activity during pandemic impacted VS level of physical activity during pandemic not impacted" as the dependant variable (hebehch_sq002) and my independent variables are a binary variable called chronic_dummy (having at least a chronic disease VS having none) and sex. The interaction term is also added (chronic_dummy*sex).
The variable chronic_dummy seems to be significant in the "Type 3 analysis of effects" with a p-value of 0.015 (<0.05) but, it it isn't significant in the "Analysis of Maximum Likelihood Estimates" with a p-value of 0.0651.
From my understanding when you add the interaction term in the model, the p-value associated with the chronic_dummy variable in the model is measuring the significance of the variable at the reference level of the other variable (Male), but I tried to run the model with "Female" as a reference category, and it isn't significant for this category either. The confidence interval for the odds ratio of chronic_dummy also contain 1 for both Male and Female. If it isn't significant neither for male nor female, why is it significant in the "Type 3 analysis of effects" ? Can someone explain what is causing that ?
Here is my code :
proc logistic data=chronic;
class chronic_dummy (ref = "No chronic disease") sex(ref="Male") / param=glm;
model hebehch_sq001 (event="impacted") = chronic_dummy sex chronic_dummy*sex;
run;
First 2 tables are the results with "Male" as a reference category, and the last 2 tables are for "Female" :
Hello,
Look here :
WHY ARE TYPE III P-VALUES DIFFERENT FROM THE ESTIMATE P-VALUES IN PROC GLM? | SAS FAQ
and here :
Logistics Reg. P-value in "Type 3 Analysis" and "Analysis of Maximum Likelihood"
https://communities.sas.com/t5/Statistical-Procedures/Logistics-Reg-P-value-in-quot-Type-3-Analysis-...
Koen
I think "Type 3 analysis of effects" is only for main effect .
But "chronic_dummy" is included in interactive effect " chronic_dummy*sex ", therefore you can't expect they are the same thing.
"analysis of maximum likelihood" is based on design matrix of model , while "Type 3 analysis of effects" is not .
To understand what the type 3 tests are doing, run your model in PROC GLM as follows using the E3 option in the MODEL statement. Of course, all numerical results are to be ignored since GLM does not fit a logistic model, but the type 3 tests are constructed exactly the same so that one table can be used.
proc glm data=chronic;
class chronic_dummy (ref = "No chronic disease") sex(ref="Male");
model hebehch_sq001 = sex chronic_dummy(sex) / e3;
ods select EstFunc;
run; quit;
Note the "Type III Estimable Functions" table. This shows the coefficients of the function of model parameters that is tested by each type 3 test for each of the terms in your model. Next, run this equivalent way of specifying your model that includes the effect of chronic_dummy nested within sex.
proc logistic data=chronic;
class chronic_dummy (ref = "No chronic disease") sex(ref="Male") / param=glm;
model hebehch_sq001 (event="impacted") = sex chronic_dummy*(sex);
run;
In the parameter estimates table, you should now see both of the chronic_dummy parameters that you saw when you ran the model separately with the changed reference level of sex. This nested form of your interaction model makes it more obvious that the test of each chronic_dummy parameter is a test of the effect of chronic_dummy within each sex level.
Now, re-run your original form of the model adding the LSMEANS and EFFECTPLOT statements.
proc logistic data=chronic;
class chronic_dummy (ref = "No chronic disease") sex(ref="Male") / param=glm;
model hebehch_sq001 (event="impacted") = chronic_dummy sex chronic_dummy*sex;
lsmeans chronic_dummy / diff e plots=none;
effectplot / link noobs;
run;
Look first at the "Coefficients for chronic_dummy Least Squares Means" table (from the E option) which shows the coefficients of the function of model parameters that is estimated and tested for each LS-mean. Note that if you take the difference in the two coefficient vectors the result is exactly the coefficient vector that you saw from the E3 option in GLM for the type 3 test of chronic_dummy. This means that the type 3 test of chronic_dummy is a test of the difference of LS-means which is requested by the DIFF option in the LSMEANS statement. The plot helps you see what is going on. The test of each chronic_dummy parameter tests the change in value along each of the two separate lines representing the sex levels. Note that the LS-mean estimates fall somewhere between the two lines. The type 3 test tests the change in these two estimates.
You might say, "Okay, but the LS-means difference seems about the same as the difference in each sex level, so why is the type 3 test significant and the parameter tests aren't?" It's the variability. Notice the standard error estimates for the LS-mean difference in the "Differences of chronic_dummy Least Squares Means" table and for the two chronic_dummy parameters in the parameter estimates table from the nested form of your model. I suspect you'll find that the standard error for the LS-means difference is smaller than for each of the chronic_dummy parameters. The smaller standard error allows the type 3 test statistic to be larger and to become significant.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.