I applied PROC LOGISTIC to predict a binary outcome variable, coded as 0 and 1. I have eleven predictors. Seven predictors are binary (VAR1 - VAR7), three predictors are continuous variable (VAR9 - VAR 11), and one predictor is categorical with four levels (VAR8: EDU4).
Regarding the four-level categorical variable,
(1) I found the "type 3 analysis of effects" was not significant (df = 3, chi-sq = 6.1259, p =0.1055).
However, one of the three estimates in "Analysis of Maximum Likelihood Estimates" was significant (df = 1, estimate = 0.9442, chi-sq = 5.2407, p = 0.0221) when"one level compared to the reference level" (Bachelor's degree or above vs. high school and some post-secondary).
In this case, should I interpret this variable has a significant effect on my binary outcome?
(3) In the model, I applied WEIGHT statement in PROC LOGISTIC. Should I use PROC SURVEYLOGISTIC to get a better estimate?
Here is my SAS code using PROC LOGISTIC:
PROC LOGISTIC data=SDE descending;
CLASS VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 / desc order = formatted param = ref;
model DV = VAR1 VAR2 VAR3 VAR4 VAR5 VAR6 VAR7 VAR8 VAR9 VAR10 VAR11 / lackfit rsquare;
weight RESPWT;
run;
Thank you for your insights.
The two tests are not testing the same thing.
When you get a not significant (p=0.1055) p-value for the Type 3 test, this means that the slopes (regression coefficients) of the four different levels of EDU4 are not statistically different from each other — i.e. they are all the same.
When one of the estimates for the levels of EDU4 is significant (p=0.0221), this means the slope for this level is significantly different than zero.
These are not testing the same thing. One does not imply the other.
Without any of the output or input data the only question I will attempt to answer at this point is about surveylogistic. You should use proc surveylogistic if your data comes from a complex sample design such as stratified or clustered or just about anything other than a simple random sample. The options and requirements for analysis would depend on the sample design.
The two tests are not testing the same thing.
When you get a not significant (p=0.1055) p-value for the Type 3 test, this means that the slopes (regression coefficients) of the four different levels of EDU4 are not statistically different from each other — i.e. they are all the same.
When one of the estimates for the levels of EDU4 is significant (p=0.0221), this means the slope for this level is significantly different than zero.
These are not testing the same thing. One does not imply the other.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.