Hello SAS Community,
I have a class project that analyzes factors associated with a binary outcome within a CBT intervention program. The dataset includes a categorical exposure variable (cbtmod
) with three levels (1,2,3), along with other predictors.
The study aims to answer two main research questions:
Mainly, I'm interested in the interactions between cbtmod
and sex
(coded as 0 for males and 1 for females), as well as cbtmod
's interactions with all other variables in the model, as advised by my supervisor.
I've structured the PROC LOGISTIC
for the fully adjusted model to check for interaction as follows:
proc logistic data=mydata; class cbtmod(ref='1') sex(ref='0') / param=ref; model outcome(event='1') = cbtmod sex hh_disabilityn pregnant_1_sex Marriage Resstatus cbtmod*sex / lackfit clodds=wald; contrast 'cbtmod=2 for males' cbtmod 2 cbtmod*sex 0 / est=exp; contrast 'cbtmod=2 for females' cbtmod 2 cbtmod*sex 1 / est=exp; contrast 'cbtmod=3 for males' cbtmod 3 cbtmod*sex 0 /est=exp; contrast 'cbtmod=3 for females' cbtmod 3 cbtmod*sex 1 / est=exp; run;
Questions:
Interaction Testing: Is this setup correct for testing the interaction between cbtmod
and sex
? How should I structure my CONTRAST
statements to best explore these interaction effects across different CBT modalities by sex?
Fully Adjusted Model Reporting: In the context of the fully adjusted model that includes interaction terms, what are the essential values or estimates to report for a comprehensive interpretation? Should I report the aOR for the interaction term or just the exposure and should the interaction term be included if I am reporting the aor for the full model?
Handling Zero Cell Counts: For my second research question, I encounter a scenario where, for males (sex=0
), with the outcome being 0 and cbtmod=3
, there is a cell count of 0. How should this be addressed in the logistic regression analysis? Should I just make a note of it in the limitation section of my report?
Checking Interactions with Other Variables: Given the directive to check for interactions between cbtmod
and all other variables, what would be an efficient strategy to approach this without complicating the model unnecessarily?
I value any insights or guidance on refining my approach, particularly regarding CONTRAST
statements for interaction testing, strategies for addressing zero cell counts, and managing multiple interactions in logistic regression.
Thank you for your assistance!
proc logistic data = sashelp.heart;
class Status Smoking_Status(ref = 'Non-smoker') Sex(ref = 'Male') / param = glm;
model Status(event = 'Dead') = Smoking_Status | Sex Cholesterol Weight / expb;
estimate 'Female; Heavy (16-25)' intercept 1 Smoking_Status 1 Sex 1 Smoking_Status*Sex 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Male; Heavy (16-25)' intercept 1 Smoking_Status 1 Sex 0 1 Smoking_Status*Sex 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Female; Light (1-5)' intercept 1 Smoking_Status 0 1 Sex 1 Smoking_Status*Sex 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Male; Light (1-5)' intercept 1 Smoking_Status 0 1 Sex 0 1 Smoking_Status*Sex 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Female; Moderate (6-15)' intercept 1 Smoking_Status 0 0 1 Sex 1 Smoking_Status*Sex 0 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Male; Moderate (6-15)' intercept 1 Smoking_Status 0 0 1 Sex 0 1 Smoking_Status*Sex 0 0 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Female; Very Heavy (> 25)' intercept 1 Smoking_Status 0 0 0 1 Sex 1 Smoking_Status*Sex 0 0 0 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Male; Very Heavy (> 25)' intercept 1 Smoking_Status 0 0 0 1 Sex 0 1 Smoking_Status*Sex 0 0 0 0 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Female; Non-smoker' intercept 1 Smoking_Status 0 0 0 0 1 Sex 1 Smoking_Status*Sex 0 0 0 0 0 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
estimate 'Male; Non-smoker' intercept 1 Smoking_Status 0 0 0 0 1 Sex 0 1 Smoking_Status*Sex 0 0 0 0 0 0 0 0 0 1 Cholesterol 227.4174412 Weight 153.0866808 / exp ilink e cl;
lsmeans Smoking_Status * Sex / at means ilink pdiff oddsratio cl;
run;
proc means data = sashelp.heart;
var Cholesterol Weight;
run;
data _null_;
logOdds = -0.7022;
odds = exp(logOdds);
prob = exp(logOdds) / (exp(logOdds) + 1);
put logOdds = odds = prob = ;
run;
data _null_;
logOddsDiff = -0.7022 - (-0.1660);
oddsRatio = exp(logOddsDiff);
put logOddsDiff = oddsRatio = ;
run;
proc logistic data = sashelp.heart;
class Status Smoking_Status(ref = 'Non-smoker') Sex(ref = 'Male') / param = glm;
model Status(event = 'Dead') = Smoking_Status | Sex Cholesterol Weight / expb;
estimate 'Female; Heavy (16-25) vs Male; Heavy (16-25)' Sex 1 -1 Smoking_Status*Sex 1 -1 / exp ilink e cl;
lsmeans Smoking_Status * Sex / at means ilink pdiff oddsratio cl;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.