DATA:
subjid | age | sex | bmi | race | wk1derm | wk1dermthres | wk1global | wk1globalthres | wk1skin | wk1skinthres | wk8derm | wk8dermthres | wk8global | wk8globalthres | wk8skin | sk8skinthres | wk10derm | wk10dermthres | wk10global | wk10gobalthres | wk10skin | wk10skinthres |
1 | 65 | 1 | 24.9 | 1 | 5 | 0 | 1 | 1 | 9 | 0 | 0 | 1 | 0 | 1 | 7 | 1 | 1 | 1 | 0 | 1 | 5 | 1 |
2 | 45 | 1 | 23.9 | 2 | 6 | 0 | 0 | 1 | 8 | 0 | 1 | 1 | 0 | 1 | 7 | 1 | 0 | 1 | 0 | 1 | 5 | 1 |
3 | 70 | 0 | 27 | 2 | 6 | 0 | 2 | 0 | 0 | 1 | 3 | 0 | 1 | 1 | 2 | 1 | 0 | 1 | 1 | 1 | 0 | 1 |
4 | 48 | 1 | 26 | 3 | 10 | 0 | 2 | 0 | 7 | 1 | 9 | 0 | 2 | 0 | 6 | 0 | 8 | 0 | 2 | 0 | 7 | 1 |
5 | 36 | 0 | 15.5 | 5 | 15 | 0 | 5 | 0 | 12 | 0 | 12 | 0 | 4 | 0 | 10 | 0 | 12 | 0 | 4 | 0 | 10 | 0 |
I would like to run logistic regression models for questionnaires given to patients at each week. Each questionnaire is a continuous variable but also has a responder threshold:
DERM questionnaire threshold: score=0/1 vs >1
Global questionnaire threshold: score=0/1 vs >1
SKIN questionnaire threshold: score= <8 vs >=8
I want to run regression models to assess the impact of potentially relevant covariates where the SKIN threshold is the independent variable and the other two (DERM and global thresholds) are the dependent variables. I also want to find out what other covariates (e.g., age [continuous], sex, bmi [continuous], race) should be included as predictors in the models.
i have attempted doing the codes for week1 below and wanted to verify if they look correct:
proc logistic data = "c:mydatahsb2" desc; model wk1DERMthres = wk1SKINthres AGE sex RACE BMI / expb; run;
proc logistic data = "c:mydatahsb2" desc; model wk1Globalthres = wk1SKINthres AGE sex RACE BMI / expb; run;
if i want to conduct ANCOVA models for the questionnaires (DERM, Global) as continuous variables (total scores) to evaluate the association between DERM and SKIN threshold and Global and SKIN threshold and to assess whether or not other variables (e.g., age, sex, race, bmi) have an impact on the association, are these the correct codes:
proc logistic data=work.question;
class wk1SKINthres;
model wk1DERM= wk1SKINthres age wk1SKINthres*age/solution;
lsmeans wk1SKINthres*age/tukey line;
run;
proc glm data=work.question;
class wk1SKINthres;
model wk1Global = wk1SKINthres age wk1SKINthres*age/solution;
lsmeans wk1SKINthres*age/tukey line;
run;
proc glm data=work.question;
class wk1SKINthres race;
model wk1DERM= wk1SKINthres race wk1SKINthres*race/solution;
lsmeans wk1SKINthres*race/tukey line;
run;
proc glm data=work.question;
class wk1SKINthres race ;
model wk1Global = wk1SKINthres race wk1SKINthres*race/solution;
lsmeans wk1SKINthres*race/tukey line;
run;
I advise against dichotomizing your variables since it throws away information. If the responses are approximately normally distributed, then PROC GLM is fine. You would never use PROC LOGISTIC with a continuous response. As mentioned in this note on ANCOVA models, you could handle more than one covariate in a single model. For example, these statements allow you to test if the slopes on BMI and AGE differ by looking at the tests for the interactions.
proc glm data=work.question;
class wk1SKINthres;
model wk1DERM= wk1SKINthres bmi bmi*wk1SKINthres age age*wk1SKINthres / solution;
run;
(Note there is no SOLUTION option in PROC LOGISTIC... it always prints the parameter estimates.)
I advise against dichotomizing your variables since it throws away information. If the responses are approximately normally distributed, then PROC GLM is fine. You would never use PROC LOGISTIC with a continuous response. As mentioned in this note on ANCOVA models, you could handle more than one covariate in a single model. For example, these statements allow you to test if the slopes on BMI and AGE differ by looking at the tests for the interactions.
proc glm data=work.question;
class wk1SKINthres;
model wk1DERM= wk1SKINthres bmi bmi*wk1SKINthres age age*wk1SKINthres / solution;
run;
(Note there is no SOLUTION option in PROC LOGISTIC... it always prints the parameter estimates.)
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.