DATA:
subjid | age | sex | bmi | race | wk1derm | wk1dermthres | wk1global | wk1globalthres | wk1skin | wk1skinthres | wk8derm | wk8dermthres | wk8global | wk8globalthres | wk8skin | sk8skinthres | wk10derm | wk10dermthres | wk10global | wk10gobalthres | wk10skin | wk10skinthres |
1 | 65 | 1 | 24.9 | 1 | 5 | 0 | 1 | 1 | 9 | 0 | 0 | 1 | 0 | 1 | 7 | 1 | 1 | 1 | 0 | 1 | 5 | 1 |
2 | 45 | 1 | 23.9 | 2 | 6 | 0 | 0 | 1 | 8 | 0 | 1 | 1 | 0 | 1 | 7 | 1 | 0 | 1 | 0 | 1 | 5 | 1 |
3 | 70 | 0 | 27 | 2 | 6 | 0 | 2 | 0 | 0 | 1 | 3 | 0 | 1 | 1 | 2 | 1 | 0 | 1 | 1 | 1 | 0 | 1 |
4 | 48 | 1 | 26 | 3 | 10 | 0 | 2 | 0 | 7 | 1 | 9 | 0 | 2 | 0 | 6 | 0 | 8 | 0 | 2 | 0 | 7 | 1 |
5 | 36 | 0 | 15.5 | 5 | 15 | 0 | 5 | 0 | 12 | 0 | 12 | 0 | 4 | 0 | 10 | 0 | 12 | 0 | 4 | 0 | 10 | 0 |
I would like to run logistic regression models for questionnaires given to patients at each week. Each questionnaire is a continuous variable but also has a responder threshold:
DERM questionnaire threshold: score=0/1 vs >1
Global questionnaire threshold: score=0/1 vs >1
SKIN questionnaire threshold: score= <8 vs >=8
I want to run regression models to assess the impact of potentially relevant covariates where the SKIN threshold is the independent variable and the other two (DERM and global thresholds) are the dependent variables. I also want to find out what other covariates (e.g., age [continuous], sex, bmi [continuous], race) should be included as predictors in the models.
i have attempted doing the codes for week1 below and wanted to verify if they look correct:
proc logistic data = "c:mydatahsb2" desc; model wk1DERMthres = wk1SKINthres AGE sex RACE BMI / expb; run;
proc logistic data = "c:mydatahsb2" desc; model wk1Globalthres = wk1SKINthres AGE sex RACE BMI / expb; run;
if i want to conduct ANCOVA models for the questionnaires (DERM, Global) as continuous variables (total scores) to evaluate the association between DERM and SKIN threshold and Global and SKIN threshold and to assess whether or not other variables (e.g., age, sex, race, bmi) have an impact on the association, are these the correct codes:
proc logistic data=work.question;
class wk1SKINthres;
model wk1DERM= wk1SKINthres age wk1SKINthres*age/solution;
lsmeans wk1SKINthres*age/tukey line;
run;
proc glm data=work.question;
class wk1SKINthres;
model wk1Global = wk1SKINthres age wk1SKINthres*age/solution;
lsmeans wk1SKINthres*age/tukey line;
run;
proc glm data=work.question;
class wk1SKINthres race;
model wk1DERM= wk1SKINthres race wk1SKINthres*race/solution;
lsmeans wk1SKINthres*race/tukey line;
run;
proc glm data=work.question;
class wk1SKINthres race ;
model wk1Global = wk1SKINthres race wk1SKINthres*race/solution;
lsmeans wk1SKINthres*race/tukey line;
run;
I advise against dichotomizing your variables since it throws away information. If the responses are approximately normally distributed, then PROC GLM is fine. You would never use PROC LOGISTIC with a continuous response. As mentioned in this note on ANCOVA models, you could handle more than one covariate in a single model. For example, these statements allow you to test if the slopes on BMI and AGE differ by looking at the tests for the interactions.
proc glm data=work.question;
class wk1SKINthres;
model wk1DERM= wk1SKINthres bmi bmi*wk1SKINthres age age*wk1SKINthres / solution;
run;
(Note there is no SOLUTION option in PROC LOGISTIC... it always prints the parameter estimates.)
I advise against dichotomizing your variables since it throws away information. If the responses are approximately normally distributed, then PROC GLM is fine. You would never use PROC LOGISTIC with a continuous response. As mentioned in this note on ANCOVA models, you could handle more than one covariate in a single model. For example, these statements allow you to test if the slopes on BMI and AGE differ by looking at the tests for the interactions.
proc glm data=work.question;
class wk1SKINthres;
model wk1DERM= wk1SKINthres bmi bmi*wk1SKINthres age age*wk1SKINthres / solution;
run;
(Note there is no SOLUTION option in PROC LOGISTIC... it always prints the parameter estimates.)
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.