Hello,
I have been trying to loop glimmix using the blog An easy way to run thousands of regressions in SAS - The DO Loop by @Rick_SAS . I believe I followed it step by step, but when I run the data I get different p values than if I wrote it out manually. In my case, I have 12 dependent variables and 31 predictors to test. I was wondering if anyone had any insight into why the looped version is giving me different results?
when I checked the type 3 of fixed effects in the looped model phq9=gender, I got a pvalue of .96, but when I run it manually using the original data set perm.mental health, the pvalue is .61.
Any help would be greatly appreciated!
data perm.temp;
set perm.mentalhealth;
array var_list[12] trouble_sleeping hurting_yourself interest depressed little_energy appetite feeling_bad
concentrating moving_slowly PHQ9_SCORE Depression Anxiety;
do i=1 to dim(var_list);
VarName=vname(var_list(i));
put VarName=;
Outcome=vvalue(var_list[i]);
array categorical[22] gender age scalp_lesions postauricular erythema eyelid_involvement cheilitis flexural_erythema
xerosis neck_folds nipple_eczema keratosis palmar hand_eczema ichthyosis
foot_eczema race education_final insurance alopecia pityriasis pain_severeB;
do j=1 to dim(categorical);
categorical_=vname(categorical(j));
put categorical_=;
CValue=vvalue(categorical[j]);
output;
end; end;
drop i j ;
format Depression Depressionn. Anxiety Anxietyy. interest interestt. depressed _depressedd. trouble_sleeping trouble_sleepingg.
little_energy little_energyy. appetite appetitee. feeling_bad feeling_badd. concentrating concentratingg.
moving_slowly moving_slowlyy. hurting_yourself hurting_yourselff. PHQ9_SCORE PHQ9_SCORE_. PHQ2_SCORE PHQ2_SCORE.
gender gender. race race. education_final education. insurance insurance. scalp_lesions scalp_lesionss.
postauricular postauricularr. erythema erythemaa. eyelid_involvement eyelid_involvementt. cheilitis cheilitiss.
flexural_erythema flexural_erythemaa. xerosis xerosiss. neck_folds neck_foldss. nipple_eczema nipple_eczemaa.
keratosis keratosiss. palmar palmarr. hand_eczema hand_eczemaa. ichthyosis ichthyosiss. foot_eczema foot_eczemaa.
age age_bin_. alopecia alopeciaa. pityriasis pityriasiss. pain_severeB painn. ;
run;
data perm.temp2;
set perm.mentalhealth;
array var_list[12] trouble_sleeping hurting_yourself interest depressed little_energy appetite feeling_bad
concentrating moving_slowly PHQ9_SCORE Depression Anxiety ;
do i=1 to dim(var_list);
VarName=vname(var_list(i));
put VarName=;
Outcome=vvalue(var_list[i]);
array npredictors[9] SCORAD EASI BSA ADSI POEM_SCORE dlqi_score FIVED_SCORE RL_SCORE flare;
do k=1 to dim(npredictors);
npredictors_=vname(npredictors(k));
Value=(npredictors[k]);
output;
end;end;
drop k;
format Depression Depressionn. Anxiety Anxietyy. interest interestt. depressed _depressedd. trouble_sleeping trouble_sleepingg.
little_energy little_energyy. appetite appetitee. feeling_bad feeling_badd. concentrating concentratingg.
moving_slowly moving_slowlyy. hurting_yourself hurting_yourselff. PHQ9_SCORE PHQ9_SCORE_. PHQ2_SCORE PHQ2_SCORE.
gender gender. race race. education_final education. insurance insurance. scalp_lesions scalp_lesionss.
postauricular postauricularr. erythema erythemaa. eyelid_involvement eyelid_involvementt. cheilitis cheilitiss.
flexural_erythema flexural_erythemaa. xerosis xerosiss. neck_folds neck_foldss. nipple_eczema nipple_eczemaa.
keratosis keratosiss. palmar palmarr. hand_eczema hand_eczemaa. ichthyosis ichthyosiss. foot_eczema foot_eczemaa.
age age_bin_. alopecia alopeciaa. pityriasis pityriasiss. pain_severeB painn. ;
run;
proc sort data=perm.temp;
by VarName categorical_;
run;
proc sort data=perm.temp2;
by VarName npredictors_;
run;
proc glimmix data=perm.temp method=laplace ;
by VarName categorical_;
class record_id_final CValue outcome ;
model outcome = CValue /link=cumlogit dist=multinomial solution;
random visit /subject=record_id_final;
run;
proc glimmix data=perm.temp2 method=laplace ;
by VarName npredictors_;
class record_id_final Value outcome ;
model outcome = Value /link=cumlogit dist=multinomial solution;
random visit /subject=record_id_final;
run;
manual glimmix:
proc glimmix data=perm.mentalhealth method=laplace order=internal ;
class record_id_final PHQ9_SCORE gender ;
model PHQ9_SCORE= gender /link=cumlogit dist=multinomial solution ;
random visit /subject=record_id_final;
run;
log from do loop:
results from testing the model phq9=gender:
manual model:
looped:
Testing for missing is a reasonable idea, but you should not be issuing a DELETE at that point. That will stop the whole iteration of the data step, not just skip this one variable in the DO loop.
data perm.temp3;
set perm.mentalhealth;
array var_list Anxiety trouble_sleeping;
array categorical gender age ;
do i=1 to dim(var_list);
if not missing(var_list[i]) then do j=1 to dim(categorical);
if not missing(categorical[j]) then do;
categorical_=vname(categorical[j]);
CValue=vvalue(categorical[j]);
output;
end;
end;
end;
drop i j ;
format
Depression Depressionn. Anxiety Anxietyy. interest interestt.
depressed _depressedd. trouble_sleeping trouble_sleepingg.
little_energy little_energyy. appetite appetitee.
feeling_bad feeling_badd. concentrating concentratingg.
moving_slowly moving_slowlyy. hurting_yourself hurting_yourselff.
PHQ9_SCORE PHQ9_SCORE_. PHQ2_SCORE PHQ2_SCORE.
gender gender. race race. education_final education.
insurance insurance. scalp_lesions scalp_lesionss.
postauricular postauricularr.
erythema erythemaa. eyelid_involvement eyelid_involvementt.
cheilitis cheilitiss. flexural_erythema flexural_erythemaa.
xerosis xerosiss. neck_folds neck_foldss.
nipple_eczema nipple_eczemaa. keratosis keratosiss. palmar palmarr.
hand_eczema hand_eczemaa. ichthyosis ichthyosiss.
foot_eczema foot_eczemaa. age age_bin_. alopecia alopeciaa.
pityriasis pityriasiss. pain_severeB painn.
;
run;
Why are you making two different datasets from the same input dataset?
What is supposed be different about the two?
To test if your transformations are generating the same analyses try just transposing one or two variables. Then run the analyses individually and see if the results are the same.
Things to check for include changes in the number of observations, changes in the number of categories for a variable (perhaps your transposed structure is truncating some longer formatted values?).
So I think the culprit is the missing values. In the manual code, SAS just gives me a note saying that it didn't include missing values. I tried to delete missing values from my loop, but it deleted way more observations than the manual version. I also tried adding a delete statement in the manual model in case sas was skipping over some of the missing values, but it didn't change the results. Any help as what I could do for this?
when I run the following code, I get 718 observations vs the manual code uses 869.
data perm.temp3;
set perm.mentalhealth;
array var_list[2] Anxiety trouble_sleeping;
do i=1 to dim(var_list);
if var_list(i)=. then delete;
VarName=vname(var_list(i));
put VarName=;
Outcome=vvalue(var_list[i]);
array categorical[2] gender age ;
do j=1 to dim(categorical);
categorical_=vname(categorical(j));
put categorical_=;
CValue=vvalue(categorical[j]);
if categorical(j)= . then delete;
output;
end; end;
drop i j ;
format Depression Depressionn. Anxiety Anxietyy. interest interestt. depressed _depressedd. trouble_sleeping trouble_sleepingg.
little_energy little_energyy. appetite appetitee. feeling_bad feeling_badd. concentrating concentratingg.
moving_slowly moving_slowlyy. hurting_yourself hurting_yourselff. PHQ9_SCORE PHQ9_SCORE_. PHQ2_SCORE PHQ2_SCORE.
gender gender. race race. education_final education. insurance insurance. scalp_lesions scalp_lesionss. postauricular postauricularr.
erythema erythemaa. eyelid_involvement eyelid_involvementt. cheilitis cheilitiss. flexural_erythema flexural_erythemaa.
xerosis xerosiss. neck_folds neck_foldss. nipple_eczema nipple_eczemaa. keratosis keratosiss. palmar palmarr.
hand_eczema hand_eczemaa. ichthyosis ichthyosiss. foot_eczema foot_eczemaa. age age_bin_. alopecia alopeciaa.
pityriasis pityriasiss. pain_severeB painn. ;
run;
Testing for missing is a reasonable idea, but you should not be issuing a DELETE at that point. That will stop the whole iteration of the data step, not just skip this one variable in the DO loop.
data perm.temp3;
set perm.mentalhealth;
array var_list Anxiety trouble_sleeping;
array categorical gender age ;
do i=1 to dim(var_list);
if not missing(var_list[i]) then do j=1 to dim(categorical);
if not missing(categorical[j]) then do;
categorical_=vname(categorical[j]);
CValue=vvalue(categorical[j]);
output;
end;
end;
end;
drop i j ;
format
Depression Depressionn. Anxiety Anxietyy. interest interestt.
depressed _depressedd. trouble_sleeping trouble_sleepingg.
little_energy little_energyy. appetite appetitee.
feeling_bad feeling_badd. concentrating concentratingg.
moving_slowly moving_slowlyy. hurting_yourself hurting_yourselff.
PHQ9_SCORE PHQ9_SCORE_. PHQ2_SCORE PHQ2_SCORE.
gender gender. race race. education_final education.
insurance insurance. scalp_lesions scalp_lesionss.
postauricular postauricularr.
erythema erythemaa. eyelid_involvement eyelid_involvementt.
cheilitis cheilitiss. flexural_erythema flexural_erythemaa.
xerosis xerosiss. neck_folds neck_foldss.
nipple_eczema nipple_eczemaa. keratosis keratosiss. palmar palmarr.
hand_eczema hand_eczemaa. ichthyosis ichthyosiss.
foot_eczema foot_eczemaa. age age_bin_. alopecia alopeciaa.
pityriasis pityriasiss. pain_severeB painn.
;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.