I have two datasets that look like the following:
I want to perform ANOVA only if there are more than 3 subjects in each BMI category for each time period (STPER, ENDPER). In my head it would be something along the lines of
%if have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') and have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') and have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') %then %do; proc glm data = have2; class BMI; model log(CONCEN) = BMI; means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs; run;
/*repeat for ENDPER*/
I have very little experience with macros, so I'm not sure if what I am asking is possible.
You don't need full-blown macro code here. Instead you can use the "select ... into :macrovar" facility of PROC SQL to populate a the macrovar &BMI_LIST with a comma separated list of quoted BMI levels that satisfy your requirements:
proc sql noprint;
select quote(strip(bmi)) into :bmi_list separated by ','
from have
where stat='N' and stper>3;
quit;
%put &=bmi_list;
Then run your PROC GLM code, with the additional where statement:
proc glm data = have2;
where bmi in (&bmi_list);
class BMI;
model log(CONCEN) = BMI;
means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs;
run;
Of course, you'd have to run this twice, once for STPER and once for ENDPER.
You don't need full-blown macro code here. Instead you can use the "select ... into :macrovar" facility of PROC SQL to populate a the macrovar &BMI_LIST with a comma separated list of quoted BMI levels that satisfy your requirements:
proc sql noprint;
select quote(strip(bmi)) into :bmi_list separated by ','
from have
where stat='N' and stper>3;
quit;
%put &=bmi_list;
Then run your PROC GLM code, with the additional where statement:
proc glm data = have2;
where bmi in (&bmi_list);
class BMI;
model log(CONCEN) = BMI;
means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs;
run;
Of course, you'd have to run this twice, once for STPER and once for ENDPER.
Since your Have set has multiple statistics perhaps you only meant the rows where the N statistic matters?
Combine the Have and Have2 so that there is a NEW variable that has the N. Then use a Where in the model to select appropriate records.
Not tested code for obvious reasons:
Proc sort data=have2;
by bmi;
run;
Proc sort data=have (where=(stat='N')) out=bmin;
by bmi;
run;
data want;
merge have2
bmin (rename=(stper=stper_n))
;
by bmi;
run;
Proc glm data=want;
where stper_n > 3;
class BMI;
<rest of glm code goes here>
You might save some minor coding by creating a summary set with only the count and use that instead of Have with all the statistics.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.