I have two datasets that look like the following:
I want to perform ANOVA only if there are more than 3 subjects in each BMI category for each time period (STPER, ENDPER). In my head it would be something along the lines of
%if have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') and have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') and have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') %then %do; proc glm data = have2; class BMI; model log(CONCEN) = BMI; means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs; run;
/*repeat for ENDPER*/
I have very little experience with macros, so I'm not sure if what I am asking is possible.
You don't need full-blown macro code here. Instead you can use the "select ... into :macrovar" facility of PROC SQL to populate a the macrovar &BMI_LIST with a comma separated list of quoted BMI levels that satisfy your requirements:
proc sql noprint;
select quote(strip(bmi)) into :bmi_list separated by ','
from have
where stat='N' and stper>3;
quit;
%put &=bmi_list;
Then run your PROC GLM code, with the additional where statement:
proc glm data = have2;
where bmi in (&bmi_list);
class BMI;
model log(CONCEN) = BMI;
means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs;
run;
Of course, you'd have to run this twice, once for STPER and once for ENDPER.
You don't need full-blown macro code here. Instead you can use the "select ... into :macrovar" facility of PROC SQL to populate a the macrovar &BMI_LIST with a comma separated list of quoted BMI levels that satisfy your requirements:
proc sql noprint;
select quote(strip(bmi)) into :bmi_list separated by ','
from have
where stat='N' and stper>3;
quit;
%put &=bmi_list;
Then run your PROC GLM code, with the additional where statement:
proc glm data = have2;
where bmi in (&bmi_list);
class BMI;
model log(CONCEN) = BMI;
means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs;
run;
Of course, you'd have to run this twice, once for STPER and once for ENDPER.
Since your Have set has multiple statistics perhaps you only meant the rows where the N statistic matters?
Combine the Have and Have2 so that there is a NEW variable that has the N. Then use a Where in the model to select appropriate records.
Not tested code for obvious reasons:
Proc sort data=have2; by bmi; run; Proc sort data=have (where=(stat='N')) out=bmin; by bmi; run; data want; merge have2 bmin (rename=(stper=stper_n)) ; by bmi; run; Proc glm data=want; where stper_n > 3; class BMI; <rest of glm code goes here>
You might save some minor coding by creating a summary set with only the count and use that instead of Have with all the statistics.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.