I have two datasets that look like the following:
I want to perform ANOVA only if there are more than 3 subjects in each BMI category for each time period (STPER, ENDPER). In my head it would be something along the lines of
%if have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') and have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') and have.STPER > 3 (where BMI = 'Not Obese' and STAT = 'N') %then %do; proc glm data = have2; class BMI; model log(CONCEN) = BMI; means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs; run;
/*repeat for ENDPER*/
I have very little experience with macros, so I'm not sure if what I am asking is possible.
You don't need full-blown macro code here. Instead you can use the "select ... into :macrovar" facility of PROC SQL to populate a the macrovar &BMI_LIST with a comma separated list of quoted BMI levels that satisfy your requirements:
proc sql noprint;
select quote(strip(bmi)) into :bmi_list separated by ','
from have
where stat='N' and stper>3;
quit;
%put &=bmi_list;
Then run your PROC GLM code, with the additional where statement:
proc glm data = have2;
where bmi in (&bmi_list);
class BMI;
model log(CONCEN) = BMI;
means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs;
run;
Of course, you'd have to run this twice, once for STPER and once for ENDPER.
You don't need full-blown macro code here. Instead you can use the "select ... into :macrovar" facility of PROC SQL to populate a the macrovar &BMI_LIST with a comma separated list of quoted BMI levels that satisfy your requirements:
proc sql noprint;
select quote(strip(bmi)) into :bmi_list separated by ','
from have
where stat='N' and stper>3;
quit;
%put &=bmi_list;
Then run your PROC GLM code, with the additional where statement:
proc glm data = have2;
where bmi in (&bmi_list);
class BMI;
model log(CONCEN) = BMI;
means BMI / hovtest = levene welch;
ods output welch = welch diff = pairs;
run;
Of course, you'd have to run this twice, once for STPER and once for ENDPER.
Since your Have set has multiple statistics perhaps you only meant the rows where the N statistic matters?
Combine the Have and Have2 so that there is a NEW variable that has the N. Then use a Where in the model to select appropriate records.
Not tested code for obvious reasons:
Proc sort data=have2; by bmi; run; Proc sort data=have (where=(stat='N')) out=bmin; by bmi; run; data want; merge have2 bmin (rename=(stper=stper_n)) ; by bmi; run; Proc glm data=want; where stper_n > 3; class BMI; <rest of glm code goes here>
You might save some minor coding by creating a summary set with only the count and use that instead of Have with all the statistics.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.