Hello SAS community,
Is there a way to make this more dynamic? I assume I would have to invoke the macro language. In parentheses are the low/high values for height/weight/BMI for each age band. Different age bands have different low/high values for these three measures (height, weight, BMI).
data output;
set input ;
if 1< age <2 then do;
if height<(low_height_reference_1to2) or height>(high_height_reference_1to2) then flag_height=1;
if weight<(low_weight_reference_1to2) or weight>(high_weight_reference_1to2) then flag_weight=1;
if bmi <(low_bmi_reference_1to2) or bmi>(high_weight_reference_1to2) then flag_bmi=1;
end;
else if 2<= age <3 then do;
if height<(low_height_reference_2to3) or height>(high_height_reference_2to3) then flag_height=1;
if weight<(low_weight_reference_2to3) or weight>(high_weight_reference_2to3) then flag_weight=1;
if bmi <(low_bmi_reference_2to3) or bmi>(high_weight_reference_2to3) then flag_bmi=1;
end;
else if 3<= age <4 then do;
if height<(low_height_reference_3to4) or height>(high_height_reference_3to4) then flag_height=1;
if weight<(low_weight_reference_3to4) or weight>(high_weight_reference_3to4) then flag_weight=1;
if bmi <(low_bmi_reference_3to4) or bmi>(high_weight_reference_3to4) then flag_bmi=1;
end;
..... this repeats until all age blocks are covered.....
run;
Thank you.
1) You probably men to check BMI aginst "high_bmi_reference.." (not "high_weight_reference...).
2) You can use next code assuming variable names are as shown in your example:
%macro chk(age1,age2);
if &age1 le age lt &age2 and
(height lt low_height_reference_&age1.to&age2 or height gt high_height_reference_&age1.to&age2.) or
(weight lt low_weight_reference_&age1.to&age2 or weight gt high_weight_reference_&age1.to&age2.) or
(bmi lt low_bmi_reference_&age1.to&age2 or bmi gt high_bmi_reference_&age1.to&age2) then flag_height=1;
%mend chk;
data output;
set input ;
%chk(1,2);
%chk(2,3);
%chk(3,4);
run;
Thank you for taking a look Shmuel.
Using your suggestion, we have something like this:
%macro chk(age1,age2); if &age1 ge age lt &age2 and (height lt 70 or height gt 80) then flag_height=1; if &age1 ge age lt &age2 and (weight lt 8 or weight gt 12) then flag_weight=1; if &age1 ge age lt &age2 and (bmi lt 14 or bmi gt 20) then flag_bmi=1; %mend chk; data output; set input ; %chk(1,2); %chk(2,3); %chk(3,4); run;
The thing with low/high references for the 3 measures is that they are different based on the age band the person is in. So in the above example, values of 70, 80 for height, 8 and 12 for weight, 14 and 20 for BMI are just examples for one age group. Other age groups would have other values for those reference numbers. I was hoping that part could be dynamic.
I would define six informats (height, weight, bmi) for high/low variables. Then something like this:
/* untested code */
data output;
set input;
w_low = input(age, weight_min.);
w_high = input(age, weight_high.);
flag_weight = weight < w_low or weight > w_heigh;
...
run;
First, put your reference values into a dataset:
data reference;
input agegrp $ low_height_ref high_height_ref low_weight_ref high_weight_ref low_bmi_ref high_bmi_ref;
datalines;
1 70.5 80 12 14 20 30
;
Add additional lines, of course.
Then, set up a format that converts age to agegroup:
proc format;
value agegroup
low - 1 = "0"
1<-<2 = "1"
2-<3 = "2"
;
run;
add additional groups as suited.
Then, load the references into a hash, and use the format to retrieve the reference values from the hash:
data want;
set have;
if _n_ = 1
then do;
length agegroup $8;
declare hash ref(dataset:"reference");
ref.definekey("agegroup");
ref.definedata("low_height_ref","high_height_ref","low_weight_ref","high_weight_ref","low_bmi_ref","high_bmi_ref");
ref.definedone();
call missing(low_height_ref,high_height_ref,low_weight_ref,high_weight_ref,low_bmi_ref,high_bmi_ref);
end;
agegroup = put(age,agegroup.);
if ref.find() = 0
then do;
flag_height = not (low_height_fref <= height <= high_height_ref);
flag_weight = not (low_weight_ref <= weight <= high_weight_ref);
flag_bmi = not (low_bmi_ref <= bmi <= high_bmi_ref);
end;
run;
You only need to maintain the reference dataset and the format; if you want, you can put the format values into a CNTLIN dataset and run PROC FORMAT off that. You could even combine everything in the reference dataset by adding fmtname, type, start, end, sexcl, eexcl and hlo variables (use rename=(agegroup=label) when using the dataset in PROC FORMAT).
Thank you Kurt. Very helpful. I ended up using a different method, but your method works, I checked.
Only one thing, you mentioned the wrong key variable in your hash object. We should call "agegrp" not "agegroup" in the following section of your code:
length agegrp $8;
declare hash ref(dataset:"reference");
ref.definekey("agegrp");
ref.definedata("low_height_ref","high_height_ref","low_weight_ref","high_weight_ref","low_bmi_ref","high_bmi_ref");
ref.definedone();
call missing(low_height_ref,high_height_ref,low_weight_ref,high_weight_ref,low_bmi_ref,high_bmi_ref);
end;
agegrp = put(age,agegroup.);
Thank you.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.