Hello SAS community,
Is there a way to make this more dynamic? I assume I would have to invoke the macro language. In parentheses are the low/high values for height/weight/BMI for each age band. Different age bands have different low/high values for these three measures (height, weight, BMI).
data output;
set input ;
if 1< age <2 then do;
if height<(low_height_reference_1to2) or height>(high_height_reference_1to2) then flag_height=1;
if weight<(low_weight_reference_1to2) or weight>(high_weight_reference_1to2) then flag_weight=1;
if bmi <(low_bmi_reference_1to2) or bmi>(high_weight_reference_1to2) then flag_bmi=1;
end;
else if 2<= age <3 then do;
if height<(low_height_reference_2to3) or height>(high_height_reference_2to3) then flag_height=1;
if weight<(low_weight_reference_2to3) or weight>(high_weight_reference_2to3) then flag_weight=1;
if bmi <(low_bmi_reference_2to3) or bmi>(high_weight_reference_2to3) then flag_bmi=1;
end;
else if 3<= age <4 then do;
if height<(low_height_reference_3to4) or height>(high_height_reference_3to4) then flag_height=1;
if weight<(low_weight_reference_3to4) or weight>(high_weight_reference_3to4) then flag_weight=1;
if bmi <(low_bmi_reference_3to4) or bmi>(high_weight_reference_3to4) then flag_bmi=1;
end;
..... this repeats until all age blocks are covered.....
run;
Thank you.
1) You probably men to check BMI aginst "high_bmi_reference.." (not "high_weight_reference...).
2) You can use next code assuming variable names are as shown in your example:
%macro chk(age1,age2);
if &age1 le age lt &age2 and
(height lt low_height_reference_&age1.to&age2 or height gt high_height_reference_&age1.to&age2.) or
(weight lt low_weight_reference_&age1.to&age2 or weight gt high_weight_reference_&age1.to&age2.) or
(bmi lt low_bmi_reference_&age1.to&age2 or bmi gt high_bmi_reference_&age1.to&age2) then flag_height=1;
%mend chk;
data output;
set input ;
%chk(1,2);
%chk(2,3);
%chk(3,4);
run;
Thank you for taking a look Shmuel.
Using your suggestion, we have something like this:
%macro chk(age1,age2); if &age1 ge age lt &age2 and (height lt 70 or height gt 80) then flag_height=1; if &age1 ge age lt &age2 and (weight lt 8 or weight gt 12) then flag_weight=1; if &age1 ge age lt &age2 and (bmi lt 14 or bmi gt 20) then flag_bmi=1; %mend chk; data output; set input ; %chk(1,2); %chk(2,3); %chk(3,4); run;
The thing with low/high references for the 3 measures is that they are different based on the age band the person is in. So in the above example, values of 70, 80 for height, 8 and 12 for weight, 14 and 20 for BMI are just examples for one age group. Other age groups would have other values for those reference numbers. I was hoping that part could be dynamic.
I would define six informats (height, weight, bmi) for high/low variables. Then something like this:
/* untested code */
data output;
set input;
w_low = input(age, weight_min.);
w_high = input(age, weight_high.);
flag_weight = weight < w_low or weight > w_heigh;
...
run;
First, put your reference values into a dataset:
data reference;
input agegrp $ low_height_ref high_height_ref low_weight_ref high_weight_ref low_bmi_ref high_bmi_ref;
datalines;
1 70.5 80 12 14 20 30
;
Add additional lines, of course.
Then, set up a format that converts age to agegroup:
proc format;
value agegroup
low - 1 = "0"
1<-<2 = "1"
2-<3 = "2"
;
run;
add additional groups as suited.
Then, load the references into a hash, and use the format to retrieve the reference values from the hash:
data want;
set have;
if _n_ = 1
then do;
length agegroup $8;
declare hash ref(dataset:"reference");
ref.definekey("agegroup");
ref.definedata("low_height_ref","high_height_ref","low_weight_ref","high_weight_ref","low_bmi_ref","high_bmi_ref");
ref.definedone();
call missing(low_height_ref,high_height_ref,low_weight_ref,high_weight_ref,low_bmi_ref,high_bmi_ref);
end;
agegroup = put(age,agegroup.);
if ref.find() = 0
then do;
flag_height = not (low_height_fref <= height <= high_height_ref);
flag_weight = not (low_weight_ref <= weight <= high_weight_ref);
flag_bmi = not (low_bmi_ref <= bmi <= high_bmi_ref);
end;
run;
You only need to maintain the reference dataset and the format; if you want, you can put the format values into a CNTLIN dataset and run PROC FORMAT off that. You could even combine everything in the reference dataset by adding fmtname, type, start, end, sexcl, eexcl and hlo variables (use rename=(agegroup=label) when using the dataset in PROC FORMAT).
Thank you Kurt. Very helpful. I ended up using a different method, but your method works, I checked.
Only one thing, you mentioned the wrong key variable in your hash object. We should call "agegrp" not "agegroup" in the following section of your code:
length agegrp $8;
declare hash ref(dataset:"reference");
ref.definekey("agegrp");
ref.definedata("low_height_ref","high_height_ref","low_weight_ref","high_weight_ref","low_bmi_ref","high_bmi_ref");
ref.definedone();
call missing(low_height_ref,high_height_ref,low_weight_ref,high_weight_ref,low_bmi_ref,high_bmi_ref);
end;
agegrp = put(age,agegroup.);
Thank you.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.