Hello,
I'm working with four data sets (6-year, 12-year, 18-year and 24-year measurements), within each data set every observation is unique to a person.
In each of these data sets, there are 13 measures of interest (height, weight, pulse....) that were measured 3 times (height1, height2, height3) there are cases where an individual only had 1 or 2 measurements (i.e. height1=48, height2=48.2, height3= . ).
I'm looking to find the average measurement on cases where the individual had 2 or more measurements (for all 13 variables of interest). My base code works (see below) but I'd like to create a macro so I don't have 26 IF/DO statements x 4 data sets.
Any Suggestions on how to shorten up the code? Any help is greatly appreciated!
%LET WT= WEIGHT; %LET HT= HEIGHT; . . . . DATA Y; SET X; IF &WT.1 NE . AND &WT.2 NE . AND &WT.3 NE . THEN DO; AVG&WT. = (&WT.1+&WT.2+&WT.3)/3; END; IF &WT.1 NE . AND &WT.2 NE . AND &WT.3 = . THEN DO; AVG&WT. = (&WT.1+&WT.2)/2; END; IF &HT.1 NE........ RUN;
It sounds like you want to calculate the mean only if there are at least two values?
if n(of weight1-weight3) >= 2 then weight_avg=mean(of weight1-weight3);
I am not sure that macro code is much use in this situation.
Just replicate the line and change the variable names.
But if you wanted you could create a macro to take in a list of base names and generate the above code. You could even make the minimum number of values and number of replicates parameters.
%macro atleast(list,n=3,min=2);
%local i base;
%do i=1 %to %sysfunc(countw(&list));
%let base=%scan(&list,&i);
if n(of &base.1-&base.&n) >= &min) then &base._avg=mean(of &base.1-&base.&n);
%end;
%mend atleast;
Then you could use it for HEIGHT and WEIGHT;
data want;
set have ;
%atleast(height weight)
run;
It sounds like you want to calculate the mean only if there are at least two values?
if n(of weight1-weight3) >= 2 then weight_avg=mean(of weight1-weight3);
I am not sure that macro code is much use in this situation.
Just replicate the line and change the variable names.
But if you wanted you could create a macro to take in a list of base names and generate the above code. You could even make the minimum number of values and number of replicates parameters.
%macro atleast(list,n=3,min=2);
%local i base;
%do i=1 %to %sysfunc(countw(&list));
%let base=%scan(&list,&i);
if n(of &base.1-&base.&n) >= &min) then &base._avg=mean(of &base.1-&base.&n);
%end;
%mend atleast;
Then you could use it for HEIGHT and WEIGHT;
data want;
set have ;
%atleast(height weight)
run;
I'm thinking a macro is not needed here.
Append all your data into one data set, with a YEAR variable to indicate 6, 12, 18, 24. Then compute the n for each individual and each measurement. This can be done in a data step. If n=1 then in the data step, then the mean is missing. If n=2 or n=3, then (for example regarding height)
height_mean = mean(of height1-height3);
On top of @Tom's suggestion, If your variables of interest share some traits, such as they are at least 2 with the same prefix (shown in the example code ), or all variables except 'ID'/'Group' etc.(not shown), you could use some variants of the following to save some typing:
%macro atleast(list,min=2);
%local i base;
%do i=1 %to %sysfunc(countw(&list));
%let base=%scan(&list,&i);
if n(of &base.:) >= &min then &base._avg=mean(of &base.:);
%end;
%mend atleast;
proc sql;
select DISTINCT PRXCHANGE('s/\d*$//',1,strip(name)) into :list separated by ' ' from dictionary.columns
where libname='WORK' AND MEMNAME='HAVE'
group by PRXCHANGE('s/\d*$//',1,strip(name))
having count(*) >=2
;
QUIT;
data want;
set have ;
%atleast(&list.)
run;
Thank you to everyone for their very helpful assistance!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.