Hello,
I'm working with four data sets (6-year, 12-year, 18-year and 24-year measurements), within each data set every observation is unique to a person.
In each of these data sets, there are 13 measures of interest (height, weight, pulse....) that were measured 3 times (height1, height2, height3) there are cases where an individual only had 1 or 2 measurements (i.e. height1=48, height2=48.2, height3= . ).
I'm looking to find the average measurement on cases where the individual had 2 or more measurements (for all 13 variables of interest). My base code works (see below) but I'd like to create a macro so I don't have 26 IF/DO statements x 4 data sets.
Any Suggestions on how to shorten up the code? Any help is greatly appreciated!
%LET WT= WEIGHT; %LET HT= HEIGHT; . . . . DATA Y; SET X; IF &WT.1 NE . AND &WT.2 NE . AND &WT.3 NE . THEN DO; AVG&WT. = (&WT.1+&WT.2+&WT.3)/3; END; IF &WT.1 NE . AND &WT.2 NE . AND &WT.3 = . THEN DO; AVG&WT. = (&WT.1+&WT.2)/2; END; IF &HT.1 NE........ RUN;
It sounds like you want to calculate the mean only if there are at least two values?
if n(of weight1-weight3) >= 2 then weight_avg=mean(of weight1-weight3);
I am not sure that macro code is much use in this situation.
Just replicate the line and change the variable names.
But if you wanted you could create a macro to take in a list of base names and generate the above code. You could even make the minimum number of values and number of replicates parameters.
%macro atleast(list,n=3,min=2);
%local i base;
%do i=1 %to %sysfunc(countw(&list));
%let base=%scan(&list,&i);
if n(of &base.1-&base.&n) >= &min) then &base._avg=mean(of &base.1-&base.&n);
%end;
%mend atleast;
Then you could use it for HEIGHT and WEIGHT;
data want;
set have ;
%atleast(height weight)
run;
It sounds like you want to calculate the mean only if there are at least two values?
if n(of weight1-weight3) >= 2 then weight_avg=mean(of weight1-weight3);
I am not sure that macro code is much use in this situation.
Just replicate the line and change the variable names.
But if you wanted you could create a macro to take in a list of base names and generate the above code. You could even make the minimum number of values and number of replicates parameters.
%macro atleast(list,n=3,min=2);
%local i base;
%do i=1 %to %sysfunc(countw(&list));
%let base=%scan(&list,&i);
if n(of &base.1-&base.&n) >= &min) then &base._avg=mean(of &base.1-&base.&n);
%end;
%mend atleast;
Then you could use it for HEIGHT and WEIGHT;
data want;
set have ;
%atleast(height weight)
run;
I'm thinking a macro is not needed here.
Append all your data into one data set, with a YEAR variable to indicate 6, 12, 18, 24. Then compute the n for each individual and each measurement. This can be done in a data step. If n=1 then in the data step, then the mean is missing. If n=2 or n=3, then (for example regarding height)
height_mean = mean(of height1-height3);
On top of @Tom's suggestion, If your variables of interest share some traits, such as they are at least 2 with the same prefix (shown in the example code ), or all variables except 'ID'/'Group' etc.(not shown), you could use some variants of the following to save some typing:
%macro atleast(list,min=2);
%local i base;
%do i=1 %to %sysfunc(countw(&list));
%let base=%scan(&list,&i);
if n(of &base.:) >= &min then &base._avg=mean(of &base.:);
%end;
%mend atleast;
proc sql;
select DISTINCT PRXCHANGE('s/\d*$//',1,strip(name)) into :list separated by ' ' from dictionary.columns
where libname='WORK' AND MEMNAME='HAVE'
group by PRXCHANGE('s/\d*$//',1,strip(name))
having count(*) >=2
;
QUIT;
data want;
set have ;
%atleast(&list.)
run;
Thank you to everyone for their very helpful assistance!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.