Hi,
I have a dataset with multiple readings for each person. I need to have just one row for each person in a way that whenever multiple readings are present; I need an average of both the readings. Is there a way to do it in the data step?
Thanks in advance!
Yes you can do it in a data step, but it would be simpler to use PROC SUMMARY. Can you provide more details?
proc summary data=have nway;
class person;
var variable1 variable2 variable3; /* Whatever list of variables you need goes here */
output out=want mean=;
run;
Yes you can do it in a data step, but it would be simpler to use PROC SUMMARY. Can you provide more details?
proc summary data=have nway;
class person;
var variable1 variable2 variable3; /* Whatever list of variables you need goes here */
output out=want mean=;
run;
Hi,
Thank you so much for posting this! This syntax worked out perfectly for me!
@code_blooded wrote:
Hi,
I have a dataset with multiple readings for each person. I need to have just one row for each person in a way that whenever multiple readings are present; I need an average of both the readings. Is there a way to do it in the data step?
Thanks in advance!
Yes but it can be a lot of work. Why do you want to do it in a data step?
If you have any sort of "person" identifier then proc means/summary would be a better way to go.
Dummy code as you have provided no actual details:
proc sort data=have; by personid; run; proc summary data=have; by personid; var _numeric_; output out=want (drop=_type_) mean= ; run;
Will create Want data set with the mean of all numeric variables retaining the existing variable name and add a variable _freq_ that has the count of "rows" used to get the summary. _Numeric_ is a special list word SAS uses in some places to indicate "use all numeric variables". If you only want some variables then place the names of the variables you want on the Var statement.
If you want something different then you need to provide more details.
To use a data step you will need to create variables that hold the sum and count, considering and missing values, of each variable you want summarized and then when the last person id is encountered calculate the mean and output. If you want different statistics then you get to do additional coding for each statistic and some of it can be quite daunting. Proc means/summary takes care of that for you.
Hi,
Thank you so much for recommending proc summary! This works seamlessly for my problem! 🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.