BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
code_blooded
Fluorite | Level 6

Hi,

 

I have a dataset with multiple readings for each person. I need to have just one row for each person in a way that whenever multiple readings are present; I need an average of both the readings. Is there a way to do it in the data step?

 

Thanks in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Yes you can do it in a data step, but it would be simpler to use PROC SUMMARY. Can you provide more details?

 

proc summary data=have nway;
    class person;
    var variable1 variable2 variable3; /* Whatever list of variables you need goes here */
    output out=want mean=;
run;

 

--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

Yes you can do it in a data step, but it would be simpler to use PROC SUMMARY. Can you provide more details?

 

proc summary data=have nway;
    class person;
    var variable1 variable2 variable3; /* Whatever list of variables you need goes here */
    output out=want mean=;
run;

 

--
Paige Miller
code_blooded
Fluorite | Level 6

Hi,

 

Thank you so much for posting this! This syntax worked out perfectly for me!

ballardw
Super User

@code_blooded wrote:

Hi,

 

I have a dataset with multiple readings for each person. I need to have just one row for each person in a way that whenever multiple readings are present; I need an average of both the readings. Is there a way to do it in the data step?

 

Thanks in advance!


Yes but it can be a lot of work. Why do you want to do it in a data step?

If you have any sort of "person" identifier then proc means/summary would be a better way to go.

Dummy code as you have provided no actual details:

proc sort data=have;
   by personid;
run;

proc summary data=have;
   by personid;
   var _numeric_;
   output out=want (drop=_type_) mean= ;
run;

Will create Want data set with the mean of all numeric variables retaining the existing variable name and add a variable _freq_ that has the count of "rows" used to get the summary. _Numeric_ is a special list word SAS uses in some places to indicate "use all numeric variables". If you only want some variables then place the names of the variables you want on the Var statement.

If you want something different then you need to provide more details.

 

To use a data step you will need to create variables that hold the sum and count, considering and missing values, of each variable you want summarized and then when the last person id is encountered calculate the mean and output. If you want different statistics then you get to do additional coding for each statistic and some of it can be quite daunting. Proc means/summary takes care of that for you.

 

 

code_blooded
Fluorite | Level 6

Hi,

 

Thank you so much for recommending proc summary! This works seamlessly for my problem! 🙂

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1164 views
  • 0 likes
  • 3 in conversation