BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kastafford
Calcite | Level 5

Hi there!  I have repeated measures data in long format.  Several subjects have multiple observations within the same binned time period.  I would like to average the values for that subject within that time period (rather than selecting just one). 

 

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc.  I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again. 

 

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat. 

 

I've attached a snippet of the data in sas7bdat format.

 

Thanks! 

 

Obs POST logvl timecat ID

05.2-0.5002E5
03.9-0.5002E5
03.6-0.5002E5
02.70.0002E5
05.50.5002E5
03.40.5002E5
02.70.5002E5
00.0-3.0003E5
03.5-3.0003E5
03.0-2.5003E5
04.9-1.0003E5
05.1-0.5003E5
04.8-0.5003E5
01.70.5003E5
00.00.5003E5
03.51.0003E5
02.11.0003E5
00.01.0003E5
00.01.0003E5
00.01.5003E5
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@kastafford wrote:

 

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc.  I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again. 

 

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat. 

 

 

This is exactly what PROC SUMMARY was designed to do.

 

UNTESTED CODE

proc summary data=have nway;
     class id timecat;
     var post;
     output out=want mean=;
run;
--
Paige Miller

View solution in original post

7 REPLIES 7
Reeza
Super User

@kastafford wrote:

 

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat. 


How are you currently doing it? I would assume adding TIMECAT to your BY, CLASS or GROUP statement would work. 

This is a standard proc means/summary type question where you place ALL your grouping variables in the BY or CLASS statement.

 

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas


@kastafford wrote:

Hi there!  I have repeated measures data in long format.  Several subjects have multiple observations within the same binned time period.  I would like to average the values for that subject within that time period (rather than selecting just one). 

 

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc.  I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again. 

 

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat. 

 

I've attached a snippet of the data in sas7bdat format.

 

Thanks! 

 

Obs POST logvl timecat ID

0 5.2 -0.5 002E5
0 3.9 -0.5 002E5
0 3.6 -0.5 002E5
0 2.7 0.0 002E5
0 5.5 0.5 002E5
0 3.4 0.5 002E5
0 2.7 0.5 002E5
0 0.0 -3.0 003E5
0 3.5 -3.0 003E5
0 3.0 -2.5 003E5
0 4.9 -1.0 003E5
0 5.1 -0.5 003E5
0 4.8 -0.5 003E5
0 1.7 0.5 003E5
0 0.0 0.5 003E5
0 3.5 1.0 003E5
0 2.1 1.0 003E5
0 0.0 1.0 003E5
0 0.0 1.0 003E5
0 0.0 1.5 003E5

 

kastafford
Calcite | Level 5

Hi @Reeza,

 

Thanks for your quick reply.  As a procedure, this works to get me the values, but I'm trying to do it in a data step to create a new variable

 

proc means data=cancer.sample;
class ID timecat;
var logvl;
run;

 

After they by ID; if first.ID, I'm not sure how to add the conditional timecat variable.  I'm thinking either a do loop or array but I don't construct them well. 

 

Thanks again!

 

PaigeMiller
Diamond | Level 26

@kastafford wrote:

 

 

After they by ID; if first.ID, I'm not sure how to add the conditional timecat variable.  I'm thinking either a do loop or array but I don't construct them well. 

 


Hint for future SAS usage: don't try to write your own code to do simple things like means and standard deviations and minimums and maximums and so on. SAS has already done this for you, plus they have built in error-checking and verified the results. In fact, even for more complicated analyses, if a SAS PROC exists that does what you want, don't write your own code. Spend some time learning what SAS PROCs are available and what they do.

--
Paige Miller
PaigeMiller
Diamond | Level 26

@kastafford wrote:

 

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc.  I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again. 

 

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat. 

 

 

This is exactly what PROC SUMMARY was designed to do.

 

UNTESTED CODE

proc summary data=have nway;
     class id timecat;
     var post;
     output out=want mean=;
run;
--
Paige Miller
kastafford
Calcite | Level 5

Hi @PaigeMiller!  Thank you!  That was what I was looking for. 

kastafford
Calcite | Level 5
for clarity, the variable I wanted averaged is logvl; in case someone else wants to practice with the sample data

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3391 views
  • 2 likes
  • 3 in conversation