Solved: Getting an average of multiple values from the same time period within...

kastafford · Posted 07-03-2018 12:33 PM

Hi there! I have repeated measures data in long format. Several subjects have multiple observations within the same binned time period. I would like to average the values for that subject within that time period (rather than selecting just one).

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc. I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again.

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat.

I've attached a snippet of the data in sas7bdat format.

Thanks!

Obs POST logvl timecat ID

5.2	-0.5	002E5
3.9	-0.5	002E5
3.6	-0.5	002E5
2.7	0.0	002E5
5.5	0.5	002E5
3.4	0.5	002E5
2.7	0.5	002E5
0.0	-3.0	003E5
3.5	-3.0	003E5
3.0	-2.5	003E5
4.9	-1.0	003E5
5.1	-0.5	003E5
4.8	-0.5	003E5
1.7	0.5	003E5
0.0	0.5	003E5
3.5	1.0	003E5
2.1	1.0	003E5
0.0	1.0	003E5
0.0	1.0	003E5
0.0	1.5	003E5

PaigeMiller · Posted 07-03-2018 12:48 PM

@kastafford wrote:

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc. I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again.

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat.

This is exactly what PROC SUMMARY was designed to do.

UNTESTED CODE

proc summary data=have nway;
     class id timecat;
     var post;
     output out=want mean=;
run;

--
Paige Miller

View solution in original post

Reeza · Posted 07-03-2018 12:45 PM

@kastafford wrote:

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat.

How are you currently doing it? I would assume adding TIMECAT to your BY, CLASS or GROUP statement would work.

This is a standard proc means/summary type question where you place ALL your grouping variables in the BY or CLASS statement.

https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas

@kastafford wrote:

Hi there! I have repeated measures data in long format. Several subjects have multiple observations within the same binned time period. I would like to average the values for that subject within that time period (rather than selecting just one).

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc. I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again.

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat.

I've attached a snippet of the data in sas7bdat format.

Thanks!

Obs POST logvl timecat ID

0 5.2 -0.5 002E5

0 3.9 -0.5 002E5

0 3.6 -0.5 002E5

0 2.7 0.0 002E5

0 5.5 0.5 002E5

0 3.4 0.5 002E5

0 2.7 0.5 002E5

0 0.0 -3.0 003E5

0 3.5 -3.0 003E5

0 3.0 -2.5 003E5

0 4.9 -1.0 003E5

0 5.1 -0.5 003E5

0 4.8 -0.5 003E5

0 1.7 0.5 003E5

0 0.0 0.5 003E5

0 3.5 1.0 003E5

0 2.1 1.0 003E5

0 0.0 1.0 003E5

0 0.0 1.0 003E5

0 0.0 1.5 003E5

kastafford · Posted 07-03-2018 12:51 PM

Hi @Reeza,

Thanks for your quick reply. As a procedure, this works to get me the values, but I'm trying to do it in a data step to create a new variable

proc means data=cancer.sample;
class ID timecat;
var logvl;
run;

After they by ID; if first.ID, I'm not sure how to add the conditional timecat variable. I'm thinking either a do loop or array but I don't construct them well.

Thanks again!

PaigeMiller · Posted 07-03-2018 12:58 PM

@kastafford wrote:

After they by ID; if first.ID, I'm not sure how to add the conditional timecat variable. I'm thinking either a do loop or array but I don't construct them well.

Hint for future SAS usage: don't try to write your own code to do simple things like means and standard deviations and minimums and maximums and so on. SAS has already done this for you, plus they have built in error-checking and verified the results. In fact, even for more complicated analyses, if a SAS PROC exists that does what you want, don't write your own code. Spend some time learning what SAS PROCs are available and what they do.

--
Paige Miller

kastafford · Posted 07-03-2018 01:02 PM

you bet. thanks again.

PaigeMiller · Posted 07-03-2018 12:48 PM

@kastafford wrote:

For example, (I've pasted the data frame below) subject 002E5 has three observations for time point -0.5 (variable name timecat), and three observations for time point 0.5; subject 003E5 has two observations for time point -3.0, etc. I am looking for a more efficient way to get the average value at time point x for subject y with all of the time points in one data set as opposed to splitting the data into each time category and getting the time point and subject specific averages and then merging it all together again.

I know how to get the average by unique ID, but can't figure out how to add the additional condition of timecat.

This is exactly what PROC SUMMARY was designed to do.

UNTESTED CODE

proc summary data=have nway;
     class id timecat;
     var post;
     output out=want mean=;
run;

--
Paige Miller

kastafford · Posted 07-03-2018 12:55 PM

Hi @PaigeMiller! Thank you! That was what I was looking for.

kastafford · Posted 07-03-2018 12:56 PM

for clarity, the variable I wanted averaged is logvl; in case someone else wants to practice with the sample data

Getting an average of multiple values from the same time period within each subject, and time period

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Re: Getting an average of multiple values from the same time period within each subject, and time pe

Registration is open