I have a long dataset with up to 6000 observations per participant and 121 participants. One of the variables is HR and I need to create average heart rate (variable=HR) for each participant and heart rate in each activity they participated in (Variable=activity with categories 0-5)
I can create the average HR for each participant using the following:
PROC MEANS DATA=mod_data NWAY NOPRINT; CLASS study_id;
VAR HR;
OUTPUT
OUT = SUMS(DROP=_TYPE_ RENAME=(_FREQ_=N_VISITS))
N = N_HR
MEAN = M_HR; RUN;
I am not quite sure how to get the average heart rate for each participant in each activity. SO I would need 6 new variables, one for each activity that gives the participants average heart rate in that activity. Not all participants did all activities and some participants did all of the activities so there are varying amounts of missing data. I need a dataset that looks like the following:
Study_id |
HRM_ACT0 |
HRM_ACT1 |
HRM_ACT2 |
HRM_ACT3 |
HRM_ACT4 |
HRM_ACT5 |
M_HR |
001 |
150 |
. |
155 |
156 |
. |
143 |
148 |
002 |
168 |
130 |
160 |
148 |
123 |
134 |
150 |
003 |
130 |
. |
124 |
. |
170 |
137 |
155 |
004 |
122 |
130 |
178 |
. |
. |
155 |
152 |
I've tried a couple ways to do this unsuccessfully so any help would be appreciated. A sample dataset is attached Thank you in advance.
@rfarmenta wrote:
I have a long dataset with up to 6000 observations per participant and 121 participants. One of the variables is HR and I need to create average heart rate (variable=HR) for each participant and heart rate in each activity they participated in (Variable=activity with categories 0-5)
I can create the average HR for each participant using the following:
PROC MEANS DATA=mod_data NWAY NOPRINT; CLASS study_id; VAR HR; OUTPUT OUT = SUMS(DROP=_TYPE_ RENAME=(_FREQ_=N_VISITS)) N = N_HR MEAN = M_HR; RUN;
I am not quite sure how to get the average heart rate for each participant in each activity. SO I would need 6 new variables, one for each activity that gives the participants average heart rate in that activity. Not all participants did all activities and some participants did all of the activities so there are varying amounts of missing data. I need a dataset that looks like the following:
Study_id
HRM_ACT0
HRM_ACT1
HRM_ACT2
HRM_ACT3
HRM_ACT4
HRM_ACT5
M_HR
001
150
.
155
156
.
143
148 002
168
130
160
148
123
134
150
003
130
.
124
.
170
137
155
004
122
130
178 .
.
155
152
I've tried a couple ways to do this unsuccessfully so any help would be appreciated. A sample dataset is attached Thank you in advance.
Just add Activity to a CLASS or BY statement to get data by Activity.
PROC MEANS is a foundational tool in SAS, I would strongly recommend you read up on it and learn how to use it properly.
PROC MEANS and FREQ should be the most frequently used tools in your toolkit for statistical analysis.
PROC MEANS is a foundational tool in SAS, I would strongly recommend you read up on it and learn how to use it properly.
If I was going to create a list of maxims, this would be one of them!
And I would mention PROC MEANS has an identical twin sister named PROC SUMMARY.
Do you need a data set (further programming uses it) or a report (people read it)?
For a data set add what ever variable you have that indicates the activity to the CLASS statement.
For a report either Proc Tabulate or Report and use the activity as an across variable.
Thank you! I need data and I was able to figure this out using proc means and proc transpose using the following code:
PROC MEANS DATA=mod_data NWAY NOPRINT;
CLASS study_id activity;
VAR HR;
OUTPUT
OUT = SUMS
MEAN = M_HR; RUN;
proc transpose data=SUMS_test(drop=_type_ _freq_)
out=transposed_hrm(drop=activity _label_)
name=activity prefix=ACT
suffix=HRM;
by study_id;
var M_HR;
run;
However, I am running into one issue that I've run into in the past which is why I had initially asked this questions. The transpose creates 6 new variables (act1hrm-act6hrm). When I transpose the data is puts each participants information in consecutive variables. For example, if a participant has HR data for activity 1, 3, and 5, the transpose puts the data in the act1hrm, act2hrm, and act3hrm instead of act1hrm, act3hrm, and act5hrm where the data should be placed. I hope this makes sense.
Previously, when I was doing something similar I used the solution below as suggested by another user to concatenate variables before transposing. However, this does not seem to work when modifying the code to use the HR variable, which is continuous, instead of the hrzone variable, which only has 3 categories. I am likely not doing something correctly and I know I need to read up on proc means more, which I am doing. But I can't seem to figure this out. Any help would be greatly appreciated.
proc means data=mod_data n completetypes;
var indicator;
class study_id activity hrzone;
types study_id*activity*hrzone;
output out=acthr_summary n=acthr;
title 'Time in activity by hrzone level for each participant';
run;
proc contents data=acthr_summary;
run;
* concatenate 'ACTHR', activity, hrzone, and 'T' to use when
* naming variable after transpose ;
data acthr_summary_with_combo;
length combo $ 8;
set acthr_summary(drop=_type_ _freq_);
combo=cat('ACTHR',activity,hrzone,'T');
run;
proc contents data=acthr_summary_with_combo;
run;
proc transpose data=acthr_summary_with_combo(drop=activity hrzone)
out=transposed_acthr(drop=_name_);
by study_id;
var acthr;
id combo;
run;
I figured it was going to be something simple, that worked! Thank you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.