Hello SAS Users,
I have a long data set with daily data across 6 weeks. I am trying calculate the weekly mean for each ID for weeks that have 3 or more days of data. My data looks like this:
ID | Week | DayofWeek | StudyDay | SleepTime |
1 | 1 | 1 | 1 | 360 |
1 | 1 | 2 | 2 | 450 |
1 | 1 | 3 | 3 | 320 |
1 | 1 | 4 | 4 | 420 |
1 | 1 | 5 | 5 | 220 |
1 | 1 | 6 | 6 | 400 |
1 | 1 | 7 | 7 | 360 |
1 | 2 | 1 | 8 | 320 |
1 | 2 | 2 | 9 | . |
1 | 2 | 3 | 10 | 220 |
1 | 2 | 4 | 11 | . |
1 | 2 | 5 | 12 | 400 |
1 | 2 | 6 | 13 | . |
1 | 2 | 7 | 14 | 450 |
1 | 3 | 1 | 15 | . |
1 | 3 | 2 | 16 | . |
1 | 3 | 3 | 17 | . |
1 | 3 | 4 | 18 | . |
1 | 3 | 5 | 19 | . |
1 | 3 | 6 | 20 | 420 |
1 | 3 | 7 | 21 | 220 |
What I want is this:
ID | Week | DayofWeek | StudyDay | SleepTime | WeeklyAverage |
1 | 1 | 1 | 1 | 360 | 361.4285714 |
1 | 1 | 2 | 2 | 450 | 361.4285714 |
1 | 1 | 3 | 3 | 320 | 361.4285714 |
1 | 1 | 4 | 4 | 420 | 361.4285714 |
1 | 1 | 5 | 5 | 220 | 361.4285714 |
1 | 1 | 6 | 6 | 400 | 361.4285714 |
1 | 1 | 7 | 7 | 360 | 361.4285714 |
1 | 2 | 1 | 8 | 320 | 347.5 |
1 | 2 | 2 | 9 | . | 347.5 |
1 | 2 | 3 | 10 | 220 | 347.5 |
1 | 2 | 4 | 11 | . | 347.5 |
1 | 2 | 5 | 12 | 400 | 347.5 |
1 | 2 | 6 | 13 | . | 347.5 |
1 | 2 | 7 | 14 | 450 | 347.5 |
1 | 3 | 1 | 15 | . | . |
1 | 3 | 2 | 16 | . | . |
1 | 3 | 3 | 17 | . | . |
1 | 3 | 4 | 18 | . | . |
1 | 3 | 5 | 19 | . | . |
1 | 3 | 6 | 20 | 420 | . |
1 | 3 | 7 | 21 | 220 | . |
Here, there was no average calculated for Week 3 because there were not sufficient days of sleep data.
I've tried this code but it is producing duplicate days:
proc means data=work.have;
class week ID ;
var SleepTime;
output out=work.summary mean= n= nmiss= / autoname;
run;
proc sql noprint;
create table work.merge as
select a.* , b.*
from work.have a
,work.summary b
where a.ID = b.ID
order by a.ID, a.SleepTime;
quit;
proc sort data=work.merge;
by Week;
run;
data work.merge;
set merge;
by week;
if last.ID then do;
output;
if SleepTime_N >=3 then SleepTime=SleepTime_Mean;
else SleepTime=.;
ID="average of ID";
output;
end;
if index (ID, 'Avg')=0 then output;
run;
Your insight is greatly appreciated!
Best,
Stephanie
UNTESTED CODE
proc summary nway data=have;
class id week;
var sleeptime;
output out=_means_ mean=weeklyaverage n=n;
run;
data want;
merge have _means_;
by id week;
if n<3 then weeklyaverage=.;
run;
UNTESTED CODE
proc summary nway data=have;
class id week;
var sleeptime;
output out=_means_ mean=weeklyaverage n=n;
run;
data want;
merge have _means_;
by id week;
if n<3 then weeklyaverage=.;
run;
Hi @srobinson5 It's a straight forward SQL with a condition
data have;
input ID Week DayofWeek StudyDay SleepTime;
cards;
1 1 1 1 360
1 1 2 2 450
1 1 3 3 320
1 1 4 4 420
1 1 5 5 220
1 1 6 6 400
1 1 7 7 360
1 2 1 8 320
1 2 2 9 .
1 2 3 10 220
1 2 4 11 .
1 2 5 12 400
1 2 6 13 .
1 2 7 14 450
1 3 1 15 .
1 3 2 16 .
1 3 3 17 .
1 3 4 18 .
1 3 5 19 .
1 3 6 20 420
1 3 7 21 220
;
proc sql;
create table want as
select * ,ifn(n(sleeptime)>=3,mean(sleeptime),.) as Weekly_average
from have
group by id, week
order by id, week ,dayofweek;
quit;
And a Datastep approach:
data have;
input ID Week DayofWeek StudyDay SleepTime;
cards;
1 1 1 1 360
1 1 2 2 450
1 1 3 3 320
1 1 4 4 420
1 1 5 5 220
1 1 6 6 400
1 1 7 7 360
1 2 1 8 320
1 2 2 9 .
1 2 3 10 220
1 2 4 11 .
1 2 5 12 400
1 2 6 13 .
1 2 7 14 450
1 3 1 15 .
1 3 2 16 .
1 3 3 17 .
1 3 4 18 .
1 3 5 19 .
1 3 6 20 420
1 3 7 21 220
;
data want;
do _n_=1 by 1 until (last.week);
set have;
by id week;
_n=sum(_n,n(sleeptime));
_s=sum(sleeptime,_s,0);
end;
if _n>=3 then Weekly_average=_s/_n;
do _n_=1 to _n_;
set have;
output;
end;
drop _:;
run;
This can be done in a single SAS step, although that step reads the dataset twice, once to count and sum, and the second pass to output record including the newly calculated mean.
data have;
input ID Week DayofWeek StudyDay SleepTime;
cards;
1 1 1 1 360
1 1 2 2 450
1 1 3 3 320
1 1 4 4 420
1 1 5 5 220
1 1 6 6 400
1 1 7 7 360
1 2 1 8 320
1 2 2 9 .
1 2 3 10 220
1 2 4 11 .
1 2 5 12 400
1 2 6 13 .
1 2 7 14 450
1 3 1 15 .
1 3 2 16 .
1 3 3 17 .
1 3 4 18 .
1 3 5 19 .
1 3 6 20 420
1 3 7 21 220
;
data want (drop=_:);
set have (in=firstpass)
have (in=secondpass);
by id week;
if first.week then call missing(_n,_sum);
if firstpass=1 and sleeptime^=. then do;
_n+1;
_sum+sleeptime;
end;
if secondpass;
if _n>=3 then weeklyaverage=_sum/_n;
run;
Thank you all! Your solutions were extremely helpful and informative for getting the data I needed.
Best,
Stephanie
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.