Hello SAS Users,
I have a long data set with daily data across 6 weeks. I am trying calculate the weekly mean for each ID for weeks that have 3 or more days of data. My data looks like this:
ID | Week | DayofWeek | StudyDay | SleepTime |
1 | 1 | 1 | 1 | 360 |
1 | 1 | 2 | 2 | 450 |
1 | 1 | 3 | 3 | 320 |
1 | 1 | 4 | 4 | 420 |
1 | 1 | 5 | 5 | 220 |
1 | 1 | 6 | 6 | 400 |
1 | 1 | 7 | 7 | 360 |
1 | 2 | 1 | 8 | 320 |
1 | 2 | 2 | 9 | . |
1 | 2 | 3 | 10 | 220 |
1 | 2 | 4 | 11 | . |
1 | 2 | 5 | 12 | 400 |
1 | 2 | 6 | 13 | . |
1 | 2 | 7 | 14 | 450 |
1 | 3 | 1 | 15 | . |
1 | 3 | 2 | 16 | . |
1 | 3 | 3 | 17 | . |
1 | 3 | 4 | 18 | . |
1 | 3 | 5 | 19 | . |
1 | 3 | 6 | 20 | 420 |
1 | 3 | 7 | 21 | 220 |
What I want is this:
ID | Week | DayofWeek | StudyDay | SleepTime | WeeklyAverage |
1 | 1 | 1 | 1 | 360 | 361.4285714 |
1 | 1 | 2 | 2 | 450 | 361.4285714 |
1 | 1 | 3 | 3 | 320 | 361.4285714 |
1 | 1 | 4 | 4 | 420 | 361.4285714 |
1 | 1 | 5 | 5 | 220 | 361.4285714 |
1 | 1 | 6 | 6 | 400 | 361.4285714 |
1 | 1 | 7 | 7 | 360 | 361.4285714 |
1 | 2 | 1 | 8 | 320 | 347.5 |
1 | 2 | 2 | 9 | . | 347.5 |
1 | 2 | 3 | 10 | 220 | 347.5 |
1 | 2 | 4 | 11 | . | 347.5 |
1 | 2 | 5 | 12 | 400 | 347.5 |
1 | 2 | 6 | 13 | . | 347.5 |
1 | 2 | 7 | 14 | 450 | 347.5 |
1 | 3 | 1 | 15 | . | . |
1 | 3 | 2 | 16 | . | . |
1 | 3 | 3 | 17 | . | . |
1 | 3 | 4 | 18 | . | . |
1 | 3 | 5 | 19 | . | . |
1 | 3 | 6 | 20 | 420 | . |
1 | 3 | 7 | 21 | 220 | . |
Here, there was no average calculated for Week 3 because there were not sufficient days of sleep data.
I've tried this code but it is producing duplicate days:
proc means data=work.have;
class week ID ;
var SleepTime;
output out=work.summary mean= n= nmiss= / autoname;
run;
proc sql noprint;
create table work.merge as
select a.* , b.*
from work.have a
,work.summary b
where a.ID = b.ID
order by a.ID, a.SleepTime;
quit;
proc sort data=work.merge;
by Week;
run;
data work.merge;
set merge;
by week;
if last.ID then do;
output;
if SleepTime_N >=3 then SleepTime=SleepTime_Mean;
else SleepTime=.;
ID="average of ID";
output;
end;
if index (ID, 'Avg')=0 then output;
run;
Your insight is greatly appreciated!
Best,
Stephanie
UNTESTED CODE
proc summary nway data=have;
class id week;
var sleeptime;
output out=_means_ mean=weeklyaverage n=n;
run;
data want;
merge have _means_;
by id week;
if n<3 then weeklyaverage=.;
run;
UNTESTED CODE
proc summary nway data=have;
class id week;
var sleeptime;
output out=_means_ mean=weeklyaverage n=n;
run;
data want;
merge have _means_;
by id week;
if n<3 then weeklyaverage=.;
run;
Hi @srobinson5 It's a straight forward SQL with a condition
data have;
input ID Week DayofWeek StudyDay SleepTime;
cards;
1 1 1 1 360
1 1 2 2 450
1 1 3 3 320
1 1 4 4 420
1 1 5 5 220
1 1 6 6 400
1 1 7 7 360
1 2 1 8 320
1 2 2 9 .
1 2 3 10 220
1 2 4 11 .
1 2 5 12 400
1 2 6 13 .
1 2 7 14 450
1 3 1 15 .
1 3 2 16 .
1 3 3 17 .
1 3 4 18 .
1 3 5 19 .
1 3 6 20 420
1 3 7 21 220
;
proc sql;
create table want as
select * ,ifn(n(sleeptime)>=3,mean(sleeptime),.) as Weekly_average
from have
group by id, week
order by id, week ,dayofweek;
quit;
And a Datastep approach:
data have;
input ID Week DayofWeek StudyDay SleepTime;
cards;
1 1 1 1 360
1 1 2 2 450
1 1 3 3 320
1 1 4 4 420
1 1 5 5 220
1 1 6 6 400
1 1 7 7 360
1 2 1 8 320
1 2 2 9 .
1 2 3 10 220
1 2 4 11 .
1 2 5 12 400
1 2 6 13 .
1 2 7 14 450
1 3 1 15 .
1 3 2 16 .
1 3 3 17 .
1 3 4 18 .
1 3 5 19 .
1 3 6 20 420
1 3 7 21 220
;
data want;
do _n_=1 by 1 until (last.week);
set have;
by id week;
_n=sum(_n,n(sleeptime));
_s=sum(sleeptime,_s,0);
end;
if _n>=3 then Weekly_average=_s/_n;
do _n_=1 to _n_;
set have;
output;
end;
drop _:;
run;
This can be done in a single SAS step, although that step reads the dataset twice, once to count and sum, and the second pass to output record including the newly calculated mean.
data have;
input ID Week DayofWeek StudyDay SleepTime;
cards;
1 1 1 1 360
1 1 2 2 450
1 1 3 3 320
1 1 4 4 420
1 1 5 5 220
1 1 6 6 400
1 1 7 7 360
1 2 1 8 320
1 2 2 9 .
1 2 3 10 220
1 2 4 11 .
1 2 5 12 400
1 2 6 13 .
1 2 7 14 450
1 3 1 15 .
1 3 2 16 .
1 3 3 17 .
1 3 4 18 .
1 3 5 19 .
1 3 6 20 420
1 3 7 21 220
;
data want (drop=_:);
set have (in=firstpass)
have (in=secondpass);
by id week;
if first.week then call missing(_n,_sum);
if firstpass=1 and sleeptime^=. then do;
_n+1;
_sum+sleeptime;
end;
if secondpass;
if _n>=3 then weeklyaverage=_sum/_n;
run;
Thank you all! Your solutions were extremely helpful and informative for getting the data I needed.
Best,
Stephanie
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.