Hello SAS Users,
I have a long data set with daily data across 6 weeks. I am trying calculate the weekly mean for each ID for weeks that have 3 or more days of data. My data looks like this:
| ID | Week | DayofWeek | StudyDay | SleepTime | 
| 1 | 1 | 1 | 1 | 360 | 
| 1 | 1 | 2 | 2 | 450 | 
| 1 | 1 | 3 | 3 | 320 | 
| 1 | 1 | 4 | 4 | 420 | 
| 1 | 1 | 5 | 5 | 220 | 
| 1 | 1 | 6 | 6 | 400 | 
| 1 | 1 | 7 | 7 | 360 | 
| 1 | 2 | 1 | 8 | 320 | 
| 1 | 2 | 2 | 9 | . | 
| 1 | 2 | 3 | 10 | 220 | 
| 1 | 2 | 4 | 11 | . | 
| 1 | 2 | 5 | 12 | 400 | 
| 1 | 2 | 6 | 13 | . | 
| 1 | 2 | 7 | 14 | 450 | 
| 1 | 3 | 1 | 15 | . | 
| 1 | 3 | 2 | 16 | . | 
| 1 | 3 | 3 | 17 | . | 
| 1 | 3 | 4 | 18 | . | 
| 1 | 3 | 5 | 19 | . | 
| 1 | 3 | 6 | 20 | 420 | 
| 1 | 3 | 7 | 21 | 220 | 
What I want is this:
| ID | Week | DayofWeek | StudyDay | SleepTime | WeeklyAverage | 
| 1 | 1 | 1 | 1 | 360 | 361.4285714 | 
| 1 | 1 | 2 | 2 | 450 | 361.4285714 | 
| 1 | 1 | 3 | 3 | 320 | 361.4285714 | 
| 1 | 1 | 4 | 4 | 420 | 361.4285714 | 
| 1 | 1 | 5 | 5 | 220 | 361.4285714 | 
| 1 | 1 | 6 | 6 | 400 | 361.4285714 | 
| 1 | 1 | 7 | 7 | 360 | 361.4285714 | 
| 1 | 2 | 1 | 8 | 320 | 347.5 | 
| 1 | 2 | 2 | 9 | . | 347.5 | 
| 1 | 2 | 3 | 10 | 220 | 347.5 | 
| 1 | 2 | 4 | 11 | . | 347.5 | 
| 1 | 2 | 5 | 12 | 400 | 347.5 | 
| 1 | 2 | 6 | 13 | . | 347.5 | 
| 1 | 2 | 7 | 14 | 450 | 347.5 | 
| 1 | 3 | 1 | 15 | . | . | 
| 1 | 3 | 2 | 16 | . | . | 
| 1 | 3 | 3 | 17 | . | . | 
| 1 | 3 | 4 | 18 | . | . | 
| 1 | 3 | 5 | 19 | . | . | 
| 1 | 3 | 6 | 20 | 420 | . | 
| 1 | 3 | 7 | 21 | 220 | . | 
Here, there was no average calculated for Week 3 because there were not sufficient days of sleep data.
I've tried this code but it is producing duplicate days:
proc means data=work.have;
class week ID ;
var SleepTime;
output out=work.summary mean= n= nmiss= / autoname;
run;
proc sql noprint;
create table work.merge as
select a.* , b.*
from work.have a
,work.summary b
where a.ID = b.ID
order by a.ID, a.SleepTime;
quit;
proc sort data=work.merge;
by Week;
run;
data work.merge;
set merge;
by week;
if last.ID then do;
output;
if SleepTime_N >=3 then SleepTime=SleepTime_Mean;
else SleepTime=.;
ID="average of ID";
output;
end;
if index (ID, 'Avg')=0 then output;
run;
Your insight is greatly appreciated!
Best,
Stephanie
UNTESTED CODE
proc summary nway data=have;
    class id week;
    var sleeptime;
    output out=_means_ mean=weeklyaverage n=n;
run;
data want;
    merge have _means_;
    by id week;
    if n<3 then weeklyaverage=.;
run;UNTESTED CODE
proc summary nway data=have;
    class id week;
    var sleeptime;
    output out=_means_ mean=weeklyaverage n=n;
run;
data want;
    merge have _means_;
    by id week;
    if n<3 then weeklyaverage=.;
run;Hi @srobinson5 It's a straight forward SQL with a condition
data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;
proc sql;
create table want as
select * ,ifn(n(sleeptime)>=3,mean(sleeptime),.) as Weekly_average
from have
group by id, week
order by id, week ,dayofweek;
quit;
And a Datastep approach:
data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;
data want;
 do _n_=1 by 1 until (last.week);
  set have;
  by id week;
  _n=sum(_n,n(sleeptime));
  _s=sum(sleeptime,_s,0);
 end;
 if _n>=3 then Weekly_average=_s/_n;
 do _n_=1 to _n_;
  set have;
  output;
 end;
 drop _:;
run;
This can be done in a single SAS step, although that step reads the dataset twice, once to count and sum, and the second pass to output record including the newly calculated mean.
data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;
data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by id week;
  if first.week then call missing(_n,_sum);
  if firstpass=1 and sleeptime^=. then do;
    _n+1;
    _sum+sleeptime;
  end;
  if secondpass;
  if _n>=3 then weeklyaverage=_sum/_n;
run;Thank you all! Your solutions were extremely helpful and informative for getting the data I needed.
Best,
Stephanie
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
