BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
srobinson5
Fluorite | Level 6

Hello SAS Users,

 

 I have a long data set with daily data across 6 weeks. I am trying calculate the weekly mean for each ID for weeks that have 3 or more days of data. My data looks like this:

 

IDWeekDayofWeekStudyDaySleepTime
1111360
1122450
1133320
1144420
1155220
1166400
1177360
1218320
1229.
12310220
12411.
12512400
12613.
12714450
13115.
13216.
13317.
13418.
13519.
13620420
13721220

 

What I want is this:

 

IDWeekDayofWeekStudyDaySleepTimeWeeklyAverage
1111360361.4285714
1122450361.4285714
1133320361.4285714
1144420361.4285714
1155220361.4285714
1166400361.4285714
1177360361.4285714
1218320347.5
1229.347.5
12310220347.5
12411.347.5
12512400347.5
12613.347.5
12714450347.5
13115..
13216..
13317..
13418..
13519..
13620420.
13721220.

 

Here, there was no average calculated for Week 3 because there were not sufficient days of sleep data.

 

I've tried this code but it is producing duplicate days:

 

 

proc means data=work.have;
class week ID ;
var SleepTime;
output out=work.summary mean= n= nmiss= / autoname;
run;

proc sql noprint;
create table work.merge as
select a.* , b.*
from work.have a
,work.summary b
where a.ID = b.ID
order by a.ID, a.SleepTime;
quit;


proc sort data=work.merge;
by Week;
run;


data work.merge;
set merge;
by week;
if last.ID then do;
output;
if SleepTime_N >=3 then SleepTime=SleepTime_Mean;
else SleepTime=.;
ID="average of ID";
output;
end;
if index (ID, 'Avg')=0 then output;
run;

 

 

 

Your insight is greatly appreciated!

Best,

Stephanie 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

UNTESTED CODE

 

proc summary nway data=have;
    class id week;
    var sleeptime;
    output out=_means_ mean=weeklyaverage n=n;
run;
data want;
    merge have _means_;
    by id week;
    if n<3 then weeklyaverage=.;
run;
--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

UNTESTED CODE

 

proc summary nway data=have;
    class id week;
    var sleeptime;
    output out=_means_ mean=weeklyaverage n=n;
run;
data want;
    merge have _means_;
    by id week;
    if n<3 then weeklyaverage=.;
run;
--
Paige Miller
novinosrin
Tourmaline | Level 20

Hi @srobinson5  It's a straight forward SQL with a condition

 


data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;


proc sql;
create table want as
select * ,ifn(n(sleeptime)>=3,mean(sleeptime),.) as Weekly_average
from have
group by id, week
order by id, week ,dayofweek;
quit;


novinosrin
Tourmaline | Level 20

And a Datastep approach:

 


data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;

data want;
 do _n_=1 by 1 until (last.week);
  set have;
  by id week;
  _n=sum(_n,n(sleeptime));
  _s=sum(sleeptime,_s,0);
 end;
 if _n>=3 then Weekly_average=_s/_n;
 do _n_=1 to _n_;
  set have;
  output;
 end;
 drop _:;
run;
mkeintz
PROC Star

This can be done in a single SAS step, although that step reads the dataset twice, once to count and sum, and the second pass to output record including the newly calculated mean.

 

data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;
data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by id week;
  if first.week then call missing(_n,_sum);
  if firstpass=1 and sleeptime^=. then do;
    _n+1;
    _sum+sleeptime;
  end;
  if secondpass;
  if _n>=3 then weeklyaverage=_sum/_n;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
srobinson5
Fluorite | Level 6

Thank you all! Your solutions were extremely helpful and informative for getting the data I needed.

 

Best,

Stephanie

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1252 views
  • 4 likes
  • 4 in conversation