BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
srobinson5
Fluorite | Level 6

Hello SAS Users,

 

 I have a long data set with daily data across 6 weeks. I am trying calculate the weekly mean for each ID for weeks that have 3 or more days of data. My data looks like this:

 

IDWeekDayofWeekStudyDaySleepTime
1111360
1122450
1133320
1144420
1155220
1166400
1177360
1218320
1229.
12310220
12411.
12512400
12613.
12714450
13115.
13216.
13317.
13418.
13519.
13620420
13721220

 

What I want is this:

 

IDWeekDayofWeekStudyDaySleepTimeWeeklyAverage
1111360361.4285714
1122450361.4285714
1133320361.4285714
1144420361.4285714
1155220361.4285714
1166400361.4285714
1177360361.4285714
1218320347.5
1229.347.5
12310220347.5
12411.347.5
12512400347.5
12613.347.5
12714450347.5
13115..
13216..
13317..
13418..
13519..
13620420.
13721220.

 

Here, there was no average calculated for Week 3 because there were not sufficient days of sleep data.

 

I've tried this code but it is producing duplicate days:

 

 

proc means data=work.have;
class week ID ;
var SleepTime;
output out=work.summary mean= n= nmiss= / autoname;
run;

proc sql noprint;
create table work.merge as
select a.* , b.*
from work.have a
,work.summary b
where a.ID = b.ID
order by a.ID, a.SleepTime;
quit;


proc sort data=work.merge;
by Week;
run;


data work.merge;
set merge;
by week;
if last.ID then do;
output;
if SleepTime_N >=3 then SleepTime=SleepTime_Mean;
else SleepTime=.;
ID="average of ID";
output;
end;
if index (ID, 'Avg')=0 then output;
run;

 

 

 

Your insight is greatly appreciated!

Best,

Stephanie 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

UNTESTED CODE

 

proc summary nway data=have;
    class id week;
    var sleeptime;
    output out=_means_ mean=weeklyaverage n=n;
run;
data want;
    merge have _means_;
    by id week;
    if n<3 then weeklyaverage=.;
run;
--
Paige Miller

View solution in original post

5 REPLIES 5
PaigeMiller
Diamond | Level 26

UNTESTED CODE

 

proc summary nway data=have;
    class id week;
    var sleeptime;
    output out=_means_ mean=weeklyaverage n=n;
run;
data want;
    merge have _means_;
    by id week;
    if n<3 then weeklyaverage=.;
run;
--
Paige Miller
novinosrin
Tourmaline | Level 20

Hi @srobinson5  It's a straight forward SQL with a condition

 


data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;


proc sql;
create table want as
select * ,ifn(n(sleeptime)>=3,mean(sleeptime),.) as Weekly_average
from have
group by id, week
order by id, week ,dayofweek;
quit;


novinosrin
Tourmaline | Level 20

And a Datastep approach:

 


data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;

data want;
 do _n_=1 by 1 until (last.week);
  set have;
  by id week;
  _n=sum(_n,n(sleeptime));
  _s=sum(sleeptime,_s,0);
 end;
 if _n>=3 then Weekly_average=_s/_n;
 do _n_=1 to _n_;
  set have;
  output;
 end;
 drop _:;
run;
mkeintz
PROC Star

This can be done in a single SAS step, although that step reads the dataset twice, once to count and sum, and the second pass to output record including the newly calculated mean.

 

data have;
input ID	Week	DayofWeek	StudyDay	SleepTime;
cards;
1	1	1	1	360
1	1	2	2	450
1	1	3	3	320
1	1	4	4	420
1	1	5	5	220
1	1	6	6	400
1	1	7	7	360
1	2	1	8	320
1	2	2	9	.
1	2	3	10	220
1	2	4	11	.
1	2	5	12	400
1	2	6	13	.
1	2	7	14	450
1	3	1	15	.
1	3	2	16	.
1	3	3	17	.
1	3	4	18	.
1	3	5	19	.
1	3	6	20	420
1	3	7	21	220
;
data want (drop=_:);
  set have (in=firstpass)
      have (in=secondpass);
  by id week;
  if first.week then call missing(_n,_sum);
  if firstpass=1 and sleeptime^=. then do;
    _n+1;
    _sum+sleeptime;
  end;
  if secondpass;
  if _n>=3 then weeklyaverage=_sum/_n;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
srobinson5
Fluorite | Level 6

Thank you all! Your solutions were extremely helpful and informative for getting the data I needed.

 

Best,

Stephanie

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 684 views
  • 4 likes
  • 4 in conversation