Hello
I'm trying to create a week variable that assigns the week number based on when Jan 1 falls for that year.
If there are more December days than January days in the week that contains January 1st, it becomes week 53. If there are more January days, it becomes week 1. For example:
Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Week |
Dec 28 | Dec 29 | Dec 30 | Dec 31 | Jan 1 | Jan 2 | Jan 3 | Week 53 |
Dec 29 | Dec 30 | Dec 31 | Jan 1 | Jan 2 | Jan 3 | Jan 4 | Week 1 |
I've been manually coding this, but now that I have to do this for a longer time frame, I'm hoping there is an easier/faster way to do this.
if case_reported_date >= '21dec2014'd and case_reported_date <= '27dec2014'd then week = '52';
if case_reported_date >= '28dec2014'd and case_reported_date <= '03jan2015'd then week = '53';
if case_reported_date >= '04jan2015'd and case_reported_date <= '10jan2015'd then week = '01';
if case_reported_date >= '11jan2015'd and case_reported_date <= '17jan2015'd then week = '02';
If I interpret the source data right then a week spans from Sunday to Saturday. The week count per year is based on the middle of the week - Wednesday. So week 1 is the one with the first Wednesday of the year.
If I've got that right then below code should do what you're after.
data test;
infile datalines4 truncover dlm=';';
attrib
week_expected informat=best32. start_dt end_dt informat=anydtdte. format=date9.
date date_shift date_fwy format=date9. week_calculated length=8
;
input week_expected start_dt end_dt;
do date=start_dt to end_dt;
weekday=put(date,downame.);
/* Date of Wednesday within a Sunday to Saturday week span */
date_shift=intnx('week.4',date+3,0,'b');
/* Date of first Wednesday of the year */
date_fwy=intnx('year',date_shift,0,'b');
date_fwy=intnx('week.5',date_fwy,0,'e');
/* Week number for week starting on Sunday */
/* First week of year based on date of Wednesday */
week_calculated=intck('week',date_fwy,date_shift)+1;
/* compare expected with calculated week count */
diff=week_expected-week_calculated;
output;
end;
datalines4;
51;December 18, 2011;December 24, 2011
52;December 25, 2011;December 31, 2011
1;January 1, 2012;January 7, 2012
2;January 8, 2012;January 14, 2012
51;December 16, 2012;December 22, 2012
52;December 23, 2012;December 29, 2012
1;December 30, 2012;January 5, 2013
2;January 6, 2013;January 12, 2013
51;December 15, 2013;December 21, 2013
52;December 22, 2013;December 28, 2013
1;December 29, 2013;January 4, 2014
2;January 5, 2014;January 11, 2014
52;December 21, 2014;December 27, 2014
53;December 28, 2014;January 3, 2015
1;January 4, 2015;January 10, 2015
2;January 11, 2015;January 17, 2015
;;;;
run;
Message was edited by: Patrick Matter
Based on comment fixed the narrative to correctly describe what the posted code does (week starting on Sunday and not on Saturday).
How are you calculating week in the first place other than in this scenario?
I've tried using the WEEK function (e.g. cweek = week(date)), but for this particular senario it doesn't work once you get to the end/start of the year. Sometimes the assigned week will have less than 7 days.
Also, becuase I have multiple years of data in my dataset, the weeks will need to 'reset' once it hits 52 or 53 weeks.
Brenda,
If you can use a Monday to Sunday criterion, rather than Sunday to Saturday, you could use the week function 'V' modifier. e.g.:
data want;
do date='28DEC2015'd to '03jan2016'd;
week=week(date,'V');
output;
end;
do date='29DEC1997'd to '04jan1998'd;
week=week(date,'V');
output;
end;
run;
This may do what you want, although I am not sure it follows exactly the same rule that you described.
/*
compute year/week according to this rule:
If Jan 1 falls on a Sunday through Wednesday, then the week that includes Jan 1
is week 1. Otherwise the following week is week 1. Then weeks are numbered consecutively
up to the last week of the year, which may be a 52 or a 53.
*/
data date1(keep=cdcweek dth_date);
length cdcweek $ 6;
do dth_date = '01jan2009'd to '31dec2012'd;
year = year(dth_date);
yyyy = put(year,4.);
pyyy = put(year-1,4.);
first_sat = mdy(1,1,year);
jan01 = mdy(1,1,year);
prevjan01 = mdy(1,1,year-1);
do while(weekday(first_sat) ne 7);
first_sat = first_sat + 1;
end;
if dth_date = . then cdcweek = '';
else do;
if weekday(jan01) le 4 then do;
if dth_date le first_sat then cdcweek = yyyy||"01";
else cdcweek = put(((year*100) + int((dth_date-first_sat-1)/7) + 2),6.);
end;
else do;
if dth_date le first_sat then do;
if (weekday(prevjan01) = 4) or ((weekday(prevjan01) = 3) and
(jan01-prevjan01 = 366)) then cdcweek = pyyy||"53";
else cdcweek = pyyy||"52";
end;
else cdcweek = put(((year*100)+int((dth_date-first_sat-1)/7) + 1),6.);
end; /* weekday(jan01) le 4 */
end; /* date missing */
if cdcweek ge 200901 then output;
end; /* dth_date from 1/1/2009 to 12/31/2012 */
run;
This seems to be doing part of the trick, but it looks like week 1 is getting lost in some instances, and week 53 is being generated when it should end at week 52.
cdcweek | dth_date | epiweek | year |
201152 | 25-Dec-11 | 52 | 2011 |
201152 | 26-Dec-11 | 52 | 2011 |
201152 | 27-Dec-11 | 52 | 2011 |
201152 | 28-Dec-11 | 52 | 2011 |
201152 | 29-Dec-11 | 52 | 2011 |
201152 | 30-Dec-11 | 52 | 2011 |
201152 | 31-Dec-11 | 52 | 2011 |
201202 | 8-Jan-12 | 2 | 2012 |
201202 | 9-Jan-12 | 2 | 2012 |
201202 | 10-Jan-12 | 2 | 2012 |
201202 | 11-Jan-12 | 2 | 2012 |
201202 | 12-Jan-12 | 2 | 2012 |
201202 | 13-Jan-12 | 2 | 2012 |
201202 | 14-Jan-12 | 2 | 2012 |
201252 | 23-Dec-12 | 52 | 2012 |
201252 | 24-Dec-12 | 52 | 2012 |
201252 | 25-Dec-12 | 52 | 2012 |
201252 | 26-Dec-12 | 52 | 2012 |
201252 | 27-Dec-12 | 52 | 2012 |
201252 | 28-Dec-12 | 52 | 2012 |
201252 | 29-Dec-12 | 52 | 2012 |
201253 | 30-Dec-12 | 53 | 2012 |
201253 | 31-Dec-12 | 53 | 2012 |
201302 | 6-Jan-13 | 2 | 2013 |
201302 | 7-Jan-13 | 2 | 2013 |
201302 | 8-Jan-13 | 2 | 2013 |
201302 | 9-Jan-13 | 2 | 2013 |
201302 | 10-Jan-13 | 2 | 2013 |
201302 | 11-Jan-13 | 2 | 2013 |
201302 | 12-Jan-13 | 2 | 2013 |
This is what I get when I run code that I cut-and-pasted from my first post. The missing week 1 suggests that something went missing when you edited the code. I am not sure of the week 52/53 discrepancy, since the rule I used may be different from yours.
Obs cdcweek dth_date
...
1084 | 201151 | 12/23/2011 | |
1085 | 201151 | 12/24/2011 | |
1086 | 201152 | 12/25/2011 | |
1087 | 201152 | 12/26/2011 | |
1088 | 201152 | 12/27/2011 | |
1089 | 201152 | 12/28/2011 | |
1090 | 201152 | 12/29/2011 | |
1091 | 201152 | 12/30/2011 | |
1092 | 201152 | 12/31/2011 | |
1093 | 201201 | 01/01/2012 | |
1094 | 201201 | 01/02/2012 | |
1095 | 201201 | 01/03/2012 | |
1096 | 201201 | 01/04/2012 | |
1097 | 201201 | 01/05/2012 | |
1098 | 201201 | 01/06/2012 |
1099 | 201201 | 01/07/2012 | |
1100 | 201202 | 01/08/2012 | |
1101 | 201202 | 01/09/2012 | |
1102 | 201202 | 01/10/2012 | |
1103 | 201202 | 01/11/2012 | |
1104 | 201202 | 01/12/2012 | |
1105 | 201202 | 01/13/2012 | |
1106 | 201202 | 01/14/2012 |
...
Me thinks that you have to define your rules more clearly. Since 7*52 only equals 364, there will always be 1 or 2 days (depending upon leap years) that still have to be accounted for. From your example, I presume that a week 53 is desired.
I think that the following comes close, but probably needs some tweeking:
data test;
format date weekdate29.;
/* test cases for Sunday January 1st*/
do date='28DEC2005'd to '31dec2006'd;
start="Sunday";
output;
end;
/* test cases for Monday January 1st*/
do date='1jan2007'd to '31dec2007'd;
start="Monday";
output;
end;
/* test cases for Tuesday January 1st*/
do date='1jan2008'd to '31dec2008'd;
start="Tuesday";
output;
end;
/* test cases for Thursday January 1st*/
do date='1jan2009'd to '31dec2009'd;
start="Thursday";
output;
end;
/* test cases for Friday January 1st*/
do date='1jan2010'd to '31dec2010'd;
start="Friday";
output;
end;
/* test cases for Saturday January 1st*/
do date='1jan2011'd to '31dec2011'd;
start="Saturday";
output;
end;
/* test cases for Sunday January 1st*/
do date='1jan2012'd to '31dec2012'd;
start="Sunday";
output;
end;
/* test cases for Tuesday January 1st*/
do date='1jan2013'd to '31dec2013'd;
start="Tuesday";
output;
end;
/* test cases for Wednesday January 1st*/
do date='1jan2014'd to '31dec2014'd;
start="Wednesday";
output;
end;
/* test cases for Thursday January 1st*/
do date='1jan2015'd to '31dec2015'd;
start="Thursday";
output;
end;
run;
data want;
set test;
if mdy(12,28,year(date)) <= date <= mdy(12,31,year(date)) then do;
if weekday(mdy(1,1,year(date)+1)) eq 1 then week=ceil((date-mdy(1,1,year(date))+1)/7);
else if weekday(mdy(1,1,year(date)+1)) eq 2 and day(date) eq 31 then week=53;
else if weekday(mdy(1,1,year(date)+1)) eq 3 and day(date) gt 30 then week=53;
else if weekday(mdy(1,1,year(date)+1)) eq 4 and day(date) gt 29 then week=53;
else week=ceil((date-mdy(1,1,year(date))+1)/7);
end;
else if mdy(1,1,year(date)) <= date <= mdy(1,3,year(date)) then do;
if weekday(mdy(1,1,year(date))) le 4 then week=1;
else week=53;
end;
else week=ceil((date-mdy(1,1,year(date))+1)/7);
run;
From Brenda's example, there does not have to be a week 53 in every year, since up to 3 days at the beginning of the year may be counted in week 53 of the previous year.
Now I see where I'm confused. In your example, which I've repeated below, for the third and fourth lines of code I would have guessed that the rules would have been:
if case_reported_date >= '29dec2013'd and case_reported_date <= '04jan2014'd then week = 1;
if case_reported_date >= '05jan2014'd and case_reported_date <= '11jan2014'd then week = 2;
if case_reported_date >= '11jan2014'd and case_reported_date <= '17jan2014'd then week = 3;
Please let us know if your example was wrong, or if I really am simply confused.
Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Week |
Dec 28 | Dec 29 | Dec 30 | Dec 31 | Jan 1 | Jan 2 | Jan 3 | Week 53 |
Dec 29 | Dec 30 | Dec 31 | Jan 1 | Jan 2 | Jan 3 | Jan 4 | Week 1 |
I've been manually coding this, but now that I have to do this for a longer time frame, I'm hoping there is an easier/faster way to do this.
if case_reported_date >= '21dec2014'd and case_reported_date <= '27dec2014'd then week = '52';
if case_reported_date >= '28dec2014'd and case_reported_date <= '03jan2015'd then week = '53';
if case_reported_date >= '04jan2015'd and case_reported_date <= '10jan2015'd then week = '01';
if case_reported_date >= '11jan2015'd and case_reported_date <= '17jan2015'd then week = '02';
Here is an example of how I'm trying to group my weeks.. I've provided some links that show how some historical years are calculated
http://www.phac-aspc.gc.ca/fluwatch/11-12/11-12cal-eng.php
http://www.phac-aspc.gc.ca/fluwatch/12-13/12-13cal-eng.php
http://www.phac-aspc.gc.ca/fluwatch/14-15/14-15cal-eng.php
35 | August 24, 2014 | August 30, 2014 |
36 | August 31, 2014 | September 6, 2014 |
37 | September 7, 2014 | September 13, 2014 |
38 | September 14, 2014 | September 20, 2014 |
39 | September 21, 2014 | September 27, 2014 |
40 | September 28, 2014 | October 4, 2014 |
41 | October 5, 2014 | October 11, 2014 |
42 | October 12, 2014 | October 18, 2014 |
43 | October 19, 2014 | October 25, 2014 |
44 | October 26, 2014 | November 1, 2014 |
45 | November 2, 2014 | November 8, 2014 |
46 | November 9, 2014 | November 15, 2014 |
47 | November 16, 2014 | November 22, 2014 |
48 | November 23, 2014 | November 29, 2014 |
49 | November 30, 2014 | December 6, 2014 |
50 | December 7, 2014 | December 13, 2014 |
51 | December 14, 2014 | December 20, 2014 |
52 | December 21, 2014 | December 27, 2014 |
53 | December 28, 2014 | January 3, 2015 |
1 | January 4, 2015 | January 10, 2015 |
2 | January 11, 2015 | January 17, 2015 |
3 | January 18, 2015 | January 24, 2015 |
4 | January 25, 2015 | January 31, 2015 |
5 | February 1, 2015 | February 7, 2015 |
6 | February 8, 2015 | February 14, 2015 |
7 | February 15, 2015 | February 21, 2015 |
8 | February 22, 2015 | February 28, 2015 |
9 | March 1, 2015 | March 7, 2015 |
10 | March 8, 2015 | March 14, 2015 |
11 | March 15, 2015 | March 21, 2015 |
12 | March 22, 2015 | March 28, 2015 |
13 | March 29, 2015 | April 4, 2015 |
14 | April 5, 2015 | April 11, 2015 |
15 | April 12, 2015 | April 18, 2015 |
16 | April 19, 2015 | April 25, 2015 |
17 | April 26, 2015 | May 2, 2015 |
18 | May 3, 2015 | May 9, 2015 |
19 | May 10, 2015 | May 16, 2015 |
20 | May 17, 2015 | May 23, 2015 |
21 | May 24, 2015 | May 30, 2015 |
22 | May 31, 2015 | June 6, 2015 |
23 | June 7, 2015 | June 13, 2015 |
24 | June 14, 2015 | June 20, 2015 |
25 | June 21, 2015 | June 27, 2015 |
26 | June 28, 2015 | July 4, 2015 |
27 | July 5, 2015 | July 11, 2015 |
28 | July 12, 2015 | July 18, 2015 |
29 | July 19, 2015 | July 25, 2015 |
30 | July 26, 2015 | August 1, 2015 |
31 | August 2, 2015 | August 8, 2015 |
32 | August 9, 2015 | August 15, 2015 |
33 | August 16, 2015 | August 22, 2015 |
34 | August 23, 2015 | August 29, 2015 |
Brenda,
Too bad that they don't publish more examples, but I think the links you provided answered most of my questions anyway.
Try the following code and see if the results match the desired calendar: They appear to match the three published tables.
data test;
format date weekdate29.;
do date='01JAN2005'd to '31dec2015'd;
output;
end;
run;
data want (drop=year_: last_: first_:);
format first_sunday weekdate29.;
set test;
first_sunday=mdy(1,1,year(date))-weekday(mdy(1,1,year(date)))+1;
if year(first_sunday) lt year(date) and first_sunday lt mdy(12,29,year(date)-1) then
first_sunday+7;
year_start=weekday(mdy(1,1,year(date)));
year_end=weekday(mdy(12,31,year(date)));
last_year_end=weekday(mdy(12,31,year(date)-1));
if year_end le 3 and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=1;
else if year_start eq 4 and (mdy(12,31,year(date))-mdy(1,1,year(date))) eq 365
and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=53;
else if 5<=year_end<=6 and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=52;
else if year_end eq 4 and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=53;
else if last_year_end eq 5 and (mdy(12,31,year(date)-1)-mdy(1,1,year(date)-1)) eq 365
and mdy(1,1,year(date))<=date<=mdy(1,2,year(date)) then week=53;
else if last_year_end le 3 and mdy(1,1,year(date))<=date<=mdy(1,4-last_year_end,year(date)) then week=1;
else if 5<=last_year_end<=6 and mdy(1,1,year(date))<=date<=mdy(1,7-last_year_end,year(date)) then week=52;
else if last_year_end eq 4 and mdy(1,1,year(date))<=date<=mdy(1,7-last_year_end,year(date)) then week=53;
else week=intck('week',first_sunday,date)+1;
run;
If I interpret the source data right then a week spans from Sunday to Saturday. The week count per year is based on the middle of the week - Wednesday. So week 1 is the one with the first Wednesday of the year.
If I've got that right then below code should do what you're after.
data test;
infile datalines4 truncover dlm=';';
attrib
week_expected informat=best32. start_dt end_dt informat=anydtdte. format=date9.
date date_shift date_fwy format=date9. week_calculated length=8
;
input week_expected start_dt end_dt;
do date=start_dt to end_dt;
weekday=put(date,downame.);
/* Date of Wednesday within a Sunday to Saturday week span */
date_shift=intnx('week.4',date+3,0,'b');
/* Date of first Wednesday of the year */
date_fwy=intnx('year',date_shift,0,'b');
date_fwy=intnx('week.5',date_fwy,0,'e');
/* Week number for week starting on Sunday */
/* First week of year based on date of Wednesday */
week_calculated=intck('week',date_fwy,date_shift)+1;
/* compare expected with calculated week count */
diff=week_expected-week_calculated;
output;
end;
datalines4;
51;December 18, 2011;December 24, 2011
52;December 25, 2011;December 31, 2011
1;January 1, 2012;January 7, 2012
2;January 8, 2012;January 14, 2012
51;December 16, 2012;December 22, 2012
52;December 23, 2012;December 29, 2012
1;December 30, 2012;January 5, 2013
2;January 6, 2013;January 12, 2013
51;December 15, 2013;December 21, 2013
52;December 22, 2013;December 28, 2013
1;December 29, 2013;January 4, 2014
2;January 5, 2014;January 11, 2014
52;December 21, 2014;December 27, 2014
53;December 28, 2014;January 3, 2015
1;January 4, 2015;January 10, 2015
2;January 11, 2015;January 17, 2015
;;;;
run;
Message was edited by: Patrick Matter
Based on comment fixed the narrative to correctly describe what the posted code does (week starting on Sunday and not on Saturday).
: I compared your code with mine on a larger date range and realized that I had to revise the code I posted in order to correctly calculate weeks for some leap years which weren't present in my original test.
However, I like your suggested code a lot better! I was concerned by your description, but you are in fact calculating weeks based on Brenda's required Sunday to Saturday spans.
As such, I modified your code to work with my test data, but Brenda should definitely mark your post as being the correct answer.
data test;
format date weekdate29.;
do date='01JAN1900'd to '31dec2200'd;
output;
end;
run;
data want (drop=date_:);
set test;
date_shift=intnx('week.4',date+3,0,'b'); /*1st Wed in week*/
date_fwy=intnx('week.5',intnx('year',date_shift,0,'b'),0,'e'); /*1st Wed in year*/
week=intck('week',date_fwy,date_shift)+1; /*week number*/
run;
My code is longer than Arthu.T's and Patrick's code.
data test; format date weekdate29.; do date='01JAN1900'd to '31dec2200'd; output; end; run; data test; set test; week=week(date); if 1 lt weekday(intnx('year',date,0,'b')) lt 5 then week=week+1; if weekday(date)=1 then n+1; run; data want ; do until(last.n); set test; by n ; if day(date)=1 and month(date)=1 then do; need_change=1; weekday=weekday(date); prev_week=_week; end; _week=week; end; do until(last.n); set test ; by n ; if need_change then week=ifn((1 lt weekday lt 5),1,prev_week); output; end; drop n need_change _week prev_week weekday; run;
Xia Keshan
Message was edited by: xia keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.