## data manipulation question

Solved
Occasional Contributor
Posts: 5

# data manipulation question

Hello

I'm trying to create a week variable that assigns the week number based on when Jan 1 falls for that year.

If there are more December days than January days in the week that contains January 1st, it becomes week 53.  If there are more January days, it becomes week 1. For example:

 Sunday Monday Tuesday Wednesday Thursday Friday Saturday Week Dec 28 Dec 29 Dec 30 Dec 31 Jan 1 Jan 2 Jan 3 Week 53 Dec 29 Dec 30 Dec 31 Jan 1 Jan 2 Jan 3 Jan 4 Week 1

I've been manually coding this, but now that I have to do this for a longer time frame, I'm hoping there is an easier/faster way to do this.

if case_reported_date >= '21dec2014'd and case_reported_date <= '27dec2014'd then week = '52';

if case_reported_date >= '28dec2014'd and case_reported_date <= '03jan2015'd then week = '53';

if case_reported_date >= '04jan2015'd and case_reported_date <= '10jan2015'd then week = '01';

if case_reported_date >= '11jan2015'd and case_reported_date <= '17jan2015'd then week = '02';

Accepted Solutions
Solution
‎11-14-2014 09:38 PM
Posts: 4,736

## Re: data manipulation question

If I interpret the source data right then a week spans from Sunday to Saturday. The week count per year is based on the middle of the week - Wednesday. So week 1 is the one with the first Wednesday of the year.

If I've got that right then below code should do what you're after.

data test;

infile datalines4 truncover dlm=';';

attrib

week_expected informat=best32. start_dt end_dt informat=anydtdte. format=date9.

date date_shift date_fwy format=date9. week_calculated length=8

;

input week_expected start_dt end_dt;

do date=start_dt to end_dt;

weekday=put(date,downame.);

/* Date of Wednesday within a Sunday to Saturday week span */

date_shift=intnx('week.4',date+3,0,'b');

/* Date of first Wednesday of the year */

date_fwy=intnx('year',date_shift,0,'b');

date_fwy=intnx('week.5',date_fwy,0,'e');

/* Week number for week starting on Sunday  */

/* First week of year based on date of Wednesday */

week_calculated=intck('week',date_fwy,date_shift)+1;

/* compare expected with calculated week count */

diff=week_expected-week_calculated;

output;

end;

datalines4;

51;December 18, 2011;December 24, 2011

52;December 25, 2011;December 31, 2011

1;January 1, 2012;January 7, 2012

2;January 8, 2012;January 14, 2012

51;December 16, 2012;December 22, 2012

52;December 23, 2012;December 29, 2012

1;December 30, 2012;January 5, 2013

2;January 6, 2013;January 12, 2013

51;December 15, 2013;December 21, 2013

52;December 22, 2013;December 28, 2013

1;December 29, 2013;January 4, 2014

2;January 5, 2014;January 11, 2014

52;December 21, 2014;December 27, 2014

53;December 28, 2014;January 3, 2015

1;January 4, 2015;January 10, 2015

2;January 11, 2015;January 17, 2015

;;;;

run;

Message was edited by: Patrick Matter

Based on comment fixed the narrative to correctly describe what the posted code does (week starting on Sunday and not on Saturday).

All Replies
Super User
Posts: 23,659

## Re: data manipulation question

How are you calculating week in the first place other than in this scenario?

Occasional Contributor
Posts: 5

## Re: data manipulation question

I've tried using the WEEK function (e.g. cweek = week(date)),  but for this particular senario it doesn't work once you get to the end/start of the year.  Sometimes the assigned week will have less than 7 days.

Also, becuase I have multiple years of data in my dataset, the weeks will need to 'reset' once it hits 52 or 53 weeks.

PROC Star
Posts: 8,163

## Re: data manipulation question

Brenda,

If you can use a Monday to Sunday criterion, rather than Sunday to Saturday, you could use the week function 'V' modifier. e.g.:

data want;

do date='28DEC2015'd to '03jan2016'd;

week=week(date,'V');

output;

end;

do date='29DEC1997'd to '04jan1998'd;

week=week(date,'V');

output;

end;

run;

Occasional Contributor
Posts: 5

## Re: data manipulation question

This may do what you want, although I am not sure it follows exactly the same rule that you described.

/*

compute year/week according to this rule:

If Jan 1 falls on a Sunday through Wednesday, then the week that includes Jan 1

is week 1. Otherwise the following week is week 1. Then weeks are numbered consecutively

up to the last week of the year, which may be a 52 or a 53.

*/

data date1(keep=cdcweek dth_date);

length cdcweek \$ 6;

do dth_date = '01jan2009'd to '31dec2012'd;

year = year(dth_date);

yyyy = put(year,4.);

pyyy = put(year-1,4.);

first_sat = mdy(1,1,year);

jan01     = mdy(1,1,year);

prevjan01 = mdy(1,1,year-1);

do while(weekday(first_sat) ne 7);

first_sat = first_sat + 1;

end;

if dth_date = . then cdcweek = '';

else do;

if weekday(jan01) le 4 then do;

if dth_date le first_sat then cdcweek = yyyy||"01";

else cdcweek = put(((year*100) + int((dth_date-first_sat-1)/7) + 2),6.);

end;

else do;

if dth_date le first_sat then do;

if (weekday(prevjan01) = 4) or ((weekday(prevjan01) = 3) and

(jan01-prevjan01 = 366)) then  cdcweek = pyyy||"53";

else cdcweek = pyyy||"52";

end;

else cdcweek = put(((year*100)+int((dth_date-first_sat-1)/7) + 1),6.);

end;    /* weekday(jan01) le 4 */

end;  /* date missing */

if cdcweek ge 200901 then output;

end; /* dth_date from 1/1/2009 to 12/31/2012 */

run;

Occasional Contributor
Posts: 5

## Re: data manipulation question

This seems to be doing part of the trick, but it looks like week 1 is getting lost in some instances, and week 53 is being generated when it should end at week 52.

 cdcweek dth_date epiweek year
 201152 25-Dec-11 52 2011 201152 26-Dec-11 52 2011 201152 27-Dec-11 52 2011 201152 28-Dec-11 52 2011 201152 29-Dec-11 52 2011 201152 30-Dec-11 52 2011 201152 31-Dec-11 52 2011 201202 8-Jan-12 2 2012 201202 9-Jan-12 2 2012 201202 10-Jan-12 2 2012 201202 11-Jan-12 2 2012 201202 12-Jan-12 2 2012 201202 13-Jan-12 2 2012 201202 14-Jan-12 2 2012
 201252 23-Dec-12 52 2012 201252 24-Dec-12 52 2012 201252 25-Dec-12 52 2012 201252 26-Dec-12 52 2012 201252 27-Dec-12 52 2012 201252 28-Dec-12 52 2012 201252 29-Dec-12 52 2012 201253 30-Dec-12 53 2012 201253 31-Dec-12 53 2012 201302 6-Jan-13 2 2013 201302 7-Jan-13 2 2013 201302 8-Jan-13 2 2013 201302 9-Jan-13 2 2013 201302 10-Jan-13 2 2013 201302 11-Jan-13 2 2013 201302 12-Jan-13 2 2013
Occasional Contributor
Posts: 5

## Re: data manipulation question

This is what I get when I run code that I cut-and-pasted from my first post. The missing week 1 suggests that something went missing when you edited the code. I am not sure of the week 52/53 discrepancy, since the rule I used may be different from yours.

Obs    cdcweek      dth_date

...

 1084 201151 12/23/2011 1085 201151 12/24/2011 1086 201152 12/25/2011 1087 201152 12/26/2011 1088 201152 12/27/2011 1089 201152 12/28/2011 1090 201152 12/29/2011 1091 201152 12/30/2011 1092 201152 12/31/2011 1093 201201 01/01/2012 1094 201201 01/02/2012 1095 201201 01/03/2012 1096 201201 01/04/2012 1097 201201 01/05/2012 1098 201201 01/06/2012
 1099 201201 01/07/2012 1100 201202 01/08/2012 1101 201202 01/09/2012 1102 201202 01/10/2012 1103 201202 01/11/2012 1104 201202 01/12/2012 1105 201202 01/13/2012 1106 201202 01/14/2012

...

PROC Star
Posts: 8,163

## Re: data manipulation question

Me thinks that you have to define your rules more clearly. Since 7*52 only equals 364, there will always be 1 or 2 days (depending upon leap years) that still have to be accounted for. From your example, I presume that a week 53 is desired.

I think that the following comes close, but probably needs some tweeking:

data test;

format date weekdate29.;

/*  test cases for Sunday January 1st*/

do date='28DEC2005'd to '31dec2006'd;

start="Sunday";

output;

end;

/*  test cases for Monday January 1st*/

do date='1jan2007'd to '31dec2007'd;

start="Monday";

output;

end;

/*  test cases for Tuesday January 1st*/

do date='1jan2008'd to '31dec2008'd;

start="Tuesday";

output;

end;

/*  test cases for Thursday January 1st*/

do date='1jan2009'd to '31dec2009'd;

start="Thursday";

output;

end;

/*  test cases for Friday January 1st*/

do date='1jan2010'd to '31dec2010'd;

start="Friday";

output;

end;

/*  test cases for Saturday January 1st*/

do date='1jan2011'd to '31dec2011'd;

start="Saturday";

output;

end;

/*  test cases for Sunday January 1st*/

do date='1jan2012'd to '31dec2012'd;

start="Sunday";

output;

end;

/*  test cases for Tuesday January 1st*/

do date='1jan2013'd to '31dec2013'd;

start="Tuesday";

output;

end;

/*  test cases for Wednesday January 1st*/

do date='1jan2014'd to '31dec2014'd;

start="Wednesday";

output;

end;

/*  test cases for Thursday January 1st*/

do date='1jan2015'd to '31dec2015'd;

start="Thursday";

output;

end;

run;

data want;

set test;

if mdy(12,28,year(date)) <= date <= mdy(12,31,year(date)) then do;

if weekday(mdy(1,1,year(date)+1)) eq 1 then week=ceil((date-mdy(1,1,year(date))+1)/7);

else if weekday(mdy(1,1,year(date)+1)) eq 2 and day(date) eq 31 then week=53;

else if weekday(mdy(1,1,year(date)+1)) eq 3 and day(date) gt 30 then week=53;

else if weekday(mdy(1,1,year(date)+1)) eq 4 and day(date) gt 29 then week=53;

else week=ceil((date-mdy(1,1,year(date))+1)/7);

end;

else if mdy(1,1,year(date)) <= date <= mdy(1,3,year(date)) then do;

if weekday(mdy(1,1,year(date))) le 4 then week=1;

else week=53;

end;

else week=ceil((date-mdy(1,1,year(date))+1)/7);

run;

Occasional Contributor
Posts: 5

## Re: data manipulation question

From Brenda's example, there does not have to be a week 53 in every year, since up to 3 days at the beginning of the year may be counted in week 53 of the previous year.

PROC Star
Posts: 8,163

## Re: data manipulation question

Now I see where I'm confused. In your example, which I've repeated below, for the third and fourth lines of code I would have guessed that the rules would have been:

if case_reported_date >= '29dec2013'd and case_reported_date <= '04jan2014'd then week = 1;

if case_reported_date >= '05jan2014'd and case_reported_date <= '11jan2014'd then week = 2;

if case_reported_date >= '11jan2014'd and case_reported_date <= '17jan2014'd then week = 3;

Please let us know if your example was wrong, or if I really am simply confused.

 Sunday Monday Tuesday Wednesday Thursday Friday Saturday Week Dec 28 Dec 29 Dec 30 Dec 31 Jan 1 Jan 2 Jan 3 Week 53 Dec 29 Dec 30 Dec 31 Jan 1 Jan 2 Jan 3 Jan 4 Week 1

I've been manually coding this, but now that I have to do this for a longer time frame, I'm hoping there is an easier/faster way to do this.

if case_reported_date >= '21dec2014'd and case_reported_date <= '27dec2014'd then week = '52';

if case_reported_date >= '28dec2014'd and case_reported_date <= '03jan2015'd then week = '53';

if case_reported_date >= '04jan2015'd and case_reported_date <= '10jan2015'd then week = '01';

if case_reported_date >= '11jan2015'd and case_reported_date <= '17jan2015'd then week = '02';

Occasional Contributor
Posts: 5

## Re: data manipulation question

Here is an example of how I'm trying to group my weeks.. I've provided some links that show how some historical years are calculated

http://www.phac-aspc.gc.ca/fluwatch/11-12/11-12cal-eng.php

http://www.phac-aspc.gc.ca/fluwatch/12-13/12-13cal-eng.php

http://www.phac-aspc.gc.ca/fluwatch/14-15/14-15cal-eng.php

 35 August 24, 2014 August 30, 2014 36 August 31, 2014 September 6, 2014 37 September 7, 2014 September 13, 2014 38 September 14, 2014 September 20, 2014 39 September 21, 2014 September 27, 2014 40 September 28, 2014 October 4, 2014 41 October 5, 2014 October 11, 2014 42 October 12, 2014 October 18, 2014 43 October 19, 2014 October 25, 2014 44 October 26, 2014 November 1, 2014 45 November 2, 2014 November 8, 2014 46 November 9, 2014 November 15, 2014 47 November 16, 2014 November 22, 2014 48 November 23, 2014 November 29, 2014 49 November 30, 2014 December 6, 2014 50 December 7, 2014 December 13, 2014 51 December 14, 2014 December 20, 2014 52 December 21, 2014 December 27, 2014 53 December 28, 2014 January 3, 2015 1 January 4, 2015 January 10, 2015 2 January 11, 2015 January 17, 2015 3 January 18, 2015 January 24, 2015 4 January 25, 2015 January 31, 2015 5 February 1, 2015 February 7, 2015 6 February 8, 2015 February 14, 2015 7 February 15, 2015 February 21, 2015 8 February 22, 2015 February 28, 2015 9 March 1, 2015 March 7, 2015 10 March 8, 2015 March 14, 2015 11 March 15, 2015 March 21, 2015 12 March 22, 2015 March 28, 2015 13 March 29, 2015 April 4, 2015 14 April 5, 2015 April 11, 2015 15 April 12, 2015 April 18, 2015 16 April 19, 2015 April 25, 2015 17 April 26, 2015 May 2, 2015 18 May 3, 2015 May 9, 2015 19 May 10, 2015 May 16, 2015 20 May 17, 2015 May 23, 2015 21 May 24, 2015 May 30, 2015 22 May 31, 2015 June 6, 2015 23 June 7, 2015 June 13, 2015 24 June 14, 2015 June 20, 2015 25 June 21, 2015 June 27, 2015 26 June 28, 2015 July 4, 2015 27 July 5, 2015 July 11, 2015 28 July 12, 2015 July 18, 2015 29 July 19, 2015 July 25, 2015 30 July 26, 2015 August 1, 2015 31 August 2, 2015 August 8, 2015 32 August 9, 2015 August 15, 2015 33 August 16, 2015 August 22, 2015 34 August 23, 2015 August 29, 2015
PROC Star
Posts: 8,163

## Re: data manipulation question

Brenda,

Too bad that they don't publish more examples, but I think the links you provided answered most of my questions anyway.

Try the following code and see if the results match the desired calendar: They appear to match the three published tables.

data test;

format date weekdate29.;

do date='01JAN2005'd to '31dec2015'd;

output;

end;

run;

data want (drop=year_: last_: first_;

format first_sunday weekdate29.;

set test;

first_sunday=mdy(1,1,year(date))-weekday(mdy(1,1,year(date)))+1;

if year(first_sunday) lt year(date) and first_sunday lt mdy(12,29,year(date)-1) then

first_sunday+7;

year_start=weekday(mdy(1,1,year(date)));

year_end=weekday(mdy(12,31,year(date)));

last_year_end=weekday(mdy(12,31,year(date)-1));

if      year_end le 3 and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=1;

else if year_start eq 4 and (mdy(12,31,year(date))-mdy(1,1,year(date))) eq 365

and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=53;

else if 5<=year_end<=6 and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=52;

else if year_end eq 4 and mdy(12,32-year_end,year(date))<=date<=mdy(12,31,year(date)) then week=53;

else if last_year_end eq 5 and (mdy(12,31,year(date)-1)-mdy(1,1,year(date)-1)) eq 365

and mdy(1,1,year(date))<=date<=mdy(1,2,year(date)) then week=53;

else if last_year_end le 3 and mdy(1,1,year(date))<=date<=mdy(1,4-last_year_end,year(date)) then week=1;

else if 5<=last_year_end<=6 and mdy(1,1,year(date))<=date<=mdy(1,7-last_year_end,year(date)) then week=52;

else if last_year_end eq 4 and mdy(1,1,year(date))<=date<=mdy(1,7-last_year_end,year(date)) then week=53;

else week=intck('week',first_sunday,date)+1;

run;

Solution
‎11-14-2014 09:38 PM
Posts: 4,736

## Re: data manipulation question

If I interpret the source data right then a week spans from Sunday to Saturday. The week count per year is based on the middle of the week - Wednesday. So week 1 is the one with the first Wednesday of the year.

If I've got that right then below code should do what you're after.

data test;

infile datalines4 truncover dlm=';';

attrib

week_expected informat=best32. start_dt end_dt informat=anydtdte. format=date9.

date date_shift date_fwy format=date9. week_calculated length=8

;

input week_expected start_dt end_dt;

do date=start_dt to end_dt;

weekday=put(date,downame.);

/* Date of Wednesday within a Sunday to Saturday week span */

date_shift=intnx('week.4',date+3,0,'b');

/* Date of first Wednesday of the year */

date_fwy=intnx('year',date_shift,0,'b');

date_fwy=intnx('week.5',date_fwy,0,'e');

/* Week number for week starting on Sunday  */

/* First week of year based on date of Wednesday */

week_calculated=intck('week',date_fwy,date_shift)+1;

/* compare expected with calculated week count */

diff=week_expected-week_calculated;

output;

end;

datalines4;

51;December 18, 2011;December 24, 2011

52;December 25, 2011;December 31, 2011

1;January 1, 2012;January 7, 2012

2;January 8, 2012;January 14, 2012

51;December 16, 2012;December 22, 2012

52;December 23, 2012;December 29, 2012

1;December 30, 2012;January 5, 2013

2;January 6, 2013;January 12, 2013

51;December 15, 2013;December 21, 2013

52;December 22, 2013;December 28, 2013

1;December 29, 2013;January 4, 2014

2;January 5, 2014;January 11, 2014

52;December 21, 2014;December 27, 2014

53;December 28, 2014;January 3, 2015

1;January 4, 2015;January 10, 2015

2;January 11, 2015;January 17, 2015

;;;;

run;

Message was edited by: Patrick Matter

Based on comment fixed the narrative to correctly describe what the posted code does (week starting on Sunday and not on Saturday).

PROC Star
Posts: 8,163

## Re: data manipulation question

: I compared your code with mine on a larger date range and realized that I had to revise the code I posted in order to correctly calculate weeks for some leap years which weren't present in my original test.

However, I like your suggested code a lot better! I was concerned by your description, but you are in fact calculating weeks based on Brenda's required Sunday to Saturday spans.

As such, I modified your code to work with my test data, but Brenda should definitely mark your post as being the correct answer.

data test;

format date weekdate29.;

do date='01JAN1900'd to '31dec2200'd;

output;

end;

run;

data want (drop=date_;

set test;

date_shift=intnx('week.4',date+3,0,'b'); /*1st Wed in week*/

date_fwy=intnx('week.5',intnx('year',date_shift,0,'b'),0,'e'); /*1st Wed in year*/

week=intck('week',date_fwy,date_shift)+1; /*week number*/

run;

Super User
Posts: 10,761

## Re: data manipulation question

My code is longer than Arthu.T's and Patrick's code.

```data test;
format date weekdate29.;
do date='01JAN1900'd to '31dec2200'd;
output;
end;
run;
data test;
set test;
week=week(date);
if 1 lt weekday(intnx('year',date,0,'b')) lt 5 then week=week+1;
if weekday(date)=1 then n+1;
run;
data want ;
do until(last.n);
set test;
by n ;
if day(date)=1 and month(date)=1 then do;
need_change=1; weekday=weekday(date); prev_week=_week;
end;
_week=week;
end;
do until(last.n);
set test ;
by n ;
if need_change then week=ifn((1 lt weekday lt 5),1,prev_week);
output;
end;
drop n  need_change _week      prev_week weekday;
run;

```

Xia Keshan

Message was edited by: xia keshan

🔒 This topic is solved and locked.