Hi,
How do I split up data range in weeks without overlap(getting several observations for week number)?
I have a data set looking like this (with many observations):
ID | FROM_DATE | TO_DATE | ATTENDING | EVENT |
1 | 01.JAN 2014 | 31. JAN 2014 | 1 | 05.JAN 2014 |
1 | 01.FEB 2014 | 28.FEB 2014 | 1 | 05.JAN 2014 |
1 | 01.MAR 2014 | 03.MAR 2014 | 0 | 05.JAN 2014 |
1 | 04.MAR | 31.MAR.2014 | 1 | 05.JAN 2014 |
2 | 01.JAN 2014 | 15.JAN 2014 | 1 | 17. MAY 2014 |
2 | 16.JAN 2014 | 1.MAR 2014 | 0 | 17. MAY 2014 |
3… | … | … | … | … |
I need to split up the data in weeks before and after the event. I did figure out how to split the observations into weeks using the week and intnx function, but I get an overlap in weeks for some observations. For example if a date range ends on a Wednesday it only has Monday-Wednesday for the given week. it will create a new variable with week number 7, and then the next date range will create a variable with the Thursday-Sunday also having week number 7. I only need one observation per week. I thought that maybe a good way tom come around the problem would be to choose a weekday, for example Thursday, and then count the occurences of Thursday in week x=week number x, but I can't find a way to do it.
I hope that this makes sense. Thank you very much for any help you can give.
I solved it by dividing the time ranges into days and then deleting all observations that didnt occur on Thursdays, now I only have on observation for each week per ID:
data want;
set want;
do i=0 to intck('day', from_v, yo_v);
variable_day=intnx('day',from_v,i,'b');
output;
end;
FORMAT variable_day WEEKDATX17.;
run;
data want;
set want;
variable_weekday=weekday(variable_day);
run;
data want;
set want;
weeksfromevent=intck('week',event,variable_day);
run;
data se;
set want;
if variable_weekday NE 5 then delete;
run;
Hi Peter_C,
Thanks for you answer. I am not quite sure how to do this as it is a date range. My code looks like this but returns several observations for some weeks, when the days are overlapping:
data mydata ;
set mydata;
by ID FROM TO;
format GFROM
GTO
FROM
TO
WEEKDATX17.;
GFROM=FROM;
GTO=TO;
WEEKFROM=week(GFROM,'v');
WEEKTO=week(GTO,'v');
/* Dividing observations to weeks; */
FROM=intnx('week',GFROM, 0, 'b')+1; /* Monday*/
TO=intnx('week',GFROM, 0, 'e')+1; /* Sunday */
if FROM<GFROM then FROM=GFROM;
if TO>GTO then TO=GTO;
if FROM>GTO then delete;
if TO<GFROM then delete;
WEEK_NR=week(FROM,'v');
days_from_event=intck('day',EVENT,FROM);
weeeks_from_event=intck('week', EVENT,FROM);
output;/* week 1 */
FROM=intnx('week',GFROM, 1, 'b')+1;
TO=intnx('week',GFROM, 1, 'e')+1;
if FROM<GFROM then FROM=GFROM;
if TO>GTO then TO=GTO;
if FROM>GTO then delete;
if TO<GFROM then delete;
WEEK_NR=week(FROM,'v');
days_from_event=intck('day',EVENT,FROM);
weeeks_from_event=intck('week', EVENT,FROM);
output;/* week 2 */
... /*same until week 5 */
run;
Show an example of the desired output for your example data. I am not at all sure what you mean by "split up the data".
A more explicit example of this:For example if a date range ends on a Wednesday it only has Monday-Wednesday for the given week. it will create a new variable with week number 7, and then the next date range will create a variable with the Thursday-Sunday also having week number 7.
is needed.
For example do ALL intervals that end on Wednesday get a week number of 7? if not what creates week 1 or 27 or what have you.
And the bit about "next date range" needs some additional clarification. What if the next date range is for a different ID? Does this "rule" still apply?
I apologize for the confusion. When I run my code, my data look like this:
ID | FROM_DATE | TO_DATE | ATTENDING | EVENT | Week_from_event |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | -1 |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 0 |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 1 |
1 | Wed. | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 2 |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 3 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 3 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 4 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 5 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 6 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 7 |
1 | Sat. 01.MAR 2014 | Mon.03.MAR 2014 | 0 | 08.JAN 2014 | 7 |
1 | … | .. | … | … | … |
What I need is data where there is only 1 of each week number per ID.
The problem is that it makes several observations for the week calculation, as it counts all the days that are normally in a week, if that makes sense.. I have a lot of different ID's and for everyone I want to calculate the weeks before and after the event, so that I can estimate what happens in week 1, week 26 etc.
So something like this:
ID | FROM_DATE | TO_DATE | ATTENDING | EVENT | Week_from_event |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | -1 |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 0 |
1 | Wed. 01.JAN 2014 | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 1 |
1 | Wed. | Fri. 31. JAN 2014 | 1 | 08.JAN 2014 | 2 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 3 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 4 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 5 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 6 |
1 | Sat. 01.FEB 2014 | Fri. 28.FEB 2014 | 1 | 08.JAN 2014 | 7 |
1 | … | .. | … | … | … |
But I am not sure what is the best way to do it. My thought was that maybe it is possible to condition the week number of a weekday(Thursday) being in the observation; for example, the last observation above from sat. 1. Mar to Mon. 3. Mar would then be dropped as there is no Thursday.
Please let me know if it is still too confusing to understand what I need help with.
I solved it by dividing the time ranges into days and then deleting all observations that didnt occur on Thursdays, now I only have on observation for each week per ID:
data want;
set want;
do i=0 to intck('day', from_v, yo_v);
variable_day=intnx('day',from_v,i,'b');
output;
end;
FORMAT variable_day WEEKDATX17.;
run;
data want;
set want;
variable_weekday=weekday(variable_day);
run;
data want;
set want;
weeksfromevent=intck('week',event,variable_day);
run;
data se;
set want;
if variable_weekday NE 5 then delete;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.