BookmarkSubscribeRSS Feed
CFF
Calcite | Level 5 CFF
Calcite | Level 5

Hi there
I have a relatively simple problem, or at least i think so, but after two hours of search, i must admit that i still can't solve it. 
What I have is a dataset looking like: 
id  week_start   week_end   year       x1
1           1              52         1998      0
1          53             79          1999      1
1          78             104        1999      1
2         .                   .            .
And so on for the all my id. So I have that each observation have som characterisk for some interval of time (week_end-week_start).
What I want is to "split each observation by into monthly interval: 
id  week_start   week_end   year       x1
1           1              4         1998      0
1           5              8          1998      0
1           9              12          1998      0
1          13              16          1998      0
1           17              20          1998      0
1           21              24          1998      0
1           25              28          1998      0
1           29              32          1998      0
1           33              36         1998      0
1           37              41          1998      0
1           42              46          1998      0
1           47              52          1998      0
1           53              56          1999     1

And so on...
I do not have a clue, what i am going to do. It could be really nice, if someone could help me.

Thanks 

15 REPLIES 15
ballardw
Super User

Data want;

   set have;

   do week_start = 1 to 47 by 4;

        week_end = week_start + 3;

       output;

   end;

run;

 

Should do it.

CFF
Calcite | Level 5 CFF
Calcite | Level 5

This almost solve my problem. 
But what do i do with the years after that? 
I have tried: 

if year=1998 then do: 

week_start = 1 to 47 by 4;

        week_end = week_start + 3;

       output;

   end;

if year=1999 then do

week_start = 53 to 99 by 4;

        week_end = week_start + 3;

       output;

   end;

run;

But it does not seem work. 

 

run;

ballardw
Super User

I think you may want to back up a bit and actually use DATE values as there are functions to do manipulations and formats for display that work. There are several definitions of WEEK, especially as to how the begining and end of year weeks are treated. Please see for example the online documentation of the formats WEEK, WEEKU, WEEKV and WEEKW, or the WEEK function and the options u, v and w.

 

Choices may well depend on what you are doing next.

Astounding
PROC Star

I think a slight variation on ballardw's program would do it:

 

data want;

   set have;

   original_week_end = week_end;

   do week_start = week_start to original_week_end by 4;

        week_end = min(original_week_end, week_start + 3);

       output;

   end;

   drop original_week_end;

run;

CFF
Calcite | Level 5 CFF
Calcite | Level 5
This almost solve my problem.
But what do i do with the years after that?
I have tried:
if year=1998 then do:
week_start = 1 to 47 by 4;
week_end = week_start + 3;
output;
end;
if year=1999 then do
week_start = 53 to 99 by 4;
week_end = week_start + 3;
output;
end;
run;
But it does not seem work.
Astounding
PROC Star

I'm not sure that you're actually looking at my program.  I haven't hard-coded any part of the loop such as:

 

1 to 47

 

Instead, I start the loop at the value of WEEK_START.

 

The only issue that I didn't try to address was when to keep 5 weeks in a goup instead of 4.  The rules really aren't specified, so I left it as all 4-week groups, and the leftover week(s) go into their own observation even if it's only 1 week that is left over.

 

Try it and see how close it comes to what you need.

CFF
Calcite | Level 5 CFF
Calcite | Level 5

Hi again Astounding 

I'm sorry i got the copy-paste wrong. So my reply to you was a little different. 

I am close to the goal with your code

What i need (and what i have not specified!) is each observation to "dividided" into 12 month - some consistent of 4 others of 5. 

I have not adressed which to consist of 5 because i did not believe that it could be coded? 
But okay: Each year i know which month to consist of 5 weeks. 
Lets say that in first year: 1-52 i know that it is month 5 and 7,8 which should contain 5 weeks, 

and it is different in the next year. 
Is it possible to code that? 

If not. I just want 12 observations per year, and the last month to contain 4 weeks. 

PS: Thanks for the quick reply!!

 

 

 

Astounding
PROC Star

Anything is possible.  But there are a couple of missing ingredients here.

 

(1) How is it possible that your second observation goes from week 53 to week 79?

 

(2) Is the information about which months have 4 weeks and which have 5 in your head, or is it contained in a data set?  (if in a data set, what does the data set look like?)

 

There may be further small pieces to fill in here and there, but those two are the dealbreakers.

CFF
Calcite | Level 5 CFF
Calcite | Level 5

Fantastic response! 
1)It is because there are two observations on id=1 in the following year (1999), which i may forgot to point out. 
2) It is not explicitly in the dataset, the observations per id is now sat so that they "split" each year, and maybe 
if they change their x1-state (as id=1 do in 1999) in a given year. 
I dont know if that makes sense. If not, just say so and i would like to try a formulate it again. 

 

With that being said it is not completely random which month to have 4 and which to have 5, 

it follows the year calender/a specific year calender.

Do you need more information? 
Thank you!!!

 

 

Astounding
PROC Star

I think the best way to proceed would be for you to create a SAS data set that indicates which time periods contain 5 weeks.  After what I hope is enough thought, the structure should be like this:

 

(a) Just two variables:  year and week

(b) Just one observation indicating each 5-week time period

(c) The value for WEEK should be the week that is the 4th week in the 5-week time period

 

It would also simplify things slightly if you could supply the range of years in the data.

CFF
Calcite | Level 5 CFF
Calcite | Level 5

But my dataset still look like: 
id  week_start   week_end   year       x1
1           1              52         1998      0
1          53             79          1999      1
1          78             104        1999      1

2            1              52          1998      1
So how do i create these observations indicating each 5-week time period? 
What if i choose the first 9 period to contain 4 weeks, and the last 3 to contain 5 weeks, 

would that be possible?? 
Im sorry. Im new to coding. I have tried to search on "do statement sas examples", but none seems to match my problem here. 

Astounding
PROC Star

You're jumping a step ahead, worrying about how to program using all the data.  For now, start with getting the 5-week periods defined.  For example (fictitious example):

 

Year   Week

1999    23

1999    34

1999    41

2000    14

2000    25

 

The first record indicates that in 1999, the first five-week grouping is from week 20 through week 24.  (When week=23 in the data, it is the fourth week of a five-week group.)

 

I you can create a data set that defines all the five-week groupings, we'll figure out a way to use it.

CFF
Calcite | Level 5 CFF
Calcite | Level 5

I misunderstood you. 
I have created that now. Should I merge the new dataset with the old? 

Astounding
PROC Star

OK, let's give this a shot.  This is untested code, so I hope it requires a minimum of debugging.

 

data want;

array addaweek {1998:2000, 53};

if _n_=1 then do until (done);

   set _5_week_periods end=done;

   addaweek{year, week}=1;

   retain addaweek:;

end;

set have;

original_week_end = week_end;

do week_start = week_start to original_week_end by 4;

     week_end = min(original_week_end, week_start + 3);

     add1 = 0;

     do week = week_start to week_end;

          add1 + addaweek{year, week};

     end;

    if add1=0 then output;

    else do;

       week_end = min(original_week_end, week_end + 1);

       output;

       week_start = week_start + 1;

   end;

end;

drop original_week_end add1 addaweek:;

run;

 

I think it works, but like I said it's untested.  The idea is that the top DO loop loads the additional data set (the one that indicates the 5-week time periods) into an array.  The rest of the DATA step breaks up the time periods into blocks of 4 (or 5) weeks.  There are still loose ends such as what should happen when a block of 4 weeks ends in the middle of a 5-week month.  But the code at least does make a decision about that.  

 

See how close this comes to what you need.  Good luck.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1685 views
  • 1 like
  • 4 in conversation