Calcite | Level 5

## cumulative sum

I would like to make 3 variables, start, stop and event.

 year Id deathy deathm 2001 1 2002 1 2003 1 2004 1 2005 1 2005 1 2001 2 2002 2 2003 2 2004 2 2004 3 2001 3 2002 3 2003 3 2004 3 2005 3

The table I'd like to see is below. Can I have it through both datastep and procsql? thanks in advance.

 year Id deathy deathm start stop event 2001 1 0 52 0 2002 1 52 104 0 2003 1 104 156 0 2004 1 156 208 0 2005 1 2005 1 208 209 1 2001 2 0 52 0 2002 2 12 104 0 2003 2 104 156 0 2004 2 2004 3 156 159 1 2001 3 0 52 0 2002 3 52 104 0 2003 3 104 156 0 2004 3 156 208 0 2005 3 208 260 0
1 ACCEPTED SOLUTION

Accepted Solutions
PROC Star

## Re: cumulative sum

@asinusdk

The fact that (1) you asked for two techniques to solve the problem, and (2) the apparently artificial use of exactly 52 weeks per year, is what made me think of a homework assignment.

However, put that aside.  Let me repeat the question @Reeza asked.  Is your task of such a nature that exactly 52 weeks/year satisfies what you need to do?  Or would you be better off counting, say, the number of Sundays (or Mondays ... Saturdays) in each incoming year?  Then you would get a lot of 52 "week" years, and an occasional 53 week year, which would be typical for most "real world data" problems.  Once you've answered that, other forum participants can be more helpful to you.  Here's a logical data step structure:

``````data want;
set have;
by id;  start=ifn(first.id,0,some_function_of(the_earliest_start));  stop=some_other_function_of(start);  if deathm=. then event=0;  else event=1;run;
``````

Now once you've confirmed how you want weeks counted, we would know how to replace the italicized expressions above.  In fact instead of some_function_of(the_earliest_start), it may be some_function_of(lag(stop)).  In your example as you put forth,

start=ifn(first.id,0,lag(stop);

and

stop=start+52;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
6 REPLIES 6
Super User

## Re: cumulative sum

If those are weeks, what about years with 53 weeks? If you’re trying to convert a death date to week there are other ways.
Calcite | Level 5

## Re: cumulative sum

May I ask why it should  be 53 weeks? Also, could you please let me know the ways?

Super User

## Re: cumulative sum

365 days in a year, 7 days in a week.

365/7 = 52.14 etc so you'll always have a portion of a 'week' left over.

INTNX and INTCK will allow you manipulate date variables more easily.

WEEKU()/WEEKV()/WEEKW() functions will also convert a date to a week, with slightly different methodologies. Please check the documentation.

@asinusdk wrote:

May I ask why it should  be 53 weeks? Also, could you please let me know the ways?

PROC Star

## Re: cumulative sum

This seems like a homework assignment.  Whether or not that's the case, could you show us know what you've tried so far?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Calcite | Level 5

## Re: cumulative sum

No. it's not homework. I only know the basic datastep procedure and proc sql (I haven't seen INTNX and INTCK mentioned above, but I have to address the real world data. That's all.) Also, I searched some procedures but I have no clue. I couldn't make any codes that's why I posted this. I wonder why you asked me it.

PROC Star

## Re: cumulative sum

@asinusdk

The fact that (1) you asked for two techniques to solve the problem, and (2) the apparently artificial use of exactly 52 weeks per year, is what made me think of a homework assignment.

However, put that aside.  Let me repeat the question @Reeza asked.  Is your task of such a nature that exactly 52 weeks/year satisfies what you need to do?  Or would you be better off counting, say, the number of Sundays (or Mondays ... Saturdays) in each incoming year?  Then you would get a lot of 52 "week" years, and an occasional 53 week year, which would be typical for most "real world data" problems.  Once you've answered that, other forum participants can be more helpful to you.  Here's a logical data step structure:

``````data want;
set have;
by id;  start=ifn(first.id,0,some_function_of(the_earliest_start));  stop=some_other_function_of(start);  if deathm=. then event=0;  else event=1;run;
``````

Now once you've confirmed how you want weeks counted, we would know how to replace the italicized expressions above.  In fact instead of some_function_of(the_earliest_start), it may be some_function_of(lag(stop)).  In your example as you put forth,

start=ifn(first.id,0,lag(stop);

and

stop=start+52;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Discussion stats
• 6 replies
• 908 views
• 0 likes
• 3 in conversation