- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would like to make 3 variables, start, stop and event.
year | Id | deathy | deathm |
2001 | 1 | ||
2002 | 1 | ||
2003 | 1 | ||
2004 | 1 | ||
2005 | 1 | 2005 | 1 |
2001 | 2 | ||
2002 | 2 | ||
2003 | 2 | ||
2004 | 2 | 2004 | 3 |
2001 | 3 | ||
2002 | 3 | ||
2003 | 3 | ||
2004 | 3 | ||
2005 | 3 |
The table I'd like to see is below. Can I have it through both datastep and procsql? thanks in advance.
year | Id | deathy | deathm | start | stop | event |
2001 | 1 | 0 | 52 | 0 | ||
2002 | 1 | 52 | 104 | 0 | ||
2003 | 1 | 104 | 156 | 0 | ||
2004 | 1 | 156 | 208 | 0 | ||
2005 | 1 | 2005 | 1 | 208 | 209 | 1 |
2001 | 2 | 0 | 52 | 0 | ||
2002 | 2 | 12 | 104 | 0 | ||
2003 | 2 | 104 | 156 | 0 | ||
2004 | 2 | 2004 | 3 | 156 | 159 | 1 |
2001 | 3 | 0 | 52 | 0 | ||
2002 | 3 | 52 | 104 | 0 | ||
2003 | 3 | 104 | 156 | 0 | ||
2004 | 3 | 156 | 208 | 0 | ||
2005 | 3 | 208 | 260 |
0
|
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The fact that (1) you asked for two techniques to solve the problem, and (2) the apparently artificial use of exactly 52 weeks per year, is what made me think of a homework assignment.
However, put that aside. Let me repeat the question @Reeza asked. Is your task of such a nature that exactly 52 weeks/year satisfies what you need to do? Or would you be better off counting, say, the number of Sundays (or Mondays ... Saturdays) in each incoming year? Then you would get a lot of 52 "week" years, and an occasional 53 week year, which would be typical for most "real world data" problems. Once you've answered that, other forum participants can be more helpful to you. Here's a logical data step structure:
data want;
set have;
by id; start=ifn(first.id,0,some_function_of(the_earliest_start)); stop=some_other_function_of(start); if deathm=. then event=0; else event=1;run;
Now once you've confirmed how you want weeks counted, we would know how to replace the italicized expressions above. In fact instead of some_function_of(the_earliest_start), it may be some_function_of(lag(stop)). In your example as you put forth,
start=ifn(first.id,0,lag(stop);
and
stop=start+52;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
May I ask why it should be 53 weeks? Also, could you please let me know the ways?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
365 days in a year, 7 days in a week.
365/7 = 52.14 etc so you'll always have a portion of a 'week' left over.
INTNX and INTCK will allow you manipulate date variables more easily.
WEEKU()/WEEKV()/WEEKW() functions will also convert a date to a week, with slightly different methodologies. Please check the documentation.
@asinusdk wrote:
May I ask why it should be 53 weeks? Also, could you please let me know the ways?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This seems like a homework assignment. Whether or not that's the case, could you show us know what you've tried so far?
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No. it's not homework. I only know the basic datastep procedure and proc sql (I haven't seen INTNX and INTCK mentioned above, but I have to address the real world data. That's all.) Also, I searched some procedures but I have no clue. I couldn't make any codes that's why I posted this. I wonder why you asked me it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The fact that (1) you asked for two techniques to solve the problem, and (2) the apparently artificial use of exactly 52 weeks per year, is what made me think of a homework assignment.
However, put that aside. Let me repeat the question @Reeza asked. Is your task of such a nature that exactly 52 weeks/year satisfies what you need to do? Or would you be better off counting, say, the number of Sundays (or Mondays ... Saturdays) in each incoming year? Then you would get a lot of 52 "week" years, and an occasional 53 week year, which would be typical for most "real world data" problems. Once you've answered that, other forum participants can be more helpful to you. Here's a logical data step structure:
data want;
set have;
by id; start=ifn(first.id,0,some_function_of(the_earliest_start)); stop=some_other_function_of(start); if deathm=. then event=0; else event=1;run;
Now once you've confirmed how you want weeks counted, we would know how to replace the italicized expressions above. In fact instead of some_function_of(the_earliest_start), it may be some_function_of(lag(stop)). In your example as you put forth,
start=ifn(first.id,0,lag(stop);
and
stop=start+52;
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------