HI everyone,
I have a data set that looks like this:
data have;
input nr subject service $ weight encounter :yymmdd10.;
format encounter yymmddd10.;
datalines;
1 1 heart 67 20130101
2 1 lung 67 20130131
3 1 gastro 69 20130302
4 1 heart 68 20130401
5 1 neuro 66 20130501
6 2 neuro 80 20130531
7 2 neuro 82 20130630
8 2 neuro 82 20130730
9 2 neuro 81 20130829
10 2 neuro 83 20130928
11 2 neuro 82 20131028
12 3 lung 55 20131127
13 3 lung 55 20131227
14 3 gastric 57 20140126
15 3 heart 57 20140225
16 3 lung 55 20140327
17 4 neuro 88 20140426
18 4 neuro 88 20140526
19 4 neuro 88 20140625
20 4 lung 86 20140725
21 4 lung 86 20140824
;
run;
I am trying to clean this dataset and use it for survival analysis, basically, I am trying to come up with a data set that contains the variables nr, subject, service,weight and encounter with one row per subject.
I will need to select the row that corresponds to the last encounter with service 'heart' for each subject, for subject that have not had an encounter with service 'heart' I need to select the latest encounter date for that subject.
I also need to have a new variable called 'censored' equal to 0 if the subject had an encounter with hear and 1 if not. SO the output should look like this:
nr | subject | service | weight | encounter | censored |
4 | 1 | heart | 68 | 20130401 | 0 |
11 | 2 | neuro | 82 | 20131028 | 1 |
15 | 3 | heart | 57 | 20140225 | 0 |
21 | 4 | lung | 86 | 20140824 | 1 |
Now this is doable with a few proc sqls but I was wondering if anyone could give an easier solution maybe using a data-- set -- by combined with arrays and a do loop?
Thank you\
Try this
data want(drop=p);
do _N_ = 1 by 1 until (last.subject);
set have;
by subject;
if service = "heart" then p = _N_;
end;
if not p then p=_N_;
do _N_ = 1 to _N_;
set have;
censored = (service = "heart");
if _N_ = p then output;
end;
run;
Try this
data want(drop=p);
do _N_ = 1 by 1 until (last.subject);
set have;
by subject;
if service = "heart" then p = _N_;
end;
if not p then p=_N_;
do _N_ = 1 to _N_;
set have;
censored = (service = "heart");
if _N_ = p then output;
end;
run;
@PeterClemmensen: impressive
Anytime. Thank you for a clear question.
A well-stated question is the fast path to a working answer. Kudos.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.