- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi SAS Programmers,
Could you please help me with this problem.
Here is my data:
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
I need to create a variable for censoring, so that it takes a value of 1 if a person dropped out of the study, i.e. if persons variables 1 and 2 are both missing from now on. Here is what I need to get:
1 | 1 | 3 | 1 | 0 |
1 | 2 | 3 | 1 | 0 |
1 | 3 | 3 | 1 | 0 |
1 | 4 | 2 | 1 | 0 |
1 | 5 | . | . | 1 |
2 | 1 | 3 | 1 | 0 |
2 | 2 | . | . | 0 |
2 | 3 | . | . | 0 |
2 | 4 | 2 | 0 | 0 |
2 | 5 | . | . | 1 |
3 | 1 | 2 | 0 | 0 |
3 | 2 | 2 | 0 | 0 |
3 | 3 | 2 | 1 | 0 |
3 | 4 | . | . | 1 |
3 | 5 | . | . | 1 |
I cannot base the calculation of "cens" on missing values, because there is intermittent missingness, so the the code: "if var1=. and var2=. then cens=1" will produce positive censoring indicator for id 2 at times 2 and 3, which is not correct. I feel like the right code will need some sorting and retain statement, but can't figure it out by myself.
I would appreciate any tips or advice!
Thank you!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Dinurik It seems pretty straight forward look up problem in my opinion-->Using Hash
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("time") ;
h.definedone () ;
end;
do _n_=1 by 1 until(last.id);
set mydata;
by id ;
if nmiss(var1,var2)=2 then h.add();
else h.clear();
end;
do _n_=1 to _n_;
set mydata;
cens=h.find()=0;
output;
end;
h.clear();
run;
proc print noobs;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Dinurik It seems pretty straight forward look up problem in my opinion->Using Array-Key indexing
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
do _n_=1 by 1 until(last.id);
set mydata;
by id ;
array t(9999) _temporary_;
if nmiss(var1,var2)=2 then t(time)=time;
else call missing(of t(*));
end;
do _n_=1 to _n_;
set mydata;
cens=^^t(time);
output;
end;
call missing(of t(*));
run;
proc print noobs;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
HI @Dinurik It seems pretty straight forward look up problem in my opinion-->Using Hash
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("time") ;
h.definedone () ;
end;
do _n_=1 by 1 until(last.id);
set mydata;
by id ;
if nmiss(var1,var2)=2 then h.add();
else h.clear();
end;
do _n_=1 to _n_;
set mydata;
cens=h.find()=0;
output;
end;
h.clear();
run;
proc print noobs;run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much! It is a huge help. It would take me a couple of days if not more to figure it out on my own. It worked!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
do i=1 by 1 until(last.id);
set mydata;
by id;
if not missing(var1) or not missing(var2) then _i=i;
end;
do i=1 by 1 until(last.id);
set mydata;
by id;
flag= i>_i ;output;
end;
drop i _i;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Ksharp Xia, do you remember I often utter my mother is a huge fan of yours. She often bets you are the champ. You never cease to inspire me to go the extra mile.
Anyways, some more fun.
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
proc sql;
create table want as
select *, time> max(time*(n(var1,var2)>0)) as cens
from mydata
group by id
order by id, time;
quit;
id | time | var1 | var2 | cens |
---|---|---|---|---|
1 | 1 | 3 | 1 | 0 |
1 | 2 | 3 | 1 | 0 |
1 | 3 | 3 | 1 | 0 |
1 | 4 | 2 | 1 | 0 |
1 | 5 | . | . | 1 |
2 | 1 | 3 | 1 | 0 |
2 | 2 | . | . | 0 |
2 | 3 | . | . | 0 |
2 | 4 | 2 | 0 | 0 |
2 | 5 | . | . | 1 |
3 | 1 | 2 | 0 | 0 |
3 | 2 | 2 | 0 | 0 |
3 | 3 | 2 | 1 | 0 |
3 | 4 | . | . | 1 |
3 | 5 | . | . | 1 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is awesome. I checked with my actual data, it works great. Thank you!