Hi SAS Programmers,
Could you please help me with this problem.
Here is my data:
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
I need to create a variable for censoring, so that it takes a value of 1 if a person dropped out of the study, i.e. if persons variables 1 and 2 are both missing from now on. Here is what I need to get:
1 | 1 | 3 | 1 | 0 |
1 | 2 | 3 | 1 | 0 |
1 | 3 | 3 | 1 | 0 |
1 | 4 | 2 | 1 | 0 |
1 | 5 | . | . | 1 |
2 | 1 | 3 | 1 | 0 |
2 | 2 | . | . | 0 |
2 | 3 | . | . | 0 |
2 | 4 | 2 | 0 | 0 |
2 | 5 | . | . | 1 |
3 | 1 | 2 | 0 | 0 |
3 | 2 | 2 | 0 | 0 |
3 | 3 | 2 | 1 | 0 |
3 | 4 | . | . | 1 |
3 | 5 | . | . | 1 |
I cannot base the calculation of "cens" on missing values, because there is intermittent missingness, so the the code: "if var1=. and var2=. then cens=1" will produce positive censoring indicator for id 2 at times 2 and 3, which is not correct. I feel like the right code will need some sorting and retain statement, but can't figure it out by myself.
I would appreciate any tips or advice!
Thank you!
HI @Dinurik It seems pretty straight forward look up problem in my opinion-->Using Hash
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("time") ;
h.definedone () ;
end;
do _n_=1 by 1 until(last.id);
set mydata;
by id ;
if nmiss(var1,var2)=2 then h.add();
else h.clear();
end;
do _n_=1 to _n_;
set mydata;
cens=h.find()=0;
output;
end;
h.clear();
run;
proc print noobs;run;
HI @Dinurik It seems pretty straight forward look up problem in my opinion->Using Array-Key indexing
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
do _n_=1 by 1 until(last.id);
set mydata;
by id ;
array t(9999) _temporary_;
if nmiss(var1,var2)=2 then t(time)=time;
else call missing(of t(*));
end;
do _n_=1 to _n_;
set mydata;
cens=^^t(time);
output;
end;
call missing(of t(*));
run;
proc print noobs;run;
HI @Dinurik It seems pretty straight forward look up problem in my opinion-->Using Hash
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
if _n_=1 then do;
dcl hash H () ;
h.definekey ("time") ;
h.definedone () ;
end;
do _n_=1 by 1 until(last.id);
set mydata;
by id ;
if nmiss(var1,var2)=2 then h.add();
else h.clear();
end;
do _n_=1 to _n_;
set mydata;
cens=h.find()=0;
output;
end;
h.clear();
run;
proc print noobs;run;
Thank you so much! It is a huge help. It would take me a couple of days if not more to figure it out on my own. It worked!
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
data want;
do i=1 by 1 until(last.id);
set mydata;
by id;
if not missing(var1) or not missing(var2) then _i=i;
end;
do i=1 by 1 until(last.id);
set mydata;
by id;
flag= i>_i ;output;
end;
drop i _i;
run;
@Ksharp Xia, do you remember I often utter my mother is a huge fan of yours. She often bets you are the champ. You never cease to inspire me to go the extra mile.
Anyways, some more fun.
data mydata;
input id time var1 var2 @@;
cards;
1 1 3 1
1 2 3 1
1 3 3 1
1 4 2 1
1 5 . .
2 1 3 1
2 2 . .
2 3 . .
2 4 2 0
2 5 . .
3 1 2 0
3 2 2 0
3 3 2 1
3 4 . .
3 5 . .
;
run;
proc sql;
create table want as
select *, time> max(time*(n(var1,var2)>0)) as cens
from mydata
group by id
order by id, time;
quit;
id | time | var1 | var2 | cens |
---|---|---|---|---|
1 | 1 | 3 | 1 | 0 |
1 | 2 | 3 | 1 | 0 |
1 | 3 | 3 | 1 | 0 |
1 | 4 | 2 | 1 | 0 |
1 | 5 | . | . | 1 |
2 | 1 | 3 | 1 | 0 |
2 | 2 | . | . | 0 |
2 | 3 | . | . | 0 |
2 | 4 | 2 | 0 | 0 |
2 | 5 | . | . | 1 |
3 | 1 | 2 | 0 | 0 |
3 | 2 | 2 | 0 | 0 |
3 | 3 | 2 | 1 | 0 |
3 | 4 | . | . | 1 |
3 | 5 | . | . | 1 |
This is awesome. I checked with my actual data, it works great. Thank you!
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.