Hi SAS community,
I hope you are doing well. When conducting survival analysis, I used two ways to define the endpoint of the study period:
1. Individuals who are diagnosed with depression during the study period can be defined as the endpoint.
2. Alternatively, individuals who do not have depression at the end of the study period can be defined as the endpoint.
I was wondering how I could calculate my survival time.
Thanks for all your help!
This is original data:
idauniq | depression | wave |
100052 | 0 | 3 |
100052 | 0 | 4 |
100052 | 0 | 5 |
100052 | 0 | 6 |
100052 | 0 | 7 |
100052 | 0 | 8 |
100052 | 0 | 9 |
100055 | 0 | 3 |
100055 | 0 | 4 |
100055 | 0 | 5 |
100055 | 0 | 6 |
100055 | 0 | 7 |
100055 | 0 | 8 |
100057 | 0 | 3 |
100057 | 0 | 4 |
100057 | 0 | 5 |
100057 | 0 | 6 |
100057 | 0 | 7 |
100057 | 0 | 8 |
100057 | 0 | 9 |
100059 | 0 | 3 |
100059 | 0 | 4 |
100059 | 1 | 5 |
100061 | 0 | 3 |
100061 | 1 | 4 |
100061 | 0 | 5 |
100061 | 0 | 6 |
100068 | 0 | 3 |
100068 | 0 | 5 |
100068 | 0 | 6 |
100068 | 1 | 7 |
100080 | 0 | 3 |
100080 | 0 | 5 |
100081 | 0 | 3 |
100081 | 0 | 4 |
100081 | 0 | 5 |
100081 | 0 | 6 |
100081 | 1 | 7 |
This is results what is expect:
idauniq | depression | wave | Time | censor |
100052 | 0 | 9 | 12 | 0 |
100055 | 0 | 8 | 10 | 0 |
100057 | 0 | 9 | 12 | 0 |
100059 | 1 | 5 | 4 | 1 |
100061 | 0 | 4 | 2 | 0 |
100068 | 1 | 7 | 8 | 1 |
100080 | 0 | 5 | 4 | 0 |
100081 | 1 | 7 | 8 | 1 |
You can follow a SET statement with
by idauniq ;
which allows you to determine if the observation-in-hand is the first (or last) obs for a given idauniq.
You will output one obs per idauniq. It will be either the last obs (if there are no preceding depression obs) or else the first obs with depressio=1:
data have;
input idauniq depression wave;
datalines;
100052 0 3
100052 0 4
100052 0 5
100052 0 6
100052 0 7
100052 0 8
100052 0 9
100055 0 3
100055 0 4
100055 0 5
100055 0 6
100055 0 7
100055 0 8
100057 0 3
100057 0 4
100057 0 5
100057 0 6
100057 0 7
100057 0 8
100057 0 9
100059 0 3
100059 0 4
100059 1 5
100061 0 3
100061 1 4
100061 0 5
100061 0 6
100068 0 3
100068 0 5
100068 0 6
100068 1 7
100080 0 3
100080 0 5
100081 0 3
100081 0 4
100081 0 5
100081 0 6
100081 1 7
run;
data want (drop=n_dep);
set have ;
by idauniq ;
if first.idauniq then n_dep=0;
n_dep+depression;
if (n_dep=1 and depression=1) or (n_dep=0 and last.idauniq=1);
time=2*(wave-3);
run;
And what is the rule by which you calculate the TIME variable?
For instance, for ID 100052, you start with
idauniq | depression | wave |
100052 | 0 | 3 |
100052 | 0 | 4 |
100052 | 0 | 5 |
100052 | 0 | 6 |
100052 | 0 | 7 |
100052 | 0 | 8 |
100052 | 0 | 9 |
From that you get
idauniq | depression | wave | Time | censor |
100052 | 0 | 9 | 12 | 0 |
How did you get time=12?
Also, if an individual has depression=1 in a given wave, does that mean you will ignore subsequent waves for that individual? For instance, see
idauniq | depression | wave |
100061 | 0 | 3 |
100061 | 1 | 4 |
100061 | 0 | 5 |
100061 | 0 | 6 |
You can follow a SET statement with
by idauniq ;
which allows you to determine if the observation-in-hand is the first (or last) obs for a given idauniq.
You will output one obs per idauniq. It will be either the last obs (if there are no preceding depression obs) or else the first obs with depressio=1:
data have;
input idauniq depression wave;
datalines;
100052 0 3
100052 0 4
100052 0 5
100052 0 6
100052 0 7
100052 0 8
100052 0 9
100055 0 3
100055 0 4
100055 0 5
100055 0 6
100055 0 7
100055 0 8
100057 0 3
100057 0 4
100057 0 5
100057 0 6
100057 0 7
100057 0 8
100057 0 9
100059 0 3
100059 0 4
100059 1 5
100061 0 3
100061 1 4
100061 0 5
100061 0 6
100068 0 3
100068 0 5
100068 0 6
100068 1 7
100080 0 3
100080 0 5
100081 0 3
100081 0 4
100081 0 5
100081 0 6
100081 1 7
run;
data want (drop=n_dep);
set have ;
by idauniq ;
if first.idauniq then n_dep=0;
n_dep+depression;
if (n_dep=1 and depression=1) or (n_dep=0 and last.idauniq=1);
time=2*(wave-3);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.