Need help on SAS code for calculating survival follow up time (person years)

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

Need help on SAS code for calculating survival follow up time (person years)

Hi all,

I'm working with a longitudinal data with structure as below:

ID

Hbp1

Hbp2

Hbp3

Hbp4

Hbp5

1

0

0

0

0

1

2

0

0

0

.

.

3

0

.

.

0

.

4

0

0

1

1

1

5

0

.

0

.

1

 

Hbp1-Hbp5: hypertension(0: no, 1:yes) from year 1 to year 5.

I want to calculate person-years (survival time) and incidence rate for this data but I failed to write the SAS code. 

It's very easy to calculate by hand. Incidence case is defined as the first time the event appears during the follow-up period.

For example, ID 1 has 4+0.5 =4.5 person-years, Id 2 has 3+0.5=3.5 person-years, ID 3 has 4 person-years (missing values before the last observation are considered 0)...

But impossible to do that with big sample size. 

Does anybody have experience with calculating survival time using this data structure in SAS?  

 


Accepted Solutions
Solution
‎04-05-2016 05:38 AM
Super User
Posts: 9,867

Re: Need help on SAS code for calculating survival follow up time (person years)

It is the kind of hard to understand what you are expected.
Assuming I know what you mean.



data have;
input ID Hbp1 Hbp2 Hbp3 Hbp4 Hbp5;
cards;
1 0 0 0 0 1
2 0 0 0 . .
3 0 . . 0 .
4 0 0 1 1 1
5 0 . 0 . 1
6 0 . 1 . .
;
run;


data want;
 if _n_=1 then do;
  declare hash h();
  h.definekey('k');
  h.definedone();
 end;
set have;
array x{*} Hbp:;
do i=dim(x) to 1 by -1;
 if x{i}=0 then leave;
 k=x{i};h.replace();
end;
person_year=i+0.5*h.num_items;
h.clear();
drop k i;
run;


View solution in original post


All Replies
Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Sorry I have 1 correction:

Missing values before Observation with value 0 are consider 0

Missing values before Observation with value 1 are still missing values.

Super User
Posts: 7,432

Re: Need help on SAS code for calculating survival follow up time (person years)

Why does ID 3 have 4 years and not 4.5 or 5.5? And why does ID 2 get 3.5 years when only the last value may be considered 1 when missing?

Nonetheless:

data want;
set have;
array hbp {*} hbp1-hbp5;
i = 1;
do while (i <= dim(hbp) and hbp{i} ne 1);
  i + 1;
end;
person_years = i - .5;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Dear Kurt,

Thank you very much for your quick reply.

You're right, some of my calcultion are wrong.

I checked your code and have error as below

 

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
55
56 data want;
57 set hbp;
58 array hbp {*} hbp1-hbp5;
59 i = 1;
60 do while (i <= dim(hbp) and hbp{i} ne 1);
61 i + 1;
62 end;
63 person_years = i - .5;
64 run;
 
ERROR: Array subscript out of range at line 60 column 29.
id=2 hbp1=0 hbp2=0 hbp3=0 hbp4=. hbp5=. i=6 person_years=. _ERROR_=1 _N_=2
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 2 observations read from the data set WORK.HBP.
WARNING: The data set WORK.WANT may be incomplete. When this step was stopped there were 1 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
 
Do you have the same error?
Super User
Posts: 7,432

Re: Need help on SAS code for calculating survival follow up time (person years)

Yeah, have to safeguard because SAS does not stop evaluating a boolean "and" when the first false condition is encountered.

Improved code with test data&colon;

data have;
input ID Hbp1 Hbp2 Hbp3 Hbp4 Hbp5;
cards;
1 0 0 0 0 1
2 0 0 0 . .
3 0 . . 0 .
4 0 0 1 1 1
5 0 . 0 . 1
6 0 . 1 . .
;
run;

data want;
set have;
array hbp {*} hbp1-hbp5;
i = 1;
do while (i <= dim(hbp) and person_years = .);
  if hbp{i} = 1 or i = dim(hbp) then person_years = i - .5;
  i + 1;
end;
drop i;
run;

proc print;
run;

Result:

                                                                                         person_
                                    Obs    ID    Hbp1    Hbp2    Hbp3    Hbp4    Hbp5     years

                                     1      1      0       0       0       0       1       4.5  
                                     2      2      0       0       0       .       .       4.5  
                                     3      3      0       .       .       0       .       4.5  
                                     4      4      0       0       1       1       1       2.5  
                                     5      5      0       .       0       .       1       4.5  
                                     6      6      0       .       1       .       .       2.5  
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Dear Kurt,

I think the results should be like this:

                                    Obs    ID    Hbp1    Hbp2    Hbp3    Hbp4    Hbp5     expect_results   your_results

                                     1      1      0       0       0       0       1       4.5                 4.5
                                     2      2      0       0       0       .       .       3.5                 4.5
                                     3      3      0       .       .       0       .       4.5                 4.5
                                     4      4      0       0       1       1       1       2.5                 2.5
                                     5      5      0       .       0       .       1       4                   4.5
                                     6      6      0       .       1       .       .       2                   2.5

ID 2 can not have the same result as ID 1. It should be 3.5

ID 5 has 1 missing value before the last observation (Hbp5) and this last obs=1, so person_years=3 years (from Hbp1 to hbp3) + 1 year (1/2 length of time from Hbp3 to Hbp5 or 2 years, 'cause of the missing value in year 4_Hbp4).

ID 6 has 1 missing value before the last observation (Hbp3) and this  last obs=1, so so person_years=1 years (1/2 length of time from Hbp1 to hbp3 or 2 years).

Perhaps I should clarify my rule again:

  • Missing values before Observation with value 0 are consider 0 (so we count this year=1)
  • Missing values before Observation with value 1 are still missing values. (so person_years will be 1/2 length of time from the last obs before missing value to the first obs with value 1)

Can you help me adjust the code to produce the results for the above conditions? I could not work it out.

Super User
Posts: 7,432

Re: Need help on SAS code for calculating survival follow up time (person years)


Minhtrang wrote:

Dear Kurt,

I think the results should be like this:

                                    Obs    ID    Hbp1    Hbp2    Hbp3    Hbp4    Hbp5     expect_results   your_results

                                     1      1      0       0       0       0       1       4.5                 4.5
                                     2      2      0       0       0       .       .       3.5                 4.5
                                     3      3      0       .       .       0       .       4.5                 4.5
                                     4      4      0       0       1       1       1       2.5                 2.5
                                     5      5      0       .       0       .       1       4                   4.5
                                     6      6      0       .       1       .       .       2                   2.5

ID 2 can not have the same result as ID 1. It should be 3.5

ID 5 has 1 missing value before the last observation (Hbp5) and this last obs=1, so person_years=3 years (from Hbp1 to hbp3) + 1 year (1/2 length of time from Hbp3 to Hbp5 or 2 years, 'cause of the missing value in year 4_Hbp4).

ID 6 has 1 missing value before the last observation (Hbp3) and this  last obs=1, so so person_years=1 years (1/2 length of time from Hbp1 to hbp3 or 2 years).

Perhaps I should clarify my rule again:

  • Missing values before Observation with value 0 are consider 0 (so we count this year=1)
  • Missing values before Observation with value 1 are still missing values. (so person_years will be 1/2 length of time from the last obs before missing value to the first obs with value 1)

Can you help me adjust the code to produce the results for the above conditions? I could not work it out.


Check your rules again. Per your rules, ID 2 would get a virtual "1" in hbp5 and a missing value in hbp4, so the time between hbp3 and hbp5 would be two years, half that is 1 year, add to 3 means 4 years.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Dear Kurt,

I'm sorry that my English is not good enough so that my explanation was confusing.

For ID 2, he was followed up for 3 years (from hbp1 to hbp3, if hbp2 is missing, I still consider it as hbp2=0). Information on hbp4 and hbp5 is missing due to drop out or loss to follow-up, so I just add 0.5 year. Finally, the person_years for this ID is 3.5.

Maybe I should ajust my rule as:

Any missing value between 0...0 is considered as 0

Any missing value between 0..1 stays the same as missing, and the person_years for this period=1/2 length of time from 0..to..1 (ex: hbp2...hbp5: 1/2 length of time is 1.5 years)

I hope you can understand my explanation this time and help me adjust for the code.

Thank you very much.

Super User
Posts: 7,432

Re: Need help on SAS code for calculating survival follow up time (person years)

But for ID2, there are TWO years from hbp3 to hbp5 (5 minus 3), and half of that is ONE year. 3 + 1 = 4.

Your value of 3.5 for ID 2 contradicts your example in

"Any missing value between 0..1 stays the same as missing, and the person_years for this period=1/2 length of time from 0..to..1 (ex: hbp2...hbp5: 1/2 length of time is 1.5 years)"

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Dear Kurt,

ID 2 has Hbp3 (value 0) Hbp4 (missing value) Hbp5 (missing value): 0 . .

So it's not the case "any missing value between 0..1" but between "0....until the last observation which is also a missing value"

Hbp5 has a missing value, not 1.

Anyway, thank you very much for your help. I will try to write the code, maybe a simplebut very long code.

I still hope to hear from your solution. 

Best,

 

Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Dear Kurt,

I worked it out, thanks to some hint from your code (array and loop). My code is like this:

*hbp1-hbp5: hypertension from visit 1 to 5;

*hp: hypertension after 5 year follow-up;

*year1: last visit with hbp=0 if hp=0;

*incvisit: visit with incidence hypertension;

*lastzero: last visit with hbp=0 before incvisit;

*personyear: person-years;

data hbp;
input id hbp1 hbp2 hbp3 hbp4 hbp5 hp;
datalines;
1 0 0 0 0 1 1
2 0 0 0 . . 0
3 0 . . 0 . 0
4 0 0 1 1 1 1
5 0 . 0 . 1 1
;
run;

data want;
set hbp;
array hbp {*} hbp1-hbp5;
i = 1;
do while (hp=0 and i<=dim(hbp));
if missing(hbp{i})=0 then year1=i;
i+1;
end;
do while (hp=1 and i<=dim(hbp) and incvisit=.);
if hbp{i}=1 then incvisit=i;
i+1;
end;
drop i;
run;

data want1;
set want;
array hbp {*} hbp1-hbp5;
i = 1;
do while (hp=1 and i<=incvisit);
if hbp{i}=0 then lastzero=i;
i+1;
end;
drop i;
run;

data want2;
set want1;
if hp=0 then personyear=year1;
else personyear=lastzero+(incvisit-lastzero)/2;
run;

proc print;
run;

 

The result with variable "personyear":

 

Obs

id

hbp1

hbp2

hbp3

hbp4

hbp5

hp

year1

incvisit

lastzero

personyear

1

1

0

0

0

0

1

1

.

5

4

4.5

2

2

0

0

0

.

.

0

3

.

.

3.0

3

3

0

.

.

0

.

0

4

.

.

4.0

4

4

0

0

1

1

1

1

.

3

2

2.5

5

5

0

.

0

.

1

1

.

5

3

4.0

Solution
‎04-05-2016 05:38 AM
Super User
Posts: 9,867

Re: Need help on SAS code for calculating survival follow up time (person years)

It is the kind of hard to understand what you are expected.
Assuming I know what you mean.



data have;
input ID Hbp1 Hbp2 Hbp3 Hbp4 Hbp5;
cards;
1 0 0 0 0 1
2 0 0 0 . .
3 0 . . 0 .
4 0 0 1 1 1
5 0 . 0 . 1
6 0 . 1 . .
;
run;


data want;
 if _n_=1 then do;
  declare hash h();
  h.definekey('k');
  h.definedone();
 end;
set have;
array x{*} Hbp:;
do i=dim(x) to 1 by -1;
 if x{i}=0 then leave;
 k=x{i};h.replace();
end;
person_year=i+0.5*h.num_items;
h.clear();
drop k i;
run;


Contributor
Posts: 28

Re: Need help on SAS code for calculating survival follow up time (person years)

Dear Xia,

I'm really surprised by your code! It's so short and it creates the same results which I expected.

I've more experience now.

Thank you very much. 

Best,

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 1226 views
  • 1 like
  • 3 in conversation