Solved: Re: Need help with coding

yoyong555 · Posted 06-11-2020 06:03 PM

Hi everyone.

I hope you are all doing fine.

I have a large data set which requires calculations. For a small data set (probably less than 30), I can patiently use MS Excel.

So, anyone's help would greatly be appreciated.

Thank you in advance.

And keep safe!

------------------------------------

I have a reference data (Table 1) and two tables (Tables 2 and 3) to produce.

Table 1 (reference data) contains two variables - Age (0 to 100) and Age specific death rates. I use this to populate Tables 2 and 3.

Below is a subset.

Age	Age_spec_dth_rates
71	0.023
72	0.023
73	0.023
74	0.023
75	0.037
76	0.037
77	0.037
78	0.037
79	0.037
80	0.063
81	0.063
82	0.063
83	0.063
84	0.063
85	0.147
86	0.147
87	0.147
88	0.147
89	0.147
90	0.147

Table 2 has three variables - age of the patients, age of the patients at last follow up and cumulative death rate.

I need to calculate the cumulative death rates for each patient using the reference data.

Table 2.

Patient_age	Patient_age_last_follow_up	Cum_dth_rate
74	80
78	84
80	83

Patient 1 joined the study at 74 years old and lived until 80 years old.

Patient 2 joined the study at 78 years old and lived until 84 years old.

Patient 2 joined the study at 80 years old and lived until 83 years old.

The cumulative death rate (cum_dth_rate) is obtained by adding the death rates at each year to the cumulative death rate for the previous year.

Table 2a shows the results. Table 2b shows how the values are generated.

Table 2a.

Patient_age	Patient_age_last_follow_up	Cum_dth_rate
74	80	0.208
78	84	0.326
80	83	0.189

Table 2b.

Age	Age_spec_dth_rates	Cum_dth_rate_pt1	Cum_dth_rate_pt2	Cum_dth_rate_pt3
71	0.023
72	0.023
73	0.023
74	0.023
75	0.037	0.023
76	0.037	0.06
77	0.037	0.097
78	0.037	0.134
79	0.037	0.171	0.037
80	0.063	0.208	0.074
81	0.063		0.137	0.063
82	0.063		0.2	0.126
83	0.063		0.263	0.189
84	0.063		0.326
85	0.147
86	0.147
87	0.147
88	0.147
89	0.147
90	0.147

Table 3 is different. Using again the reference data, I need to populate each patient's cumulative death rate until the end of the longest follow - up time of the study. For example, if the study is conducted for 10 years, the table will look like this:

Table 3.

Study Year	Cum_dth_rate_pt1	Cum_dth_rate_pt2	Cum_dth_rate_pt3
1
2
3
4
5
6
7
8
9
10

Table 3a shows the results. Table 3b shows how the values are generated.

Table 3a.

Study Year	Cum_dth_rate_pt1	Cum_dth_rate_pt2	Cum_dth_rate_pt3
1	0.023	0.037	0.063
2	0.06	0.074	0.126
3	0.097	0.137	0.189
4	0.134	0.2	0.252
5	0.171	0.263	0.315
6	0.208	0.326	0.462
7	0.271	0.389	0.609
8	0.334	0.536	0.756
9	0.397	0.683	0.903
10	0.46	0.83	1.05

Table 3b.

Age	Age_spec_dth_rates	Cum_dth_rate_pt1	Cum_dth_rate_pt2	Cum_dth_rate_pt3
71	0.023
72	0.023
73	0.023
74	0.023
75	0.037	0.023
76	0.037	0.06
77	0.037	0.097
78	0.037	0.134
79	0.037	0.171	0.037
80	0.063	0.208	0.074
81	0.063	0.271	0.137	0.063
82	0.063	0.334	0.2	0.126
83	0.063	0.397	0.263	0.189
84	0.063	0.46	0.326	0.252
85	0.147		0.389	0.315
86	0.147		0.536	0.462
87	0.147		0.683	0.609
88	0.147		0.83	0.756
89	0.147			0.903
90	0.147			1.05

The generated values are an extension of Table 2b where the cumulative death rates are calculated until the end of the entire study period (in this case, 10th year).

-----

Again, thank you very much for your help!

Yoyong.

PGStats · Posted 06-11-2020 07:21 PM

I would use this code:

data rates;
input Age   Age_spec_dth_rate;
datalines;
71    0.023
72    0.023
73    0.023
74    0.023
75    0.037
76    0.037
77    0.037
78    0.037
79    0.037
80    0.063
81    0.063
82    0.063
83    0.063
84    0.063
85    0.147
86    0.147
87    0.147
88    0.147
89    0.147
90    0.147
;

data patients;
input Patient_age Patient_age_last_follow_up;
patient_Id = _n_;
datalines;
74    80   
78    84   
80    83
;

data 
    t2a(keep=patient_id patient_age Patient_age_last_follow_up cum_rate) 
    t3a(keep=patient_id year cum_rate);
array r {0:100} _temporary_;
if _n_ = 1 then do until(done);
    set rates end=done;
    r{age} = Age_spec_dth_rate;
    end;
set patients;
cum_rate = 0;
do year = 1 to 10;
    age = patient_age + year - 1;
    cum_rate = cum_rate + r{min(age, hbound(r))};
    if age = Patient_age_last_follow_up - 1 then output t2a;
    output t3a;
    end;
run;

PG

View solution in original post

smantha · Posted 06-11-2020 07:03 PM

data table_2a(keep=patient_age patient_age_last_follow_up Cum_dth_rate) table_2b(keep=patient age Cum_dth_rate)
table_3a(keep=year patient Cum_dth_rate) table_3b (keep=age patient Cum_dth_rate);
length patient $32.;
array age_death_rates{100} (0.023,
0.023,
0.023,
0.023,
0.037,
0.037,
0.037,
0.037,
0.037,
0.063,
0.063,
0.063,
0.063,
0.063,
0.147,
0.147,
0.147,
0.147,
0.147,
0.147,
0.023,
0.023,
0.023,
0.023,
0.037,
0.037,
0.037,
0.037,
0.037,
0.063,
0.063,
0.063,
0.063,
0.063,
0.147,
0.147,
0.147,
0.147,
0.147,
0.147,
0.023,
0.023,
0.023,
0.023,
0.037,
0.037,
0.037,
0.037,
0.037,
0.063,
0.063,
0.063,
0.063,
0.063,
0.147,
0.147,
0.147,
0.147,
0.147,
0.147,
0.023,
0.023,
0.023,
0.023,
0.037,
0.037,
0.037,
0.037,
0.037,
0.063,
0.063,
0.063,
0.063,
0.063,
0.147,
0.147,
0.147,
0.147,
0.147,
0.147,
0.023,
0.023,
0.023,
0.023,
0.037,
0.037,
0.037,
0.037,
0.037,
0.063,
0.063,
0.063,
0.063,
0.063,
0.147,
0.147,
0.147,
0.147,
0.147,
0.147
);
input patient_age patient_age_last_follow_up;
retain count 1;
Cum_dth_rate=0;
do age=patient_age to patient_age_last_follow_up-1;
Cum_dth_rate = age_death_rates[age] + Cum_dth_rate ;
put age_death_rates[age]=;
end;
output table_2a;
Cum_dth_rate = 0;
do age=patient_age to patient_age_last_follow_up;
patient='patient_cnt'||strip(put(count,12.));
output table_2b;
Cum_dth_rate = age_death_rates[age] + Cum_dth_rate ;
end;
Cum_dth_rate = 0;
do age=patient_age+1 to patient_age+10;
patient='patient_cnt'||strip(put(count,12.));
Cum_dth_rate = age_death_rates[age-1] + Cum_dth_rate ;
year=age-patient_age;
output table_3a;
output table_3b;
end;

count=count+1;
cards;
74 80
78 84
80 83
;;;
run;

proc sort data=table_3b;
by age;

proc transpose data=table_3b out=table_3b_trans(drop=_name_);
by age;
id patient;
var Cum_dth_rate;
;
run;

proc sort data=table_3a;
by year;

proc transpose data=table_3a out=table_3a_trans(drop=_name_);
by year;
id patient;
var Cum_dth_rate;
;
run;

proc sort data=table_2b;
by age;

proc transpose data=table_2b out=table_2b_trans(drop=_name_);
by age;
id patient;
var Cum_dth_rate;
;
run;

I created a random array of cumulative death rates by year. Please verify before you use it.

yoyong555 · Posted 06-11-2020 08:11 PM

Hi again @ballardw.

I have additional questions.

a. Instead of using cards, how do I change the code if I want to use a SAS data that contains the two columns with the same variable names?

b. How do I change the code to produce this output ? Cum_dth_rateYno refers to cumulative death rate from Year 1 to Y 10. It is merging the Table 2 and 3 (but transposing the results).

Patient_age	Patient_age_last_follow_up	Cum_dth_rate	Cum_dth_rate_Y1	Cum_dth_rate_Y2	Cum_dth_rate_Y3	Cum_dth_rate_Y4	Cum_dth_rate_Y5

ballardw · Posted 06-11-2020 07:06 PM

Are the variables Cum_dth_rate_pt1, Cum_dth_rate_pt2 and Cum_dth_rate_pt3 supposed to be for Patient 1, Patient 2 and Patient 3?

If you are going to do further work in SAS that is a very cumbersome file layout. to work with.

yoyong555 · Posted 06-11-2020 07:10 PM

Hi @ballardw.

Yes, the Cum_dth_rate_pt1, Cum_dth_rate_pt2 and Cum_dth_rate_pt3 are supposed to be for Patient 1, Patient 2 and Patient 3.

Thank you very much for your reply and advice.

PGStats · Posted 06-11-2020 07:21 PM

I would use this code:

data rates;
input Age   Age_spec_dth_rate;
datalines;
71    0.023
72    0.023
73    0.023
74    0.023
75    0.037
76    0.037
77    0.037
78    0.037
79    0.037
80    0.063
81    0.063
82    0.063
83    0.063
84    0.063
85    0.147
86    0.147
87    0.147
88    0.147
89    0.147
90    0.147
;

data patients;
input Patient_age Patient_age_last_follow_up;
patient_Id = _n_;
datalines;
74    80   
78    84   
80    83
;

data 
    t2a(keep=patient_id patient_age Patient_age_last_follow_up cum_rate) 
    t3a(keep=patient_id year cum_rate);
array r {0:100} _temporary_;
if _n_ = 1 then do until(done);
    set rates end=done;
    r{age} = Age_spec_dth_rate;
    end;
set patients;
cum_rate = 0;
do year = 1 to 10;
    age = patient_age + year - 1;
    cum_rate = cum_rate + r{min(age, hbound(r))};
    if age = Patient_age_last_follow_up - 1 then output t2a;
    output t3a;
    end;
run;

PG

yoyong555 · Posted 06-11-2020 08:17 PM

Thank you @PGStats. for your reply.

I have additional questions.

a. Instead of using cards, how do I change the code if I want to use a SAS data that contains the two columns with the same variable names?

b. How do I change the code to produce this output ? Cum_dth_rateYno refers to cumulative death rate from Year 1 to Y 10. It is merging the Table 2 and 3 (but transposing the results).

Patient_age	Patient_age_last_follow_up	Cum_dth_rate	Cum_dth_rate_Y1	Cum_dth_rate_Y2	Cum_dth_rate_Y3	Cum_dth_rate_Y4	Cum_dth_rate_Y5

PGStats · Posted 06-11-2020 08:57 PM

a) The code contains 3 data steps. The first two create the datasets RATES and PATIENTS. If you already have those, just skip those data steps.

b) Unless all you want is to return play with Excel, DON'T do this! SAS does not expect wide data. But if you realy have to, do this:

proc sql;
create table t3Long as
select 
    a.patient_id,
    patient_age,
    Patient_age_last_follow_up,
    a.cum_rate,
    year,
    b.cum_rate as cum_rate_y
from t2a as a, t3a as b
where a.patient_id=b.patient_id
order by patient_id, year;
quit;

proc transpose data=t3Long prefix=cum_rate_ out=t3aWide(drop=_name_);
by patient_id patient_age Patient_age_last_follow_up cum_rate;
id year;
var cum_rate_y;
run;For

For SAS work, you would do better with the dataset t3Long.

PG

yoyong555 · Posted 06-11-2020 09:27 PM

I am really learning a lot. Thank you @PGStats !

In the RATES data set, a Gender column is added where each gender will have different sets of death rates for patients from 1 to 100 years old.

How will the code change if the cumulative death rate is both gender and age specific?

Gender	Age	Patient_age_last_follow_up	cum_rate	cum_rate_1	cum_rate2	cum_rate3
M	74	80
M	78	84
F	80	83

PGStats · Posted 06-11-2020 11:11 PM

Make separate datasets for each gender, then concatenate them. You could also work gender through the data step code, which would also be a good way to make sure you understand what the code is doing.

PG

yoyong555 · Posted 06-11-2020 11:43 PM

Thank you @PGStats .

I am already analyzing the data by gender and learning the code.

Here is the result of the last syntax.

I have two more questions.

I want to calculate the survival rate for of each patient for each cum_rate_year.

This means 10 more columns will be added to the table.

patient_id	surv_rate_1	surv_rate_2	surv_rate3	.....
1
2
3

surv_rate_n = exp -(cum_rate_n)

For patient 1, the surv_rate_1 for cum_rate_1 is exp(-0.023) = 0.977.

For patient 2, the surv_rate_1 for cum_rate_1 is exp(-0.037) = 0.963.

For patient 3, the surv_rate_1 for cum_rate_1 is exp(-0.063) = 0.939.

I should do this for each patient for every cum_rate. How do I include this in the syntax?

And lastly, I would like to produce a table that shows the average of surv_rates across all patients for each year.

year	average_surv_rate
1
2
3
4
5
6
7
8
9
10

PGStats · Posted 06-12-2020 12:50 PM

This should do it.

proc sql;
create table t3Long as
select 
    a.patient_id,
    patient_age,
    Patient_age_last_follow_up,
    a.cum_rate,
    year,
    b.cum_rate as cum_rate_y,
    exp(-b.cum_rate) as surv_rate_y
from t2a as a, t3a as b
where a.patient_id=b.patient_id
order by patient_id, year;
create table t3AvgSurv as
select 
    year,
    mean(surv_rate_y) as average_surv_rate
from t3Long
group by year;
quit;

proc transpose data=t3Long prefix=cum_rate_ out=t3aWide(drop=_name_);
by patient_id patient_age Patient_age_last_follow_up cum_rate;
id year;
var cum_rate_y;
run;

proc transpose data=t3Long prefix=surv_rate_ out=t3aSurvWide(drop=_name_);
by patient_id patient_age Patient_age_last_follow_up;
id year;
var surv_rate_y;
run;

Cheers!

PG

Registration is open

SAS Training: Just a Click Away