Question about assessing the assumptions of survival analysis

Musfer · Posted 11-10-2019 10:16 PM

Hello,

I have a questions regarding assessing the proportionality assumption for Fine and Gray model in SAS. I looked at the available resources and I found a SAS tutorial on this, which is great.

https://www.sas.com/content/dam/SAS/en_ca/User%20Group%20Presentations/Vancouver-User-Group/Gondara_...

However, I was tried to reproduce the results following the same steps in the tutorial, I found that there is one step in that I could not understand. The code looks like this:

/**Checking PH assumption:Export Schoenfeld residuals from PHREG**/

proc phreg data=follic plots(overlay=stratum)=cif
covs(aggregate) out=estimates;
model dftime*cens(0)=agedecade hgb clinstg chemo /
eventcode=1;
output out=test ressch=WSR_agedecade WSR_hgb WSR_clinstg
WSR_chemo;

run;

So, this step was very clear to me. However the following step was not clear:

/**Checking PH assumption: Merge estimates with residuals and create an adjusted estimate(beta(t))**/
data schoenfeld_data;
merge test(keep=dftime by agedecade2 hgb2
clinstg2 chemo2) estimates;
by by;
rescaled_WSR_agedecade=agedecade2+agedecade;
rescaled_WSR_hgb=hgb2+hgb;
rescaled_WSR_clinstg=clinstg2+clinstg;
rescaled_WSR_chemo=chemo2+chemo;
ldftime=log(dftime+1);
label rescaled_WSR_agedecade="beta(t) of age per decade"
rescaled_WSR_hgb="beta(t) of haemoglobin"
rescaled_WSR_clinstg="beta(t) of stage"
rescaled_WSR_chemo="beta(t) of chemotherapy"
ldftime="log of time";
run;

The used data set can be accessed through:

https://support.sas.com/documentation/onlinedoc/stat/ex_code/143/liftcrsk.html

My questions:

1/What is "by" in the merge statement?

2/What are these new variables in the keep statement?

3/Why do we have double "by by" in the by step?

4/ Is there any other ways to modify the code to calculate the rescaled residuals?

5/ Is there any ways to assess the proportionality assumptions for Fine and Gray model?

Thanks in advance for your help?

jarg · Posted 11-15-2019 12:40 AM

of your 5 answers i can answer 3 - i'm not much of a stats person sorry

1) whenever multiple datasets are merged together in a DATA step a "by" statement is used as the key to join them upon - it's the same as saying table1.variable = table2.variable when joining in SQL/PROC SQL

2) i haven't looked at the prior code but these will be variables that already exist in the TEST dataset (the naming standards in this PDF aren't great) - which leads us to the next question:

3) it looks like the authors have not followed one of the golden rules of programming - never name a variable the same thing as an existing statement/function. the two datasets are being merged together using a common key named "BY" - which you can see in the KEEP statement for the TEST dataset. using the SQL comparison again, it's the same as saying table1.by = table2.by

Question about assessing the assumptions of survival analysis

Re: Question about assessing the assumptions of survival analysis

Ready to join fellow brilliant minds for the SAS Hackathon?