BookmarkSubscribeRSS Feed
Varrelle
Quartz | Level 8

Dear SAS community:

 

I am using the sample data publicly available at:

 

https://stats.idre.ucla.edu/wp-content/uploads/2016/02/whas500.sas7bdat

 

The tutorial is found here along with a description of the data set: https://stats.idre.ucla.edu/sas/seminars/sas-survival/

 

  • lenfol: length of followup, terminated either by death or censoring. The outcome in this study.
  • fstat: the censoring variable, loss to followup=0, death=1
  • age: age at hospitalization
  • bmi: body mass index
  • hr: initial heart rate
  • gender: males=0, females=1

 

I have attached the data to this post for convenience.

 

I believe the units of follow-up are days, but for the sake of my question, let's instead assume that the units of follow-up are years. If this were the case, the minimum follow-up time captured by the LENFOL variable is 1 year and the maximum is 2358 years.

 

My understanding of Cox PH regression is that while the Hazard function may vary over time, the HAZARDRATIO  is supposed to remain constant.  Please correct me if i am wrong, but this implies that  HAZARDRATIO at year =1 is equal to the HAZARDRATIO at year =2358 when estimating the HAZARDRATIO from the entire length of follow-up (2358 years in this study).

 

If I wanted to estimate the 5-year HAZARDRATIO (ie, assuming the study ended at year=5), could the PHREG procedure return for me the HAZARDRATIO assuming that the length of follow-up ended at year 5 instead of the actual full length of the study (2358 years in this case)? For example, say I wanted to estimate the association between death and gender, I used the following SAS code:

 

libname ucla "C:\<FILEPATH>";

data ucla_surv;
set ucla.whas500;
run;


proc phreg data=ucla_surv;
model lenfol*fstat(0) = gender/ties=efron;
run;

This results in a HAZARDRATIO (HR) estimate over the entire length of follow-up.  Could my code be modified to estimate the 5-year HR as I mentioned above (study artificially ends at year=5)?

 

Related, would it be appropriate to create a new LENFOL variable that truncates the data at year 5 and execute the model with these new  variables as follows:?

 

data ucla_surv_5yr;
set ucla_surv;

label 
	lenfol5="5-year follow-up"
	fstat5="Event indicator for 5-year FU; 1=death,0=censor"
	;
if lenfol <5 then do;
fstat5=fstat;
lenfol5=lenfol;
end;
else do;
fstat5=0;
lenfol5=5;
end;
run;

proc phreg data=ucla_surv;
model lenfol*fstat(0) = gender/ties=efron;
title "HR over entire study FU";
ods select ParameterEstimates;
run;
title;

proc phreg data=ucla_surv_5yr;
model lenfol5*fstat5(0) = gender/ties=efron;
title "HR over 5 years of FU";
ods select ParameterEstimates;
run;
title;

One can see from the output that the HR estimate has changed: over the entire follow-up period, the HR for death modeled against gender was 1.465 while at 5-year FU the estimate was 1.363.  Because of the truncation though, the 5-year estimate is less precise.

 

I welcome any thoughts about my approach from the SAS community. 

 

Thanks very much.

 

 

 

 

 

 

 

 

 

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 0 replies
  • 560 views
  • 0 likes
  • 1 in conversation