BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nchesnaye
Calcite | Level 5

Hi,

I would like to plot hazard curves using proc phreg. I know proc lifetest is capable of this, but i need the 'entry=' option (for left-truncated data) which is provided only by proc phreg. Is there anyway I can get the hazard function using phreg (smoothed if possible)?

Thanks!

Kind regards,

Nick

1 ACCEPTED SOLUTION

Accepted Solutions
JacobSimonsen
Barite | Level 11

An example of a kernel smoother to get the hazard function;

*simulate a dataset with survivial times;
*the baseline hazard, lamda0, is t+1;
data simulation2;
  call streaminit(1234567);
  do i=1 to 10000;
    x=(i>5000);
    censur_time=-2*log(rand('uniform',0,1));
    eventt=sqrt(-2*log(rand('uniform',0,1))/(exp(x*0.25)) +1)-1;
    event=(eventt<censur_time);
    t=min(eventt,censur_time);
    output;
  end;
  keep t  x event ;
run;

proc phreg data=simulation2;
  class x(ref="0")/param=ref;
  model t*event(0)=x;
  baseline out=baseline  cumhaz=cumhaz;
run;

*from the cumulated baseline I smooth out the increments to get an estimate of the hazard-function itself;

*here I calculate the increments;
data baseline2;
  set baseline;
  retain last;
  delta=cumhaz-last;
  last=cumhaz;
  keep t delta;
run;

*delta = length of the interval over which average is taken;

*dellta should be choosen larger if you have a longer time scale;


%let delta=0.1;

*Here construct a dataset of time-values where the estimated hazardfunction will be calculated;
data template;
  do t=0.01 to 2 by 0.01;
  output;
  end;
run;

*Here I smooth out the increments;

*the kernel function I use is the triangulare function;

*Notice, I have not taken care of the boundary problem at zero. therefore I dont estimate lambda(t) for t<&delta.;

proc fcmp outlib=work.function.kernel;
  function k(t);
     k= (-t**2+1)*3/4;
     *k= 1-abs(t);
return (k);
  endsub;
quit;

option cmplib=(work.function);
data smoother;
  set baseline2(in=a keep=t) template(in=b );
  by t;
  retain dsid;
  if _N_=1 then do;
    obsid=0;
    dsid=open('baseline2');
  end;
  if a then obsid+1;
  if b;
  *forward;
  hazard=0;
  obsid=max(obsid,1);

  do i=obsid to attrn(dsid,'nobs') until (runningt>t+&delta.);
    obs=fetchobs(dsid,i);
    runningt=getvarn(dsid,1);
     delta=getvarn(dsid,2);
   * put 'a' i= runningt=;
    if t-&delta.<runningt<t+&delta. then do;
      if t>=&delta. then hazard+delta*k((t-runningt)/&delta.)/&delta.;
   else if runningt<2*t then hazard+delta*k((t-runningt)/t)/t;
end;
*output;
  end;

  *backward;
  do i=obsid-1 to 1 by -1 until (runningt<t-&delta.);
    *put i=;
    *put 'b' i= runningt=;
    obs=fetchobs(dsid,i);
    runningt=getvarn(dsid,1);
    delta=getvarn(dsid,2);
if runningt=0 then delta=0;
    if t-&delta.<runningt<t+&delta. then do;
      if t>=&delta. then hazard+delta*k((t-runningt)/&delta.)/&delta.;
   else if runningt<2*t then hazard+delta*k((t-runningt)/t)/t;

end;
*output;
  end;
  true_lambda0=(t+1);
run;

*then it has to be plotted;

symbol1 i=join v=none;
symbol2 i=join v=none;
proc gplot data=smoother;
  plot (hazard true_lambda0)*t/overlay;
run;

I changed a bit: a function now defines the kernel smoother, I use the Epanechnikov, and uncommented the Triangular kernel. Further, I changed the bandwidth for t close to zero, such the smoother now can start at zero instead. Message was edited by: Jacob Simonsen

View solution in original post

7 REPLIES 7
Ksharp
Super User

I am afraid you need to save the data , and plot it by proc gplot , here is an example.

proc phreg data=hero ;
model time*status(0)=prison dose / rl ;
strata clinic ;
baseline out=phout loglogs=lls /method=ch ;
run;

proc gplot data=phout ;
 plot lls*time=clinic ;
run;

Xia Keshan

nchesnaye
Calcite | Level 5

Hi Xia,

Thanks for your reply. Your syntax provides the log of negative log of survival, which looks very much like the cumulative hazard. What I need though is to plot the 'instantaneous' hazard against time. In other words, the risk of the event occuring / hazard rate / failure rate at any given time. I am also not sure how to translate the log of negative log of survival scale to hazard on the y-axis.

JacobSimonsen
Barite | Level 11

Hi Nick,

It is not possible to get the smoothed hazard-function directly out of phreg. It can be that you can use some kernel smoothing technique to smooth the increments of the cumulated hazard function, but in that case it will be difficult to get confidence intervals of the hazard-function.

If the hazard-function is the main goal of the analysis then I will suggest you to use an accelerated failure time model instead (with PROC LIFEREG), where you can get a hazard-function with a full parametric form.

Or alternative, you can assume piecewise constant hazard-rates and perform a poisson-regression with proc genmod. Here the hazard-function will also have a full parametric form.

Jacob

JacobSimonsen
Barite | Level 11

An example of a kernel smoother to get the hazard function;

*simulate a dataset with survivial times;
*the baseline hazard, lamda0, is t+1;
data simulation2;
  call streaminit(1234567);
  do i=1 to 10000;
    x=(i>5000);
    censur_time=-2*log(rand('uniform',0,1));
    eventt=sqrt(-2*log(rand('uniform',0,1))/(exp(x*0.25)) +1)-1;
    event=(eventt<censur_time);
    t=min(eventt,censur_time);
    output;
  end;
  keep t  x event ;
run;

proc phreg data=simulation2;
  class x(ref="0")/param=ref;
  model t*event(0)=x;
  baseline out=baseline  cumhaz=cumhaz;
run;

*from the cumulated baseline I smooth out the increments to get an estimate of the hazard-function itself;

*here I calculate the increments;
data baseline2;
  set baseline;
  retain last;
  delta=cumhaz-last;
  last=cumhaz;
  keep t delta;
run;

*delta = length of the interval over which average is taken;

*dellta should be choosen larger if you have a longer time scale;


%let delta=0.1;

*Here construct a dataset of time-values where the estimated hazardfunction will be calculated;
data template;
  do t=0.01 to 2 by 0.01;
  output;
  end;
run;

*Here I smooth out the increments;

*the kernel function I use is the triangulare function;

*Notice, I have not taken care of the boundary problem at zero. therefore I dont estimate lambda(t) for t<&delta.;

proc fcmp outlib=work.function.kernel;
  function k(t);
     k= (-t**2+1)*3/4;
     *k= 1-abs(t);
return (k);
  endsub;
quit;

option cmplib=(work.function);
data smoother;
  set baseline2(in=a keep=t) template(in=b );
  by t;
  retain dsid;
  if _N_=1 then do;
    obsid=0;
    dsid=open('baseline2');
  end;
  if a then obsid+1;
  if b;
  *forward;
  hazard=0;
  obsid=max(obsid,1);

  do i=obsid to attrn(dsid,'nobs') until (runningt>t+&delta.);
    obs=fetchobs(dsid,i);
    runningt=getvarn(dsid,1);
     delta=getvarn(dsid,2);
   * put 'a' i= runningt=;
    if t-&delta.<runningt<t+&delta. then do;
      if t>=&delta. then hazard+delta*k((t-runningt)/&delta.)/&delta.;
   else if runningt<2*t then hazard+delta*k((t-runningt)/t)/t;
end;
*output;
  end;

  *backward;
  do i=obsid-1 to 1 by -1 until (runningt<t-&delta.);
    *put i=;
    *put 'b' i= runningt=;
    obs=fetchobs(dsid,i);
    runningt=getvarn(dsid,1);
    delta=getvarn(dsid,2);
if runningt=0 then delta=0;
    if t-&delta.<runningt<t+&delta. then do;
      if t>=&delta. then hazard+delta*k((t-runningt)/&delta.)/&delta.;
   else if runningt<2*t then hazard+delta*k((t-runningt)/t)/t;

end;
*output;
  end;
  true_lambda0=(t+1);
run;

*then it has to be plotted;

symbol1 i=join v=none;
symbol2 i=join v=none;
proc gplot data=smoother;
  plot (hazard true_lambda0)*t/overlay;
run;

I changed a bit: a function now defines the kernel smoother, I use the Epanechnikov, and uncommented the Triangular kernel. Further, I changed the bandwidth for t close to zero, such the smoother now can start at zero instead. Message was edited by: Jacob Simonsen

nchesnaye
Calcite | Level 5

I played around a bit to get the right bandwidth and now it's perfect! Thank you for your help!

JacobSimonsen
Barite | Level 11

If you need then you have here the reference for the applied techique:

Ramlau-Hansen, Henrik. Smoothing Counting Process Intensities by Means of Kernel Functions.  The Annals of Statistics 11 (1983), no. 2, 453--466. doi:10.1214/aos/1176346152.

http://projecteuclid.org/euclid.aos/1176346152.


pauldallison
Calcite | Level 5

I wrote a macro called SMOOTH that implements this method.  It can be downloaded at http://www.statisticalhorizons.com/resources/macros

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3512 views
  • 4 likes
  • 4 in conversation