<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to output residuals, in phreg, for a model with time dependent variables and left truncation? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-output-residuals-in-phreg-for-a-model-with-time-dependent/m-p/919740#M362275</link>
    <description>&lt;P&gt;Hello!&lt;BR /&gt;I am trying to perform survival analysis on a sample with 100,000 observations.&amp;nbsp;&lt;BR /&gt;The sample is 90% censored, so there are around 10,000 events.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(The reason for choosing such a large sample was to ensure adequate number of events with a 90% censoring rate in the population)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The survival time is in days from date of birth, the event is death.&lt;/P&gt;&lt;P&gt;Left truncation is accounted for by including the "entry =" option in the model statement.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have used proc phreg for a semi-parametric cox proportional hazards model and followed this procedure:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Model with two variables: education level (4 levels),&amp;nbsp; sex(2 levels).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc phreg data=model_data;
	class sex(ref='M') edu_n(ref='3');
	model surv_t_dob*event(0) =  edu_n sex /entry=surv_t_till_s; 
        output ressch = _all_;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;2. The PH assumption is violated for both the variables, verified by inspecting log cumulative hazard plots, schoenfeld residuals, and time-dependent interaction significance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. To remedy the PH violation, I stratified on sex and included 3 time interactions (education level i * survival time) with one interaction for each level of education except the reference level.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc phreg data=model_data;
	class sex(ref='M') edu_n(ref='3');
	strata sex;
	model surv_t_dob*event(0) = 
		edu_n edu_nt1 edu_nt2 edu_nt4 / entry=surv_t_till_s; 
	edu_nt1 = (edu_n=1)*surv_t_dob;
	edu_nt2 = (edu_n=2)*surv_t_dob;
	edu_nt4 = (edu_n=4)*surv_t_dob;&lt;BR /&gt;    output ressch =_all_;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Now, I want to know if I can correctly interpret the hazard ratios of this extended cox model. My initial guess was to look at the model fit statistics and also the Schoenfeld residuals.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The fit statistics tell me that the model performs better than a null model:&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Model Fit Statistics&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Criterion&lt;/TD&gt;&lt;TD&gt;Without&lt;/TD&gt;&lt;TD&gt;With&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Covariates&lt;/TD&gt;&lt;TD&gt;Covariates&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;-2 LOG L&lt;/TD&gt;&lt;TD&gt;188952.79&lt;/TD&gt;&lt;TD&gt;188658.3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;AIC&lt;/TD&gt;&lt;TD&gt;188952.8&lt;/TD&gt;&lt;TD&gt;188670.3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;SBC&lt;/TD&gt;&lt;TD&gt;188952.8&lt;/TD&gt;&lt;TD&gt;188714.6&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;But, the problem is that PROC PHREG does not create an output dataset when time dependent covariates are included using programming statements.&lt;/P&gt;&lt;P&gt;I tried to include the time dependent variables separately in a data step but according to this discussion: &lt;A href="https://support.sas.com/kb/24/554.html" target="_blank" rel="noopener"&gt;Link&lt;/A&gt;&amp;nbsp;&amp;nbsp;the method is incorrect.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Another option was to use counting style process of input, but according to this note:&amp;nbsp;&lt;A href="https://support.sas.com/kb/20/857.html" target="_blank" rel="noopener"&gt;Link&lt;/A&gt;&amp;nbsp;, the survival estimates are wrong when a counting style process input with time dependent covariates is used and there is no circumvention. Hence, I assume that the residuals will also be incorrect.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;My questions are:&lt;/P&gt;&lt;P&gt;Q1: How do I validate this extended Cox model in SAS? When can I appropriately interpret the Hazard Ratios?&amp;nbsp;&lt;/P&gt;&lt;P&gt;Q2: Is it possible to look at the residuals of such a model?&amp;nbsp;Why does SAS not create an output data set when time dependent covariates are included?&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 11 Mar 2024 11:14:12 GMT</pubDate>
    <dc:creator>rahulkunte</dc:creator>
    <dc:date>2024-03-11T11:14:12Z</dc:date>
    <item>
      <title>How to output residuals, in phreg, for a model with time dependent variables and left truncation?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-output-residuals-in-phreg-for-a-model-with-time-dependent/m-p/919740#M362275</link>
      <description>&lt;P&gt;Hello!&lt;BR /&gt;I am trying to perform survival analysis on a sample with 100,000 observations.&amp;nbsp;&lt;BR /&gt;The sample is 90% censored, so there are around 10,000 events.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;(The reason for choosing such a large sample was to ensure adequate number of events with a 90% censoring rate in the population)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The survival time is in days from date of birth, the event is death.&lt;/P&gt;&lt;P&gt;Left truncation is accounted for by including the "entry =" option in the model statement.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have used proc phreg for a semi-parametric cox proportional hazards model and followed this procedure:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Model with two variables: education level (4 levels),&amp;nbsp; sex(2 levels).&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc phreg data=model_data;
	class sex(ref='M') edu_n(ref='3');
	model surv_t_dob*event(0) =  edu_n sex /entry=surv_t_till_s; 
        output ressch = _all_;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;2. The PH assumption is violated for both the variables, verified by inspecting log cumulative hazard plots, schoenfeld residuals, and time-dependent interaction significance.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;3. To remedy the PH violation, I stratified on sex and included 3 time interactions (education level i * survival time) with one interaction for each level of education except the reference level.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc phreg data=model_data;
	class sex(ref='M') edu_n(ref='3');
	strata sex;
	model surv_t_dob*event(0) = 
		edu_n edu_nt1 edu_nt2 edu_nt4 / entry=surv_t_till_s; 
	edu_nt1 = (edu_n=1)*surv_t_dob;
	edu_nt2 = (edu_n=2)*surv_t_dob;
	edu_nt4 = (edu_n=4)*surv_t_dob;&lt;BR /&gt;    output ressch =_all_;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Now, I want to know if I can correctly interpret the hazard ratios of this extended cox model. My initial guess was to look at the model fit statistics and also the Schoenfeld residuals.&amp;nbsp;&lt;/P&gt;&lt;P&gt;The fit statistics tell me that the model performs better than a null model:&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;Model Fit Statistics&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Criterion&lt;/TD&gt;&lt;TD&gt;Without&lt;/TD&gt;&lt;TD&gt;With&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;Covariates&lt;/TD&gt;&lt;TD&gt;Covariates&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;-2 LOG L&lt;/TD&gt;&lt;TD&gt;188952.79&lt;/TD&gt;&lt;TD&gt;188658.3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;AIC&lt;/TD&gt;&lt;TD&gt;188952.8&lt;/TD&gt;&lt;TD&gt;188670.3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;SBC&lt;/TD&gt;&lt;TD&gt;188952.8&lt;/TD&gt;&lt;TD&gt;188714.6&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;But, the problem is that PROC PHREG does not create an output dataset when time dependent covariates are included using programming statements.&lt;/P&gt;&lt;P&gt;I tried to include the time dependent variables separately in a data step but according to this discussion: &lt;A href="https://support.sas.com/kb/24/554.html" target="_blank" rel="noopener"&gt;Link&lt;/A&gt;&amp;nbsp;&amp;nbsp;the method is incorrect.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Another option was to use counting style process of input, but according to this note:&amp;nbsp;&lt;A href="https://support.sas.com/kb/20/857.html" target="_blank" rel="noopener"&gt;Link&lt;/A&gt;&amp;nbsp;, the survival estimates are wrong when a counting style process input with time dependent covariates is used and there is no circumvention. Hence, I assume that the residuals will also be incorrect.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;My questions are:&lt;/P&gt;&lt;P&gt;Q1: How do I validate this extended Cox model in SAS? When can I appropriately interpret the Hazard Ratios?&amp;nbsp;&lt;/P&gt;&lt;P&gt;Q2: Is it possible to look at the residuals of such a model?&amp;nbsp;Why does SAS not create an output data set when time dependent covariates are included?&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 11 Mar 2024 11:14:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-output-residuals-in-phreg-for-a-model-with-time-dependent/m-p/919740#M362275</guid>
      <dc:creator>rahulkunte</dc:creator>
      <dc:date>2024-03-11T11:14:12Z</dc:date>
    </item>
  </channel>
</rss>

