I am doing survival analysis with proc phreg looking at a continuous nutrient exposure and colorectal cancer as the outcome. I have never performed restricted cubic splines analysis before and tried to find similar SAS posts and incorporate them into my model which I have listed here:
proc phreg data = want;
title "Colorectal-cancer incidence attempt";
effect resid_total_polys = spline(resid_total_poly / basis=tpf(noint) NATURALCUBIC details knotmethod=percentiles(4));
class categorical_covariate (ref="0");
model eof_age_CRC*Inc_CRC(0) = resid_total_polys
categorical_covariate continuous_covariate
/ entry=enrollment_agemonths rl=wald;
run;
First, I am not sure how to best compare the results of the spline models with the model I have done separately by quintiles. Should I compare the -2 Log L of the Global Fit Statistics or just use the P value (Pr >ChiSq) of testing the global null hypothesis output?
Second, I would like to be able to plot this data in a meaningful way, but am not sure how to appropriately plot this data or include 95% confidence bars if possible. I would hope it might look something like this:
I did some digging and was able to use this code:
proc phreg data = want;
title "Colorectal-cancer incidence attempt";
effect resid_total_polys = spline(resid_total_poly / basis=tpf(noint) NATURALCUBIC details knotmethod=percentiles(4));
class categorical_covariate (ref="0");
model eof_age_CRC*Inc_CRC(0) = resid_total_polys
categorical_covariate continuous_covariate
/ entry=enrollment_agemonths rl=wald;
store work.coxr;
run;
%macro est(ref=212, start=0, end=3000, by=1);
%Do i = 1 %To %eval(%SysFunc( Ceil( %SysEvalF( ( &End - &Start ) / &By ) ) ) +1) ;
%Let value=%SysEvalF( ( &Start - &By ) + ( &By * &I ) ) ;
estimate "&value." resid_total_polys [-1, &ref] [1, &value] / exp cl;
%end;
%mend est;
ods exclude all;
ods dataset Estimates=Estimates;
proc plm restore=work.coxr;
%est(ref=212, start=0, end=3000, by=1);
run;
ods exclude none;
data estimates;
set estimates;
resid_total_poly=label*1;
run;
proc sgplot data=estimates NOAUTOLEGEND Noborder;
Series y=ExpEstimate x=resid_total_poly / LINEATTRS=(color=black thickness=3px);
Series y=LowerExp x=resid_total_poly / LINEATTRS=(pattern=ShortDash color=Black THICKNESS=1);
Series y=UpperExp x=resid_total_poly / LINEATTRS=(pattern=ShortDash color=Black THICKNESS=1);
band x=resid_total_poly upper=UpperExp lower=LowerExp / fillattrs=(color=graydd) transparency=.50;
REFLINE 1 / axis=y;
REFLINE 3000 / Axis=X LINEATTRS=(pattern=ThinDot color=Black THICKNESS=1);;
yaxis Values=(0.5 0.8 1 1.2 1.5) Label="Hazard ratio" Type=LOG LABELATTRS=(weight=BOLD);
xaxis min=0 VALUES=(0 to 3000 by 100) LABELATTRS=(weight=BOLD);
run;
which created this graph:
My remaining question would be what output from proc PHREG I should use to compare this to the model by quintiles or the linear model to determine how much better the cubic splines model is. Do I then use this to decide how many knots to use?
I did some digging and was able to use this code:
proc phreg data = want;
title "Colorectal-cancer incidence attempt";
effect resid_total_polys = spline(resid_total_poly / basis=tpf(noint) NATURALCUBIC details knotmethod=percentiles(4));
class categorical_covariate (ref="0");
model eof_age_CRC*Inc_CRC(0) = resid_total_polys
categorical_covariate continuous_covariate
/ entry=enrollment_agemonths rl=wald;
store work.coxr;
run;
%macro est(ref=212, start=0, end=3000, by=1);
%Do i = 1 %To %eval(%SysFunc( Ceil( %SysEvalF( ( &End - &Start ) / &By ) ) ) +1) ;
%Let value=%SysEvalF( ( &Start - &By ) + ( &By * &I ) ) ;
estimate "&value." resid_total_polys [-1, &ref] [1, &value] / exp cl;
%end;
%mend est;
ods exclude all;
ods dataset Estimates=Estimates;
proc plm restore=work.coxr;
%est(ref=212, start=0, end=3000, by=1);
run;
ods exclude none;
data estimates;
set estimates;
resid_total_poly=label*1;
run;
proc sgplot data=estimates NOAUTOLEGEND Noborder;
Series y=ExpEstimate x=resid_total_poly / LINEATTRS=(color=black thickness=3px);
Series y=LowerExp x=resid_total_poly / LINEATTRS=(pattern=ShortDash color=Black THICKNESS=1);
Series y=UpperExp x=resid_total_poly / LINEATTRS=(pattern=ShortDash color=Black THICKNESS=1);
band x=resid_total_poly upper=UpperExp lower=LowerExp / fillattrs=(color=graydd) transparency=.50;
REFLINE 1 / axis=y;
REFLINE 3000 / Axis=X LINEATTRS=(pattern=ThinDot color=Black THICKNESS=1);;
yaxis Values=(0.5 0.8 1 1.2 1.5) Label="Hazard ratio" Type=LOG LABELATTRS=(weight=BOLD);
xaxis min=0 VALUES=(0 to 3000 by 100) LABELATTRS=(weight=BOLD);
run;
which created this graph:
My remaining question would be what output from proc PHREG I should use to compare this to the model by quintiles or the linear model to determine how much better the cubic splines model is. Do I then use this to decide how many knots to use?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.