when I do proc transreg and specify the spline or sspline option for a variable at the model statement, the output forecast dataset always contains a new column containing the variable name and a prefix t. May I know how this tVariable series is derived during spline or spline transformation of the variable?
For example, work._fcst contains a series tX. Is there a closed form formula that describes the relation between tX and X? Thousands thanks ahead!
proc transreg data=work._0;
ID timestamp;
model identity(Y)=spline(X);
/*or sspline transformation*/
/*model identity(Y)=sspline(X / sm=20);*/
output out=work._fcst;
run;
The variable tx contains the transformed x values. The documentation below describes the spline basis that is used to compute the transformed x --
The following sample program illustrates how the predicted values for a spline model in PROC TRANSREG are computed, including how the transformed Tx values are computed. It is a function of the spline basis.
data test;
do x = 1 to 10;
y = x + log(x) + 2 * sin(x) + 10 * normal(7);
output;
end;
run;
**transreg doesn't output the bspline... save coefficients here
for bspline basis output in next transreg run with ODS;
**save regression coefficients using output statement;
title2 'Fit Curve, Get Predicted Values for All Observations';
proc transreg data=test detail ;
model ide(y) = spline(x / degree=3) / detail nomiss;
output out=spline p mrc;
ods output details=sdetails;
run;
proc print data=spline;
title 'TRANSREG OUT= data set using SPLINE transformation';
run;
proc print data=sdetails;
title 'TRANSREG DETAILS= data set using SPLINE transformation';
run;
*compare regression with transreg after transformation on x;
*note that parameters here agree with last obs of out=spline data set;
proc reg data=spline outest=outest;
model y=tx;
title 'REG output for Y vs. Transformed X variable';
quit;
*get the regression coefficients (only) from transreg run;
data regcoeff;
set spline(where=(upcase(_name_)='TY'));
keep tintercept tx;
rename tintercept=regb0 tx=regb1;
run;
proc print data=regcoeff;
title 'TRANSREG coefficients using SPLINE transformation';
run;
*** so the model is Py = regb0 + regb1*tx = -1.98554 + 1.54593*tx;
*** Tx is the transformed variable that comes from the spline basis, which is illustrated next;
*** the following program illustrates how tx is computed;
/*-------------------get TX manually-----------------------*/
proc transreg data=x; /* get the B-spline basis */
model ide(y) = bspline(x) / detail nomiss;
output out=manually;
run;
proc print data=manually;
title 'TRANSREG OUT= data set using BSPLINE transformation';
run;
*get transformation coefficients and restructure data set;
data getcoeff;
set sdetails(where=(term=2 & part>=9));
part=9-part;
keep numericvalue part;
run;
proc print data=getcoeff;
run;
proc transpose data=getcoeff out=splinecoef prefix=bx;
id part;
var numericvalue;
run;
proc print data=regcoeff;
title 'Subset of SDETAILS output';
run;
proc print data=splinecoef;
run;
data manually; /* Plug in the coefficients from DETAIL */
set manually;
if _n_=1 then set splinecoef;
if _n_=1 then set regcoeff;
tx = bx0 * x_0 + bx_1*x_1 + bx_2* x_2 + bx_3*x_3 ;
predicty=regb0+regb1*tx;
run;
**compare predicty with Py from out=spline;
data compare;
merge spline(keep=Py _name_ where=(upcase(_name_)^='TY'))
manually(keep=predicty);
run;
proc print data=compare;
title 'Compare resulting computations';
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.