I've fit a linear regression onto a set of training data using both Proc Reg and Proc GLM. When I score the testing dataset, I can only create the Confidence using Proc PLM on the saved Proc GLM model - the Proc Reg model results in blanks (despite being the same model)
This is just a question on whether Proc Reg is incompatible with Proc PLM in generating Confidence intervals on test data.
The below code is runable on any machine (generates dummy data to regress on)
/* the original data; fit model to these values */
data A;
input x y @@;
datalines;
1 4 2 9 3 20 4 25 5 1 6 5 7 -4 8 12
;
/* the scoring data; evaluate model on these values */
%let NumPts = 200;
data ScoreX(keep=x);
min=1; max=8;
do i = 0 to &NumPts-1;
x = min + i*(max-min)/(&NumPts-1); /* evenly spaced values */
output; /* no Y variable; only X */
end;
run;
proc reg data=A outest=RegOut tableout;
model y = x; /* name of model is used by PROC SCORE */
store work.proc_reg_model;
quit;
ods output ParameterEstimates=Pi_Parameters FitStatistics=Pi_Summary;
proc glm data=A;
model y = x;
store work.proc_glm_model; /* store the model */
quit;
proc plm restore=work.proc_glm_model;
score data=ScoreX out=Pred predicted=yhat lcl=lower_pred_int lclm=lower_confidence_int ucl=upper_pred_int uclm=upper_confidence_int; /* evaluate the model on new data */
run;
proc plm restore=work.proc_reg_model;
score data=ScoreX out=Pred_lin_reg predicted=yhat lcl=lower_pred_int lclm=lower_confidence_int ucl=upper_pred_int uclm=upper_confidence_int; /* evaluate the model on new data */
run;
I expect identical output datasets from the PROC PLM procedure for both models. the PROC PLM for the proc reg model results in blank data for the confidence and prediction intervals. As can be seen, the final 2 datasets of interest are: pred_proc_reg (blank values for confidence and prediction intervals) pred_proc_glm (populated values for confidence and prediction intervals).
Is my conclusion correct? Proc Reg cannot be used to calculate confidence intervals on test data? (can only be used to create confidence intervals on training data).