I have created a linear regression model using Proc Reg output my parameters to use in Proc Score and produced the predicted values in my output table. However when I used Proc Score on data (including the data used to build the model) the values for the data I used to build the model are different in Proc Score to the output predicted values in Proc Reg. Can anyone explain this?
(EG 7.1)
Target | P_Reg | P_Score |
88806 | 89207.13 | 89300.56 |
89051 | 89286.64 | 89421.16 |
89397 | 89364.52 | 89541.75 |
90354 | 90080.69 | 90134.00 |
90984 | 90552.01 | 90541.30 |
91283 | 90979.03 | 90948.60 |
They should be the same, so my assumption at this point would be user error.
Here's an example of how it should work.
data Fitness;
input Age Weight Oxygen RunTime RestPulse RunPulse @@;
datalines;
44 89.47 44.609 11.37 62 178 40 75.07 45.313 10.07 62 185
44 85.84 54.297 8.65 45 156 42 68.15 59.571 8.17 40 166
38 89.02 49.874 9.22 55 178 47 77.45 44.811 11.63 58 176
40 75.98 45.681 11.95 70 176 43 81.19 49.091 10.85 64 162
44 81.42 39.442 13.08 63 174 38 81.87 60.055 8.63 48 170
44 73.03 50.541 10.13 45 168 45 87.66 37.388 14.03 56 186
;
proc reg data=Fitness outest=RegOut;
OxyHat: model Oxygen=Age Weight RunTime RunPulse RestPulse;
output out=p1 p= r=;
title 'Regression Scoring Example';
run;quit;
proc print data=RegOut;
title2 'OUTEST= Data Set from PROC REG';
run;
proc score data=Fitness score=RegOut out=P2 type=parms;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=P2;
title2 'Predicted Scores for Regression';
run;
data check;
merge p1 (keep=oxygen) P2(keep = oxygen rename=oxygen=O2) ;
diff= round(oxygen-o2, 0.0001);
run;
title 'Check of differences, should be 0';
proc print data=check;
run;
@SandyWindsor wrote:
I have created a linear regression model using Proc Reg output my parameters to use in Proc Score and produced the predicted values in my output table. However when I used Proc Score on data (including the data used to build the model) the values for the data I used to build the model are different in Proc Score to the output predicted values in Proc Reg. Can anyone explain this?
(EG 7.1)
Target P_Reg P_Score 88806 89207.13 89300.56 89051 89286.64 89421.16 89397 89364.52 89541.75 90354 90080.69 90134.00 90984 90552.01 90541.30 91283 90979.03 90948.60
You'd need to show us your code, and show us a reasonable portion of your data (as a SAS data step, see: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...)
They should be the same, so my assumption at this point would be user error.
Here's an example of how it should work.
data Fitness;
input Age Weight Oxygen RunTime RestPulse RunPulse @@;
datalines;
44 89.47 44.609 11.37 62 178 40 75.07 45.313 10.07 62 185
44 85.84 54.297 8.65 45 156 42 68.15 59.571 8.17 40 166
38 89.02 49.874 9.22 55 178 47 77.45 44.811 11.63 58 176
40 75.98 45.681 11.95 70 176 43 81.19 49.091 10.85 64 162
44 81.42 39.442 13.08 63 174 38 81.87 60.055 8.63 48 170
44 73.03 50.541 10.13 45 168 45 87.66 37.388 14.03 56 186
;
proc reg data=Fitness outest=RegOut;
OxyHat: model Oxygen=Age Weight RunTime RunPulse RestPulse;
output out=p1 p= r=;
title 'Regression Scoring Example';
run;quit;
proc print data=RegOut;
title2 'OUTEST= Data Set from PROC REG';
run;
proc score data=Fitness score=RegOut out=P2 type=parms;
var Age Weight RunTime RunPulse RestPulse;
run;
proc print data=P2;
title2 'Predicted Scores for Regression';
run;
data check;
merge p1 (keep=oxygen) P2(keep = oxygen rename=oxygen=O2) ;
diff= round(oxygen-o2, 0.0001);
run;
title 'Check of differences, should be 0';
proc print data=check;
run;
@SandyWindsor wrote:
I have created a linear regression model using Proc Reg output my parameters to use in Proc Score and produced the predicted values in my output table. However when I used Proc Score on data (including the data used to build the model) the values for the data I used to build the model are different in Proc Score to the output predicted values in Proc Reg. Can anyone explain this?
(EG 7.1)
Target P_Reg P_Score 88806 89207.13 89300.56 89051 89286.64 89421.16 89397 89364.52 89541.75 90354 90080.69 90134.00 90984 90552.01 90541.30 91283 90979.03 90948.60
Thanks upon checking the file which was imported from excel had become corrupted ,so some of the values were incorrect.
Item is now resolved
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.