Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

Hi All, I came across a factor analysis issue. When I used the PROC FACTOR and PROC SCORE to get factor scores, I have noticed that the factor scores provided by SAS output do not match the ones calculated by hand. To be specific, I used the "standardized scoring coefficients" provided by SAS output and multiplied the coefficients by the standardized variables and then summed them up. The two factor score values are pretty close but not identical. I am not sure if this is caused by the rounding issues or there is one more step for manually calculating the factor scores.

Your comments and help are much appreciated.


Accepted Solutions
Solution
‎08-21-2015 12:34 PM
Grand Advisor
Posts: 17,464

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

You messed up the order of your variables in your scoring, the last two variables are flipped.

You listed them in a different order in the proc factor than they are in the data set, if the order is corrected the correct values are derived.

array vars(5) Age Weight RunTime RunPulse RestPulse; *CORRECT;

array vars_order(5) Age Weight RunTime  RestPulse RunPulse; *INCORRECT;

data manual_score;

*load factors into temporary array for comparison;

if _n_=1 then do i=1 to 5;

set ttt;

array f1(5) _temporary_;

array f2(5)  _temporary_;

f1(i)=factor1;

f2(i)=factor2;

end;

*load standardized data;

set stdfit;

*initialize factor scores to 0;

factor_score1=0;

factor_score2=0;

factor_score1_wrong=0;

factor_score1_wrong=0;

*Set array for variables - NOTE ORDER;

array vars(5) Age Weight RunTime RunPulse RestPulse;

array vars_order(5) Age Weight RunTime  RestPulse RunPulse;

*Calculate correct factor scores;

do i=1 to 5;

factor_score1=sum(factor_score1, f1(i)*vars(i));

factor_score2=sum(factor_score2, f2(i)*vars(i));

end;

*Calculate incorrect factor scores;

do i=1 to 5;

factor_score1_wrong=sum(factor_score1_wrong, f1(i)*vars_order(i));

factor_score2_wrong=sum(factor_score2_wrong, f2(i)*vars_order(i));

end;

run;

View solution in original post


All Replies
Grand Advisor
Posts: 10,251

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

If you looked at proc output tables in the results window and not a data set it is likely that some decimal rounding is likely.

You may to use ODS Output to send the results of the table to a dataset and calculate with that.

Add

ODS Output StdScoreCoef = CoefficentDataSetname;

to the proc code.

Contributor
Posts: 40

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

Hi ballardw,

Thank you very much for your comments and help. But I tried the ODS output and have found that the ODS output values match exactly those of "standardized scoring coefficients" from PROC FACTOR.

I used the example from SAS: SAS/STAT(R) 9.2 User's Guide, Second Edition

Here is the screenshot for factor proc output.

And here is the result from ODS:

Contributor
Posts: 40

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

Hi All,

In order to provide more details about the  issue, I have included the slightly modified SAS code and output screenshots for your reference.

The SAS code:

/* This data set contains only the first 12 observations   */

      /* from the full data set used in the chapter on PROC REG. */

data Fitness;

      input Age Weight Oxygen RunTime RestPulse RunPulse @@;

      datalines;

   44 89.47  44.609 11.37 62 178     40 75.07  45.313 10.07 62 185

   44 85.84  54.297  8.65 45 156     42 68.15  59.571  8.17 40 166

   38 89.02  49.874  9.22 55 178     47 77.45  44.811 11.63 58 176

   40 75.98  45.681 11.95 70 176     43 81.19  49.091 10.85 64 162

   44 81.42  39.442 13.08 63 174     38 81.87  60.055  8.63 48 170

   44 73.03  50.541 10.13 45 168     45 87.66  37.388 14.03 56 186

   ;

proc factor data=Fitness outstat=FactOut

               method=prin rotate=varimax score;

   var Age Weight RunTime RunPulse RestPulse;

   title 'FACTOR SCORING EXAMPLE';

   ODS output StdScoreCoef=ttt;

run;

  

proc print data=ttt;

title 'ODS Output Table';

run;

***User added this proc to get standardized variables;

proc stdize data=Fitness method=std out=stdfit;

var Age Weight Oxygen RunTime RestPulse RunPulse;

run;

Title 'Standardized Data';

proc print data=stdfit;

run;

proc print data=FactOut;

   title2 'Data Set from PROC FACTOR';

run;

  

proc score data=Fitness score=FactOut out=FScore;

      var Age Weight RunTime RunPulse RestPulse;

run;

  

proc print data=FScore;

   title2 'Data Set from PROC SCORE';

run;

Title;

Part of PROC FACTOR output:

The ODS output table:

The standardized data:

The factor scores (last two columns) calculated by SAS:

The factor score calculated by hand (Excel 2010):

Grand Advisor
Posts: 10,251

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

As soon as you bring in Excel I get nervous. HOW did you get the values into Excel? If you entered the values as shown in your output labeled "Part of PROC FACTOR output:" then you rounded the values. If you look at the value in the TTT data set, row labeled score you will see that the score for Age is -0.178464537 with a best12. format and -0.1784645370142 with a best16. format. I think your print defaulted to best8. format which does round the data. Copy and paste from the ODS output does not carry the additional decimals unless you print it with a longer format.

Contributor
Posts: 40

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

Thanks, ballardw. Good suggestion. However, I also tried best12. format and showed all the numbers after the decimal. But the results did not change. Still, my calculation does not match SAS output. The largest change is about 109% [(mycalc-sas)/sas*100%]. So, I believe that it cannot be the rounding issue.

Solution
‎08-21-2015 12:34 PM
Grand Advisor
Posts: 17,464

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

You messed up the order of your variables in your scoring, the last two variables are flipped.

You listed them in a different order in the proc factor than they are in the data set, if the order is corrected the correct values are derived.

array vars(5) Age Weight RunTime RunPulse RestPulse; *CORRECT;

array vars_order(5) Age Weight RunTime  RestPulse RunPulse; *INCORRECT;

data manual_score;

*load factors into temporary array for comparison;

if _n_=1 then do i=1 to 5;

set ttt;

array f1(5) _temporary_;

array f2(5)  _temporary_;

f1(i)=factor1;

f2(i)=factor2;

end;

*load standardized data;

set stdfit;

*initialize factor scores to 0;

factor_score1=0;

factor_score2=0;

factor_score1_wrong=0;

factor_score1_wrong=0;

*Set array for variables - NOTE ORDER;

array vars(5) Age Weight RunTime RunPulse RestPulse;

array vars_order(5) Age Weight RunTime  RestPulse RunPulse;

*Calculate correct factor scores;

do i=1 to 5;

factor_score1=sum(factor_score1, f1(i)*vars(i));

factor_score2=sum(factor_score2, f2(i)*vars(i));

end;

*Calculate incorrect factor scores;

do i=1 to 5;

factor_score1_wrong=sum(factor_score1_wrong, f1(i)*vars_order(i));

factor_score2_wrong=sum(factor_score2_wrong, f2(i)*vars_order(i));

end;

run;

Contributor
Posts: 40

Re: Question about the difference between manually calculcated factor scores and the ones using PROC SCORE

Thanks so much! I indeed overlooked the order of the variable output. I verified that in the Excel file and now the manual calculation matches the SAS output! Very appreciated!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 480 views
  • 10 likes
  • 3 in conversation