Programming the statistical procedures from SAS

PROC SCORE from PROC PLS method = RRR

Reply
New Contributor
Posts: 3

PROC SCORE from PROC PLS method = RRR

Please forgive the long-winded explanation, I am a new SAS user.

I am using RRR to create dietary patterns at a baseline time point and using this pattern to look at diet over a 10 year period which requires repeated scores.

Using the xweight ods output from PROC PLS RRR I am able to use PROC SCORE to produce a score based on individuals food intake. However, I have hit a problem. To check my methodology, I have applied the PROC SCORE to the same data used in the PROC PLS to create a RRR score. The hypothesis was that the applied and natural scores would be the same, however, they are not. They are systematically different by 11.18%. This same ratio appears if i use the same method in a completely different data set with different x groups .

Does anyone have any ideas why this very consistent error keeps cropping up? Does anyone have information on the way in which PROC PLS method = RRR applies its xweights to the data to create the score?

Thanks for any help, code below.

Our code:

*KEEP CENTRED AND SCALED PREDICTOR (FOOD GROUP) VARIABLES & NATURAL DP
SCORE PRODUCED BY EXPL RRR
& REMOVE RAW DATA;
data scaled;
set pattern10;
keep cid_477a qlet $foods2
pred10score1 ;
run;

************************************************************************
CONFIRMATORY RRR USING CENTRED AND SCALED DATA;

*MAKE XWEIGHTS (SCORING FILE) SUITABLE FOR PROC SCORE;

data scores;
set rrr10xweights;

if Numberoffactors > 1 then delete;*only interested in 1st pattern;
drop Numberoffactors;

_TYPE_="SCORE";
_NAME_="Factor1";

/* rename scoring variables to match scaled predictor variable names*/
rename $foods = $foods2;
run;

*RE-SCORE SCALED AND CENTRED PREDICTOR VARIABLES using scoring
coefficients to test confirmatory RRR;

proc score data=scaled out=pattern10_1 score=scores type="SCORE"
nostd;
var $foods2;
run;

***************************************************************************
COMPARE 'NATURAL' AND 'APPLIED' SCORES;

*check correlation between natural and applied scores;
proc corr data=pattern10_1;
var pred10score1 factor1;
run;

*calculate differences and ratio b/w natural and applied scores;

proc rank data=pattern10_1 out=ranks;
ranks rankpred10 rankfact1;
var pred10score1 factor1;
run;

proc sort data=ranks;
by pred10score1;
run;

data rankdiff;
set ranks;
difpat1=factor1 - pred10score1;
ratiopat1=factor1/pred10score1;
difrank=rankpred10 - rankfact1;
run;

proc means;
var difpat1 ratiopat1 difrank;
run;

Message was edited by: UKPhD Message was edited by: UKPhD
Super Contributor
Posts: 281

Re: PROC SCORE from PROC PLS method = RRR

If they are systematically different by 11.18%, then you have a scaling issue. Somehow, somewhere, in either PROC PLS or PROC SCORE, things are not being scaled properly. I would check the scaling options in both procedures.
Contributor
Posts: 34

Re: PROC SCORE from PROC PLS method = RRR

[ Edited ]

I noticed the same problem and am sure the data is properly centered and scaled by the PLS procedure.   I also used the centering/scaling output table from PROC PLS and applied it to the new data in PROC SCORE.  See sample code attached to reproduce the issue.   The results of the last proc means show a ~20% difference between the original factor generated by RRR and the factor generated by PROC SCORE (and the % difference is the same for all observations.)  If any thoughts on what I may be missing or assuming incorrectly, please advice.

Ask a Question
Discussion stats
  • 2 replies
  • 335 views
  • 0 likes
  • 3 in conversation