BookmarkSubscribeRSS Feed
Jonison
Fluorite | Level 6

Hello, experts, I have created specific multiple responses PLS model, and would like to use score approach for prediction.

I use the code below:

 

/* store score from pls*/
PROC PLS DATA=WORK.training NFAC=4 missing=EM(MAXITER=3) METHOD=PLS NOCENTER NOSCALE  plots=(vip dmodx) VARSS NOCVSTDIZE details;
 MODEL &Y1 = &X1 &X2 &X3 &X4 &X5 &X6 &X7 &X8 &X9 &X10 &X11 &X12 &X13/solution;
 ods output ParameterEstimates=WORK.Parm;
 OUTPUT OUT=PRED1 PREDICTED=p_Y1 p_Y2 P_Y3 p_Y4 p_Y5;
RUN;

proc transpose data=WORK.Parm(rename=(RowName=_NAME_)) out=WORK.tParm;

data WORK.tParm; set tParm;
      _TYPE_ = 'PARMS';
RUN; 
QUIT;

/* use score to get prediction*/
proc score data=WORK.prediction (drop=Y1 Y2 Y3 Y4 Y5)
score=WORK.tParm out=ScoreOuttest type=parms nostd predict;
var &X1 &X2 &X3 &X4 &X5 &X6 &X7 &X8 &X9 &X10 &X11 &X12 &X13;
run;

 

the issue is that I found the predict results generated from PLS procedure and score procedure are different on training dataset. Can experts give some explanation or suggestions on this? 

 

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

"Different" is very vague. Show us a portion of the comparison where you say it is different. Are there any ERRORs or WARNINGs in the log?

 

Your are running PROC PLS on work.training, but you are running PROC SCORE on work.prediction. These are not the same, and they should be.

 

Also, your code shows only a single Y variable, but your are asking for predicted values for 5 different Y variables, that doesn't seem right.

 

Lastly, although there are rare situations where you might want to run PROC PLS with the NOCENTER and NOSCALE options, generally you don't do that, and the predicted values and important variables will change depending on choice of scaling and centering of the original data.

 
 
--
Paige Miller
PaigeMiller
Diamond | Level 26

Example using SASHELP.CARS

 

proc pls data=sashelp.cars(keep=msrp invoice enginesize--mpg_highway) nfac=3;
    ods output parameterestimates=coeffs;
    model msrp invoice = enginesize--mpg_highway/solution;
    output out=alligator predicted=msrp_pred invoice_pred;
run;
proc transpose data=coeffs out=coeffs_t;
    id rowname;
run;
data coeffs_t;
    set coeffs_t;
    _type_='PARMS';
run;
proc score data=sashelp.cars(keep=enginesize--mpg_highway) out=alligator2 score=coeffs_t type=parms;
    var enginesize--mpg_highway;
run;
data compare;
    merge alligator(keep=invoice_pred msrp_pred) alligator2(keep=msrp invoice rename=(msrp=msrp_pred2 invoice=invoice_pred2));
    delta_msrp=msrp_pred-msrp_pred2;
    delta_invoice=invoice_pred-invoice_pred2;
run;
--
Paige Miller
Ksharp
Super User
PROC SCORE is for PROC REG ( Y= X*beta ) , not for PROC PLS .
They have different score algorithm , the only way to score pls is your first way I think .
PaigeMiller
Diamond | Level 26

But I just showed code that uses PROC SCORE to predict from PROC PLS output.

--
Paige Miller
Ksharp
Super User
Great. Learn something new '/solution' .
Don't realize the coefficient of PLS is for raw data , NOT principal component.
PaigeMiller
Diamond | Level 26

You can get the loadings for each factor as well.

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 967 views
  • 3 likes
  • 3 in conversation