BookmarkSubscribeRSS Feed
greveam
Quartz | Level 8

Hi, I have a dataset with some imputed variables using proc mi. Now I want to determine which explanatory variables are most important an exposure of interest (SBP). My idea was to use proc pls and rank the variables by their calculated variable importance score. However, I am not sure how I can output that to a format that can be used by proc mianalyze to account for the imputed values? Any help would be greatly appreciated. Thanks!

 

Here is the code I've tried:

 

proc pls data=have nfac=4 method=pls (algorithm=nipals maxiter=300) plot=XLoadingProfiles details CENSCALE outmodel=est1 cv=one varss plot=(ParmProfiles VIP);
class female;
model sbp = age female bmi dbp gfr / solution;
output out=plsscored XSCORE=xscore STDY=STDY STDX=STDX;
id subject;
by _imputation_;
run;

 

proc mianalyze parms(classvar=full)=want;
modeleffects intercept "VIP";
ods output ParameterEstimates=info;
run;

10 REPLIES 10
PaigeMiller
Diamond | Level 26

Is the question "How to output the variable importances?" or is the question "How to use the variable importances in PROC MIANALYZE?" 

--
Paige Miller
greveam
Quartz | Level 8

Both questions. Thanks, Anders

PaigeMiller
Diamond | Level 26

How to output the variable importance values?

 

There is a macro at sas.com which computes these and stores them in a SAS data set. You will have to search for it.

 

How to use variable importance values in PROC MIANALYZE?

 

I don't think you can. From the documentation: "The MIANALYZE procedure reads parameter estimates and associated standard errors or covariance matrix that are computed by the standard statistical procedure for each imputed data set." Variable importances from PROC PLS are not parameter estimates, they are not associated standard errors, they are not covariance matrix ...

--
Paige Miller
greveam
Quartz | Level 8

1. Thanks I found the macro.

 

2. The documentation states that the definition of parameter estimates is: "...also called coefficients, are the change in the response associated with a one-unit change of the predictor, all other predictors being held constant." 

 

So when you say VIP is not a parameter estimate, is that correct? The way I read the SAS documentation, VIP is a scaled model estimate of how much the Y changes for a standardized change in predictor variables (i.e. a parameter estimate that can be rank the "importance" of both continuous and class variables)? The problem for proc mianalyze is that there is no associated standarderror with the VIP estimate - maybe it could be bootstrapped?

 

Cheers,

Anders

 

 

PaigeMiller
Diamond | Level 26

@greveam wrote:

 

So when you say VIP is not a parameter estimate, is that correct?


That's correct. It is a diagnostic statistic, intended to help you understand the PLS model. It is not the PLS model itself.

 

The PLS model itself, which is what I think you need in PROC MIANALYZE, is obtained via the SOLUTION option in the MODEL statement.

 

The way I read the SAS documentation, VIP is a scaled model estimate of how much the Y changes for a standardized change in predictor variables (i.e. a parameter estimate that can be rank the "importance" of both continuous and class variables)?

 

 I don't see anything like this in the SAS documentation for PROC PLS. Can you share a link to where it says this?

--
Paige Miller
greveam
Quartz | Level 8

Ok, now I understand - Thanks.

 

Have you tried the pls macro? I copied the code from: https://support.sas.com/rnd/app/stat/papers/plsex.pdf

 

It seems to be working fine until I submit %get_wts(est1,dsxwts=xwts), which produces an error message:

 

NOTE: There were 15 observations read from the data set WORK.DSOUT.
NOTE: There were 15 observations read from the data set WORK.PLTANNO.
NOTE: PROCEDURE GPLOT used (Total process time):
real time 14.57 seconds
cpu time 1.04 seconds

 

ERROR: The variable age in the DROP, KEEP, or RENAME list has never been referenced.

ERROR: The variable dbp in the DROP, KEEP, or RENAME list has never been referenced.

ERROR: The variable female in the DROP, KEEP, or RENAME list has never been referenced.

ERROR: The variable htn in the DROP, KEEP, or RENAME list has never been referenced.

ERROR: The variable bmi in the DROP, KEEP, or RENAME list has never been referenced.

ERROR: The variable chol in the DROP, KEEP, or RENAME list has never been referenced.

ERROR: The variable hscrp in the DROP, KEEP, or RENAME list has never been referenced.

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.XWTS may be incomplete. When this step was stopped there were 0
observations and 1 variables.
WARNING: Data set WORK.XWTS was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds

 

The macros %get_bpls and %get_vip also produce error messages:

 

2041 %get_bpls(dsoutmod,dsout=bpls);

NOTE: There were 16 observations read from the data set WORK.DSOUTMOD.
NOTE: The data set WORK.EST_WB has 4 observations and 24 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds

 

NOTE: There were 16 observations read from the data set WORK.DSOUTMOD.
NOTE: The data set WORK.EST_PQ has 4 observations and 24 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds


NOTE: IML Ready
ERROR: AGE is not in the scope of variables for the data set.

 

 

*/-----------------------------------------------------------------------*/

 

As far as I can see in the macro, there is no renaming of x-variables so I don't understand the problem. Could it be related to the number of characters in the names of the x variables? Here is the code I submitted:

 

data data_a; set data;
if _N_ <= 15;
n=_N_;
run;

%global xvars yvars predname resname xscrname yscrname
num_x num_y lv;
%global xvars yvars;
%let xvars= age dbp female htn bmi chol hscrp;
%let yvars= sbp;
%let ypred=yhat1;
%let yres=yres1;
%let predname=yhat;
%let resname=res;
%let xscrname=xscr;
%let yscrname=yscr;
%let num_y=1;
%let num_x=15;

 

proc pls data=data_a method=pls outmodel=dsoutmod lv=4;
class female htn;
model &yvars = &xvars;
output out=outpls p=yhat1 yresidual=yres1
xresidual=xres1-xres15 xscore=xscr yscore=yscr
stdy=stdy stdx=stdx h=h press=press t2=t2
xqres=xqres yqres=yqres;
run;

 

%let lv=4;

%plot_scr(outpls);
%plotxscr(outpls,max_lv=4);
%get_wts(est1,dsxwts=xwts);
%plot_wt(xwts,max_lv=4);
%getxload(est1,dsxload=xloads);
%pltxload(xloads,max_lv=4);

%get_bpls(dsoutmod,dsout=bpls);
%get_vip(dsoutmod,dsvip=vip_data);

 

data eval;
merge bpls vip_data;
run;
proc print data=eval;
run;

 

PaigeMiller
Diamond | Level 26

Re-run the macros, after turning on the following option:

 

options mprint;

Then show us the SASLOG for the entire macro %get_wts

--
Paige Miller
greveam
Quartz | Level 8

Thanks for spending time on this. I really appreciate it.

 

Code: 

 

data data_a; set data;
if _N_ <= 15;
n=_N_;
run;

 

%global xvars yvars predname resname xscrname yscrname
num_x num_y lv;
%let xvars= age female htn hscrp;
%let yvars= bmi;
%let ypred=yhat1;
%let yres=yres1;
%let predname=yhat;
%let resname=res;
%let xscrname=xscr;
%let yscrname=yscr;
%let num_y=1;
%let num_x=4;

 

proc pls data=data_a method=pls outmodel=est1 lv=2;
class female baseline_mi dm_all m5 htn inactive;
model &yvars = &xvars;
output out=outpls p=yhat1 yresidual=yres1
xresidual=xres1-xres4 xscore=xscr yscore=yscr
stdy=stdy stdx=stdx h=h press=press t2=t2
xqres=xqres yqres=yqres;
run;

 

%let lv=2;

 

options mprint;
%plot_scr(outpls);
%plotxscr(outpls,max_lv=2);

%get_wts(est1,dsxwts=xwts); /* see log below */
%plot_wt(xwts,max_lv=2);
%getxload(est1,dsxload=xloads);
%pltxload(xloads,max_lv=2);

%get_bpls(est1,dsout=bpls);
%get_vip(est1,dsvip=vip_data);

 

data eval;
merge bpls vip_data;
run;
proc print data=eval;
run;

 

Log for %get_wts with option mprint:

 

1864 %get_wts(est1,dsxwts=xwts);

NOTE: There were 15 observations read from the data set WORK.DSOUT.
NOTE: There were 15 observations read from the data set WORK.PLTANNO.
NOTE: PROCEDURE GPLOT used (Total process time):
real time 17.50 seconds
cpu time 1.79 seconds


MPRINT(GET_WTS): data xwts;
MPRINT(GET_WTS): set est1(keep=_TYPE_ _LV_ age female htn hscrp);
ERROR: The variable age in the DROP, KEEP, or RENAME list has never been referenced.
ERROR: The variable female in the DROP, KEEP, or RENAME list has never been referenced.
ERROR: The variable htn in the DROP, KEEP, or RENAME list has never been referenced.
ERROR: The variable hscrp in the DROP, KEEP, or RENAME list has never been referenced.
MPRINT(GET_WTS): if _TYPE_='WB' then output;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.XWTS may be incomplete. When this step was stopped there were 0
observations and 1 variables.
WARNING: Data set WORK.XWTS was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


MPRINT(GET_WTS): proc transpose data=xwts out=xwts;
MPRINT(GET_WTS): run;

NOTE: There were 0 observations read from the data set WORK.XWTS.
NOTE: The data set WORK.XWTS has 1 observations and 1 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


MPRINT(GET_WTS): data xwts;
MPRINT(GET_WTS): set xwts;
MPRINT(GET_WTS): if _NAME_='_LV_' then delete;
MPRINT(GET_WTS): n=_n_-1;
MPRINT(GET_WTS): run;

NOTE: There were 1 observations read from the data set WORK.XWTS.
NOTE: The data set WORK.XWTS has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.03 seconds


MPRINT(GET_WTS): data xwts;
MPRINT(GET_WTS): set xwts;
MPRINT(GET_WTS): rename col1=w1;
MPRINT(GET_WTS): run;

WARNING: The variable col1 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 1 observations read from the data set WORK.XWTS.
NOTE: The data set WORK.XWTS has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


MPRINT(GET_WTS): data xwts;
MPRINT(GET_WTS): set xwts;
MPRINT(GET_WTS): rename col2=w2;
MPRINT(GET_WTS): run;

WARNING: The variable col2 in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 1 observations read from the data set WORK.XWTS.
NOTE: The data set WORK.XWTS has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds

 

--------------------------------------

 

Looking at work.est1 (see attached) produced by proc pls, there are indeed no &xvars but instead col1-col6  -  it seems to me like something is not adding up.

 

 

PaigeMiller
Diamond | Level 26

I can't read your PDF file. I don't know if the problem is my firewall, or something else. 

 

You need to examine all the data sets leading up to this error (in all of the macros) to see if there is a place where the naming gets screwed up or changes somehow.

--
Paige Miller
greveam
Quartz | Level 8

The problem was that proc pls renames the X variables to col1, col2 in the outmodel. I added rename to the outmodel statement - making it compatible with &xvars - and now the macro runs without errors.

 

proc pls data=data_a method=pls outmodel=est1(rename=(col1=age
col2=male col3=female col4=nohtn col5=htn col6=hscrp)) lv=2;
class female baseline_mi dm_all m5 htn inactive;
model &yvars = &xvars;
output out=outpls p=yhat1 yresidual=yres1
xresidual=xres1-xres4 xscore=xscr yscore=yscr
stdy=stdy stdx=stdx h=h press=press t2=t2
xqres=xqres yqres=yqres;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1493 views
  • 0 likes
  • 2 in conversation