I'm using Proc PLS and with the following code I was able to generate a VIP plot, but I have so many X variables (57) that they're hard to distinguish in the VIP Plot. Is there a way I can generate a VIP table so I can sort through my X variables to remove those with VIP values <0.8 and low (in absolute value) parameter estimates as recommended inthe documentation? I've got the Parameter estimates table already. Thanks.
Proc pls data=t_hsisubset_sort cv=split cvtest varss plots=(diagnostics dmod xyscores ParmProfiles VIP XLoadingProfiles);
@Ksharp refers to the dataset behind the VIP graph which you can get via ODS OUTPUT:
data pentaTrain;
input obsnam $ S1 L1 P1 S2 L2 P2
S3 L3 P3 S4 L4 P4
S5 L5 P5 log_RAI @@;
n = _n_;
datalines;
VESSK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
1.9607 -1.6324 0.5746 1.9607 -1.6324 0.5746
2.8369 1.4092 -3.1398 0.00
VESAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
1.9607 -1.6324 0.5746 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.28
VEASK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 1.9607 -1.6324 0.5746
2.8369 1.4092 -3.1398 0.20
VEAAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.51
VKAAK -2.6931 -2.5271 -1.2871 2.8369 1.4092 -3.1398
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.11
VEWAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
-4.7548 3.6521 0.8524 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 2.73
VEAAP -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
-1.2201 0.8829 2.2253 0.18
VEHAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
2.4064 1.7438 1.1057 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 1.53
VAAAK -2.6931 -2.5271 -1.2871 0.0744 -1.7333 0.0902
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 -0.10
GEAAK 2.2261 -5.3648 0.3049 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 -0.52
LEAAK -4.1921 -1.0285 -0.9801 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.40
FEAAK -4.9217 1.2977 0.4473 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.30
VEGGK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
2.2261 -5.3648 0.3049 2.2261 -5.3648 0.3049
2.8369 1.4092 -3.1398 -1.00
VEFAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
-4.9217 1.2977 0.4473 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 1.57
VELAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
-4.1921 -1.0285 -0.9801 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.59
;
proc pls data=pentaTrain nfac=2 plot=(VIP);
model log_RAI = S1-S5 L1-L5 P1-P5;
ods output VariableImportancePlot=vip;
run;
proc print data=vip noobs; run;
Label VIP S1 0.61108 S2 0.50482 S3 1.57775 S4 1.22255 S5 0.21288 L1 0.31822 L2 0.27123 L3 2.43480 L4 1.17994 L5 0.21288 P1 0.75127 P2 0.35927 P3 1.13222 P4 0.88380 P5 0.21288
You can use the %get_vip macro at https://support.sas.com/rnd/app/stat/papers/plsex.pdf
/* Pick up variable by PROC PLS */
ods output VariableImportancePlot= VariableImportancePlot;
proc pls data=class missing=em nfact=2 plot=(ParmProfiles VIP) ;
class sex;
model age=weight height sex;
run;
Check DataSet VariableImportancePlot .
@Ksharp have you actually tried this to see if it works?
There is no such mention of an ODS OUTPUT option called VariableImportancePlot in the documentation at https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_pls_details11.htm&docsetVersion=1...
@Ksharp refers to the dataset behind the VIP graph which you can get via ODS OUTPUT:
data pentaTrain;
input obsnam $ S1 L1 P1 S2 L2 P2
S3 L3 P3 S4 L4 P4
S5 L5 P5 log_RAI @@;
n = _n_;
datalines;
VESSK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
1.9607 -1.6324 0.5746 1.9607 -1.6324 0.5746
2.8369 1.4092 -3.1398 0.00
VESAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
1.9607 -1.6324 0.5746 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.28
VEASK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 1.9607 -1.6324 0.5746
2.8369 1.4092 -3.1398 0.20
VEAAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.51
VKAAK -2.6931 -2.5271 -1.2871 2.8369 1.4092 -3.1398
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.11
VEWAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
-4.7548 3.6521 0.8524 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 2.73
VEAAP -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
-1.2201 0.8829 2.2253 0.18
VEHAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
2.4064 1.7438 1.1057 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 1.53
VAAAK -2.6931 -2.5271 -1.2871 0.0744 -1.7333 0.0902
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 -0.10
GEAAK 2.2261 -5.3648 0.3049 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 -0.52
LEAAK -4.1921 -1.0285 -0.9801 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.40
FEAAK -4.9217 1.2977 0.4473 3.0777 0.3891 -0.0701
0.0744 -1.7333 0.0902 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.30
VEGGK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
2.2261 -5.3648 0.3049 2.2261 -5.3648 0.3049
2.8369 1.4092 -3.1398 -1.00
VEFAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
-4.9217 1.2977 0.4473 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 1.57
VELAK -2.6931 -2.5271 -1.2871 3.0777 0.3891 -0.0701
-4.1921 -1.0285 -0.9801 0.0744 -1.7333 0.0902
2.8369 1.4092 -3.1398 0.59
;
proc pls data=pentaTrain nfac=2 plot=(VIP);
model log_RAI = S1-S5 L1-L5 P1-P5;
ods output VariableImportancePlot=vip;
run;
proc print data=vip noobs; run;
Label VIP S1 0.61108 S2 0.50482 S3 1.57775 S4 1.22255 S5 0.21288 L1 0.31822 L2 0.27123 L3 2.43480 L4 1.17994 L5 0.21288 P1 0.75127 P2 0.35927 P3 1.13222 P4 0.88380 P5 0.21288
Odd that this is not mentioned in the documentation.
It is mentioned that you can get the plot via the PLOTS=VIP option, and this particular plot is called the VarianceImportancePlot in case you want to select the plot specifically. It is not mentioned in the list of possible ODS table names from PROC PLS at the link I provided earlier.
I will submit a request to SAS Technical Support to update their documentation.
@Reeza wrote:
In general, you can get the data from any plot with the same name is the 'thing' that's uncommon knowledges similar to any ODS table.
I have never heard of this. I will give this a try on other plots tomorrow.
Hi Paige,
It has nothing to do with PROC PLS. It is a general SAS-ism that you can use ODS OUTPUT to get the data object that underlies any graph. I wrote about it back in 2012, because I, too, didn't think enough people knew this trick. One place it is mentioned in the SAS/STAT doc in the section "Statistical Graphics Using ODS"
where a bullet says "ODS OUTPUT statements, which create SAS data sets from the data object that is used to make the plot. See the section Specifying an ODS Destination for Graphics for an example." When you follow the link you see an example of a FitPlot in PROC REG that is written to a data set.
For a cool application of capturing the data object, see Kufeld's article "Processing ODS OUTPUT data sets from PROC SGPLOT," and many of his papers and books.
Yes. I tried it before . Working .
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.