Hi everyone! I'm new to LASSO regression, so I'm unsure how to interpret the output from this procedure. Here is the code I ran below:
proc hpgenselect data=data;
class var1 (ref="0") var2 (ref="0")
var3 (ref="0") var4 (ref="0")
var5(ref="0") var6 (ref="0") var7 (ref="0")
var8 (ref="0") var9 (ref="0");
model varx (ref="0")= var1 var2 var3 var4 var5 var6 var7 var8 var9
/ dist=mult link=glogit;
selection method=lasso(choose=aicc) details=all;
performance details;
run;
I'm confused on Selection Details and Parameter Estimates. I am under the impression I would use Proc Hpgenselect for the Lasso regression since the dependent variable is multinominal (rather than continuous). And I am under the impression that Lasso is useful to determine which independent variables would influence the dependent variable the most.
Any help will be greatly appreciated! *Var 3 for some reason doesn't come up in the output*
Step | Description | Effects In Model |
Lambda | AIC | AICC | BIC |
---|---|---|---|---|---|---|
0 | Initial Model | 1 | 1 | 9524.699 | 9524.706 | 9543.186 |
1 | var1 | 3 | .8 | 9515.842 | 9515.883 | 9565.142 |
var7 | 3 | .8 | 9515.842 | 9515.883 | 9565.142 | |
2 | var6 | 4 | .64 | 9507.382 | 9507.458 | 9575.170 |
3 | var2 | 5 | 0.512 | 9492.006 | 9492.224 | 9609.094 |
4 | var5 | 7 | 0.4096 | 9484.026 | 9484.526 | 9662.739 |
var9 | 7 | 0.4096 | 9484.026 | 9484.526 | 9662.739 | |
5 | var8 | 8 | 0.3277 | 9451.726 | 9452.536 | 9679.739 |
6 | var4 | 9 | 0.2621 | 9422.729 | 9423.822 | 9687.717 |
7 | 9 | 0.2097 | 9391.901 | 9392.994 | 9656.890 | |
8 | 9 | 0.1678 | 9374.869 | 9376.118 | 9658.345 | |
9 | 9 | 0.1342 | 9363.952 | 9365.427 | 9672.077 | |
10 | 9 | 0.1074 | 9351.942 | 9353.600 | 9678.556 | |
11 | 9 | 0.0859 | 9345.010 | 9346.927 | 9696.274 | |
12 | 9 | 0.0687 | 9333.868 | 9335.785 | 9685.132 | |
13 | 9 | 0.055 | 9329.730 | 9331.784 | 9693.319 | |
14 | 9 | 0.044 | 9323.719 | 9325.773 | 9687.307 | |
15 | 9 | 0.0352 | 9319.290 | 9321.344 | 9682.878 | |
16 | 9 | 0.0281 | 9315.928 | 9317.982 | 9679.517 | |
17 | 9 | 0.0225 | 9313.381 | 9315.435 | 9676.970 | |
18 | 9 | 0.018 | 9311.495 | 9313.549 | 9675.084 | |
19 | 9 | 0.0144 | 9312.130 | 9314.254 | 9681.881 | |
20 | 9 | 0.0115 | 9311.125 | 9313.249* | 9680.876 |
Parameter Estimates | |||
---|---|---|---|
Parameter | Varx Group | DF | Estimate |
Intercept | 1 | 1 | -0.923848 |
Intercept | 3 | 1 | -0.090371 |
Intercept | 2 | 1 | -0.475460 |
var1 1 | 1 | 1 | 0.208376 |
var1 1 | 3 | 1 | 0.085181 |
var1 1 | 2 | 1 | -0.033519 |
var1 2 | 1 | 1 | 0.564515 |
var1 2 | 3 | 1 | 0.614004 |
var1 2 | 2 | 1 | 0.208413 |
var1 3 | 1 | 1 | 0.775994 |
var1 3 | 3 | 1 | 0.824564 |
var1 3 | 2 | 1 | 0.306114 |
var2 1 | 1 | 1 | -0.258637 |
var2 1 | 3 | 1 | -0.344585 |
var2 1 | 2 | 1 | -0.420374 |
var2 2 | 1 | 1 | -0.183059 |
var2 2 | 3 | 1 | -0.006305 |
var2 2 | 2 | 1 | -0.034611 |
var2 3 | 1 | 1 | -0.656491 |
var2 3 | 3 | 1 | -0.643286 |
var2 3 | 2 | 1 | -0.339439 |
var41 | 1 | 1 | 0.216545 |
var41 | 3 | 1 | 0.176734 |
var41 | 2 | 1 | -0.017829 |
var5 1 | 1 | 1 | 0.049015 |
var5 1 | 3 | 1 | -0.122792 |
var51 | 2 | 1 | 0.374545 |
var5 2 | 1 | 1 | -0.018575 |
var5 2 | 3 | 1 | -0.220216 |
var5 2 | 2 | 1 | 0.117048 |
var6 1 | 1 | 1 | 0.576625 |
var6 1 | 3 | 1 | 0.338929 |
var6 1 | 2 | 1 | 0.416339 |
var6 2 | 1 | 1 | 0.998560 |
var6 2 | 3 | 1 | 0.743957 |
var6 2 | 2 | 1 | 0.502328 |
var6 3 | 1 | 1 | 1.339983 |
var6 3 | 3 | 1 | 0.982288 |
var6 3 | 2 | 1 | 0.665514 |
var7 1 | 1 | 1 | -0.287458 |
var7 1 | 3 | 1 | -0.587046 |
var7 1 | 2 | 1 | -0.274003 |
var7 2 | 1 | 1 | -0.346579 |
var7 2 | 3 | 1 | -0.053197 |
var7 2 | 2 | 1 | 0.017640 |
var8 1 | 1 | 1 | -0.059514 |
var8 1 | 3 | 1 | -0.033493 |
var8 1 | 2 | 1 | 0.072684 |
var8 2 | 1 | 1 | 0.006447 |
var8 2 | 3 | 1 | 0.132097 |
var8 2 | 2 | 1 | 0.037160 |
var9 1 | 1 | 1 | 0.051825 |
var9 1 | 3 | 1 | 0.129466 |
var9 1 | 2 | 1 | 0.359189 |
var9 2 | 1 | 1 | 0.322085 |
var9 2 | 3 | 1 | 0.201435 |
var9 2 | 2 | 1 | 0.571104 |
var9 3 | 1 | 1 | -0.118308 |
var9 3 | 3 | 1 | -0.155053 |
var9 3 | 2 | 1 | 0.346671 |
@chester2018 wrote:
Any help will be greatly appreciated! *Var 3 for some reason doesn't come up in the output*
lasso is used as variable selection method. Your Var_3 is not relevant for improving multi-class classification.
See also here (there is also coverage of overall "variable importance") :
Last update: 01-17-2024
Updated by: SAS Employee AndyRavenna
Koen
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.