I'm taking a Coursera course that gave example code to produce a lasso regression. In their code, they used lars algorithm to get a lasso multiple regression: * lasso multiple regression with lars algorithm k=10 fold validation;
proc glmselect data=traintest plots=all seed=123;
partition ROLE=selected(train='1' test='0');
model schconn1 = male hispanic white black namerican asian alcevr1 marever1 cocever1
inhever1 cigavail passist expel1 age alcprobs1 deviant1 viol1 dep1 esteem1 parpres paractv
famconct gpa1/selection=lar(choose=cv stop=none) cvmethod=random(10);
run; On the lar documentation page, it does say "Not only does this algorithm provide a selection method in its own right, but with one additional modification it can be used to efficiently produce LASSO solutions." However, nowhere can I find an explanation of what the "additional modification" is that is making these results LASSO results rather than LAR results. When I run the code on the same data with "selection=lasso" instead, I get the same best fit model, but the plots and tables show variables (after the best fit model has been chosen in an earlier iteration) being removed from the model as their coefficient is shrunk to 0 (as I would expect with a LASSO regression), which is not the case of the output using the above example code. Can anyone shed some light on how I am using LARS to get LASSO results? Thanks!
... View more