1. Does GLMSELECT LASSO by default, assume response variable is continuous and approximately normally distributed? proc glmselect data=lasso_allsample plots=coefficients seed=123;
partition role=SELECTED(TRAIN='1' TEST='0');
model return = "list of predictors" /selection=lasso( choose=cv stop=none) cvmethod=random(10);
run;
2. A key assumption of traditional linear regression is that the residuals (the differences between the observed and predicted values) are normally distributed. This allows for statistical inference and hypothesis testing. Can we relax this assumption when doing LASSO and how to implement a NON-normal distribution of error in GLMSELECT (if the answer to question 1 is that GLMSELECT do assume normal distribution)? Thank you.
... View more