I am running a regularized regression on several traits using the following code:
Proc glmselect data = DalReg1 plots(stepaxis=normb)=coefficients;
Model TW = Protein TGW SGD GL GW Size Shape / selection = LASSO(stop=none choose = cvex);
run;
The output is great. However, I am wondering how to obtain standard errors for each coefficient. Help suggestions on this, please?
Thanks,
Dalitso
The standard error for the coefficients appears in the parameter estimates results which should be in the output by default. Do mean to ask how to get that information into a data set?
Ballardw: Here is the output. There is no SE.
SAS Output
5 | 4523.88515 | 904.77703 | 208.11 |
1271 | 5525.89163 | 4.34767 | |
1276 | 10050 |
2.08511 |
58.92247 |
0.4501 |
0.4480 |
3161.71694 |
3161.80520 |
1913.63055 |
4.85730 |
1 | 19.141464 |
1 | -0.221838 |
1 | 0.192900 |
1 | 15.111831 |
1 | 0.040392 |
1 | 0.195827 |
There are no SE provided when variable selection is performed with LASSO. There might be a good reason for that. Models resulting from variable selection methods do not account in their parameter estimates SE for model uncertainty. You can get parameter SEs for the chosen model, conditional on that choice, with other regression procedures, such as GLM, GENMOD or GLIMMIX.
Thanks PG. I agree. I thought there must be a good reason for not having SEs in LASSO procedure. I might have to do some more literature review on this. I chose LASSO because I have multicollinearity in my data but I am curious what SEs would be, if it is possible to generate them. Thanks again!
You might consider doing LASSO selection via PROC NLMIXED instead as illustrated in this note.
Hi Dave,
I am not sure if I am familiar with NLMIXED. Is there any other way with proc GLMSELECT? If not I might just to have a go at NLMIXED and see.
Thanks Dave.
proc hpgenselect data=sashelp.class ;
class sex;
model weight = sex height age/ CL ;
selection method=Lasso(choose=SBC) details=all;
performance details;
run;
You will see :
NOTE: The CL option is not available for the LASSO method.
NOTE: The HPGENSELECT procedure is executing in single-machine mode.
NOTE: * Optimal Value of Criterion
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
Ksharp,
This is what I have seen:
NOTE: The CL option is not available for the LASSO method.
NOTE: The HPGENSELECT procedure is executing in single-machine mode.
NOTE: * Optimal Value of Criterion
NOTE: There were 1496 observations read from the data set WORK.DALREG1.
NOTE: PROCEDURE HPGENSELECT used (Total process time):
real time 1.07 seconds
cpu time 0.51 seconds
I just read about Bayesian LASSO that has the ability to generate SE. However, it requires a macro, an area I am, sadly, not competent with. Any help from anybody please?
Thanks,
DNY
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.