10-03-2012 12:21 PM
I'm running a multivariate GLM insurance model in proc nlmixed and am interested in using the lasso methodology to select variables and shrink coefficients of correlated variables. Park & Hastie, 2007 (http://www.stanford.edu/~hastie/Papers/JRSSB.69.4%20(2007)%20659-677%20Park.pdf) discusses the lasso technique for GLMs.
Using the "off the shelf" options already available in proc nlmixed, is there a way to conduct a lasso? For example, using the practice education dataset from the UCLA ATS website located at http://www.ats.ucla.edu/stat/data/hsbdemo.sas7bdat, I tried the following code for running a Poisson regression:
/* Standardize predictor variables. */
proc standard data=hsbdemo mean=0 std=1 out=hsbdemo_zscores;
var read female science write;
/* Lasso Poisson regression. */
proc nlmixed data=hsbdemo_zscores;
parms b0=0.1 b1=0.1 b2=0.1 b3=0.1 b4=0.1 lambda=0.01;
mu = exp(xb);
ll = (awards*log(mu) - mu - lgamma(awards+1) + lambda*sum_abs_betas;
model awards ~ general(ll);
Where lambda*sum_abs_betas is the penalty term added to the log-likelihood as per Equation (2) in Park & Hastie's paper. Running this code produces the error message:
WARNING: The final Hessian matrix is not positive definite, and therefore the estimated covariance matrix is not full rank and may be unreliable. The variance of some parameter estimates is zero or some parameters are linearly related to other parameters.
Am I attempting too much within the proc nlmixed framework? Park & Hastie use the predictor-corrector algorithm to converge to a solution, a technique not included among the proc nlmixed TECH options.Instead, I'm introducing lambda as an additional parameter.
I've received some advice at SAS-L, where the consensus was that lasso cannot be used in proc nlmixed, but thought I'd check here as well.