Programming the statistical procedures from SAS

Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Reply
Frequent Contributor
Posts: 102

Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

I've not been able to find a way to do LASSO or LAR variable selection in SAS for a binary outcome.  Please let me know if I am missing it.

If it does not exist, I am curious about your thoughts on what's a better alternative: using GLMSELECT with a 0/1 variable as an outcome or running a LOGISTIC regression with a set of strong predictors for the 0/1 variable and then using GLMSELECT on the residuals from the LOGISTIC?  How far off would a GLM solution for a LOGISTIC problem be with samples in the 100,000?

Thanks,

Haris

Valued Guide
Posts: 3,206

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

did you google?

SAS/STAT(R) 13.1 User's Guide  (lasso lar in glmselect)

https://support.sas.com/resources/papers/proceedings09/259-2009.pdf

---->-- ja karman --<-----
Valued Guide
Valued Guide
Posts: 678

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

There is no LAR or LASSO selection options for generalized linear models, such as logistic regression. There is the new HPGENSELECT procedure for distributions in the exponential family (such as binomial, binary), but this only has the more traditional stepwise selection methods (which I do not recommend). As an ad hoc method, you could take your first approach (direct analysis on the binary observations) using GLMSELECT, with LASSO or LAR. Then you could refit the model in GENMOD just using the LASSO/LAR selected variables from GLMSELECT. I am sure there are all kinds of theoretical issues with this, but I have others recommend this in talks. I would not do your second suggestion (based on residuals).

Frequent Contributor
Posts: 102

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Thanks for the input.  Why not residuals?  That's basically what LAR does.  For a further complication, in my case I am actually using GLIMMIX because my data are clustered.  The first run in GLIMMIX will account for most of the cluster effects as well as give me a true continuous variable in the residual which I can plug into GLMSELECT with the remaining set of predictors.

Frequent Contributor
Posts: 102

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Yes, I have seen this paper, Jaap.  Does not really answer my question, does it?

Valued Guide
Posts: 3,206

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Moe it does not really answer your question. But it was more clear mentioning what you have found.  A question in the middle of your thoughts.

The result those guys working with this in a daily approach to react.   See LVM came in...    I am hoping on some stat specialists at SAS also.  

---->-- ja karman --<-----
Frequent Contributor
Posts: 81

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Yesterday I attended a presentation by Robert Rodriguez at the SAS Global Forum on the latest version (SAS/STAT 13.1) of the HPGENSELECT procedure.

The lasso variable selection procedure is available for logistic regression (in fact that was one of the examples in his slides), although I can't speak for least angle regression.

SAS Employee
Posts: 232

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Hi there - the Proceedings for 2015 are now live. Here is a link to all the papers SAS Global Forum Proceedings 2015. Here is a link to Robert's paper: http://support.sas.com/resources/papers/proceedings15/SAS1742-2015.pdf. If you would like me to try to get a copy of the PPT let me know (that has not yet been posted). You can email Robert as his contact info is in the paper. Thanks!  (logged in as Community Admin)

New Contributor
Posts: 2

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Thanks for sharing the link to Robert's paper, very useful! In his paper he says that the LASSO method is only available in SAS/STAT 14.1? Is this correct? How can I access this version of SAS/STAT? I am currently on SAS/STAT 13.2 and though this was the most up-to-date release?

A swift reply would be appreciated as trying to see if I can use SAS for this work or whether I will need to resort to R.

Many thanks in advance.

Frequent Contributor
Posts: 81

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

You may have to purchase an updated SAS 9.4 license to obtain SAS/STAT 14.1. I'd contact the SAS licensing dept.

I've run a lasso on logistic regression models in R if you need help.

If you're dead set on using SAS (or your data is too big for R to handle in memory), I wrote a short program in Base SAS 9.3 that runs a logistic regression lasso & presented at the SAS Global Forum last week. If you're interested I can send you the link.

Valued Guide
Posts: 3,206

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

@robf Needing a updated license? That would be a very new approach. It is normal to have SAS licensed and you can get the new versions with that.

What is needed is a new installation SID / Setinit  to install that into an ICT managed environment. That is normal life cycle-management.
The numbering of the releases base/foundation  SAS (9.3 - 9.4) has been made different to SAS/STAT as they can have different life-cycles.
You need a Chief Versions Officer for that to understand it.  

---->-- ja karman --<-----
Frequent Contributor
Posts: 81

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Jaap - if you've already installed SAS 9.3, do you have to purchase a 9.4 license to upgrade to SAS/STAT 14.1? I haven't a clue - the process is very opaque to me.

Valued Guide
Posts: 3,206

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Buy SAS&reg; Software: Frequently Asked Questions

What will I receive with my online order?

Your license for the product includes software, technical support, online documentation and software upgrades.
This same kind of agreement always seen on all the contracts. Being allowed to run the newest version no additional cost.

The installation itself is initial done with a SID file it reflects the license order. One of the parts of that is the setinit code that is applied to core-files allowing the system to run.
That setinit is something like a key for starting your car. You can see the current active settings with "Proc setinit;run;". There is a yearly purchase-order as that is your payment.

Getting a new SID/Setinit is calling your sales-office. That is a rather easy part.

The real problem is getting the software you have got to run on managed servers according to in house business data and it/data/business policies.  That is a problem because SAS is not aligned to IT departments.

The same issue is there with a lot of others tools. R Phyton installing by yourself is not compliant as you could be the cause or data-breaches ad other similar big problems.

There is an IT/business gap. So what now?    

---->-- ja karman --<-----
New Contributor
Posts: 2

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Hi Rob,

Thanks for the reply. I have since contacted SAS and SAS / STAT 14.1 won't be released until Q3/Q4 this year.

I would be very much interested in the program you have written in SAS, and also perhaps to discuss your experience with LASSO in R?

Kind regards,

Emma

Frequent Contributor
Posts: 81

Re: Does SAS have LASSO or LAR variable selection implemented for Logistic Regression?

Emma,

My paper is available at this link:

http://support.sas.com/resources/papers/proceedings15/3297-2015.pdf

Please let me know if you have any questions.

Lasso can be run in R using the glmnet package, which may be freely downloaded (along with the R language itself at http://www.r-project.org/) from a number of online sites. The glmnet package is very fast and reliable. However if you're working with large datasets that cannot be contained within your computer memory (RAM) then glmnet may not be able to execute properly - this is where SAS shines. If you'd like me to email you examples of my R code I can do that. (I don't know if this is the appropriate forum to discuss R. )

Ask a Question
Discussion stats
  • 26 replies
  • 4610 views
  • 12 likes
  • 10 in conversation