- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi:
How to set the offset varible in glmselect procedure.
for example, there's offset option in genmod procedure for poisson regression.
Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is going to sound really weird, but the only PROC I found with model selection ability and the capability of specifying an offset variable was PHREG, and I don't particularly like any of the methods available (no LAR or LASSO). I suppose you could divide your response variable by log(offset value) and use GLMSELECT that way.
Steve Denham
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try using the HPGENSELECT procedure, the MODEL statement has an OFFSET option.
Below are some information about the procedure, and links to a short YouTube video and documentation.
The new HPGENSELECT procedure, available with SAS/STAT 12.3 (which runs on Base 9.4), performs model selection for generalized linear models (GLMs). such as Poisson regression, negative binomial regression, and any other GLM. Designed for the distributed computing of SAS High-Performance Statistic, PROC HPGENSELECT also works in single-machine mode. It provides forward, backward, and stepwise selection (LASSO-type methods are still in progress) and includes the AIC, SBC, and AICC selection criteria.
http://www.youtube.com/watch?v=RV6lkXNpoKA
Funda
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For all of the other reasons for not using forward, backward or stepwise methods, please read:"Stopping Stepwise" by Peter Flom and David Cassell. It is available at http://www.nesug.org/proceedings/nesug07/sa/sa07.pdf. Whatever problems are seen with normally distributed data are extended with non-normally distributed data, especially skewed distributions such as a Poisson or negative binomial.
Steve Denham
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank Funda, that's a good news. Also sincerely thank Steve, I'll read your paper carefully.
But I have to mention that, the situation has changed, once I creating models, it's like cooking for dinner, patiently and strictly following statistics. but now when facing big data, and thousands of product categorries, I have to create thousands of models at the same time, no more time is given to me to check each of them, so a simple and efficient way is needed, I can't find a better way to replace 'STEPWISE' at present.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Read David and Peter's paper carefully--and it will help you be able to explain why the predictive model you came up with using STEPWISE performed so badly when presented with new data.
In big data, you would be as wise to ask a five year old to pick out important variables based on how cool the name sounded as to use STEPWISE methods. Between bias of the estimators and poor control of multiple testing you have a recipe for a poor predictive model. Clustering and classification trees will do much better.
Steve Denham