## Logistic Regression : 10 events per predictor rule

Regular Contributor
Posts: 185

# Logistic Regression : 10 events per predictor rule

I am building a marketing model based on logistic regression. It's a customer attrition model. The event rate is very less i.e 0.1%. I have more than 1000 predictors. I know there is a rule - Minimum 10 events per predictor. I want to know - Does this rule exist before dimensionality reduction (feature extraction) with PCA and Information value? Should i consider this rule based on my original 1500 variables or does it exist for significant variables that came after applying variable selection techniques such as Stepwise Regression , PCA etc?

Super User
Posts: 20,730

## Re: Logistic Regression : 10 events per predictor rule

This is a commonly asked question on here, have none of the search results been useful?

Regular Contributor
Posts: 185

## Re: Logistic Regression : 10 events per predictor rule

I understand this is a commonly asked question. But no one clarified the background of this rule. Does this rule consider correlated predictors? Should this rule apply before removing multicollinearity or after removing collinearity and feature extraction?

Super User
Posts: 20,730

## Re: Logistic Regression : 10 events per predictor rule

What's your source for the 'rule'?

Regular Contributor
Posts: 185

Super User
Posts: 20,730

## Re: Logistic Regression : 10 events per predictor rule

My nickel - and it's my [somewhat]educated opinion.

I would argue that the 10 per rule of thumb isn't always valid, it depends on the variability of the variables being measured.

If you're not using the event rate in your dimensionality reduction and variable selection I would argue the 'rule' would apply to the variables after reduction.

If before then to original variables.

Super User
Posts: 10,205

## Re: Logistic Regression : 10 events per predictor rule

According to some statistical expert , You need run EXACT Logistic regression . Check EXACT statement in proc logistic , If I remembered correctly.

SAS Super FREQ
Posts: 3,837

## Re: Logistic Regression : 10 events per predictor rule

Most variable selection techniques start by evaluating all one-variable models, then trying to add a second variable, a third variable, and so forth.  If you want to conform to the 10-events-per-predictor rule, then you should not try to build models that have more than NumEvents / 10 predictors. For example, if you have 51 events, you could limit the selection algorithm to consider only models that have up to 5 continuous variables.

Regular Contributor
Posts: 185

## Re: Logistic Regression : 10 events per predictor rule

Thanks Xia and Rick for your reply. I am aware of Exact and Firth Logistic Regression. I am curious to know the background of this rule.

@ Rick - Suppose i have 1000 predictors in my model. Do you mean to say -  It requires atleast 10k events before correcting for multicollinearity and feature extraction. I understand i can ignore this rule if i apply unsupervised learning (For e.g. PCA or PROC VARCLUS) as they are not related to dependent variable. I am more curious to know about supervised method to extract important variables. By supervised methods, i mean 'Information Value' and 'Chi-Square' methods. The model needs to have sufficient events for feature extractions? Otherwise the feature extraction would be biased. Correct?

Posts: 2,655

## Re: Logistic Regression : 10 events per predictor rule

Model building from 1000 predictors, using 'supervised methods' will be biased.  The question is how biased, and will the model adequately predict future data.  It is well known that naive methods lead to problematic results with standard regression models (stepwise, backward, forward, all possible subsets).  See Flom and Cassell's paper on Stopping Stepwise http://www.lexjansen.com/pnwsug/2008/DavidCassell-StoppingStepwise.pdf

The problem is exacerbated for logistic regression.  However, PROC HPGENSELECT in SAS/STAT14.1 does offer selection=LASSO which gets around a lot of the difficulties with the other methods.  Still, consider the result of putting things on a logit link, and what might happen with fewer than 10 events per predictor.  You are going to have some points with very small logits that have a lot of influence on the fit.

Steve Denham

SAS Super FREQ
Posts: 3,837

## Re: Logistic Regression : 10 events per predictor rule

No, I said that if you apply this rule, then in going from 1,000 potential explanatory variables to the k that you want in your final model, that the (number of events)/10  will bound the value for k.

Regular Contributor
Posts: 185

## Re: Logistic Regression : 10 events per predictor rule

"Thanks Steve and Rick. @ Rick - " then in going from 1,000 potential explanatory variables to the k that you want in your final model, that the (number of events)/10  will bound the value for k. " -  Would each of these 1000 variables have significant events to explain their variable importance? I suspect univariate analysis of these variables with dependent variable would fail. I am sorry to bug you again.

Posts: 2,655

## Re: Logistic Regression : 10 events per predictor rule

Here is a concrete example.  Suppose in your training dataset you have 10,000 records with an event rate of 0.1%.  That would be 10 events.  Using the bounded value for k of events/10, you could adequately fit 1 variable to the data.  If you had 20,000 records with the same event rate, you could adequately fit 2 variables, and so forth.

Of course, you will need additional records to validate your model against.

Steve Denham

Regular Contributor
Posts: 185

## Re: Logistic Regression : 10 events per predictor rule

Thank you so much Steve for being so patient in replying this thread.:-) My question still lies in your explanation. I understand i can fit only 2 variables with 20k records with an event rate of 0.1%. My question - can i perform INITIAL feature extraction (important variables selection with supervised methods) to come up with 2 FINAL significant variable? Or Do i need more events to perform initial feature extraction step?

Super User
Posts: 20,730

## Re: Logistic Regression : 10 events per predictor rule

I think my original comment stands, if the feature extraction doesn't depend on the outcome you can use derived features as your variables - so can use 2 derived features with 20K records and an event rate of 0.1%.

Discussion stats
• 14 replies
• 702 views
• 2 likes
• 5 in conversation