Operations Research topics: SAS/OR,
SAS Optimization, and SAS Simulation Studio

speed of the logisitic regression

Reply
N/A
Posts: 1

speed of the logisitic regression

I’m conducting logistic regression using proc logistic on the sample consisting of approximately 150000 people described by 1500 variables. The analysis lasts for about 8 hours. Do you know if there is any methodical way to speed it up? Or is it rather a software/hardware problem?

Thanks a lot.

Regards
Iryna
Super Contributor
Posts: 260

Re: speed of the logisitic regression

Hi Iryna.
I don't think you really need all these 1500 variables to be used in the model, do you ?
So I'd rather use both SELECTION=FORWARD and STOP=50 to see which variables are the (at most) fifty best-contributing to your model, and then rerun the model with them...

Regards
Olivier
Frequent Contributor
Posts: 95

Re: speed of the logisitic regression

Are any of these 1,500 variables highly correlated? If so, you might be able to select one among a group of highly correlated variables or use a small number of principal components (from a Principal Components Analysis) for your logistic regression.
Contributor
Posts: 35

Re: speed of the logisitic regression

iryna, i think you don't need all 150000 records/observations either.

for example, if you are interested in variables that rate respondents' ratings of certain job attributes, you may want to use the data for employed respondents only.
SAS Employee
Posts: 48

Re: speed of the logisitic regression

Is this question related to Mathematical Optimization and Operations Research with SAS? If not, this is the wrong forum.
Ask a Question
Discussion stats
  • 4 replies
  • 245 views
  • 0 likes
  • 5 in conversation