Hi All,
I have built a Logistic Regression Model and I get the results below...The variables with a high Chi-Square below (Var1 and Var6)..Should they be removed from the model?
Thank You
Analysis | of Maximum | Likelihood | Estimates | |||||
Parameter | DF | Estimate | Standard Error | Wald Chi-Square | Pr | Pr> ChiSq | Standardized Estimate | Exp(Est) |
Intercept | 1 | -2.8145 | 0.004 | 495,784.70 | <.0001 | 0.06 | ||
Var 1 | 1 | 0.0758 | 0.000357 | 45,168.02 | <.0001 | 0.2661 | 1.079 | |
Var 2 | 1 | 0.3646 | 0.00753 | 2,345.72 | <.0001 | 0.0403 | 1.44 | |
Var 3 | 1 | -0.0912 | 0.00186 | 2,407.66 | <.0001 | -0.052 | 0.913 | |
Var 4 | 1 | 0.7891 | 0.00809 | 9,506.31 | <.0001 | 0.0981 | 2.201 | |
Var 5 | 1 | 0.1089 | 0.00334 | 1,060.29 | <.0001 | 0.0339 | 1.115 | |
Var 6 | 1 | 1.098 | 0.00339 | 104,610.00 | <.0001 | 0.7095 | 2.998 | |
Var 7 | 1 | 0.153 | 0.00239 | 4,092.80 | <.0001 | 0.062 | 1.165 |
I think a statistically right way would be, if you run the regression again - without var1 & var6 - and compare the likelihood functions. If they don't differ much, you don't need the 2 variables. (Likelihood-Ratio-Test: Subtract lambda=2*(likelihood with var1&var6 - likelihood without var1&var6) -> get value of the chi-squared distribution with 2 degrees of freedom -> if the result is almost zero (e.g. <0.01) the 2 variables should not be excluded.)
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.