I am running Proc Logistic in SAS with Stepwise selection. My results show the Intercept with an estimate of -3290.3 while the other parameters vary with values such as 0.5337, 3.11E-6, 0.000088 etc.
This is the first time I've had such differing values, would anyone be able to explain in layman's terms what this means? And possibly what fixes I could do? Or maybe more things to look into?
Unfortunately, I am the only analyst within the business so I have no support from others who work from me as they don't understand the methodology.
Thanks in advance!
Intercept is where the regression line crosses the y-axis when all the x-variables are set to zero. The other numbers are slopes of the individual variables.
I don't know what there is to fix here, it's not clear that anything needs to be fixed, it's not clear that anything is wrong.
Have you checked for outliers in the x-variables? .
Sorry it's just in the past my Intercept value has been more similar to the parameter estimates, this is the first time they've been extremely different and I was wondering if I needed to fix something.
Are the outliers outputted as part of Proc Logistic?
It might be worth noting that I'm just building quick scorecard for Segmentation Analysis before I start the actual scorecard builds.
@manonlyn wrote:
Sorry it's just in the past my Intercept value has been more similar to the parameter estimates, this is the first time they've been extremely different and I was wondering if I needed to fix something.
Are the outliers outputted as part of Proc Logistic?
It might be worth noting that I'm just building quick scorecard for Segmentation Analysis before I start the actual scorecard builds.
You can check outliers in the x-variables via a histogram or PROC UNIVARIATE, or just about any other plotting method. There are also a number of "goodness of fit" statistics computed by PROC LOGISTIC, such as the C statistic (area under the curve), percent concordance and Somer's D.
Thank you I'll start with those, this has been really helpful thanks again.
Sorry it's just in the past my Intercept value has been more similar to the parameter estimates, this is the first time they've been extremely different and I was wondering if I needed to fix something.
That may have been coincidental. I'm not aware of any rules or logic that would indicate that should happen.
I didn't see any problem in your output, so what is your problem .
For those variables have very small coefficient (3.11E-6, 0.000088 etc) , you can ignore/remove them due to not significant .
@Ksharp wrote:
I didn't see any problem in your output, so what is your problem .
For those variables have very small coefficient (3.11E-6, 0.000088 etc) , you can ignore/remove them due to not significant .
But they could be statistically significant, the statistical significance of these coefficients wasn't mentioned.
I guess these P values must be >0.05 . And hope OP post these P values and confirm my guess .
The P value for the variable with the estimate of 3.11E-6 is <.0001
Can you post the whole parameter estimator table ?
Analysis of Maximum Likelihood Estimates | ||||||
Parameter | DF | Estimate | Standard Error | Wald Chi-Square | Pr > ChiSq | |
Intercept | 1 | -3290.3 | 703.3 | 21.8867 | <.0001 | |
Var1 | 1 | -0.00163 | 0.000226 | 52.2701 | <.0001 | |
Var2 | 1 | -0.00526 | 0.000466 | 127.441 | <.0001 | |
Var3 | 1 | 0.000958 | 0.00027 | 12.608 | 0.0004 | |
Var4 | 1 | -0.0001 | 0.000038 | 7.1978 | 0.0073 | |
Var5 | 1 | -0.3936 | 0.0369 | 113.85 | <.0001 | |
Var6 | 1 | 0.000088 | 0.000016 | 31.0853 | <.0001 | |
Var7 | 1 | -0.002 | 0.000298 | 45.2192 | <.0001 | |
Var8 | 1 | 0.0126 | 0.00317 | 15.871 | <.0001 | |
Var9 | 1 | -0.00046 | 0.000152 | 9.2876 | 0.0023 | |
Var10 | 1 | -0.00307 | 0.000654 | 22.0835 | <.0001 | |
Var11 | Y | 1 | -0.4135 | 0.0961 | 18.5191 | <.0001 |
Var12 | 1 | 3.11E-06 | 7.86E-07 | 15.6228 | <.0001 | |
Var13 | 1 | 0.000163 | 0.000035 | 21.8936 | <.0001 | |
Var14 | HI | 1 | -0.276 | 0.0988 | 7.8036 | 0.0052 |
Var14 | LO | 1 | 0.0772 | 0.0714 | 1.1689 | 0.2796 |
Var14 | VH | 1 | -0.4861 | 0.0891 | 29.767 | <.0001 |
Var14 | VL | 1 | 0.5337 | 0.1432 | 13.8923 | 0.0002 |
Var15 | 1 | 0.0193 | 0.00414 | 21.7022 | <.0001 | |
Var16 | 1 | 0.00149 | 0.000291 | 26.3576 | <.0001 | |
Var17 | 1 | -0.2849 | 0.0557 | 26.1488 | <.0001 | |
Var18 | 1 | 0.000088 | 0.000018 | 23.3451 | <.0001 | |
Var19 | 1 | 0.00291 | 0.000628 | 21.5079 | <.0001 |
At the moment I'm building quick models to analyse if it's worth segmenting. I'm not sure how I would standardize variables? What does this usually include? Thank you.
@manonlyn wrote:
At the moment I'm building quick models to analyse if it's worth segmenting. I'm not sure how I would standardize variables? What does this usually include? Thank you.
Even though @Reeza says it is good practice to standardize the variables, I don't think standardizing is required in any way, and your models are fine without standardizing.
But the answer about should you standardize really depends on what "building quick models to analyse if it's worth segmenting" means, and I don't know what that means, and I don't know how you intend to do the analysis to determine if it's worth segmenting. Please explain further.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.