I would like to detect a outliers and multicollinearity for my regression (both linear and logistic) analysis. Appreciate if someone guide me through options/procs for that.
Thanks in advance!
For proc reg : Outliers - Check Cook Distance Multicollinearity - Check VIF ( model y=x / vif ) For proc logistic: Outliers - Check INFLUENCE option and see Chi-square : proc logistic data=sashelp.class; model sex=age height/ influence; run; Regression Diagnostics Case Number Covariates Pearson Residual Deviance Residual Hat Matrix Diagonal Intercept DfBeta Age DfBeta Height DfBeta Confidence Interval Displacement C Confidence Interval Displacement CBar Delta Deviance Delta Chi-Square Age Height 1 14.0000 69.0000 -0.3158 -0.4361 0.1443 0.1096 0.0994 -0.1265 0.0197 0.0168 0.2070 0.1166 2 13.0000 56.5000 0.3559 0.4883 0.1644 0.1391 0.1271 -0.1562 0.0298 0.0249 0.2634 0.1515 3 13.0000 65.3000 2.4687 1.9796 0.1484 -0.6318 -0.9210 0.9706 1.2470 1.0620 4.9807 7.1566 4 14.0000 62.8000 0.8089 1.0034 0.0999 0.0598 0.1630 -0.1330 0.0807 0.0726 1.0794 0.7270 Multicollinearity - There is not check in proc logistic, but sas will remove one variable automatically if it is colinearity with other variables due to proc logistic is using MLE .
What's your definition of an outlier?
Your questions are too broad. They're chapters in text books.
If your trying to learn statistical theory and SAS have you taken the first statistic e-course from SAS? It's free.
There's also a ton of videos on topics related to specific statistical procedures.
http://support.sas.com/training/tutorial/
Suggestions:
For linear regression you can use the ROBUSTREG procedure. The procedure has algorithms that automatically flag outliers. The documentation contains several Getting Started examples. I suggest you start with the examples and then move on to the "Details" section if you want to understand the details about how an observation is classified as an outlier.
There is not an analogous "robust" procedure for logistic regression. However, there are still techniques for detecting potential outliers in almost every SAS procedure. The technique is to use regression diagonostic plots.
For example, in PROC REG you can use the INFLUENCE option on the MODEL statement and look at the ODS graphics to assess observations that are highly influential in the model. See the section of the doc titled "Influence Statstics".
You can do something similar for logistic regression. The LOGISTIC procedure contains many diagnostic plots. As Reeze says, a full explanation is lengthy, but start with the doc example "Logistic Regression diagnostics", which shows how to use the INFLUENCE option and the diagnostic plots.
For proc reg : Outliers - Check Cook Distance Multicollinearity - Check VIF ( model y=x / vif ) For proc logistic: Outliers - Check INFLUENCE option and see Chi-square : proc logistic data=sashelp.class; model sex=age height/ influence; run; Regression Diagnostics Case Number Covariates Pearson Residual Deviance Residual Hat Matrix Diagonal Intercept DfBeta Age DfBeta Height DfBeta Confidence Interval Displacement C Confidence Interval Displacement CBar Delta Deviance Delta Chi-Square Age Height 1 14.0000 69.0000 -0.3158 -0.4361 0.1443 0.1096 0.0994 -0.1265 0.0197 0.0168 0.2070 0.1166 2 13.0000 56.5000 0.3559 0.4883 0.1644 0.1391 0.1271 -0.1562 0.0298 0.0249 0.2634 0.1515 3 13.0000 65.3000 2.4687 1.9796 0.1484 -0.6318 -0.9210 0.9706 1.2470 1.0620 4.9807 7.1566 4 14.0000 62.8000 0.8089 1.0034 0.0999 0.0598 0.1630 -0.1330 0.0807 0.0726 1.0794 0.7270 Multicollinearity - There is not check in proc logistic, but sas will remove one variable automatically if it is colinearity with other variables due to proc logistic is using MLE .
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.