turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Outlier detection

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-18-2010 09:59 PM

Hi ,

could you suggest soem useful commands and their interpretations if I want to find if there are any influential observations / outliers in my data.

Kind Regards ,

markc

could you suggest soem useful commands and their interpretations if I want to find if there are any influential observations / outliers in my data.

Kind Regards ,

markc

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-18-2010 10:17 PM

Hi:

To investigate and explore your data, PROC UNIVARIATE is a good overall procedure to start with:

http://support.sas.com/documentation/cdl/en/procstat/63032/HTML/default/procstat_univariate_sect008....

http://support.sas.com/documentation/cdl/en/procstat/63032/HTML/default/procstat_univariate_sect003....

cynthia

To investigate and explore your data, PROC UNIVARIATE is a good overall procedure to start with:

http://support.sas.com/documentation/cdl/en/procstat/63032/HTML/default/procstat_univariate_sect008....

http://support.sas.com/documentation/cdl/en/procstat/63032/HTML/default/procstat_univariate_sect003....

cynthia

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-18-2010 10:28 PM

Whether or not an observation is an outlier or influential observation (they are often thought of differently) is a function of your putative model. The UNIVARIATE procedure that Cynthia mentioned will look at several different single variable distributions.

All of the regression type statistical models in SAS 9.2 have very good ODS graphics to assist in outlier detection. There is a section with each procedure describing the diagnostic models available in the ODS Graphics.

All of the regression type statistical models in SAS 9.2 have very good ODS graphics to assist in outlier detection. There is a section with each procedure describing the diagnostic models available in the ODS Graphics.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-19-2010 08:07 AM

For the "interpretations" part of your question, I use the mean and standard deviation to identify thresholds for univariate outliers.

For samples < 500, outliers are values that exceed [mean + or - 1.96 standard deviations]. This is equivalent to a 95% confidence interval.

If 500 < n < 5,000, [mean + or - 2.576 standard deviations]... 99% CI.

For n > 5,000, [mean + or - 3.291 standard deviations]... 99.9% CI.

For multivariate outliers, I look at Cook's D & DFFits most often.

Good luck,

Parker

For samples < 500, outliers are values that exceed [mean + or - 1.96 standard deviations]. This is equivalent to a 95% confidence interval.

If 500 < n < 5,000, [mean + or - 2.576 standard deviations]... 99% CI.

For n > 5,000, [mean + or - 3.291 standard deviations]... 99.9% CI.

For multivariate outliers, I look at Cook's D & DFFits most often.

Good luck,

Parker

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

02-19-2010 12:24 PM

just as an addition to above answers.....To find outliers with character data you can use PROC FREQ to find all the values of a particular character variable.

To find outliers in Numeric data we can use PROC Means.

To find outliers in Numeric data we can use PROC Means.