Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Outlier Detection using Studentized Residuals in Different Alphas

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-23-2019 11:18 PM
(1771 views)

Good day.

I watched videos on detecting outliers by using studentized residuals on proc reg, which has a default alpha=0.05 (95% confidence level), thus it tells that if the studentized residual is greater than 3, then it is considered an outlier.

If I were to change the alpha into 0.01 (99% confidence level), at what least value for studentized residual can I consider to be an outlier?

Many thanks!

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

By definition, a Studentized residual is formed by dividing each residual by an estimate of its standard error. Therefore the Studentized residuals are normalized to have mean 0 and unit variance. Under the usual OLS assumptions that the errors are normally distributed, the normal quantile is computed as

alpha = 0.05;

q = quantile("Normal", 1 - alpha/2);

which is 1.96, which is usually rounded to 2.

If you want alpha=0.01, then the analogous computation is

alpha = 0.01;

q = quantile("Normal", 1 - alpha/2);

which gives 2.58.

```
data Student;
alpha = 0.05;
q = quantile("Normal", 1 - alpha/2);
output;
alpha = 0.01;
q = quantile("Normal", 1 - alpha/2);
output;
run;
proc print; run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks Rick.

here's the data i'm trying to studentized:

ASSET:

-4.506 |

5.169 |

-2.57 |

3.068 |

-0.703 |

-7.037 |

0.329 |

-2.602 |

-0.969 |

9.217 |

-0.495 |

-1.608 |

1.808 |

2.643 |

-0.19 |

-0.853 |

0.688 |

1.209 |

0.796 |

-0.632 |

-0.139 |

1.1 |

-1.653 |

-0.178 |

here's the code I'm using:

proc reg data=data_A1;

model ASSET = SEQ /r;

output out=Data_A2 student=studASSETS;

by TYPE;

run;

(does this code have an alpha of 0.05 or does it not? I was just basing on the videos i watched: https://youtu.be/IiGPEPDyC4I)

this bear these results under student residuals column:

-1.301 |

1.787 |

-0.744 |

1.094 |

-0.14 |

-2.207 |

0.192 |

-0.766 |

-0.236 |

3.075 |

-0.087 |

-0.451 |

0.656 |

0.703 |

-0.23 |

-0.451 |

0.053 |

0.222 |

0.084 |

-0.39 |

-0.23 |

0.176 |

-0.736 |

-0.252 |

I tried specifying the alpha into 0.01 and it produced the same outputs.

I'm trying to determine, if these set represents alpha=0.05 and it brought one result which is greater than 3, if I am to use alpha=0.01, what is the least value i need to consider as an outlier

thanks in advance

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The Studentized residuals are standardized. You said, "I tried specifying the alpha into 0.01 and it produced the same outputs" That statement is correct.

Your question appears to be related to detecting outliers. A studentized residual (SR) represents the residual in units of the standard deviation of the residuals. If |SR| > 3, then the residual is more than 3 SD away from the OLS line.

In terms of confidence intervals, the following is true:

If the data satisfy the assumption of OLS and the sample size is large, then the studentized residuals are approximately normally distributed. Therefore you would expect 95% of the studentized residuals to have abs values less than 1.96. You would expect 99% of the studentized residuals to have abs value less than 2.58. You would expect 99.9% of the studentized residuals to have abs value less than 3.

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.