Calcite | Level 5

## Outlier Detection using Studentized Residuals in Different Alphas

Good day.

I watched videos on detecting outliers by using studentized residuals on proc reg, which has a default alpha=0.05 (95% confidence level), thus it tells that if the studentized residual is greater than 3, then it is considered an outlier.

If I were to change the alpha into 0.01 (99% confidence level), at what least value for studentized residual can I consider to be an outlier?

Many thanks!

3 REPLIES 3
SAS Super FREQ

## Re: Outlier Detection using Studentized Residuals in Different Alphas

By definition, a Studentized residual is formed by dividing each residual by an estimate of its standard error. Therefore the Studentized residuals are normalized to have mean 0 and unit variance. Under the usual OLS assumptions that the errors are normally distributed, the normal quantile is computed as

alpha = 0.05;
q = quantile("Normal", 1 - alpha/2);

which is 1.96, which is usually rounded to 2.

If you want alpha=0.01, then the analogous computation is

alpha = 0.01;
q = quantile("Normal", 1 - alpha/2);

which gives 2.58.

``````data Student;
alpha = 0.05;
q = quantile("Normal", 1 - alpha/2);
output;
alpha = 0.01;
q = quantile("Normal", 1 - alpha/2);
output;
run;

proc print; run;``````

Calcite | Level 5

## Re: Outlier Detection using Studentized Residuals in Different Alphas

Thanks Rick.

here's the data i'm trying to studentized:

ASSET:

 -4.506 5.169 -2.57 3.068 -0.703 -7.037 0.329 -2.602 -0.969 9.217 -0.495 -1.608 1.808 2.643 -0.19 -0.853 0.688 1.209 0.796 -0.632 -0.139 1.1 -1.653 -0.178

here's the code I'm using:

proc reg data=data_A1;
model ASSET = SEQ /r;
output out=Data_A2 student=studASSETS;
by TYPE;
run;

(does this code have an alpha of 0.05 or does it not? I was just basing on the videos i watched: https://youtu.be/IiGPEPDyC4I)

this bear these results under student residuals column:

 -1.301 1.787 -0.744 1.094 -0.14 -2.207 0.192 -0.766 -0.236 3.075 -0.087 -0.451 0.656 0.703 -0.23 -0.451 0.053 0.222 0.084 -0.39 -0.23 0.176 -0.736 -0.252

I tried specifying the alpha into 0.01 and it produced the same outputs.

I'm trying to determine, if these set represents alpha=0.05 and it brought one result which is greater than 3, if I am to use alpha=0.01, what is the least value i need to consider as an outlier

SAS Super FREQ

## Re: Outlier Detection using Studentized Residuals in Different Alphas

The Studentized residuals are standardized. You said, "I tried specifying the alpha into 0.01 and it produced the same outputs" That statement is correct.

Your question appears to be related to detecting outliers. A studentized residual (SR) represents the residual in units of the standard deviation of the residuals. If |SR| > 3, then the residual is more than 3 SD away from the OLS line.

In terms of confidence intervals, the following is true:

If the data satisfy the assumption of OLS and the sample size is large, then the studentized residuals are approximately normally distributed. Therefore you would expect 95% of the studentized residuals to have abs values less than 1.96. You would expect 99% of the studentized residuals to have abs value less than 2.58. You would expect 99.9% of the studentized residuals to have abs value less than 3.

Discussion stats
• 3 replies
• 1772 views
• 1 like
• 2 in conversation