BookmarkSubscribeRSS Feed
J6
Calcite | Level 5 J6
Calcite | Level 5

Good day.

 

I watched videos on detecting outliers by using studentized residuals on proc reg, which has a default alpha=0.05 (95% confidence level), thus it tells that if the studentized residual is greater than 3, then it is considered an outlier.

 

If I were to change the alpha into 0.01 (99% confidence level), at what least value for studentized residual can I consider to be an outlier?

 

Many thanks!

3 REPLIES 3
Rick_SAS
SAS Super FREQ

By definition, a Studentized residual is formed by dividing each residual by an estimate of its standard error. Therefore the Studentized residuals are normalized to have mean 0 and unit variance. Under the usual OLS assumptions that the errors are normally distributed, the normal quantile is computed as 

 

alpha = 0.05;
q = quantile("Normal", 1 - alpha/2);

which is 1.96, which is usually rounded to 2.  

 

If you want alpha=0.01, then the analogous computation is 

 

alpha = 0.01;
q = quantile("Normal", 1 - alpha/2);

 

which gives 2.58.

 

data Student;
alpha = 0.05;
q = quantile("Normal", 1 - alpha/2);
output;
alpha = 0.01;
q = quantile("Normal", 1 - alpha/2);
output;
run;

proc print; run;

 

J6
Calcite | Level 5 J6
Calcite | Level 5

Thanks Rick.

here's the data i'm trying to studentized:

ASSET:

-4.506
5.169
-2.57
3.068
-0.703
-7.037
0.329
-2.602
-0.969
9.217
-0.495
-1.608
1.808
2.643
-0.19
-0.853
0.688
1.209
0.796
-0.632
-0.139
1.1
-1.653
-0.178

 

here's the code I'm using:

 

proc reg data=data_A1;
model ASSET = SEQ /r;
output out=Data_A2 student=studASSETS;
by TYPE;
run;

 

(does this code have an alpha of 0.05 or does it not? I was just basing on the videos i watched: https://youtu.be/IiGPEPDyC4I)

 

this bear these results under student residuals column:

-1.301
1.787
-0.744
1.094
-0.14
-2.207
0.192
-0.766
-0.236
3.075
-0.087
-0.451
0.656
0.703
-0.23
-0.451
0.053
0.222
0.084
-0.39
-0.23
0.176
-0.736
-0.252

 

I tried specifying the alpha into 0.01 and it produced the same outputs.

 

I'm trying to determine, if these set represents alpha=0.05 and it brought one result which is greater than 3, if I am to use alpha=0.01, what is the least value i need to consider as an outlier

 

thanks in advance

Rick_SAS
SAS Super FREQ

The Studentized residuals are standardized. You said, "I tried specifying the alpha into 0.01 and it produced the same outputs" That statement is correct.

 

Your question appears to be related to detecting outliers. A studentized residual (SR) represents the residual in units of the standard deviation of the residuals. If |SR| > 3, then the residual is more than 3 SD away from the OLS line.

 

In terms of confidence intervals, the following is true:

If the data satisfy the assumption of OLS and the sample size is large, then the studentized residuals are approximately normally distributed. Therefore you would expect 95% of the studentized residuals to have abs values less than 1.96. You would expect 99% of the studentized residuals to have abs value less than 2.58. You would expect 99.9% of the studentized residuals to have abs value less than 3. 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1772 views
  • 1 like
  • 2 in conversation