Hi,
Using proc uni-variate procedure, I found that the following Obs are outliers:
proc univariate;
var resids;
qqplot resids;
run;
SAS Output
-8.40972 | 188 | 5.12990 | 691 |
-7.62763 | 211 | 5.12990 | 695 |
-7.46829 | 570 | 5.12990 | 810 |
-7.38851 | 367 | 6.26658 | 612 |
-6.79448 | 588 | 7.25340 | 610
|
I want to remove these all 10 observations from data set. Is there any handy code for outlier removal? Thank you.
There isn't a quick way, but you could save the output from the univariate, then use that to remove those values:
proc sql; delete from have where value_obs in (select value_obs from univariate_output); quit;
That would remove all values which have the number given in the univariate output.
@ramkhatiwada wrote:
Hi,
Using proc uni-variate procedure, I found that the following Obs are outliers:
proc univariate;
var resids;
qqplot resids;
run;SAS Output
Extreme ObservationsLowest HighestValue Obs Value Obs
-8.40972 188 5.12990 691 -7.62763 211 5.12990 695 -7.46829 570 5.12990 810 -7.38851 367 6.26658 612 -6.79448 588 7.25340 610
I want to remove these all 10 observations from data set. Is there any handy code for outlier removal? Thank you.
Are you sure that you want to remove observations? Removing an observation removes all other variables as well. Are other variables on those records still useful for other purposes? You might be better served by either adding a flag variable that indicates "do not use variable x when the flag value is 1 (or zero your choice)" by using where options. Or perhaps creating a new data set and setting these values to missing.
Also Proc Univariate always by default shows the five largest and smallest values. They are not automatically "outliers". You may very well have values such as -6.79200 remaining in your data. Is that an outlier?
Please run this example data and tell me if you actually think the five smallest and largest values are "outliers".
data work.dummy; do x=1 to 10; y=1; output; end; run; proc univariate data=work.dummy; var y; run;
Yeah, that's not a good rule for identifying outliers.
Use a different logic.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.