Hello,
Good afternoon-- I am trying to write a program to automate outlier detection. I need to produce a list of values > 3 or < -3 along with its variable name and obs number.
I used proc standard to standardize my variables:
data=work.prepstandard;
set work.dataset (drop= id x1 x2 x3 x4 x5);
run;
PROC STANDARD DATA=work.prepstandard MEAN=0 STD=1 OUT=zstandards;
VAR x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16;
run;
Then tried running the following array, but it doesn't work:
DATA work.outliers;
SET zstandards;
ARRAY x
DO i=1 TO DIM(x);
IF x > 3 or x <-3 THEN DO;
obsNum= _N_;
OUTPUT;
END;
END;
run;
Did I do something wrong in the array, or is this the wrong way to go about it? Thanks for helping me out... I sincerely appreciate it!
-Charles
It should work, but you don't explain how it doesn't work so I can't comment beyond that.
However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.
Here's a sample that does what you're asking using SASHELP.CARS. I didn't drop the lead variables though you could easily.
If you have SAS/STAT licensed you can also look into proc stdize.
proc standard data=sashelp.cars mean=0 std=1 out=zstandards;
var msrp--length;
run;
data outliers;
set zstandards;
array x(*) _numeric_;
do i=1 to dim(x);
if abs(x(i))-3>0 then do;
obsnum=_n_;
variable=vname(x(i));
value=x(i);
output;
end;
end;
keep obsnum variable value;
run;
It should work, but you don't explain how it doesn't work so I can't comment beyond that.
However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.
Here's a sample that does what you're asking using SASHELP.CARS. I didn't drop the lead variables though you could easily.
If you have SAS/STAT licensed you can also look into proc stdize.
proc standard data=sashelp.cars mean=0 std=1 out=zstandards;
var msrp--length;
run;
data outliers;
set zstandards;
array x(*) _numeric_;
do i=1 to dim(x);
if abs(x(i))-3>0 then do;
obsnum=_n_;
variable=vname(x(i));
value=x(i);
output;
end;
end;
keep obsnum variable value;
run;
It worked, thank you very much. I really like the way you set up the IF-THEN statement with the abs() to capture both the negative and positive values-- nicely done!!
I appreciate your help!
-Charles
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.