Hello,
Good afternoon-- I am trying to write a program to automate outlier detection. I need to produce a list of values > 3 or < -3 along with its variable name and obs number.
I used proc standard to standardize my variables:
data=work.prepstandard;
set work.dataset (drop= id x1 x2 x3 x4 x5);
run;
PROC STANDARD DATA=work.prepstandard MEAN=0 STD=1 OUT=zstandards;
VAR x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16;
run;
Then tried running the following array, but it doesn't work:
DATA work.outliers;
SET zstandards;
ARRAY x
DO i=1 TO DIM(x);
IF x > 3 or x <-3 THEN DO;
obsNum= _N_;
OUTPUT;
END;
END;
run;
Did I do something wrong in the array, or is this the wrong way to go about it? Thanks for helping me out... I sincerely appreciate it!
-Charles
It should work, but you don't explain how it doesn't work so I can't comment beyond that.
However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.
Here's a sample that does what you're asking using SASHELP.CARS. I didn't drop the lead variables though you could easily.
If you have SAS/STAT licensed you can also look into proc stdize.
proc standard data=sashelp.cars mean=0 std=1 out=zstandards;
var msrp--length;
run;
data outliers;
set zstandards;
array x(*) _numeric_;
do i=1 to dim(x);
if abs(x(i))-3>0 then do;
obsnum=_n_;
variable=vname(x(i));
value=x(i);
output;
end;
end;
keep obsnum variable value;
run;
It should work, but you don't explain how it doesn't work so I can't comment beyond that.
However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.
Here's a sample that does what you're asking using SASHELP.CARS. I didn't drop the lead variables though you could easily.
If you have SAS/STAT licensed you can also look into proc stdize.
proc standard data=sashelp.cars mean=0 std=1 out=zstandards;
var msrp--length;
run;
data outliers;
set zstandards;
array x(*) _numeric_;
do i=1 to dim(x);
if abs(x(i))-3>0 then do;
obsnum=_n_;
variable=vname(x(i));
value=x(i);
output;
end;
end;
keep obsnum variable value;
run;
It worked, thank you very much. I really like the way you set up the IF-THEN statement with the abs() to capture both the negative and positive values-- nicely done!!
I appreciate your help!
-Charles
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.