BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CharlesC
Calcite | Level 5

Hello,

  Good afternoon-- I am trying to write a program to automate outlier detection.  I need to produce a list of values > 3 or < -3 along with its variable name and obs number.

I used proc standard to standardize my variables:

data=work.prepstandard;

set work.dataset (drop= id x1 x2 x3 x4 x5);

run;

PROC STANDARD DATA=work.prepstandard MEAN=0 STD=1 OUT=zstandards;

VAR x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16;

run;

Then tried running the following array, but it doesn't work:

DATA work.outliers;

    SET zstandards;

       ARRAY x

  • _NUMERIC_;
  •          DO i=1 TO DIM(x);

               IF x > 3 or x <-3 THEN DO;

                   obsNum= _N_;

                 OUTPUT;

              END;

           END;

    run;

    Did I do something wrong in the array, or is this the wrong way to go about it?  Thanks for helping me out...  I sincerely appreciate it!

    -Charles

    1 ACCEPTED SOLUTION

    Accepted Solutions
    Reeza
    Super User

    It should work, but you don't explain how it doesn't work so I can't comment beyond that.

    However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.

    Here's a sample that does what you're asking using SASHELP.CARS.  I didn't drop the lead variables though you could easily.

    If you have SAS/STAT licensed you can also look into proc stdize.

    proc standard data=sashelp.cars mean=0 std=1 out=zstandards;

    var msrp--length;

    run;

    data outliers;

        set zstandards;

        array x(*) _numeric_;

        do i=1 to dim(x);

        if abs(x(i))-3>0 then do;

            obsnum=_n_;

            variable=vname(x(i));

            value=x(i);

            output;

        end;

        end;

        keep obsnum variable value;

    run;

    View solution in original post

    2 REPLIES 2
    Reeza
    Super User

    It should work, but you don't explain how it doesn't work so I can't comment beyond that.

    However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.

    Here's a sample that does what you're asking using SASHELP.CARS.  I didn't drop the lead variables though you could easily.

    If you have SAS/STAT licensed you can also look into proc stdize.

    proc standard data=sashelp.cars mean=0 std=1 out=zstandards;

    var msrp--length;

    run;

    data outliers;

        set zstandards;

        array x(*) _numeric_;

        do i=1 to dim(x);

        if abs(x(i))-3>0 then do;

            obsnum=_n_;

            variable=vname(x(i));

            value=x(i);

            output;

        end;

        end;

        keep obsnum variable value;

    run;

    CharlesC
    Calcite | Level 5

    It worked, thank you very much.  I really like the way you set up the IF-THEN statement with the abs() to capture both the negative and positive values-- nicely done!!

    I appreciate your help!

    -Charles

    SAS Innovate 2025: Register Now

    Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
    Sign up by Dec. 31 to get the 2024 rate of just $495.
    Register now!

    How to connect to databases in SAS Viya

    Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

    Find more tutorials on the SAS Users YouTube channel.

    Discussion stats
    • 2 replies
    • 1277 views
    • 0 likes
    • 2 in conversation