BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CharlesC
Calcite | Level 5

Hello,

  Good afternoon-- I am trying to write a program to automate outlier detection.  I need to produce a list of values > 3 or < -3 along with its variable name and obs number.

I used proc standard to standardize my variables:

data=work.prepstandard;

set work.dataset (drop= id x1 x2 x3 x4 x5);

run;

PROC STANDARD DATA=work.prepstandard MEAN=0 STD=1 OUT=zstandards;

VAR x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16;

run;

Then tried running the following array, but it doesn't work:

DATA work.outliers;

    SET zstandards;

       ARRAY x

  • _NUMERIC_;
  •          DO i=1 TO DIM(x);

               IF x > 3 or x <-3 THEN DO;

                   obsNum= _N_;

                 OUTPUT;

              END;

           END;

    run;

    Did I do something wrong in the array, or is this the wrong way to go about it?  Thanks for helping me out...  I sincerely appreciate it!

    -Charles

    1 ACCEPTED SOLUTION

    Accepted Solutions
    Reeza
    Super User

    It should work, but you don't explain how it doesn't work so I can't comment beyond that.

    However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.

    Here's a sample that does what you're asking using SASHELP.CARS.  I didn't drop the lead variables though you could easily.

    If you have SAS/STAT licensed you can also look into proc stdize.

    proc standard data=sashelp.cars mean=0 std=1 out=zstandards;

    var msrp--length;

    run;

    data outliers;

        set zstandards;

        array x(*) _numeric_;

        do i=1 to dim(x);

        if abs(x(i))-3>0 then do;

            obsnum=_n_;

            variable=vname(x(i));

            value=x(i);

            output;

        end;

        end;

        keep obsnum variable value;

    run;

    View solution in original post

    2 REPLIES 2
    Reeza
    Super User

    It should work, but you don't explain how it doesn't work so I can't comment beyond that.

    However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.

    Here's a sample that does what you're asking using SASHELP.CARS.  I didn't drop the lead variables though you could easily.

    If you have SAS/STAT licensed you can also look into proc stdize.

    proc standard data=sashelp.cars mean=0 std=1 out=zstandards;

    var msrp--length;

    run;

    data outliers;

        set zstandards;

        array x(*) _numeric_;

        do i=1 to dim(x);

        if abs(x(i))-3>0 then do;

            obsnum=_n_;

            variable=vname(x(i));

            value=x(i);

            output;

        end;

        end;

        keep obsnum variable value;

    run;

    CharlesC
    Calcite | Level 5

    It worked, thank you very much.  I really like the way you set up the IF-THEN statement with the abs() to capture both the negative and positive values-- nicely done!!

    I appreciate your help!

    -Charles

    SAS Innovate 2025: Save the Date

     SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

    Save the date!

    How to connect to databases in SAS Viya

    Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

    Find more tutorials on the SAS Users YouTube channel.

    Discussion stats
    • 2 replies
    • 1188 views
    • 0 likes
    • 2 in conversation