SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to read a dataset for a value range and output the var name, obs num and value

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

How to read a dataset for a value range and output the var name, obs num and value

Hello,

  Good afternoon-- I am trying to write a program to automate outlier detection.  I need to produce a list of values > 3 or < -3 along with its variable name and obs number.

I used proc standard to standardize my variables:

data=work.prepstandard;

set work.dataset (drop= id x1 x2 x3 x4 x5);

run;

PROC STANDARD DATA=work.prepstandard MEAN=0 STD=1 OUT=zstandards;

VAR x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 x16;

run;

Then tried running the following array, but it doesn't work:

DATA work.outliers;

    SET zstandards;

       ARRAY x

  • _NUMERIC_;
  •          DO i=1 TO DIM(x);

               IF x > 3 or x <-3 THEN DO;

                   obsNum= _N_;

                 OUTPUT;

              END;

           END;

    run;

    Did I do something wrong in the array, or is this the wrong way to go about it?  Thanks for helping me out...  I sincerely appreciate it!

    -Charles


    Accepted Solutions
    Solution
    ‎10-08-2013 06:15 PM
    Super User
    Posts: 17,829

    Re: How to read a dataset for a value range and output the var name, obs num and value

    It should work, but you don't explain how it doesn't work so I can't comment beyond that.

    However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.

    Here's a sample that does what you're asking using SASHELP.CARS.  I didn't drop the lead variables though you could easily.

    If you have SAS/STAT licensed you can also look into proc stdize.

    proc standard data=sashelp.cars mean=0 std=1 out=zstandards;

    var msrp--length;

    run;

    data outliers;

        set zstandards;

        array x(*) _numeric_;

        do i=1 to dim(x);

        if abs(x(i))-3>0 then do;

            obsnum=_n_;

            variable=vname(x(i));

            value=x(i);

            output;

        end;

        end;

        keep obsnum variable value;

    run;

    View solution in original post


    All Replies
    Solution
    ‎10-08-2013 06:15 PM
    Super User
    Posts: 17,829

    Re: How to read a dataset for a value range and output the var name, obs num and value

    It should work, but you don't explain how it doesn't work so I can't comment beyond that.

    However, it won't identify the variable that is the outlier and if there are multiple outliers in a specific observation, though I suppose if you're automating then you don't care too much about that.

    Here's a sample that does what you're asking using SASHELP.CARS.  I didn't drop the lead variables though you could easily.

    If you have SAS/STAT licensed you can also look into proc stdize.

    proc standard data=sashelp.cars mean=0 std=1 out=zstandards;

    var msrp--length;

    run;

    data outliers;

        set zstandards;

        array x(*) _numeric_;

        do i=1 to dim(x);

        if abs(x(i))-3>0 then do;

            obsnum=_n_;

            variable=vname(x(i));

            value=x(i);

            output;

        end;

        end;

        keep obsnum variable value;

    run;

    Occasional Contributor
    Posts: 14

    Re: How to read a dataset for a value range and output the var name, obs num and value

    It worked, thank you very much.  I really like the way you set up the IF-THEN statement with the abs() to capture both the negative and positive values-- nicely done!!

    I appreciate your help!

    -Charles

    ☑ This topic is SOLVED.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 2 replies
    • 336 views
    • 0 likes
    • 2 in conversation