Help using Base SAS procedures

Retain values for evalution throughout the dataset

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Retain values for evalution throughout the dataset

Hi, I have the follow question referring to the attached dataset. Basically, I want to normalize variables (Xtest_1 to Xtest8). using their respective (mean1 to mean8) and (std1-std8).

I know I can use IML for vector manipulation, but if I choose to use dataset to do this. How to proceed?

Question1:

I manage to "gather" the (mean1 to mean8) and (std1-std8) with variables (Xtest_1 to Xtest8) under the same dataset, see attached file.

I tried to use following code to populate Normalize array, but it produce only 1 obs, because (mean1-mean8) and (std1-std8) are missing for the rest of obs.

Maybe: how can I declare them constants such I can refer to them as SAS sweeps through the dataset?

Question2:

Or, take one-step back before "gather" the statistics, how can I use the means(mean1-mean8) and standard deviation(std1-std8) produced by PROC MEANS. I still cannot figure out how to "apply" the output statistics from PROC MEANS on other dataset.

Thanks

----------------------------------------------------------------------------------------------------------------

DATA New_ex;

SET ex;

RETAIN     mean1-mean8;

RETAIN  std1-std8;

ARRAY MeanStat[8] mean1-mean8;

ARRAY StdStat[8] std1-std8;

    ARRAY X

  • Xtest_1-Xtest_8;
  •     ARRAY Normal[8] Normal1-Normal8 ;

        DO i = 1 to 8;

            Normal=(X-MeanStat)/StdStat;

       

        END;

        OUTPUT;

    RUN;

    Attachment

    Accepted Solutions
    Solution
    ‎08-12-2014 03:01 PM
    Occasional Contributor
    Posts: 9

    Re: Retain values for evalution throughout the dataset

    Just found out a much easier way to standardize (xtest_1-xtest_8), using PROC STDIZE.


    Given:

    1) The statistics means(mean1-mean8) and standard deviation(std1-std8)  [produced by output dataset of PROC MEANS] reside in  'one-observation' dataset.

    2) variable Xtest1_Xtest_8 resides in 'many-observation' dataset

    I can standardized variables (Xtest1_Xtest_8) according to their mean and std using:

    ----------------------------------------------------------------------

    PROC STDIZE DATA="many-observation dataset" OUT=Want PSTAT METHOD=IN("one-observation datase");

          VAR  Xtest_1-Xtest_8;

          LOCATION  mean1-mean8 ;

          SCALE std1-std8 ;

    RUN;

    View solution in original post


    All Replies
    Super User
    Posts: 17,819

    Re: Retain values for evalution throughout the dataset

    There's a proc for that:

    PROC STANDARD

    OR

    PROC STDIZE

    Occasional Contributor
    Posts: 9

    Re: Retain values for evalution throughout the dataset

    I am aware of the PROC STANDARD, wihich takes only 1 mu and 1 sigma a time, applying on the many-observation dataset.

    Here my situation is: the variables (Xtest_1-Xtest_8), all residing in one dataset, each should be normalized by their respective mean and std. PROC STANDARD won't work, unless I split the dataset into 8 individual dataset. (ie, dataset1 has only Xtest_1, dataset2 has only Xtest_2, ...etc) , then use PROC STANDARD on each of them.

    Super User
    Posts: 17,819

    Re: Retain values for evalution throughout the dataset

    Proc STDIZE will though and calculates the mean/std as well Smiley Happy

    proc stdize data=sashelp.class out=check;

    var weight height age;

    run;

    Super User
    Posts: 5,081

    Re: Retain values for evalution throughout the dataset

    As Reeza notes, you may find it easier to use an existing procedure for this particular problem.  Just for the record, though, there is an easy DATA step technique to combine a one-observation data set (your means and standard deviations) with a many-observation data set (your original line-by-line values):

    data want;

       if _n_=1 then set one_observation;

       set many_observations;

       *** array processing, no retain needed;

    run;

    Variables that come in from a SAS data set are automatically retained.  The trick is to keep the DATA step going instead of having it end prematurely.  That's why there's an IF/THEN statement.  Good luck.

    Trusted Advisor
    Posts: 1,204

    Re: Retain values for evalution throughout the dataset

    Hi,

    This will populate missing values in ex dataset to have mean and standard deviation for every observation.

    proc stdize data=imp.ex out=want reponly method=median;
    var m: s: ;
    run;

    proc print data=want;
    run;

    Occasional Contributor
    Posts: 9

    Re: Retain values for evalution throughout the dataset

    Thanks, I was looking to fill the missing value before, using an ad-hoc approach. This makes it easier.

    Solution
    ‎08-12-2014 03:01 PM
    Occasional Contributor
    Posts: 9

    Re: Retain values for evalution throughout the dataset

    Just found out a much easier way to standardize (xtest_1-xtest_8), using PROC STDIZE.


    Given:

    1) The statistics means(mean1-mean8) and standard deviation(std1-std8)  [produced by output dataset of PROC MEANS] reside in  'one-observation' dataset.

    2) variable Xtest1_Xtest_8 resides in 'many-observation' dataset

    I can standardized variables (Xtest1_Xtest_8) according to their mean and std using:

    ----------------------------------------------------------------------

    PROC STDIZE DATA="many-observation dataset" OUT=Want PSTAT METHOD=IN("one-observation datase");

          VAR  Xtest_1-Xtest_8;

          LOCATION  mean1-mean8 ;

          SCALE std1-std8 ;

    RUN;

    Super User
    Posts: 17,819

    Re: Retain values for evalution throughout the dataset

    The default method is STD, which automatically sets the location to be the mean and scale to be the standard deviation.

    Don't work too hard Smiley Happy

    This should give you the same results.

    PROC STDIZE DATA="many-observation dataset" OUT=Want PSTAT ;

          VAR  Xtest_1-Xtest_8;

    RUN;

    Occasional Contributor
    Posts: 9

    Re: Retain values for evalution throughout the dataset

    I see your point. But, in my case, means(mean1-mean8) and standard deviation(std1-std8) are not generated from (Xtest_1-Xtest_8). They are computed based on other dataset.

    Thanks

    ☑ This topic is SOLVED.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 9 replies
    • 260 views
    • 7 likes
    • 4 in conversation