Help using Base SAS procedures

Standard Deviation of All Observations of Multiple Variables

Accepted Solution Solved
Reply
Contributor
Posts: 27
Accepted Solution

Standard Deviation of All Observations of Multiple Variables

Hi folks,

I need to compute Standard Deviation of ALL Observations of MULTIPLE Variables.

PROC MEANS can compute std of EACH variable.

STD function can compute std of EACH observation.

Is there a way to compute std of all observations (data points) of multiple variables?

For example, I have 3 variables, var1, var2, var3 and 3 observations.

var1 var2 var3

1      2       3

4      5       6

7      8       9

PROC MEANS can compute std of EACH variable and return std1 of (1, 4, 7), std2 of (2, 5, 8), std3 (3, 6, 9).

STD function in data step can compute std of EACH observation and return std_1_ of (1,2,3), std_2_ of (4, 5, 6) and std_2_ of (7, 8, 9).

But I want to get ONE single standard deviation of all 12 data points, std_all of (1,2,3,4,5,6,7,8,9).

Thanks.


Accepted Solutions
Solution
‎08-04-2011 08:14 AM
PROC Star
Posts: 7,474

Re: Standard Deviation of All Observations of Multiple Variables

There are a number of ways to do it.  I, personally, would use:

data have;

  input var1-var3;

  recnum=_n_;

  cards;

1 2 3

4 5 6

7 8 9

;

proc transpose data=have out=need;

  by recnum;

run;

proc means data=need std;

  var col1;

run;

View solution in original post


All Replies
PROC Star
Posts: 7,474

Standard Deviation of All Observations of Multiple Variables

Posted in reply to richard_hu2003

Richard,

I may or may not understand what you are trying to do.  Does the following do what you want?:

proc means data=sashelp.class std;

  var _numeric_;

run;

Contributor
Posts: 27

Re: Standard Deviation of All Observations of Multiple Variables

art297, thanks. But I think your approach will return multiple std of EACH numeric variable, not ONE std of ALL elements of ALL numberic variables.

I just revised my original post to clarify my question. thanks.

Super User
Posts: 10,035

Re: Standard Deviation of All Observations of Multiple Variables

Posted in reply to richard_hu2003

Proc means can not allow you to use all the data to calculate std, you need to make a longitude  variable to contain all the value of variables.Such as

data want;

set have;

var=var1;output;

var=var2;output;

.....

drop var1-var4.

run;

If you have a lot of variables ,then use array.

Ksharp

Solution
‎08-04-2011 08:14 AM
PROC Star
Posts: 7,474

Re: Standard Deviation of All Observations of Multiple Variables

There are a number of ways to do it.  I, personally, would use:

data have;

  input var1-var3;

  recnum=_n_;

  cards;

1 2 3

4 5 6

7 8 9

;

proc transpose data=have out=need;

  by recnum;

run;

proc means data=need std;

  var col1;

run;

Valued Guide
Posts: 765

Re: Standard Deviation of All Observations of Multiple Variables

Posted in reply to richard_hu2003

Hi .. another idea ...

* 4,000 values;

data x;

do j=1 to 1000;

   a = 100*ranuni(999);

   b = 100*ranuni(999);

   c = 100*ranuni(999);

   d = 100*ranuni(999);

   output;

end;

drop j;

run;

* macro variable can hold up to 64K characters;

proc sql noprint;

select catx(',', a, b, c, d) into :nnn  separated by ',' from x;

quit;


data _null_;

std = std(&nnn);

put "STANDARD DEVIATION:  " std;

run;

Valued Guide
Posts: 765

Re: Standard Deviation of All Observations of Multiple Variables

Hi ... got a tip from a friend, down to one PROC ...

data x;

do j=1 to 1000;

   a = 100*ranuni(999);

   b = 100*ranuni(999);

   c = 100*ranuni(999);

   d = 100*ranuni(999);

   output;

end;

drop j;

run;

proc sql noprint;

select catx(',', a, b, c, d) into :nnn  separated by ',' from x;

reset print ;

select std(&nnn) "STANDARD DEVIATION" from x(obs=1) ;

quit;

PROC Star
Posts: 7,474

Re: Standard Deviation of All Observations of Multiple Variables

Mike,

Glad you posted this as I hadn't realized that one could create and use a macro variable within one proc sql run.  However, I believe that the OP wanted the sd per record, not for the entire file.

Art

Regular Contributor
Posts: 184

Standard Deviation of All Observations of Multiple Variables

Posted in reply to richard_hu2003

If you don't want to reshape the data, a roll-your-own approach with a Double DoW can do it. Something like

data have;

  input var1-var3;

  cards;

1 2 3

4 5 6

7 8 9

;

data _null_ ;

do until (lastobs) ;

   set have end=lastobs ;

     n_v +   n(of varSmiley Happy ;

   sum_v + sum(of varSmiley Happy ;

   end ;

lastobs = 0 ;

do until (lastobs) ;

   set have end=lastobs ;

   array vv

  • var: ;
  •    do j = 1 to dim(vv) ;

          sumsq + ( vv - (sum_v / n_v) )**2 ;

          end ;

       end ;

       std = sqrt( sumsq/(n_v-1) ) ;

       put std= ;

    run  ;

    richard_hu2003 wrote:

    Hi folks,

    I need to compute Standard Deviation of ALL Observations of MULTIPLE Variables.

    PROC MEANS can compute std of EACH variable.

    STD function can compute std of EACH observation.

    Is there a way to compute std of all observations (data points) of multiple variables?

    For example, I have 3 variables, var1, var2, var3 and 3 observations.

    var1 var2 var3

    1      2       3

    4      5       6

    7      8       9

    PROC MEANS can compute std of EACH variable and return std1 of (1, 4, 7), std2 of (2, 5, 8), std3 (3, 6, 9).

    STD function in data step can compute std of EACH observation and return std_1_ of (1,2,3), std_2_ of (4, 5, 6) and std_2_ of (7, 8, 9).

    But I want to get ONE single standard deviation of all 12 data points, std_all of (1,2,3,4,5,6,7,8,9).

    Thanks.

    Respected Advisor
    Posts: 3,799

    Standard Deviation of All Observations of Multiple Variables

    Howles wrote:

    If you don't want to reshape the data, a roll-your-own approach with a Double DoW can do it. 


    I think one pass will suffice.

    data _null_;

       do until(lastobs);

          set have end=lastobs; 

          uss = sum(uss,uss(of varSmiley Happy);

          sum = sum(sum,of varSmiley Happy;

          n   = sum(n,n(of varSmiley Happy);

          end;

       std = sqrt((uss-(sum**2/n))/(n-1));

       put std=;

       stop;

       run;

    🔒 This topic is solved and locked.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 9 replies
    • 3706 views
    • 11 likes
    • 6 in conversation