BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
richard_hu2003
Calcite | Level 5

Hi folks,

I need to compute Standard Deviation of ALL Observations of MULTIPLE Variables.

PROC MEANS can compute std of EACH variable.

STD function can compute std of EACH observation.

Is there a way to compute std of all observations (data points) of multiple variables?

For example, I have 3 variables, var1, var2, var3 and 3 observations.

var1 var2 var3

1      2       3

4      5       6

7      8       9

PROC MEANS can compute std of EACH variable and return std1 of (1, 4, 7), std2 of (2, 5, 8), std3 (3, 6, 9).

STD function in data step can compute std of EACH observation and return std_1_ of (1,2,3), std_2_ of (4, 5, 6) and std_2_ of (7, 8, 9).

But I want to get ONE single standard deviation of all 12 data points, std_all of (1,2,3,4,5,6,7,8,9).

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

There are a number of ways to do it.  I, personally, would use:

data have;

  input var1-var3;

  recnum=_n_;

  cards;

1 2 3

4 5 6

7 8 9

;

proc transpose data=have out=need;

  by recnum;

run;

proc means data=need std;

  var col1;

run;

View solution in original post

9 REPLIES 9
art297
Opal | Level 21

Richard,

I may or may not understand what you are trying to do.  Does the following do what you want?:

proc means data=sashelp.class std;

  var _numeric_;

run;

richard_hu2003
Calcite | Level 5

art297, thanks. But I think your approach will return multiple std of EACH numeric variable, not ONE std of ALL elements of ALL numberic variables.

I just revised my original post to clarify my question. thanks.

Ksharp
Super User

Proc means can not allow you to use all the data to calculate std, you need to make a longitude  variable to contain all the value of variables.Such as

data want;

set have;

var=var1;output;

var=var2;output;

.....

drop var1-var4.

run;

If you have a lot of variables ,then use array.

Ksharp

art297
Opal | Level 21

There are a number of ways to do it.  I, personally, would use:

data have;

  input var1-var3;

  recnum=_n_;

  cards;

1 2 3

4 5 6

7 8 9

;

proc transpose data=have out=need;

  by recnum;

run;

proc means data=need std;

  var col1;

run;

MikeZdeb
Rhodochrosite | Level 12

Hi .. another idea ...

* 4,000 values;

data x;

do j=1 to 1000;

   a = 100*ranuni(999);

   b = 100*ranuni(999);

   c = 100*ranuni(999);

   d = 100*ranuni(999);

   output;

end;

drop j;

run;

* macro variable can hold up to 64K characters;

proc sql noprint;

select catx(',', a, b, c, d) into :nnn  separated by ',' from x;

quit;


data _null_;

std = std(&nnn);

put "STANDARD DEVIATION:  " std;

run;

MikeZdeb
Rhodochrosite | Level 12

Hi ... got a tip from a friend, down to one PROC ...

data x;

do j=1 to 1000;

   a = 100*ranuni(999);

   b = 100*ranuni(999);

   c = 100*ranuni(999);

   d = 100*ranuni(999);

   output;

end;

drop j;

run;

proc sql noprint;

select catx(',', a, b, c, d) into :nnn  separated by ',' from x;

reset print ;

select std(&nnn) "STANDARD DEVIATION" from x(obs=1) ;

quit;

art297
Opal | Level 21

Mike,

Glad you posted this as I hadn't realized that one could create and use a macro variable within one proc sql run.  However, I believe that the OP wanted the sd per record, not for the entire file.

Art

Howles
Quartz | Level 8

If you don't want to reshape the data, a roll-your-own approach with a Double DoW can do it. Something like

data have;

  input var1-var3;

  cards;

1 2 3

4 5 6

7 8 9

;

data _null_ ;

do until (lastobs) ;

   set have end=lastobs ;

     n_v +   n(of var:) ;

   sum_v + sum(of var:) ;

   end ;

lastobs = 0 ;

do until (lastobs) ;

   set have end=lastobs ;

   array vv

  • var: ;
  •    do j = 1 to dim(vv) ;

          sumsq + ( vv - (sum_v / n_v) )**2 ;

          end ;

       end ;

       std = sqrt( sumsq/(n_v-1) ) ;

       put std= ;

    run  ;

    richard_hu2003 wrote:

    Hi folks,

    I need to compute Standard Deviation of ALL Observations of MULTIPLE Variables.

    PROC MEANS can compute std of EACH variable.

    STD function can compute std of EACH observation.

    Is there a way to compute std of all observations (data points) of multiple variables?

    For example, I have 3 variables, var1, var2, var3 and 3 observations.

    var1 var2 var3

    1      2       3

    4      5       6

    7      8       9

    PROC MEANS can compute std of EACH variable and return std1 of (1, 4, 7), std2 of (2, 5, 8), std3 (3, 6, 9).

    STD function in data step can compute std of EACH observation and return std_1_ of (1,2,3), std_2_ of (4, 5, 6) and std_2_ of (7, 8, 9).

    But I want to get ONE single standard deviation of all 12 data points, std_all of (1,2,3,4,5,6,7,8,9).

    Thanks.

    data_null__
    Jade | Level 19

    Howles wrote:

    If you don't want to reshape the data, a roll-your-own approach with a Double DoW can do it. 


    I think one pass will suffice.

    data _null_;

       do until(lastobs);

          set have end=lastobs; 

          uss = sum(uss,uss(of var:));

          sum = sum(sum,of var:);

          n   = sum(n,n(of var:));

          end;

       std = sqrt((uss-(sum**2/n))/(n-1));

       put std=;

       stop;

       run;

    SAS Innovate 2025: Save the Date

     SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

    Save the date!

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    SAS Training: Just a Click Away

     Ready to level-up your skills? Choose your own adventure.

    Browse our catalog!

    Discussion stats
    • 9 replies
    • 11158 views
    • 12 likes
    • 6 in conversation