- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi folks,
I need to compute Standard Deviation of ALL Observations of MULTIPLE Variables.
PROC MEANS can compute std of EACH variable.
STD function can compute std of EACH observation.
Is there a way to compute std of all observations (data points) of multiple variables?
For example, I have 3 variables, var1, var2, var3 and 3 observations.
var1 var2 var3
1 2 3
4 5 6
7 8 9
PROC MEANS can compute std of EACH variable and return std1 of (1, 4, 7), std2 of (2, 5, 8), std3 (3, 6, 9).
STD function in data step can compute std of EACH observation and return std_1_ of (1,2,3), std_2_ of (4, 5, 6) and std_2_ of (7, 8, 9).
But I want to get ONE single standard deviation of all 12 data points, std_all of (1,2,3,4,5,6,7,8,9).
Thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There are a number of ways to do it. I, personally, would use:
data have;
input var1-var3;
recnum=_n_;
cards;
1 2 3
4 5 6
7 8 9
;
proc transpose data=have out=need;
by recnum;
run;
proc means data=need std;
var col1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Richard,
I may or may not understand what you are trying to do. Does the following do what you want?:
proc means data=sashelp.class std;
var _numeric_;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
art297, thanks. But I think your approach will return multiple std of EACH numeric variable, not ONE std of ALL elements of ALL numberic variables.
I just revised my original post to clarify my question. thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Proc means can not allow you to use all the data to calculate std, you need to make a longitude variable to contain all the value of variables.Such as
data want;
set have;
var=var1;output;
var=var2;output;
.....
drop var1-var4.
run;
If you have a lot of variables ,then use array.
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There are a number of ways to do it. I, personally, would use:
data have;
input var1-var3;
recnum=_n_;
cards;
1 2 3
4 5 6
7 8 9
;
proc transpose data=have out=need;
by recnum;
run;
proc means data=need std;
var col1;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi .. another idea ...
* 4,000 values;
data x;
do j=1 to 1000;
a = 100*ranuni(999);
b = 100*ranuni(999);
c = 100*ranuni(999);
d = 100*ranuni(999);
output;
end;
drop j;
run;
* macro variable can hold up to 64K characters;
proc sql noprint;
select catx(',', a, b, c, d) into :nnn separated by ',' from x;
quit;
data _null_;
std = std(&nnn);
put "STANDARD DEVIATION: " std;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi ... got a tip from a friend, down to one PROC ...
data x;
do j=1 to 1000;
a = 100*ranuni(999);
b = 100*ranuni(999);
c = 100*ranuni(999);
d = 100*ranuni(999);
output;
end;
drop j;
run;
proc sql noprint;
select catx(',', a, b, c, d) into :nnn separated by ',' from x;
reset print ;
select std(&nnn) "STANDARD DEVIATION" from x(obs=1) ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Mike,
Glad you posted this as I hadn't realized that one could create and use a macro variable within one proc sql run. However, I believe that the OP wanted the sd per record, not for the entire file.
Art
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you don't want to reshape the data, a roll-your-own approach with a Double DoW can do it. Something like
data have;
input var1-var3;
cards;
1 2 3
4 5 6
7 8 9
;
data _null_ ;
do until (lastobs) ;
set have end=lastobs ;
n_v + n(of var:) ;
sum_v + sum(of var:) ;
end ;
lastobs = 0 ;
do until (lastobs) ;
set have end=lastobs ;
array vv
do j = 1 to dim(vv) ;
sumsq + ( vv
end ;
end ;
std = sqrt( sumsq/(n_v-1) ) ;
put std= ;
run ;
richard_hu2003 wrote:
Hi folks,
I need to compute Standard Deviation of ALL Observations of MULTIPLE Variables.
PROC MEANS can compute std of EACH variable.
STD function can compute std of EACH observation.
Is there a way to compute std of all observations (data points) of multiple variables?
For example, I have 3 variables, var1, var2, var3 and 3 observations.
var1 var2 var3
1 2 3
4 5 6
7 8 9
PROC MEANS can compute std of EACH variable and return std1 of (1, 4, 7), std2 of (2, 5, 8), std3 (3, 6, 9).
STD function in data step can compute std of EACH observation and return std_1_ of (1,2,3), std_2_ of (4, 5, 6) and std_2_ of (7, 8, 9).
But I want to get ONE single standard deviation of all 12 data points, std_all of (1,2,3,4,5,6,7,8,9).
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Howles wrote:
If you don't want to reshape the data, a roll-your-own approach with a Double DoW can do it.
I think one pass will suffice.
data _null_;
do until(lastobs);
set have end=lastobs;
uss = sum(uss,uss(of var:));
sum = sum(sum,of var:);
n = sum(n,n(of var:));
end;
std = sqrt((uss-(sum**2/n))/(n-1));
put std=;
stop;
run;