Hello,
My understanding, a new topic should be opened for this by group question:-
Next, if we wanted to add another field such as group by and
calculate the std for each value of gr in the data below:-
what would be the best solution ? Thanks a lot.
data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
run ;
data wanted ;
input gr dt y std ;
cards ;
1 1 80
1 2 20 42.426406871
1 3 40 30.550504633
1 4 60 25.819888975
2 1 80
2 2 20 42.426406871
2 3 40 30.550504633
2 4 50 25
;
run ;
data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
proc freq data=available;
tables gr/noprint out=_a_;
run;
data intermediate;
merge available _a_;
by gr;
if first.gr then seq=0;
seq+1;
do group=seq to count;
thisy=y;
output;
end;
run;
proc summary data=intermediate nway;
class gr group;
var thisy;
output out=want stddev=;
run;
Did you try my code ?
data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
data want;
set available;
by gr;
array x{999999} _temporary_;
if first.gr then do;n=0;call missing(of x{*});end;
n+1;
x{n}=y;
std=std(of x{*});
drop n;
run;
Thanks a lot it works
super
Hello @J111,
My solution from the previous thread can be generalized to BY groups as well:
data want(drop=_:);
_s=0; _v=0;
do _n=1 by 1 until(last.gr);
set available;
by gr;
_s+y; /* cumulative sum */
_m=_s/_n; /* cumulative mean */
_d=dif(_m); /* mean change */
_q=(y-_m)**2; /* new term in sum of squares */
if _n>1 then do;
std=sqrt(_v+_d**2+_q/(_n-1)); /* cumulative standard deviation */
_v=((_n-1)*(_v+_d**2)+_q)/_n; /* cumulative population variance */
end;
output;
end;
run;
Thanks it works !!
Great
It would be nice to see the solution using the logic below (from P.Miller),
but just for the by group..
data _NULL_;
if 0 then set available nobs=n;
call symputx('nrows',n);
stop;
run;
data intermediate;
set available;
do group=_n_ to &nrows;
thisy=y;
output;
end;
run;
proc summary data=intermediate nway;
class group;
var thisy;
output out=want stddev=;
run;
using
data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
proc freq data=available;
tables gr/noprint out=_a_;
run;
data intermediate;
merge available _a_;
by gr;
if first.gr then seq=0;
seq+1;
do group=seq to count;
thisy=y;
output;
end;
run;
proc summary data=intermediate nway;
class gr group;
var thisy;
output out=want stddev=;
run;
Thanks a lot -
May I point out that this solution has much better performance compared to the array post
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.