BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
J111
Quartz | Level 8

Hello,

How can we calculate std on data that has an additional row each time.

It is not a regular moving std , as the data window size increases each time.


data available ;
input dt y ;
cards ;
1 80
2 20
3 40
4 60
;
run ;

data wanted ;
input dt std ;
cards ;
1 80 .
2 20 42.426406871
3 40 30.550504633
4 60 25.819888975
;
run ;

data calc ;
x1 = 80 ;
x2 = 20 ;
x3 = 40 ;
x4 = 60 ;
new1 = std(x1,x2) ; *42.426406871 ;
new2 = std(x1,x2,x3) ; *30.550504633 ;
new3 = std(x1,x2,x3,x4) ;* 25.819888975 ;
run ;

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Thanks for providing data as a SAS data step. Very helpful. I wish more people would do that.

 

This seems to work. If you have thousands or millions of observations, this will note scale up efficiently.

 

data available ;
input dt y ;
cards ;
1 80
2 20
3 40
4 60
;

/* Compute number of observations in data set available */
/* Code from https://communities.sas.com/t5/SAS-Communities-Library/Determining-the-number-of-observations-in-a-SAS-data-set/ta-p/475174 */
data _NULL_;
	if 0 then set available nobs=n;
	call symputx('nrows',n);
	stop;
run;


data intermediate;
    set available;
    do group=_n_ to &nrows;
        thisy=y;
        output;
    end;
run;
        
proc summary data=intermediate nway;
    class group;
    var thisy;
    output out=want stddev=;
run;
--
Paige Miller

View solution in original post

6 REPLIES 6
J111
Quartz | Level 8

oops..

 

data available ;
input dt y ;
cards ;
1 80
2 20
3 40
4 60
;
run ;

data wanted ;
input dt y std ;* <---three input variables ;
cards ;
1 80 .
2 20 42.426406871
3 40 30.550504633
4 60 25.819888975
;
run ;

data calc ;
x1 = 80 ;
x2 = 20 ;
x3 = 40 ;
x4 = 60 ;
new1 = std(x1,x2) ; *42.426406871 ;
new2 = std(x1,x2,x3) ; *30.550504633 ;
new3 = std(x1,x2,x3,x4) ;* 25.819888975 ;
run ;

PaigeMiller
Diamond | Level 26

Thanks for providing data as a SAS data step. Very helpful. I wish more people would do that.

 

This seems to work. If you have thousands or millions of observations, this will note scale up efficiently.

 

data available ;
input dt y ;
cards ;
1 80
2 20
3 40
4 60
;

/* Compute number of observations in data set available */
/* Code from https://communities.sas.com/t5/SAS-Communities-Library/Determining-the-number-of-observations-in-a-SAS-data-set/ta-p/475174 */
data _NULL_;
	if 0 then set available nobs=n;
	call symputx('nrows',n);
	stop;
run;


data intermediate;
    set available;
    do group=_n_ to &nrows;
        thisy=y;
        output;
    end;
run;
        
proc summary data=intermediate nway;
    class group;
    var thisy;
    output out=want stddev=;
run;
--
Paige Miller
J111
Quartz | Level 8

All 3 replies to my question are good solutions

Thanks so much 🙂 

J111
Quartz | Level 8

Hello,

 

Next, if we wanted to add another field such as group by and

calculate the std for each value of gr in the data below:-

what would be the best solution ? Thanks  a lot.


data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
run ;

data wanted ;
input gr dt y std ;
cards ;
1 1 80
1 2 20 42.426406871
1 3 40 30.550504633
1 4 60 25.819888975
2 1 80
2 2 20 42.426406871
2 3 40 30.550504633
2 4 50 25
;
run ;

Ksharp
Super User
data available ;
input dt y ;
cards ;
1 80
2 20
3 40
4 60
;
data want;
 set available;
 array x{999999} _temporary_;
 x{_n_}=y;
 std=std(of x{*});
run;
FreelanceReinh
Jade | Level 19

Hello @J111,

 

You can also compute the cumulative standard deviation using Steiner's theorem:

data want(drop=_:);
set available;
retain _v 0;
_s+y; /* cumulative sum */
_m=_s/_n_; /* cumulative mean */
_d=dif(_m); /* mean change */
_q=(y-_m)**2; /* new term in sum of squares */
if _n_>1 then do;
  std=sqrt(_v+_d**2+_q/(_n_-1)); /* cumulative standard deviation */
  _v=((_n_-1)*(_v+_d**2)+_q)/_n_; /* cumulative population variance */
end;
run;

This calculation should take less than one second for a million observations.

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 3295 views
  • 3 likes
  • 4 in conversation