BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
J111
Quartz | Level 8

Hello,

My understanding, a new topic should be opened for this by group question:-

 

Next, if we wanted to add another field such as group by and

calculate the std for each value of gr in the data below:-

what would be the best solution ? Thanks  a lot.


data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
run ;

data wanted ;
input gr dt y std ;
cards ;
1 1 80
1 2 20 42.426406871
1 3 40 30.550504633
1 4 60 25.819888975
2 1 80
2 2 20 42.426406871
2 3 40 30.550504633
2 4 50 25
;
run ;

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;

proc freq data=available;
    tables gr/noprint out=_a_;
run;

data intermediate;
    merge available _a_;
    by gr;
    if first.gr then seq=0;
    seq+1;
    do group=seq to count;
        thisy=y;
        output;
    end;
run;

proc summary data=intermediate nway;
    class gr group;
    var thisy;
    output out=want stddev=;
run;
--
Paige Miller

View solution in original post

7 REPLIES 7
Ksharp
Super User

Did you try my code ?

data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;
data want;
 set available;
 by gr;
 array x{999999} _temporary_;
 if first.gr then do;n=0;call missing(of x{*});end;
 n+1;
 x{n}=y;
 std=std(of x{*});
 drop n;
run;
J111
Quartz | Level 8

Thanks a lot it works 

super

FreelanceReinh
Jade | Level 19

Hello @J111,

 

My solution from the previous thread can be generalized to BY groups as well:

data want(drop=_:);
_s=0; _v=0;
do _n=1 by 1 until(last.gr);
  set available;
  by gr;
  _s+y; /* cumulative sum */
  _m=_s/_n; /* cumulative mean */
  _d=dif(_m); /* mean change */
  _q=(y-_m)**2; /* new term in sum of squares */
  if _n>1 then do;
    std=sqrt(_v+_d**2+_q/(_n-1)); /* cumulative standard deviation */
    _v=((_n-1)*(_v+_d**2)+_q)/_n; /* cumulative population variance */
  end;
  output;
end;
run;
J111
Quartz | Level 8

Thanks it works !!

Great

J111
Quartz | Level 8

It would be nice to see the solution using the logic below (from P.Miller),

but just for the by group..

 

data _NULL_;
	if 0 then set available nobs=n;
	call symputx('nrows',n);
	stop;
run;


data intermediate;
    set available;
    do group=_n_ to &nrows;
        thisy=y;
        output;
    end;
run;
        
proc summary data=intermediate nway;
    class group;
    var thisy;
    output out=want stddev=;
run;

 using 

PaigeMiller
Diamond | Level 26
data available ;
input gr dt y ;
cards ;
1 1 80
1 2 20
1 3 40
1 4 60
2 1 80
2 2 20
2 3 40
2 4 50
;

proc freq data=available;
    tables gr/noprint out=_a_;
run;

data intermediate;
    merge available _a_;
    by gr;
    if first.gr then seq=0;
    seq+1;
    do group=seq to count;
        thisy=y;
        output;
    end;
run;

proc summary data=intermediate nway;
    class gr group;
    var thisy;
    output out=want stddev=;
run;
--
Paige Miller
J111
Quartz | Level 8

Thanks a lot -

May I point out that this solution has much better performance compared to the array post

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 372 views
  • 2 likes
  • 4 in conversation