@Reeza is right that this is a typical job for the DoW loop, however the latter requires explicit sorting by VAR2. If you don't mind sorting, then:
data have ;
input id var1 var2 ;
cards ;
1 23 1
2 12 1
3 14 2
4 33 1
5 18 3
6 20 2
7 45 3
run ;
proc sort data = have out = havesort ;
by var2 id ;
run ;
data want_dow (drop = _:) ;
do _n_ = 1 by 1 until (last.var2) ;
set havesort (keep = var:) ;
by var2 ;
_s = sum (_s, var1) ;
end ;
newvar = divide (_s, _n_) ;
do _n_ = 1 to _n_ ;
set havesort ;
output ;
end ;
run ;
If you want to preserve the original data order without sorting forth and back, you can first aggregate the sums and counts using the hash object, then look it up for each record from HAVE and compute the means:
data want_hash (drop = _:) ;
if _n_ = 1 then do ;
dcl hash h () ;
h.definekey ("var2") ;
h.definedata ("_s", "_q") ;
h.definedone () ;
do until (z) ;
set have (keep = var:) end = z ;
if h.find() ne 0 then call missing (_s, _q) ;
_s = sum (_s, var1) ;
_q = sum (_q, 1) ;
h.replace() ;
end ;
end ;
set have ;
h.find() ;
newvar = divide (_s, _q) ;
run ;
However, perhaps the simplest way (also suggested by @Reeza) is just to use the self-merging property of SQL:
proc sql ;
create table want_sql as
select id, var2, var1, avg (var1) as newvar
from have
group var2
order var2, id
;
quit ;
Kind regards
Paul D.
... View more