@morichard wrote:
Hi,
How can i add count rows by groups through proc sql AND THEN (this is the key thing I can't find instructions on) create a new "total" variable that is the same for each member of the group -- NOT just the Last member of the group. In conceptual terms, I want each group member to know how many members of its group have x=1.
Have dataset
x y n
0 a 0
1 a 1
0 b 0
1 b 1
1 b 2
0 c 0
1 c 1
1 c 2
1 c 3
wanted output
x y n n2
0 a 0 1
1 a 1 1
0 b 0 2
1 b 1 2
1 b 2 2
0 c 0 3
1 c 1 3
1 c 2 3
1 c 3 3
Grouping variable is y, the variable to be counted/totaled is x.
i know how to do it with data step:
Proc sort data = have ;
by y;
run;
data want;
set have;
by y;
if first.y then n =0;
n + x;
run;
For future reference please specify which variable(s) define a group. It really isn't nice to make us guess.
A possible solution if the values of X are only ever 0, 1 and possibly missing:
data have;
input x y $ n;
datalines;
0 a 0
1 a 1
0 b 0
1 b 1
1 b 2
0 c 0
1 c 1
1 c 2
1 c 3
;
proc sql;
create table want as
select x,y,n, sum(x) as n2
from have
group by y
;
quit;
There will be a note in the log:
NOTE: The query requires remerging summary statistics back with the original data.
which is because it is doing exactly what you request: merging a single summary statistic (the sum of X) to all of the rows of the group.
... View more