- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
How can i add count rows by groups through proc sql AND THEN (this is the key thing I can't find instructions on) create a new "total" variable that is the same for each member of the group -- NOT just the Last member of the group. In conceptual terms, I want each group member to know how many members of its group have x=1.
Have dataset
x y n
0 a 0
1 a 1
0 b 0
1 b 1
1 b 2
0 c 0
1 c 1
1 c 2
1 c 3
wanted output
x y n n2
0 a 0 1
1 a 1 1
0 b 0 2
1 b 1 2
1 b 2 2
0 c 0 3
1 c 1 3
1 c 2 3
1 c 3 3
Grouping variable is y, the variable to be counted/totaled is x.
i know how to do it with data step:
Proc sort data = have ;
by y;
run;
data want;
set have;
by y;
if first.y then n =0;
n + x;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Since you are bothering to sort the data set, here is a single data step solution to use that sorted data set (changed "n+x" to "n2+x" below):
data want;
set have;
by y;
if first.y then n2 =0;
n2 + x;
if last.y;
do until (last.y);
set have;
by y;
output;
end;
run;
There are two SET plus BY pairs. So the first. and last. dummy variables appear to be paired with the most-recently-used SET/BY combination.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@morichard wrote:
Hi,
How can i add count rows by groups through proc sql AND THEN (this is the key thing I can't find instructions on) create a new "total" variable that is the same for each member of the group -- NOT just the Last member of the group. In conceptual terms, I want each group member to know how many members of its group have x=1.
Have dataset
x y n
0 a 0
1 a 1
0 b 0
1 b 1
1 b 2
0 c 0
1 c 1
1 c 2
1 c 3
wanted output
x y n n2
0 a 0 1
1 a 1 1
0 b 0 2
1 b 1 2
1 b 2 2
0 c 0 3
1 c 1 3
1 c 2 3
1 c 3 3
Grouping variable is y, the variable to be counted/totaled is x.
i know how to do it with data step:
Proc sort data = have ;
by y;
run;
data want;
set have;
by y;
if first.y then n =0;
n + x;
run;
For future reference please specify which variable(s) define a group. It really isn't nice to make us guess.
A possible solution if the values of X are only ever 0, 1 and possibly missing:
data have; input x y $ n; datalines; 0 a 0 1 a 1 0 b 0 1 b 1 1 b 2 0 c 0 1 c 1 1 c 2 1 c 3 ; proc sql; create table want as select x,y,n, sum(x) as n2 from have group by y ; quit;
There will be a note in the log:
NOTE: The query requires remerging summary statistics back with the original data.
which is because it is doing exactly what you request: merging a single summary statistic (the sum of X) to all of the rows of the group.