Hi,
I'm having trouble creating a new variable.
Imagine this fictive dataset:
ID X Y Z
1 23 3 1
2 12 2 0
3 10 3 0
1 80 6 1
2 09 2 0
1 78 7 1
4 14 6 0
5 67 9 1
6 12 5 1
1 89 4 0
I want to create a new variable named 'count' that counts how many times the same ID number appears. Besides that, the new variable 'count' should only count those ID numbers with the variable Z = 1. So in this dataset the variable 'Count' would look like:
ID X Y Z COUNT
1 23 3 1 3
2 12 2 0 0
3 10 3 0 0
1 80 6 1 3
2 09 2 0 0
1 78 7 1 3
4 14 6 0 0
5 67 9 1 1
6 12 5 1 1
1 89 4 0 3
Thanks!
data have;
input ID X Y Z;
datalines;
1 23 3 1
2 12 2 0
3 10 3 0
1 80 6 1
2 09 2 0
1 78 7 1
4 14 6 0
5 67 9 1
6 12 5 1
1 89 4 0
;
data want (drop = rc);
if _N_ = 1 then do;
declare hash h ();
h.definekey ('ID');
h.definedata ('ID', 'count');
h.definedone();
do until (lr);
set have end=lr;
if Z = 1 then do;
if h.find () ne 0 then count = 1;
else count + 1;
h.replace ();
end;
end;
end;
set have;
count = 0;
rc = h.find();
run;
Result:
ID X Y Z count 1 23 3 1 3 2 12 2 0 0 3 10 3 0 0 1 80 6 1 3 2 9 2 0 0 1 78 7 1 3 4 14 6 0 0 5 67 9 1 1 6 12 5 1 1 1 89 4 0 3
data have;
input ID X Y Z;
datalines;
1 23 3 1
2 12 2 0
3 10 3 0
1 80 6 1
2 09 2 0
1 78 7 1
4 14 6 0
5 67 9 1
6 12 5 1
1 89 4 0
;
data want (drop = rc);
if _N_ = 1 then do;
declare hash h ();
h.definekey ('ID');
h.definedata ('ID', 'count');
h.definedone();
do until (lr);
set have end=lr;
if Z = 1 then do;
if h.find () ne 0 then count = 1;
else count + 1;
h.replace ();
end;
end;
end;
set have;
count = 0;
rc = h.find();
run;
Result:
ID X Y Z count 1 23 3 1 3 2 12 2 0 0 3 10 3 0 0 1 80 6 1 3 2 9 2 0 0 1 78 7 1 3 4 14 6 0 0 5 67 9 1 1 6 12 5 1 1 1 89 4 0 3
Also using a hash, but with a preliminary step to create the sums:
data have;
input ID $ X Y Z;
datalines;
1 23 3 1
2 12 2 0
3 10 3 0
1 80 6 1
2 09 2 0
1 78 7 1
4 14 6 0
5 67 9 1
6 12 5 1
1 89 4 0
;
proc summary data=have (where=(z = 1));
class id;
output out=sum (drop=_type_ rename=(_freq_=count));
run;
data want;
set have;
if 0 then set sum;
if _n_ = 1
then do;
declare hash h (dataset:"sum");
h.definekey("id");
h.definedata("count");
h.definedone();
end;
if h.find() ne 0 then count = 0;
run;
Note that the rather complicated algorithms are only necessary to keep the original order. If that is not the case, a simple SQL will do it:
proc sql;
create table want as
select *, sum(case when z = 1 then 1 else 0 end) as count
from have
group by id;
quit;
And you can keep the original order if you create a column to order by first:
data have;
input ID $ X Y Z;
ord = _n_;
datalines;
1 23 3 1
2 12 2 0
3 10 3 0
1 80 6 1
2 09 2 0
1 78 7 1
4 14 6 0
5 67 9 1
6 12 5 1
1 89 4 0
;
proc sql;
create table want as
select id, x, y, z, sum(case when z = 1 then 1 else 0 end) as count
from have
group by id
order by ord;
quit;
data have;
input ID X Y Z;
datalines;
1 23 3 1
2 12 2 0
3 10 3 0
1 80 6 1
2 09 2 0
1 78 7 1
4 14 6 0
5 67 9 1
6 12 5 1
1 89 4 0
;
proc sql;
create table want as
select *,
coalesce((select count(*) from have where z=1 and id=a.id),0) as count
from have as a;
quit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.