I'm pretty new to proc sql, but I was trying to count the number of people. Each unique person can have multiple rows (multiple application submissions). An example of a table below is: Person_ID Category Ref_type Total_amt 100 Green 2 350 100 Blue 2 300 100 Red 3 100 200 Green 1 20 200 Black 3 500 300 Blue 2 200 I want to count the number of people in each Category*Ref_type, so it'll be Category Ref_type No_ppl Green 1 1 Green 2 2 Blue 2 2 Red 3 1 Black 3 1 I'm not at home right now so I can't check but I remember that I tried doing proc sql;
create table want as
select Category
,Ref_type
,count(unique Person_ID)
from have
group by Category, Ref_type
;quit; which gave me not exactly what I wanted (note that this was for a large dataset, probably 2 million rows). When I removed the "unique" keyword, I got what I wanted. Can someone explain to me what the code with the "unique" when I'm using it with the count function, and a group by statement, and also what the code without the "unique" function does when I'm counting a specific variable that's not in the group by statement?
... View more