I would like to count numbers of distinct cases that meet different criteria in my large dataset. The example data looks like below:
| obsid | var |
| 1 | a |
| 1 | b |
| 1 | b |
| 2 | b |
| 2 | b |
| 2 | c |
| 2 | c |
| 3 | a |
| 3 | a |
| 4 | c |
I need to see how many distinct obsids that contain "a", "b" or "c" in var.
I could do for different criteria one after another like this:
proc sql; select count(distinct obsid) as N_a from project.dswi_nodebride where var="a"; run;
proc sql; select count(distinct obsid) as N_b from project.dswi_nodebride where var="b"; run;
proc sql; select count(distinct obsid) as N_c from project.dswi_nodebride where var="c"; run;
I have many criteria, so is there a simpler way to do for all different but similar criteria?
Thanks a lot!
SQL is ideal for this:
data have;
input obsid var$;
datalines;
1 a
1 b
1 b
2 b
2 b
2 c
2 c
3 a
3 a
4 c
;
proc sql;
create table want as
select var, count(distinct obsid) as n
from have
group by var;
select * from want;
quit;
PG
proc sql;
select var,count(distinct obsid) as N from have
group by var;
quit;
SQL is ideal for this:
data have;
input obsid var$;
datalines;
1 a
1 b
1 b
2 b
2 b
2 c
2 c
3 a
3 a
4 c
;
proc sql;
create table want as
select var, count(distinct obsid) as n
from have
group by var;
select * from want;
quit;
PG
This is somewhat like counting subjects with adverse events. This counts events and obsids for each level of var.

Thanks so much to you all!
Lizi
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.