I would like to count numbers of distinct cases that meet different criteria in my large dataset. The example data looks like below:
obsid | var |
1 | a |
1 | b |
1 | b |
2 | b |
2 | b |
2 | c |
2 | c |
3 | a |
3 | a |
4 | c |
I need to see how many distinct obsids that contain "a", "b" or "c" in var.
I could do for different criteria one after another like this:
proc sql; select count(distinct obsid) as N_a from project.dswi_nodebride where var="a"; run;
proc sql; select count(distinct obsid) as N_b from project.dswi_nodebride where var="b"; run;
proc sql; select count(distinct obsid) as N_c from project.dswi_nodebride where var="c"; run;
I have many criteria, so is there a simpler way to do for all different but similar criteria?
Thanks a lot!
SQL is ideal for this:
data have;
input obsid var$;
datalines;
1 a
1 b
1 b
2 b
2 b
2 c
2 c
3 a
3 a
4 c
;
proc sql;
create table want as
select var, count(distinct obsid) as n
from have
group by var;
select * from want;
quit;
PG
proc sql;
select var,count(distinct obsid) as N from have
group by var;
quit;
SQL is ideal for this:
data have;
input obsid var$;
datalines;
1 a
1 b
1 b
2 b
2 b
2 c
2 c
3 a
3 a
4 c
;
proc sql;
create table want as
select var, count(distinct obsid) as n
from have
group by var;
select * from want;
quit;
PG
This is somewhat like counting subjects with adverse events. This counts events and obsids for each level of var.
Thanks so much to you all!
Lizi
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.