08-04-2016 08:58 AM
Have you tried it? There are several possible answers to that depending on the data and the code used. Here is the SAS documentation on aggregate functions:
If you want unique values within the group in question, then you put distinct
08-04-2016 08:59 AM
I use county (distinct ID), I get smaller numbers
then I use another way
proc sort data=have nodupkey;
by id date;
by id date;
if first.id then index=1; else index=2;
proc freq data=have2;
But the numbers I got are smaller than I use count disinct id. can any body tell why? which is correct? Thanks.
08-04-2016 09:02 AM
count(var) will increment for every non-missing value in the dataset.
If you want to count a group as one if it has at least one non-missing value, you will have to pre-process the dataset:
data have; input id value; cards; 1 3 1 2 1 . 2 . 2 . 3 5 3 6 3 7 ; run; proc sort data=have (where=(value ne .)) out=int nodupkey ; by id; run; proc print data=int;run; proc sql; select count(value) from int ; quit;
Obs id value 1 1 3 2 3 5 -------- 2
08-04-2016 09:25 AM
Thanks. But the example you list, the ID is not missing
for me, as long as ID is not missing, I do need to count
If an ID only has missing values in var, it would always count as 0.
You might consider to rephrase your requirement and provide "have" and "want" data as examples.
08-04-2016 10:09 AM
Just test it
data have; do var=1,2,3,4,4,5; output; end; call missing(var); output; run; proc sql; select count(*) as n_all_rows, count(var) as n_var_rows, count(distinct var) as n_distinct_var_rows from have ; quit;