when I use proc sql; count (var) as
it might have duplicate records for a single ID
i want to know will they count duplicate records as once, or count more times?
Thanks
Have you tried it? There are several possible answers to that depending on the data and the code used. Here is the SAS documentation on aggregate functions:
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473699.htm
If you want unique values within the group in question, then you put distinct
count(distinct var)...
I use county (distinct ID), I get smaller numbers
then I use another way
proc sort data=have nodupkey;
by id date;
run;
data have2;
set have;
by id date;
if first.id then index=1; else index=2;
run;
proc freq data=have2;
tables cat1*cat2;
where index=1;
run;
But the numbers I got are smaller than I use count disinct id. can any body tell why? which is correct? Thanks.
Do you have missing values? If so, check how the Procs handle them.
count(var) will increment for every non-missing value in the dataset.
If you want to count a group as one if it has at least one non-missing value, you will have to pre-process the dataset:
data have;
input id value;
cards;
1 3
1 2
1 .
2 .
2 .
3 5
3 6
3 7
;
run;
proc sort
data=have (where=(value ne .))
out=int
nodupkey
;
by id;
run;
proc print data=int;run;
proc sql;
select count(value)
from int
;
quit;
Output:
Obs id value 1 1 3 2 3 5 -------- 2
Thanks. But the example you list, the ID is not missing
for me, as long as ID is not missing, I do need to count
@Bal23 wrote:
Thanks. But the example you list, the ID is not missing
for me, as long as ID is not missing, I do need to count
If an ID only has missing values in var, it would always count as 0.
You might consider to rephrase your requirement and provide "have" and "want" data as examples.
Just test it
data have;
do var=1,2,3,4,4,5;
output;
end;
call missing(var);
output;
run;
proc sql;
select
count(*) as n_all_rows,
count(var) as n_var_rows,
count(distinct var) as n_distinct_var_rows
from have
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.