DATA Step, Macro, Functions and more

proc sql count

Reply
Super Contributor
Posts: 345

proc sql count

when I use proc sql;  count (var) as

 

it might have duplicate records for a single ID

i want to know will they count duplicate records as once, or count more times?

Thanks

Super User
Super User
Posts: 7,971

Re: proc sql count

Have you tried it?  There are several possible answers to that depending on the data and the code used.  Here is the SAS documentation on aggregate functions:

http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473699.htm

 

If you want unique values within the group in question, then you put distinct

count(distinct var)...

Super Contributor
Posts: 345

Re: proc sql count

I use county (distinct ID), I get smaller numbers

then I use another way

 

 

proc sort data=have nodupkey;

by id date;

run;

data have2;

set have;

by id date;

if first.id then index=1; else index=2;

run;

proc freq data=have2;

tables cat1*cat2;

where index=1;

run;

 

But the numbers I got are smaller than I use count disinct id. can any body tell why? which is correct? Thanks.

Super User
Posts: 19,822

Re: proc sql count

Do you have missing values? If so, check how the Procs handle them. 

 

Super User
Posts: 7,821

Re: proc sql count

count(var) will increment for every non-missing value in the dataset.

If you want to count a group as one if it has at least one non-missing value, you will have to pre-process the dataset:

data have;
input id value;
cards;
1 3
1 2
1 .
2 .
2 .
3 5
3 6
3 7
;
run;

proc sort
  data=have (where=(value ne .))
  out=int
  nodupkey
;
by id;
run;

proc print data=int;run;

proc sql;
select count(value)
from int
;
quit;

Output:

Obs    id    value

 1      1      3  
 2      3      5  
                  

             
     --------
            2

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 345

Re: proc sql count

Posted in reply to KurtBremser

Thanks. But the example you list, the ID is not missing

for me, as long as ID is not missing, I do need to count

Super User
Posts: 7,821

Re: proc sql count


wenling wrote:

Thanks. But the example you list, the ID is not missing

for me, as long as ID is not missing, I do need to count


If an ID only has missing values in var, it would always count as 0.

 

You might consider to rephrase your requirement and provide "have" and "want" data as examples.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 4,173

Re: proc sql count

Just test it

data have;
  do var=1,2,3,4,4,5;
    output;
  end;
  call missing(var);
  output;
run;

proc sql;
  select 
    count(*) as n_all_rows,
    count(var) as n_var_rows,
    count(distinct var) as n_distinct_var_rows
  from have
  ;
quit;

Capture.PNG

Ask a Question
Discussion stats
  • 7 replies
  • 557 views
  • 1 like
  • 5 in conversation