DATA Step, Macro, Functions and more

How to count distinct values of a Character type field in a SAS dataset?

Reply
Contributor
Posts: 22

How to count distinct values of a Character type field in a SAS dataset?

Hello,

 

Could anyone help, on how to count distinct values for a character type field in a SAS dataset? Like, I have a field of email addresses in a dataset, and I want to count the distinct email addresses only. I could get the desired results using PROC SQL, but I wanted to know if there is any other way to do the same.

 

Thanks in advance!

Super User
Super User
Posts: 7,942

Re: How to count distinct values of a Character type field in a SAS dataset?

Contributor
Posts: 22

Re: How to count distinct values of a Character type field in a SAS dataset?

Thanks for your help.

Could you please give a small example for this?

 

Thank you!

Super User
Posts: 5,426

Re: How to count distinct values of a Character type field in a SAS dataset?

With the help of the links provided, try to use them. If you run into obstacles, then get back to us.

Data never sleeps
Contributor
Posts: 22

Re: How to count distinct values of a Character type field in a SAS dataset?

Proc means- does not give any results for Character type variable.

 

Proc freq- I did not get the desired output. Like, I have a field of email address in the dataset, using proc freq displayed all email addresses with count as 1 in front of each.

 

Proc summary- I have never used, I am trying and it runs into errors.

 

 

Trusted Advisor
Posts: 1,117

Re: How to count distinct values of a Character type field in a SAS dataset?

Hello @PoojaP1,

 

PROC SORT with NODUPKEY and OUT= options would be another method (in addition to the approaches listed by RW9).

 

Please note that some approaches differ in how missing values (and special missing values) are handled. For example, a COUNT(DISTINCT ...) in PROC SQL excludes missing values, whereas one missing value (of each kind in case of special missing values), if any, would remain after a NODUPKEY sort.

 

The optimal method will depend on the number of observations in the dataset, the expected number of distinct values and other considerations, e.g., whether the dataset is already sorted by that character variable.

Contributor
Posts: 50

Re: How to count distinct values of a Character type field in a SAS dataset?

Hello,

 

Another technique is using Hash tables and SUMINC

 

http://www.pharmasug.org/proceedings/2014/CC/PharmaSUG-2014-CC02.pdf

 

Best regards,

Super User
Posts: 10,023

Re: How to count distinct values of a Character type field in a SAS dataset?

1)

proc freq data=sashelp.class nlevels;
 table sex;
run;



2) data _null_; if 0 then set sashelp.class; declare hash h(dataset:'sashelp.class'); h.definekey('sex'); h.definedone(); nlevels=h.num_items; putlog "NOTE:" nlevels; stop; run;
Ask a Question
Discussion stats
  • 7 replies
  • 830 views
  • 1 like
  • 6 in conversation