## How to count no. of unique values only

Solved
Occasional Contributor
Posts: 12

# How to count no. of unique values only

Suppose I have a dataset with 100 char and 100 numeric variables and I want to count the no. of unique observations in char variables only ...

I know we could use the count distinct in Proc SQL but I wanted to know if there is any method using data step approach...

Thank you,

Rockerd.

Accepted Solutions
Solution
‎03-03-2017 08:00 AM
Posts: 3,852

## Re: How to count no. of unique values only

> The method suggested by Patrick works for me.

More easily obtained with PROC FREQ LEVELS

[pre]
Number of Variable Levels

Variable Levels
--------------------
Name 19
Sex 2
[/pre]

All Replies
SAS Employee
Posts: 183

## Re: How to count no. of unique values only

[pre]data test;
input char1 \$ char2 \$ num1 num2;
datalines;
A B 1 2
A B 1 2
A C 1 2
A C 1 2
D B 4 2
D B 1 2
;
run;

proc sort data=work.test
out=char nodupkey;
by _char_;
run;

proc sort data=work.test
out=num nodupkey;
by _numeric_;
run;
Not applicable
Posts: 0

## Re: How to count no. of unique values only

You could sort the dataset by the variable you want to count and then use the retain function to count every time you find a new value.

proc sort data=x;
by CharVar1;
run;

data x1(keep=count);
set x end=eof;
by CharVar1;

retain count 0;

if first.CharVar1 then count=count+1;

if eof then output;
run;
Valued Guide
Posts: 653

## Re: How to count no. of unique values only

Ron Fehd (the Macro Maven) has written a couple of macros that do what you are asking. Especially his %showall. See http://www2.sas.com/proceedings/sugi22/POSTERS/PAPER236.PDF
Occasional Contributor
Posts: 12

## Re: How to count no. of unique values only

Do I have to sort every one individually and then follow this procedure of nodupkey for every variable? I feel that would be time consuming.

Is there any way where I could do it without having to sort the data?

Like in a Proc print you could specify :
sum var_name;
Is there any method similar to this in SAS where in I could say count(distinct (var_name)) without using proc SQL?
Posts: 3,852

## Re: How to count no. of unique values only

I think PROC FREQ is what you need. TABLES _CHAR_;
Occasional Contributor
Posts: 12

## Re: How to count no. of unique values only

actually I was trying for proc freq ...

but when I say out = dataset_name .. it outputs only the last char variable it has encountered ... so counting for distinct values is a little difficult with proc freq.
Posts: 3,852

## Re: How to count no. of unique values only

[pre]ods output onewayfreqs=freq;[/pre]
Posts: 4,736

## Re: How to count no. of unique values only

Hi

As I understand your question you're actually not looking for a count of unique observations but rather for a count of unique values per character variable.

If so then you have to scan the data set per variable. Luckily there are PROC's doing this for us - and they do it normally more efficient than we would code it.

Below approach uses Proc Freq to get a "record" per character variable and distinct value of this variable (missings included) and then uses ODS Output to collect the result and write it to a data set.

ods _all_ close;
ods output OneWayFreqs=work.Freqs(keep=table) ;

proc freq data=sashelp.class;
table _character_ /nocum nopercent missing;
run;

ods output close;
ods listing;

title "Unique Values per Character Variable";
proc sql;
select substr(Table,6) as Variable_Name format=\$32. ,count(*) as N_UniqueValues format=8.
from work.Freqs
group by Table
;
quit;
title;

HTH
Patrick
Super User
Posts: 10,761

## Re: How to count no. of unique values only

Hmmmmm...
I think there will be a way to get the result you want.try it.

[pre]
proc transpose data=sashelp.class out=class;
by _character_;
var _character_;
run;
proc sort data=class(keep= _name_ col1) out=temp;
by _name_ col1;
run;
data freq;
set temp;
by _name_ col1;
retain count;
if first._name_ then count=0;
if last.col1 then count+1;
if last._name_ then output;
keep _name_ count;
run;
proc print data=freq(rename=( _name_ = var_name)) noobs;
run;
[/pre]

Ksharp
Occasional Contributor
Posts: 12

## Re: How to count no. of unique values only

Patrick and KSharp., thank you for the reply.

The method suggested by Patrick works for me.

Again thanks

Rockerd.
Solution
‎03-03-2017 08:00 AM
Posts: 3,852

## Re: How to count no. of unique values only

> The method suggested by Patrick works for me.

More easily obtained with PROC FREQ LEVELS

[pre]
Number of Variable Levels

Variable Levels
--------------------
Name 19
Sex 2
[/pre]
Super User
Posts: 10,761

## Re: How to count no. of unique values only

Hi . data null;
You must work hard on documentation. It is pretty easy.
Valued Guide
Posts: 653

## Re: How to count no. of unique values only

Thank you Data _NULL_ for reminding me of these options. I believe that the LEVELS option is an alias for the NLEVELS option. Without a TABLE statement you not only get the summary shown by Data _NULL_, but also a summary for each level of each variable. Adding a TABLE statement yields only the overall summary for the selected variables. Since you want all variables try the following:

[pre]
proc freq data=sashelp.class nlevels;
table _all_ / noprint;
run;
[/pre]
🔒 This topic is solved and locked.