Not applicable
Posts: 0

# Number of Distinct Values?

Is there a simple way to determine the number of distinct values in an observation?

I'm working with many columns with integer values. I can use descriptive statistics to determine the number of (non-missing) values or the minimum value etc.

numentries = n(of col_start--col_end)
minvalue = min(of col_start--col_end)

Many values are repeated. I would like to know the number of distinct values there are for each observation.
Not applicable
Posts: 0

## Re: Number of Distinct Values?

Hi JohnI,

I'm not sure what you mean by simple, but something like this might help:

data have;
input id var1 var2 var3 var4 var5;
cards;
1 1 2 3 4 5
2 1 2 2 3 4
3 3 3 2 1 3
4 4 1 2 3 1
;
run;

data want(keep=id var1--var5 n distinct);
set have;
array v_(*) var1--var5;
array _dist_(10) _temporary_;
do j = 1 to 10;
_dist_(j) = .;
end;
do i = 1 to dim(v_);
_dist_(v_(i)) = 1;
end;
distinct = 0;
do j = 1 to 10;
if _dist_(j) ne . then distinct = distinct + 1;
end;
n = n(of var1--var5);
run;

I've only used five variables and a limit of 10 for the value, but you could extend this for n variables and a wide range of integer values.

Robert
Not applicable
Posts: 0

## Re: Number of Distinct Values?

Thank you Robert.

By simple I meant a 'one-liner' like N or Max.

But, I was able to use your code for my needs (and I learned something too).

Thanks again, John
SAS Super FREQ
Posts: 9,368

## Re: Number of Distinct Values?

Hi:
You could also investigate the NLEVELS option of PROC FREQ or just proc freq in general:
[pre]
data have;
input id var1 var2 var3 var4 var5;
cards;
1 1 2 3 4 5
2 1 2 2 3 4
3 3 3 2 1 3
4 4 1 2 3 1
;
run;

proc freq data=have nlevels;
table _numeric_;
run;
[/pre]

cynthia
Discussion stats
• 3 replies
• 180 views
• 0 likes
• 2 in conversation