Hello everyone,
I'm wondering if sas could count distinct values in each row. For example, my data looks similar like:
ID Col1 Col2 Col3 Col4
1 12 12 13 14
2 23 34 34 34
3 55 67 . 45
4 33 33 33 33
I woud like to have the count of unique values that each ID has, something like:
ID # Uniquevalues
1 3
2 2
3 3
4 1
I appreciate any ideas that may help. Thank you!
Another way using transpose with sql:
proc transpose data=have out=want;
by id;
run;
proc sql;
create table uniq as
select id, count(distinct col1) as Unique_values from want group by id;
quit;
Assumptions make a difference in how complex the code becomes. I'm going to assume:
Here's one way to go about it (assuming you actually have 50 variables):
data want;
set have;
array col {50};
call sortn (of col1-col50);
unique_vals = (col1 > .);
do _n_=2 to 50;
if col{_n_} > col{_n_-1} then unique_vals + 1;
end;
keep id unique_vals;
run;
If the assumptions should be different, we can always adjust.
Good luck.
Thank you for your input and assumptions. I tried your codes, but in the new dataset it showed all zeros in column " unique_vals". I just change the number 50 to exact columns I have. Any suggestions?
Those assumptions are correct. I also would like to be clear that the numbers are randomly organized and not in any orders in each row (like NOT col4>col3>col2>col1); at the same time, there could be more than one missing values in each row, which can be at any columns.
Thank you,
OK, the program is short enough that you could probably post your log. It should be an easy fix.
Another way using transpose with sql:
proc transpose data=have out=want;
by id;
run;
proc sql;
create table uniq as
select id, count(distinct col1) as Unique_values from want group by id;
quit;
smart way!
Another array option:
data have;
input ID Col1 Col2 Col3 Col4;
cards;
1 12 12 13 14
2 23 34 34 34
3 55 67 . 45
4 33 33 33 33
;
run;
data want;
set have;
array temp(50) _temporary_;
array col(50) col1-col50;
do i=1 to dim(col);
if col(i) not in temp then
temp(i)=col(i);
end;
ct=n(of temp(*));
call missing (of temp(*));
keep id ct;
run;
It works! Thank you!
It is IML thing. data have; input ID Col1 Col2 Col3 Col4; cards; 1 12 12 13 14 2 23 34 34 34 3 55 67 . 45 4 33 33 33 33 5 . . . . ; run; proc iml; use have(keep=id); read all var {id}; close; use have(keep=col:); read all var _num_ into x; close; n=countunique(x,'row')-(countmiss(x,'row')^=0); create want var{id n}; append; close; quit;
For some reason both of your posts in this thread are truncated. Maybe you can try posting your code as SAS code?
Yes, indeed, I was using IE. IMHO, the quality of this forum is not in par with SAS reputation. If I use Chrome, I sometimes have a login issue. Another one is of course, The notorious Hash colon problem, the colon gets to turn into literal spelling. Wonder will it get ever fixed.
OR Hash Table. data have; input ID Col1 Col2 Col3 Col4; cards; 1 12 12 13 14 2 23 34 34 34 3 55 67 . 45 4 33 33 33 33 5 . . . . ; run; data want; if _n_ eq 1 then do; declare hash h(); h.definekey('k'); h.definedone(); end; set have; array x{*} col:; do i=1 to dim(x); if not missing(x{i}) then do;k=x{i};h.replace();end; end; n=h.num_items; h.clear(); drop i k col:; run;
Thanks to everyone's help! The problem already been solved.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.