- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
I'm wondering if sas could count distinct values in each row. For example, my data looks similar like:
ID Col1 Col2 Col3 Col4
1 12 12 13 14
2 23 34 34 34
3 55 67 . 45
4 33 33 33 33
I woud like to have the count of unique values that each ID has, something like:
ID # Uniquevalues
1 3
2 2
3 3
4 1
I appreciate any ideas that may help. Thank you!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another way using transpose with sql:
proc transpose data=have out=want;
by id;
run;
proc sql;
create table uniq as
select id, count(distinct col1) as Unique_values from want group by id;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assumptions make a difference in how complex the code becomes. I'm going to assume:
- You may have missing values in your data
- The sample data is representative, but you actually have more variables than 4.
- Since the original variables are not being kept in the output, they can be changed by the program
Here's one way to go about it (assuming you actually have 50 variables):
data want;
set have;
array col {50};
call sortn (of col1-col50);
unique_vals = (col1 > .);
do _n_=2 to 50;
if col{_n_} > col{_n_-1} then unique_vals + 1;
end;
keep id unique_vals;
run;
If the assumptions should be different, we can always adjust.
Good luck.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your input and assumptions. I tried your codes, but in the new dataset it showed all zeros in column " unique_vals". I just change the number 50 to exact columns I have. Any suggestions?
Those assumptions are correct. I also would like to be clear that the numbers are randomly organized and not in any orders in each row (like NOT col4>col3>col2>col1); at the same time, there could be more than one missing values in each row, which can be at any columns.
Thank you,
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, the program is short enough that you could probably post your log. It should be an easy fix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another way using transpose with sql:
proc transpose data=have out=want;
by id;
run;
proc sql;
create table uniq as
select id, count(distinct col1) as Unique_values from want group by id;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
smart way!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Another array option:
data have;
input ID Col1 Col2 Col3 Col4;
cards;
1 12 12 13 14
2 23 34 34 34
3 55 67 . 45
4 33 33 33 33
;
run;
data want;
set have;
array temp(50) _temporary_;
array col(50) col1-col50;
do i=1 to dim(col);
if col(i) not in temp then
temp(i)=col(i);
end;
ct=n(of temp(*));
call missing (of temp(*));
keep id ct;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It works! Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It is IML thing. data have; input ID Col1 Col2 Col3 Col4; cards; 1 12 12 13 14 2 23 34 34 34 3 55 67 . 45 4 33 33 33 33 5 . . . . ; run; proc iml; use have(keep=id); read all var {id}; close; use have(keep=col:); read all var _num_ into x; close; n=countunique(x,'row')-(countmiss(x,'row')^=0); create want var{id n}; append; close; quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For some reason both of your posts in this thread are truncated. Maybe you can try posting your code as SAS code?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, indeed, I was using IE. IMHO, the quality of this forum is not in par with SAS reputation. If I use Chrome, I sometimes have a login issue. Another one is of course, The notorious Hash colon problem, the colon gets to turn into literal spelling. Wonder will it get ever fixed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OR Hash Table. data have; input ID Col1 Col2 Col3 Col4; cards; 1 12 12 13 14 2 23 34 34 34 3 55 67 . 45 4 33 33 33 33 5 . . . . ; run; data want; if _n_ eq 1 then do; declare hash h(); h.definekey('k'); h.definedone(); end; set have; array x{*} col:; do i=1 to dim(x); if not missing(x{i}) then do;k=x{i};h.replace();end; end; n=h.num_items; h.clear(); drop i k col:; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks to everyone's help! The problem already been solved.