- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
how to get a count of distinct values from a variable without using proc sql ?
example dataset:
ID Name
101 AAA
102 BBB
103 CCC
105 EEE
101 AAA
103 CCC
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Which variable to be used to find the distinct observations - either ID or Name can be used in your example. The simplest way is to use Proc sort with nodupkey.
For unique Name:
proc sort data = have nodupkey;
by Name;
run;
For unique ID:
proc sort data = have nodupkey;
by id;
run;
There several ways to do with data step and hash.
Editor's note: @Tom suggests PROC FREQ with NLEVELS option. Here's a full example of how that could work.
data t;
infile datalines dsd delimiter=',';
input id $ name $;
datalines;
101,AAA
102,BBB
103,CCC
105,EEE
101,AAA
103,CCC
104,CCC
;
run;
proc freq data=t nlevels noprint;
/* save values of unique ID */
tables id /
out=uniqueId (where=(count=1));
/* save values of unique Name */
tables name /
out=uniqueName (where=(count=1));
/* save values of unique combination */
tables id * name /
out=uniqueWhole (where=(count=1));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Which variable to be used to find the distinct observations - either ID or Name can be used in your example. The simplest way is to use Proc sort with nodupkey.
For unique Name:
proc sort data = have nodupkey;
by Name;
run;
For unique ID:
proc sort data = have nodupkey;
by id;
run;
There several ways to do with data step and hash.
Editor's note: @Tom suggests PROC FREQ with NLEVELS option. Here's a full example of how that could work.
data t;
infile datalines dsd delimiter=',';
input id $ name $;
datalines;
101,AAA
102,BBB
103,CCC
105,EEE
101,AAA
103,CCC
104,CCC
;
run;
proc freq data=t nlevels noprint;
/* save values of unique ID */
tables id /
out=uniqueId (where=(count=1));
/* save values of unique Name */
tables name /
out=uniqueName (where=(count=1));
/* save values of unique combination */
tables id * name /
out=uniqueWhole (where=(count=1));
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC FREQ NLEVELS
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Tom.