how to get distinct observation from a dataset without SQL

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

how to get distinct observation from a dataset without SQL

[ Edited ]

how to get a count of distinct values from a variable without using proc sql ?

 

example dataset:

 

  ID     Name

101     AAA

102     BBB

103     CCC

105     EEE

101     AAA

103     CCC


Accepted Solutions
Solution
Tuesday
Super Contributor
Posts: 251

Re: how to get distinct observation from a dataset

[ Edited ]

Which variable to be used to find the distinct observations - either ID or Name can be used in your example. The simplest way is to use Proc sort with nodupkey.

 

For unique Name:

 

proc sort data = have nodupkey;
by Name;
run;

 

For unique ID:

 

proc sort data = have nodupkey;
by id;
run;

 

 There several ways to do with data step and hash.

 

Editor's note: @Tom suggests PROC FREQ with NLEVELS option.  Here's a full example of how that could work.

 

data t;
infile datalines dsd delimiter=',';
input  id $ name $;
datalines;
101,AAA
102,BBB
103,CCC
105,EEE
101,AAA
103,CCC
104,CCC
;
run;

proc freq data=t nlevels noprint;

/* save values of unique ID */
tables id / 
   out=uniqueId (where=(count=1));

/* save values of unique Name */
tables name / 
   out=uniqueName (where=(count=1));

/* save values of unique combination */
tables id * name / 
   out=uniqueWhole (where=(count=1));
run;

 

View solution in original post


All Replies
Solution
Tuesday
Super Contributor
Posts: 251

Re: how to get distinct observation from a dataset

[ Edited ]

Which variable to be used to find the distinct observations - either ID or Name can be used in your example. The simplest way is to use Proc sort with nodupkey.

 

For unique Name:

 

proc sort data = have nodupkey;
by Name;
run;

 

For unique ID:

 

proc sort data = have nodupkey;
by id;
run;

 

 There several ways to do with data step and hash.

 

Editor's note: @Tom suggests PROC FREQ with NLEVELS option.  Here's a full example of how that could work.

 

data t;
infile datalines dsd delimiter=',';
input  id $ name $;
datalines;
101,AAA
102,BBB
103,CCC
105,EEE
101,AAA
103,CCC
104,CCC
;
run;

proc freq data=t nlevels noprint;

/* save values of unique ID */
tables id / 
   out=uniqueId (where=(count=1));

/* save values of unique Name */
tables name / 
   out=uniqueName (where=(count=1));

/* save values of unique combination */
tables id * name / 
   out=uniqueWhole (where=(count=1));
run;

 

Occasional Contributor
Posts: 8

Re: how to get distinct observation from a dataset

thank you

Super User
Super User
Posts: 6,351

Re: how to get distinct observation from a dataset

PROC FREQ NLEVELS

Occasional Contributor
Posts: 8

Re: how to get distinct observation from a dataset

Thank you Tom.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 4752 views
  • 0 likes
  • 3 in conversation