BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Sharan
Obsidian | Level 7

How can I determine the distinct values of many variables with just one syntax?

For instance to know the distinct values of 'Make' from Sashelp. cars data set, the syntax is as follows:
proc sql;
select distinct(make)
from sashelp.cars;
quit;

 

BUT if we would like to know the distinct values of all the variables in one syntax?

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Before actually listing the VALUES I would suggest getting an idea just how many there might be. The Proc Freq Nlevels will do that. Coupled with ODS SELECT that is the only thing in the result.

 

ods select nlevels;
proc freq data=sashelp.cars nlevels;
tables _all_ / missing;
run; 

Consider if you run Proc Freq on a data set that has a unique identifier for every record and the data set is moderately large. You may end up printing millions of records with one value each line. (May actually never complete if the data set is large enough).

 

In the output from above you see that the Invoice variable has 425 values. Since there are only 428 observations in the data set that means you have very few duplicated values and use of "distinct Invoice" may not be particularly helpful.

 

You also get to see quickly which variables have any (not how many, just if any are) missing values (cylinder the only one in this case because data includes the Mazda RX-8, last of the Wankle engines that use rotors instead of piston/cylinders)

View solution in original post

4 REPLIES 4
FreelanceReinh
Jade | Level 19

Hi @Sharan,

 

For a quick overview you can use PROC FREQ:

proc freq data=sashelp.cars;
tables _all_ / missing;
run; 
ballardw
Super User

Before actually listing the VALUES I would suggest getting an idea just how many there might be. The Proc Freq Nlevels will do that. Coupled with ODS SELECT that is the only thing in the result.

 

ods select nlevels;
proc freq data=sashelp.cars nlevels;
tables _all_ / missing;
run; 

Consider if you run Proc Freq on a data set that has a unique identifier for every record and the data set is moderately large. You may end up printing millions of records with one value each line. (May actually never complete if the data set is large enough).

 

In the output from above you see that the Invoice variable has 425 values. Since there are only 428 observations in the data set that means you have very few duplicated values and use of "distinct Invoice" may not be particularly helpful.

 

You also get to see quickly which variables have any (not how many, just if any are) missing values (cylinder the only one in this case because data includes the Mazda RX-8, last of the Wankle engines that use rotors instead of piston/cylinders)

Sharan
Obsidian | Level 7
This is very useful. Thank you!
Tom
Super User Tom
Super User

If you just want the NLEVELS printed without the actual values then just use the / NOPRINT option on the TABLES statement.

proc freq data=sashelp.class nlevels;
  tables _all_ / noprint;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1406 views
  • 2 likes
  • 4 in conversation