- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
How can I determine the distinct values of many variables with just one syntax?
For instance to know the distinct values of 'Make' from Sashelp. cars data set, the syntax is as follows:
proc sql;
select distinct(make)
from sashelp.cars;
quit;
BUT if we would like to know the distinct values of all the variables in one syntax?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Before actually listing the VALUES I would suggest getting an idea just how many there might be. The Proc Freq Nlevels will do that. Coupled with ODS SELECT that is the only thing in the result.
ods select nlevels; proc freq data=sashelp.cars nlevels; tables _all_ / missing; run;
Consider if you run Proc Freq on a data set that has a unique identifier for every record and the data set is moderately large. You may end up printing millions of records with one value each line. (May actually never complete if the data set is large enough).
In the output from above you see that the Invoice variable has 425 values. Since there are only 428 observations in the data set that means you have very few duplicated values and use of "distinct Invoice" may not be particularly helpful.
You also get to see quickly which variables have any (not how many, just if any are) missing values (cylinder the only one in this case because data includes the Mazda RX-8, last of the Wankle engines that use rotors instead of piston/cylinders)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Sharan,
For a quick overview you can use PROC FREQ:
proc freq data=sashelp.cars;
tables _all_ / missing;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Before actually listing the VALUES I would suggest getting an idea just how many there might be. The Proc Freq Nlevels will do that. Coupled with ODS SELECT that is the only thing in the result.
ods select nlevels; proc freq data=sashelp.cars nlevels; tables _all_ / missing; run;
Consider if you run Proc Freq on a data set that has a unique identifier for every record and the data set is moderately large. You may end up printing millions of records with one value each line. (May actually never complete if the data set is large enough).
In the output from above you see that the Invoice variable has 425 values. Since there are only 428 observations in the data set that means you have very few duplicated values and use of "distinct Invoice" may not be particularly helpful.
You also get to see quickly which variables have any (not how many, just if any are) missing values (cylinder the only one in this case because data includes the Mazda RX-8, last of the Wankle engines that use rotors instead of piston/cylinders)
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you just want the NLEVELS printed without the actual values then just use the / NOPRINT option on the TABLES statement.
proc freq data=sashelp.class nlevels;
tables _all_ / noprint;
run;