How can I determine the distinct values of many variables with just one syntax?
For instance to know the distinct values of 'Make' from Sashelp. cars data set, the syntax is as follows:
proc sql;
select distinct(make)
from sashelp.cars;
quit;
BUT if we would like to know the distinct values of all the variables in one syntax?
Before actually listing the VALUES I would suggest getting an idea just how many there might be. The Proc Freq Nlevels will do that. Coupled with ODS SELECT that is the only thing in the result.
ods select nlevels; proc freq data=sashelp.cars nlevels; tables _all_ / missing; run;
Consider if you run Proc Freq on a data set that has a unique identifier for every record and the data set is moderately large. You may end up printing millions of records with one value each line. (May actually never complete if the data set is large enough).
In the output from above you see that the Invoice variable has 425 values. Since there are only 428 observations in the data set that means you have very few duplicated values and use of "distinct Invoice" may not be particularly helpful.
You also get to see quickly which variables have any (not how many, just if any are) missing values (cylinder the only one in this case because data includes the Mazda RX-8, last of the Wankle engines that use rotors instead of piston/cylinders)
Hi @Sharan,
For a quick overview you can use PROC FREQ:
proc freq data=sashelp.cars;
tables _all_ / missing;
run;
Before actually listing the VALUES I would suggest getting an idea just how many there might be. The Proc Freq Nlevels will do that. Coupled with ODS SELECT that is the only thing in the result.
ods select nlevels; proc freq data=sashelp.cars nlevels; tables _all_ / missing; run;
Consider if you run Proc Freq on a data set that has a unique identifier for every record and the data set is moderately large. You may end up printing millions of records with one value each line. (May actually never complete if the data set is large enough).
In the output from above you see that the Invoice variable has 425 values. Since there are only 428 observations in the data set that means you have very few duplicated values and use of "distinct Invoice" may not be particularly helpful.
You also get to see quickly which variables have any (not how many, just if any are) missing values (cylinder the only one in this case because data includes the Mazda RX-8, last of the Wankle engines that use rotors instead of piston/cylinders)
If you just want the NLEVELS printed without the actual values then just use the / NOPRINT option on the TABLES statement.
proc freq data=sashelp.class nlevels;
tables _all_ / noprint;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.