I am trying to create a macro that calculates number of levels (distinct categories) for all the character variables in a dataset. Currently, i am using PROC SQL to calculate distinct categories. If i have 100 variables, i have to run PROC SQL 100 times in a loop. Is there a better way to do it? Is "stackods" kind of option available in PROC SQL so that i can reshape my dataset easily?
Sample Code -
proc sql;
create table abc as
select count(distinct(sex)) as sex_levels
from sashelp.class;
quit;
Sample Output -
Variable | N_Levels |
Sex | 2 |
Age | 6 |
It's just the sample code and output for 2 variables. I need to do it for all the character variables in a dataset.
Note : All my categorical variables are stored as character variables (string) in my dataset.
Thanks in anticipation!
PROC FREQ NLEVELS ....
I am with Tom: proc.freq.nlevels provides the answer, ... for the variables in the data-set.
I have written several papers on extrapolating from n-levels.
the information you are seeking is in contents output data set which has a variable type -- in (1:n, 2:c)
you have to sort both out-nlevels and out-contents by name in order to merge them.
check this SASware ballot item.
%let data = sashelp.class;
%let data = sashelp.heart;
proc freq data = &data
(keep = _character_)
nlevels;
ods exclude onewayfreqs;
ods output
nlevels = out_n_levels
(keep = tablevar nlevels
rename = (tablevar = name
nlevels = n_levels));
*to match out_contents;
proc print;
How would you tackle a data set with more than 70k levels for at least one variable? (proc freq always yields OoM error.)
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.