BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

I want to run proc freq for all numeric variables.

I want to export it to a data set and not print it.

In this code I see the distribution only for one variable. Why??

proc freq data=sashelp.class noprint;
tables _numeric_/ out=want (drop = percent);
run;
8 REPLIES 8
PaigeMiller
Diamond | Level 26

Why? Because that the way SAS programmed it, the output data set is only for the first variable. I don't know why they made that decision.

 

But you can use ODS OUTPUT to get all the frequencies in one humongous ginormous table, that isn't particularly easy to work with.

 

ods output onewayfreqs=outputdatasetname; 

 

--
Paige Miller
Ronein
Meteorite | Level 14

Thanks.

This code provide output that is not comfortable to read.

ods output onewayfreqs=want; 
proc freq data=sashelp.class;
tables age height weight;
run;
ods output close;

This is the required output.

proc freq noprint data = sashelp.class; 
tables age / out = data_age 
(rename = (age = category)); 
run;
proc freq noprint data = sashelp.class; 
tables height / out = data_height 
(rename = (height = category )); 
run;
proc freq noprint data = sashelp.class; 
tables weight / out = data_weight 
(rename = (weight = category )); 
run;

proc sort data = data_age; by category; run;
proc sort data = data_height; by category; run;
proc sort data = data_weight; by category; run;
data want;
retain Var_name;
set data_age data_height data_weight indsname = source;
dsname = scan(source,2,'.');  /* extract the data set name */
Var_name=substr(dsname,6);
drop dsname;
run;


 

Is there a way to create it via one proc freq?

 

 

PaigeMiller
Diamond | Level 26

Is there a way to create it via one proc freq?

 

Not that I know of. 

 

Have you looked at the NLEVELS option?

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hello @Ronein,

 

After preparing an intermediate dataset with PROC TRANSPOSE and PROC SORT you can obtain the desired output dataset with one PROC FREQ step:

proc transpose data=sashelp.class out=tmp(drop=name rename=(col1=category)) name=Var_name;
by name;
run;

proc sort data=tmp;
by Var_name;
run;

proc freq data=tmp noprint;
by Var_name;
tables category / out=want;
run;
ballardw
Super User

Learning point: Proc Freq supports multiple Tables statements.

If you think you must do something like this:

 

proc freq noprint data = sashelp.class; 
tables age / out = data_age 
(rename = (age = category)); 
run;
proc freq noprint data = sashelp.class; 
tables height / out = data_height 
(rename = (height = category )); 
run;
proc freq noprint data = sashelp.class; 
tables weight / out = data_weight 
(rename = (weight = category )); 
run;

 

That can be replaced with:

proc freq noprint data = sashelp.class; 
   tables age / out = data_age 
                  (rename = (age = category)); 
   tables height / out = data_height 
                  (rename = (height = category )); 
   tables weight / out = data_weight 
                  (rename = (weight = category )); 
run;

But I would say lack of imagination to say the ODS output "is not comfortable to read".

And to get something that looks a bit like your mashed together example (but without creating 100 data sets)

 

ods output onewayfreqs=tempfreq; 
proc freq data=sashelp.class;
tables age height weight;
run;

data want;
   set tempfreq;
   length var_name $ 32 Category $ 16;
   Category = cats(of F_:);
   var_name = scan(table,2);
   keep var_name category Frequency percent;
run;

If you look in the ODS set you will find two "versions" of each variable. The original variable and one that has "F_" prefixed to the name. That is for the formatted version of the variable and is a character value.

Each observation will only have one of the F_ variables populated. So you can select the "category" using the CATS function with the list of F_: to use all those formatted values. You may ask why to use the F_ instead of the raw variable. Consider what happens if you have multiple (and especially custom formats) assigned to those numeric variables. If you create a single variable, like your "category" with those values what FORMAT would you assign? Nothing like trying to actually read a bunch of numbers that were a mixture of Date, Datetime, Currency and some other measurements and make sense of all the values with a single format assigned. Use of the F_ variables removes that headache. You could use a Max(list) but you will find, unless you spent a some time on your variable names, some issues with getting a simple list that works. Remember that the frequency and percentage variables will be numeric so you can't use the _numeric_ list.

 

If you want the percentages and the cumulative frequency values I would suggest changing the default variable names as the "percentage of what" and "Cumulative of what" have sort of changed. A long descriptive Label on those variables would be a good idea as well.

 

 

The Table variable has the text "Table Varname". So you can pull the variable name quite easily from the Table variable.

whymath
Lapis Lazuli | Level 10

The usage of CATS() is very tricky, I used to use COALESCEC() in there.

 

Update: After trying, I think CATT() maybe better than CATS() in this condition, avoid to remove leading space of category.

PaigeMiller
Diamond | Level 26

If you are trying to determine unique number of levels for each variable, you want to use the NLEVELS option of PROC FREQ. Although as I said, number of unique levels for numeric variables isn't particularly meaningful in most cases.

--
Paige Miller
Reeza
Super User

https://gist.github.com/statgeek/e0903d269d4a71316a4e

 

*Run frequency for tables;
ods table onewayfreqs=temp;
proc freq data=sashelp.class;
	table _numeric_;
run;

*Format output;
data want;
length variable $32. variable_value $50.;
set temp;
Variable=scan(table, 2);

Variable_Value=strip(trim(vvaluex(variable)));

keep variable variable_value frequency percent cum:;
label variable='Variable' 
	variable_value='Variable Value';
run;

*Display;
proc print data=want(obs=20) label;
run;

@Ronein wrote:

Hello

I want to run proc freq for all numeric variables.

I want to export it to a data set and not print it.

In this code I see the distribution only for one variable. Why??

proc freq data=sashelp.class noprint;
tables _numeric_/ out=want (drop = percent);
run;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1422 views
  • 5 likes
  • 6 in conversation