BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
learner_sas
Quartz | Level 8

Hello Everybody,

I am trying to create a macro so that I can check distribution of each of the variable of the given dataset. Here is my basic query.


PROC CONTENTS DATA=test VARNUM OUT=NAME;RUN;

DATA NAME;

  SET NAME;

ID = _N_;

RUN;

PROC SQL;

SELECT NAME,ID

INTO :VARNAME,:NUMVAR

FROM WORK.NAME;

QUIT;

%macro dist(dsname = ,varname=);

PROC FREQ DATA=&dsname;

tables &varname/list missing out=&varname._dist;

run;

%mend;

%dist(dsname=test,varname=var1);

%dist(dsname=test,varname=var2);

%dist(dsname=test,varname=var3);

in macro %dist I have type all the variable names if I want to see the distribution of variables. I am wondering if we can incorporate &varname macro with in %dist macro so that it would iterate macro %dist for all the variables in test data. I am aware that it can be time and system consuming for larger tables.

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
CTorres
Quartz | Level 8

In this casse I think you need to return to the "report" solution instead of the "SAS Table" solution because Proc FREQ can not create "one way frequencies" with two or more variables in the tables statement (or two or more tables statements each one with one variable) in a single dataset.

I then suggest the following code to generate two PDF files each one containing the frequency distribution for numeric or character variables:

ods html close;

ods pdf file='c:\temp\class_num.pdf';

proc freq data=sashelp.class order=freq page;

  tables _NUMERIC_ / nocum nopercent ;

run;

ods pdf close;

ods pdf file='c:\temp\class_Char.pdf';

proc freq data=sashelp.class order=freq page;

  tables _CHARACTER_ / nocum nopercent ;

run;

ods pdf close;

ods html;

I am not very expert with ODS so maybe there is a better way to get the same result.

I hope this helps,

CTorres

View solution in original post

10 REPLIES 10
MaikH_Schutze
Quartz | Level 8

Hi,

You should be able to get a distribution of all variables simply by using the keyword _ALL_; for example:

proc freq data=dsn;

tables _all_;

run;

Alternatively, you can use _NUMERIC_ for all numeric variables or _CHARACTER_ for all character variables that are already defined in the data set.

Hope that helps,

M.

MaikH_Schutze
Quartz | Level 8

I guess, technically, it's called a "SAS name list", not a keyword: SAS(R) 9.2 Language Reference: Concepts, Second Edition

learner_sas
Quartz | Level 8

Thank you . I was thinking on hard way.

CTorres
Quartz | Level 8

Hi,

If you need to generate SAS datasets instead of a printed report you could use the "CALL EXECUTE" solution suggested by TonKari. See below:

147  data _null_;
148    length tables $80;
149    set sashelp.vcolumn end=eof;
150    where libname='SASHELP' and memname='CLASS';
151    if _N_=1 then call execute('proc freq noprint data=sashelp.class;');
152    tables=cat('tables ',strip(name),' / nocum out=freq_',strip(name),';');
153    call execute(tables);
154    if eof then call execute('run;');
155  run;

NOTE: There were 5 observations read from the data set SASHELP.VCOLUMN.
      WHERE (libname='SASHELP') and (memname='CLASS');
NOTE: DATA statement used (Total process time):
      real time           0.42 seconds
      cpu time            0.17 seconds


NOTE: CALL EXECUTE generated line.
1   + proc freq noprint data=sashelp.class;
2   + tables Name / nocum out=freq_Name;
3   + tables Sex / nocum out=freq_Sex;
4   + tables Age / nocum out=freq_Age;
5   + tables Height / nocum out=freq_Height;
6   + tables Weight / nocum out=freq_Weight;
7   + run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.FREQ_NAME has 19 observations and 3 variables.
NOTE: The data set WORK.FREQ_SEX has 2 observations and 3 variables.
NOTE: The data set WORK.FREQ_AGE has 6 observations and 3 variables.
NOTE: The data set WORK.FREQ_HEIGHT has 17 observations and 3 variables.
NOTE: The data set WORK.FREQ_WEIGHT has 15 observations and 3 variables.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

CTorres

learner_sas
Quartz | Level 8

Thank you,it is even better than I thought. Now I am wondering if following scenarios would be possible

I want to set all the numeric variables distribution table in to one table and all character variables into the other tables so that I can see if distribution of variable in one glance rather than going through all the variables table. I thought adding by statement in execute statement would work but it did not work since table is not sorted properly.

CTorres
Quartz | Level 8

I think I dont really understand what you want to get.

Please post a sample of the "all the numeric variables distribution table" because I think maybe proc freq is not a good tool to get a distribution of numeric values unless you group then somehow using a format.

For character variables I woud suggest to use proc summary to get the frequency distribution in one table. Something like this:

proc summary data=sashelp.cars nway;
  class make type origin cylinders;
  output out=summary (drop=_type_);
run;

learner_sas
Quartz | Level 8

I want to get 2 tables instead of  tables for each variable. Table one would be all numeric variables distribution put together and table two with all the character variables. I agree proc freq for numeric variable might be tricky sometimes but it is preferable way for me since I do not know variable distribution to format the numeric variable properly. Since I have to use this process frequently therefore I thought it would be great if there is a single process for all the variables which helps me to check quality of data I received.

ballardw
Super User

One way, likely to be several.

First know which variables are numeric and character in your data set. I prefer to use the SAS dictionary tables instead of contents but the idea would be similar but the data type in contents output is 1 2 and numeric. (BTW instead of your step to get ID you could use the VARNUM variable in the contents output unless you really want the variables ordered alphabetically and sensitive to case)

Replace LIBRARY and DATASET with the library and dataset name of your starting set. These must be in uppercase due to the way SAS stores information about data. Use WORK if that is where your single level datasets unless you have a USER library set.

proc sql noprint;

   select name into :charvars separated by ' '

   from dictionary.columns

      where Libname='LIBRARY' AND MEMNAME='DATASET'

      AND TYPE='char'

;

   select name into :numvars separated by ' '

   from dictionary.columns

      where Libname='LIBRARY' AND MEMNAME='DATASET'

      AND TYPE='num'

;

quit;

That created to macro variables with lists of names. You can split the data by:

Data NumericVars (Keep = &numvars)

        CharacterVars (keep=&charvars)

;

set have; /* where HAVE is replaced by the name of your dataset*/

run;

CTorres
Quartz | Level 8

In this casse I think you need to return to the "report" solution instead of the "SAS Table" solution because Proc FREQ can not create "one way frequencies" with two or more variables in the tables statement (or two or more tables statements each one with one variable) in a single dataset.

I then suggest the following code to generate two PDF files each one containing the frequency distribution for numeric or character variables:

ods html close;

ods pdf file='c:\temp\class_num.pdf';

proc freq data=sashelp.class order=freq page;

  tables _NUMERIC_ / nocum nopercent ;

run;

ods pdf close;

ods pdf file='c:\temp\class_Char.pdf';

proc freq data=sashelp.class order=freq page;

  tables _CHARACTER_ / nocum nopercent ;

run;

ods pdf close;

ods html;

I am not very expert with ODS so maybe there is a better way to get the same result.

I hope this helps,

CTorres

TomKari
Onyx | Level 15

Yes, it's actually fairly easy. Look at the "CALL EXECUTE" statement...it's exactly what you're looking for. I do this all the time.

Tom

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 12150 views
  • 5 likes
  • 5 in conversation