Desktop productivity for business analysts and programmers

create distribution of all the variables of a dataset using macro

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 89
Accepted Solution

create distribution of all the variables of a dataset using macro

Hello Everybody,

I am trying to create a macro so that I can check distribution of each of the variable of the given dataset. Here is my basic query.


PROC CONTENTS DATA=test VARNUM OUT=NAME;RUN;

DATA NAME;

  SET NAME;

ID = _N_;

RUN;

PROC SQL;

SELECT NAME,ID

INTO :VARNAME,:NUMVAR

FROM WORK.NAME;

QUIT;

%macro dist(dsname = ,varname=);

PROC FREQ DATA=&dsname;

tables &varname/list missing out=&varname._dist;

run;

%mend;

%dist(dsname=test,varname=var1);

%dist(dsname=test,varname=var2);

%dist(dsname=test,varname=var3);

in macro %dist I have type all the variable names if I want to see the distribution of variables. I am wondering if we can incorporate &varname macro with in %dist macro so that it would iterate macro %dist for all the variables in test data. I am aware that it can be time and system consuming for larger tables.

Thank you


Accepted Solutions
Solution
‎01-08-2015 02:25 PM
Regular Contributor
Posts: 180

Re: create distribution of all the variables of a dataset using macro

In this casse I think you need to return to the "report" solution instead of the "SAS Table" solution because Proc FREQ can not create "one way frequencies" with two or more variables in the tables statement (or two or more tables statements each one with one variable) in a single dataset.

I then suggest the following code to generate two PDF files each one containing the frequency distribution for numeric or character variables:

ods html close;

ods pdf file='c:\temp\class_num.pdf';

proc freq data=sashelp.class order=freq page;

  tables _NUMERIC_ / nocum nopercent ;

run;

ods pdf close;

ods pdf file='c:\temp\class_Char.pdf';

proc freq data=sashelp.class order=freq page;

  tables _CHARACTER_ / nocum nopercent ;

run;

ods pdf close;

ods html;

I am not very expert with ODS so maybe there is a better way to get the same result.

I hope this helps,

CTorres

View solution in original post


All Replies
Contributor
Posts: 45

Re: create distribution of all the variables of a dataset using macro

Hi,

You should be able to get a distribution of all variables simply by using the keyword _ALL_; for example:

proc freq data=dsn;

tables _all_;

run;

Alternatively, you can use _NUMERIC_ for all numeric variables or _CHARACTER_ for all character variables that are already defined in the data set.

Hope that helps,

M.

Contributor
Posts: 45

Re: create distribution of all the variables of a dataset using macro

I guess, technically, it's called a "SAS name list", not a keyword: SAS(R) 9.2 Language Reference: Concepts, Second Edition

Frequent Contributor
Posts: 89

Re: create distribution of all the variables of a dataset using macro

Thank you . I was thinking on hard way.

Regular Contributor
Posts: 180

Re: create distribution of all the variables of a dataset using macro

Hi,

If you need to generate SAS datasets instead of a printed report you could use the "CALL EXECUTE" solution suggested by TonKari. See below:

147  data _null_;
148    length tables $80;
149    set sashelp.vcolumn end=eof;
150    where libname='SASHELP' and memname='CLASS';
151    if _N_=1 then call execute('proc freq noprint data=sashelp.class;');
152    tables=cat('tables ',strip(name),' / nocum out=freq_',strip(name),';');
153    call execute(tables);
154    if eof then call execute('run;');
155  run;

NOTE: There were 5 observations read from the data set SASHELP.VCOLUMN.
      WHERE (libname='SASHELP') and (memname='CLASS');
NOTE: DATA statement used (Total process time):
      real time           0.42 seconds
      cpu time            0.17 seconds


NOTE: CALL EXECUTE generated line.
1   + proc freq noprint data=sashelp.class;
2   + tables Name / nocum out=freq_Name;
3   + tables Sex / nocum out=freq_Sex;
4   + tables Age / nocum out=freq_Age;
5   + tables Height / nocum out=freq_Height;
6   + tables Weight / nocum out=freq_Weight;
7   + run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.FREQ_NAME has 19 observations and 3 variables.
NOTE: The data set WORK.FREQ_SEX has 2 observations and 3 variables.
NOTE: The data set WORK.FREQ_AGE has 6 observations and 3 variables.
NOTE: The data set WORK.FREQ_HEIGHT has 17 observations and 3 variables.
NOTE: The data set WORK.FREQ_WEIGHT has 15 observations and 3 variables.
NOTE: PROCEDURE FREQ used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

CTorres

Frequent Contributor
Posts: 89

Re: create distribution of all the variables of a dataset using macro

Thank you,it is even better than I thought. Now I am wondering if following scenarios would be possible

I want to set all the numeric variables distribution table in to one table and all character variables into the other tables so that I can see if distribution of variable in one glance rather than going through all the variables table. I thought adding by statement in execute statement would work but it did not work since table is not sorted properly.

Regular Contributor
Posts: 180

Re: create distribution of all the variables of a dataset using macro

I think I dont really understand what you want to get.

Please post a sample of the "all the numeric variables distribution table" because I think maybe proc freq is not a good tool to get a distribution of numeric values unless you group then somehow using a format.

For character variables I woud suggest to use proc summary to get the frequency distribution in one table. Something like this:

proc summary data=sashelp.cars nway;
  class make type origin cylinders;
  output out=summary (drop=_type_);
run;

Frequent Contributor
Posts: 89

Re: create distribution of all the variables of a dataset using macro

I want to get 2 tables instead of  tables for each variable. Table one would be all numeric variables distribution put together and table two with all the character variables. I agree proc freq for numeric variable might be tricky sometimes but it is preferable way for me since I do not know variable distribution to format the numeric variable properly. Since I have to use this process frequently therefore I thought it would be great if there is a single process for all the variables which helps me to check quality of data I received.

Super User
Posts: 11,121

Re: create distribution of all the variables of a dataset using macro

One way, likely to be several.

First know which variables are numeric and character in your data set. I prefer to use the SAS dictionary tables instead of contents but the idea would be similar but the data type in contents output is 1 2 and numeric. (BTW instead of your step to get ID you could use the VARNUM variable in the contents output unless you really want the variables ordered alphabetically and sensitive to case)

Replace LIBRARY and DATASET with the library and dataset name of your starting set. These must be in uppercase due to the way SAS stores information about data. Use WORK if that is where your single level datasets unless you have a USER library set.

proc sql noprint;

   select name into :charvars separated by ' '

   from dictionary.columns

      where Libname='LIBRARY' AND MEMNAME='DATASET'

      AND TYPE='char'

;

   select name into :numvars separated by ' '

   from dictionary.columns

      where Libname='LIBRARY' AND MEMNAME='DATASET'

      AND TYPE='num'

;

quit;

That created to macro variables with lists of names. You can split the data by:

Data NumericVars (Keep = &numvars)

        CharacterVars (keep=&charvars)

;

set have; /* where HAVE is replaced by the name of your dataset*/

run;

Solution
‎01-08-2015 02:25 PM
Regular Contributor
Posts: 180

Re: create distribution of all the variables of a dataset using macro

In this casse I think you need to return to the "report" solution instead of the "SAS Table" solution because Proc FREQ can not create "one way frequencies" with two or more variables in the tables statement (or two or more tables statements each one with one variable) in a single dataset.

I then suggest the following code to generate two PDF files each one containing the frequency distribution for numeric or character variables:

ods html close;

ods pdf file='c:\temp\class_num.pdf';

proc freq data=sashelp.class order=freq page;

  tables _NUMERIC_ / nocum nopercent ;

run;

ods pdf close;

ods pdf file='c:\temp\class_Char.pdf';

proc freq data=sashelp.class order=freq page;

  tables _CHARACTER_ / nocum nopercent ;

run;

ods pdf close;

ods html;

I am not very expert with ODS so maybe there is a better way to get the same result.

I hope this helps,

CTorres

PROC Star
Posts: 1,146

Re: create distribution of all the variables of a dataset using macro

Yes, it's actually fairly easy. Look at the "CALL EXECUTE" statement...it's exactly what you're looking for. I do this all the time.

Tom

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 3389 views
  • 3 likes
  • 5 in conversation