BookmarkSubscribeRSS Feed
SASuser4321
Obsidian | Level 7

My goal is to combine these tables into one without having to manually run my macro each time for each column.

The code I currently have with me is the following:

%macro task_Oct(set,col_name);

data _type_;
set &set;
call symputx('col_type', vtype(&col_name));
run;

proc sql; 
create table work.oct27 as
select "&col_name" as variable, 
"&col_type" as type, 
            nmiss(&col_name) as missing_val, 
            count(distinct &col_name) as distinct_val
    from &set;
quit;

%mend task_Oct;

%task_Oct(sashelp.cars,Origin)

The above code gives me the following output:

|Var   |Type     |missing_val|distinct_val| 
|Origin|Character|0          | 3          | 

But the sashelp.cars data sheet has 15 columns and so I would like to output a new data sheet which has 15 rows with 4 columns.

I would like to get the following combined table as the output of my code:

|Var   |Type     |missing_val|distinct_val| 
|Make  |Character|0          | 38         |
|Model |Character|0          | 425        | 
|Type  |Character|0          | 6          | 
|Origin|Character|0          | 3          | 
... ... ... 

Since I'm using a macro, I could run my code 15 different times by manually entering the names of the columns and then merging the tables into 1; and it wouldn't be a problem. But what if I have a table with 100s of columns? I could use some loop statement but I'm not sure how to go about that in this case. Help would be appreciated. Thank you.
I wish to solve this task using DO loops.

 

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Is it a requirement to use Do Loops? And why?

 

Or do you simply want a table with col name, type, the number of missing values and the number of distinct values?

SASuser4321
Obsidian | Level 7

My current code allows me to view the table for one column and I think that if I use DO loops, I can get a table for each column and then if I use PROC Append, I can just stack all the data sets together to get one nice looking table. This way I can also use elements of my current code without them having to go waste and having to rewrite a completely fresh code. Sure, there may be faster ways of doing this (without using DO loops) but I'm currently learning SAS and if there is a way to solve this task with DO loops, I would be interested in that.

PeterClemmensen
Tourmaline | Level 20

Ok. As pointet out by others, I do not recommend using this method. But since you're learning SAS and this is in a learning context, I have expanded your code a bit to help you.

 

When you say Do Loop, I assume you want to use a macro %do loop in your code and do what you've already written for as many variables available in the input data. See the code below and feel free to ask further

 

%macro task_Oct(set);

   proc sql;
      select nvar into :nvar 
      from dictionary.tables
      where libname=upcase(scan("&set.", 1, '.')) 
        and memname=upcase(scan("&set.", 2, '.'));
   quit;

   %do i = 1 %to &nvar.;

      proc sql;
         select  name, type
            into :name, :type
         from dictionary.columns
         where libname=upcase(scan("&set.", 1, '.')) 
           and memname=upcase(scan("&set.", 2, '.'))
         and varnum = &i.;
      quit;

      proc sql; 
      create table table_&i. as
      select distinct "&name" as variable 
           , "&type" as type
           , cmiss(&name) as missing_val
           , count(distinct &name) as distinct_val
          from &set.;
      quit;
   %end;

   data want;
      set table_:;
   run;

%mend task_Oct;

%task_Oct(sashelp.cars)
PaigeMiller
Diamond | Level 26

@SASuser4321 wrote:


I wish to solve this task using DO loops


In general, it is a mistake to specify the method of solution (in this case a DO loop), rather than to say "This is the output I want". By specifying the method of solution (DO loop), you miss a much simpler solution. Also, DO loops in SAS ought to be the last resort, rather than the first choice.


PROC FREQ with the NLEVELS option does what you want, without any looping.

 

ods select nlevels;
proc freq nlevels data=sashelp.cars;
tables _all_;
run;

  

--
Paige Miller
SASuser4321
Obsidian | Level 7

proc freq doesn't tell me that there are 2 missing values in the Cylinders column in the sashelp.cars data set. It counts the 2 "." as 1, while I would like the output table of the code to make it clear that there are indeed 2 missing values.

PaigeMiller
Diamond | Level 26

Okay, got it. You will have to combine data from several sources (PROC FREQ, PROC SUMMARY, PROC CONTENTS), but again no DO loops needed.

 

ods select none;
proc freq nlevels data= sashelp.cars;
    ods output nlevels=nlevels;
    tables _all_;
run;
ods select all;
proc summary data=sashelp.cars;
    var _numeric_;
    output out=_summary_ nmiss=;
run;
proc transpose data=_summary_(drop=_type_ _freq_) out=_summary_t;
run;
proc sort data=_summary_t;
    by _name_;
run;
proc contents noprint data=sashelp.cars out=_contents_;
run;
proc sort data=nlevels;
    by tablevar;
run;
proc sort data=_contents_;
    by name;
run;
proc format;
    value typef 1='Numeric' 2='Character';
run;
data final;
    merge nlevels(rename=(tablevar=name)) _contents_(keep=name type) _summary_t(rename=(_name_=name col1=n_missing));
    by name;
    format type typef.;
run;
--
Paige Miller
SASuser4321
Obsidian | Level 7

@PaigeMiller  In this code, how did you know that we had to write "_numeric_" and not something else after var?
And what is the purpose of "nmiss="? Thanks.

proc summary data=sashelp.cars;
    var _numeric_;
    output out=_summary_ nmiss=;
run;

 

 

PaigeMiller
Diamond | Level 26

_NUMERIC_ allows PROC SUMMARY to operate on all numeric variables, I don't have to list them by name.

 

NMISS= calculates the number of missing values for each variable in the VAR statement.

--
Paige Miller
SASuser4321
Obsidian | Level 7
Thank you. But why is there nothing after the "=" sign after NMISS? NMISS=?
PaigeMiller
Diamond | Level 26

Nothing after the equal sign sets the new variable name (the name of the new variable that has the count of number of missings) to be the same as the original variable name. On the other hand if you had NMISS=POTATO, then the new variable name is POTATO. It was fine to use NMISS= in this case.

--
Paige Miller
SASuser4321
Obsidian | Level 7

@PaigeMiller Thank you. But when I run your code, I don't see the number of missing values for the character variables in the final table. I think this could be because your summary table (_summary_t) has 10 rows (only numeric) instead of 15?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 3481 views
  • 0 likes
  • 3 in conversation