DATA Step, Macro, Functions and more

Performing the same analysis on many data tables

Accepted Solution Solved
Reply
Super Contributor
Posts: 413
Accepted Solution

Performing the same analysis on many data tables

Hi,

 

I have many data tables each corresponding to a city (such as new_york, boston etc). In each data table I have data for a person's age. salary and number of childern, and I would to make the following proc means for these variables in the following way: 

 

proc means data=new_york n mean max min range std ;

     age salary children;

 run;

 

But instead of doing this manually for each city, is it possible to do this for a list of cities, some sort of "macro for data tables?"

 

Thank you!


Accepted Solutions
Solution
‎09-15-2016 09:14 PM
Super Contributor
Posts: 413

Re: Performing the same analysis on many data tables

[ Edited ]

Thanks Reeza for the direction, the ucla link does actually have prety much the same thing and I got my wanted result!

 

But just one thing that I would like to note. Here is my code :

 

 

%let data_list = boston new_york

%macro means_for_all(all_data);
  %let k=1;
  %let the_data= %scan(&all_data, &k);
  %do %while("&the_data" NE "");
    title "data table is &the_data";
    proc means data=&the_data n min max mean;
      var _all_;
    run;
    %let k = %eval(&k + 1);
    %let the_data = %scan(&all_data, &k);
  %end;
%mend;


%means_for_all(&data_list)

First I put the names of the tables into a macro variable "&data_list". Then when I run the macro the way above, i.e I put  "&data_list" inside the parantheses, I get error messages such as work.data.mean doesn't exist, so for some reason sas treats the proc means options as if they were data names, BUT the result is the same as if when calling the macro I had inputted the variable names one by one, strange... 

View solution in original post


All Replies
Super User
Posts: 17,832

Re: Performing the same analysis on many data tables

Why do you have a table for each city? Can you combine them into one? That would be the fastest way. 

Super Contributor
Posts: 413

Re: Performing the same analysis on many data tables

Hi Reeza,

 

The tables are separate, but is there a way to combine them? (I assume that a new column "city" should be added which wil contain the city's name in it, but I can't figure out how)

 

Thank you!

Respected Advisor
Posts: 4,649

Re: Performing the same analysis on many data tables

Use indsname= option in the set statement to get the dataset name

 

data all;
set NewYork Boston ... indsname=source;
city = source;
run;

proc means data=all ...;
by city;
var ...;
run;
PG
Super Contributor
Posts: 413

Re: Performing the same analysis on many data tables

Hi PGStats,

 

thanks for the code, it works perfectly!

 

Could you please give me a general solution for the case where I have many data tables but the variables in each table might be diferent so that I can make a proc means for all of the variables for all the data tables?

 

Thanks!

Super User
Posts: 9,681

Re: Performing the same analysis on many data tables


You can combine these tables together firstly:

data want;
 set NewYork Boston .......   indsname=indsname;
dsn=indsname;
run;

Then run BY statement:

proc means data=want;
by dsn;
.........

Super Contributor
Posts: 413

Re: Performing the same analysis on many data tables

[ Edited ]

Hi Ksharp,

 

thnka sofr the code, it works greatly.

 

Just as I asked PGStats, is it possible to a more general solution for the case where I have many data tables and the variables in each table might be different so that I can do a proc means for all the variables for all the tables?

 

 

Thanks! 

Super User
Posts: 17,832

Re: Performing the same analysis on many data tables

1. Write the code for base case

 

Proc means data=mydataset n min mean max;
Run;

2. Determine fields that need to change in each iteration - these will be macro variable parameters. 

Proc means data=mydataset ... ;

 

The dataset variable name will need to change.

 

3. Convert to macro variable

%let dsin=mydataset;

Proc means data = &dsin ...
Run;

4. Wrap in macro code and convert macro variables to parameters. 

%macro summarize(data_in);

Proc means data=&data_in n min mean max;
Run;

%mend;

5.

Execute macro for desired dataset

 

%summarize(sashelp.class);

Intro to macro language

 http://www.ats.ucla.edu/stat/sas/seminars/sas_macros_introduction/

Super Contributor
Posts: 413

Re: Performing the same analysis on many data tables

Hi Reeza,

 

I understood parts 4 and 5 which are making a macro and indeed it works when I do the data tables one by one like this:

%summarize(cities);
%summarize(countries);
etc

But is it possible to make a code which will run the same macro but for a list of data tables?

 

Thanks!

Super User
Posts: 17,832

Re: Performing the same analysis on many data tables

@ilikesas Yes. Look at CALL EXECUTE. The example in the documentation is very clear. 

 

If you need further help with it, post what you've tried and highlight what isn't working. 

Solution
‎09-15-2016 09:14 PM
Super Contributor
Posts: 413

Re: Performing the same analysis on many data tables

[ Edited ]

Thanks Reeza for the direction, the ucla link does actually have prety much the same thing and I got my wanted result!

 

But just one thing that I would like to note. Here is my code :

 

 

%let data_list = boston new_york

%macro means_for_all(all_data);
  %let k=1;
  %let the_data= %scan(&all_data, &k);
  %do %while("&the_data" NE "");
    title "data table is &the_data";
    proc means data=&the_data n min max mean;
      var _all_;
    run;
    %let k = %eval(&k + 1);
    %let the_data = %scan(&all_data, &k);
  %end;
%mend;


%means_for_all(&data_list)

First I put the names of the tables into a macro variable "&data_list". Then when I run the macro the way above, i.e I put  "&data_list" inside the parantheses, I get error messages such as work.data.mean doesn't exist, so for some reason sas treats the proc means options as if they were data names, BUT the result is the same as if when calling the macro I had inputted the variable names one by one, strange... 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 559 views
  • 4 likes
  • 4 in conversation