DATA Step, Macro, Functions and more

Macro for accessing multiple datasets and calculating a mean

Reply
New Contributor
Posts: 4

Macro for accessing multiple datasets and calculating a mean

Hello everyone!

I have limited experience with macros and wonder if you can help me figure out how to do this. I'll explain what I want to do:

I have 20 years of data in 20 SAS data sets, one for each year. Each file has different numbers of observations. I want to perform longitudinal analysis on some variables. So let's say as an example that I wanted the mean AGE throughout the years.

I think a macro would be very useful so I don't have to do the same thing 20 times. I want to access each data set, calculate the mean AGE for that year, and input it into a new data set that contains the mean AGE for the 20 year period.

I feel like I'm close to achieving that with the following code and subsequent error message (for only 3 years, not 20):

%let years = 3 ;

%let var = age ;

%macro means ;

       data final ; run ;

       names =           {"cno_std_93","cno_std_94","cno_std_95","cno_std_96","cno_std_97","cno_std_98","cno_std_99","cno_std_00","cno_std_01","cno_std_02","cno_std_03","cno_std_04","cno_std_05","cno_std_06","cno_std_07","cno_std_08","cno_std_09","cno_std_10","cno_std_11"} ;

       %do i = 1 %to &years ;

            data new ; set cno.names(&i) ;

                 keep &var ;

            run ;

            proc means data = new mean ;

                 var &var ;

                 output out = means mean = meanofvar ;

            run ;

            data final ;

                 merge final means ;

            run ;

       %end ;

%mend means ;

%means(3) ;

And I get the following error:

NOTE: Line generated by the invoked macro "MEANS".

1                              names =

                               -----

                               180

1  ! {"cno_std_93","cno_std_94","cno_std_95","cno_std_96","cno_std_97","cno_std_98","cno_std_99","cno_

1  ! std_00","cno_std_01","cno_std_02","cno_std_03","cno_std_04","cno_std_05","cno_std_06","cno_std_07

1  ! ","cno_std_08"

ERROR 180-322: Statement is not valid or it is used out of proper order.

Thanks in advance for any help or advice you can give me!

Best,

SE Greaves

Super User
Posts: 10,500

Re: Macro for accessing multiple datasets and calculating a mean

No macro needed I think:

Data final;

     set cno_std_: ;

run;

will combine all of the data sets that start with cno_std_

You can take the mead of all of the variables

proc means data = new mean ;

        var <put your list of vars here>;

         output out = means mean = /autoname ; /* autoname will create output names like var_mean*/

run ;

I am not sure quite what you are attempting to do with the merge at the end though.

New Contributor
Posts: 4

Re: Macro for accessing multiple datasets and calculating a mean

Oh wow ok, this is great! You're right, let me try it out and see if I could do all the analyses I intend to do this way...

I just realized I made a big mistake on the code. I am resetting the data set on every iteration. I intended to start with an empty data set, then get the AGE variable from the 1st year data set, and merge it to that empty data set, same with the next years, only without deleting the "final" data set... So at the end I would have merged all the means together into one data set.

Anyway, I think your advice might be what I needed, thanks a lot! I'll let you know how it goes...

Best,

SE

Super User
Posts: 10,500

Re: Macro for accessing multiple datasets and calculating a mean

It may be helpful to describe exactly what the final analysis would be.

For instance, it looks like you may be requesting to do some of this by year. If your datasets no not have a year variable we can likely add that using some options such as:

data want;

     set cno_std_:  indsname=dsn;

     length year $ 2;

     year = scan(dsn,3,'_');

run;

And the analysis

proc means data = want;

     CLASS year;

        var <put your list of vars here>;

         output out = means mean = /autoname ;

run ;

will calculate an overall mean and one for each year.

If you were going to use the result of this data to print a report you may be able to generate the report output directly from another report procedure:

proc tabulate data=want;

     class year;

     var <variable list>;

     table (variables) *(mean std max min),

               year all='All years';

run;

Will generate a table with each variable as a row with subrows of the requested statistics and columns for each year plus an overall combined result in the last column.

Note that if you have years such as 00 01 02 then they will appear before 93 94 95.

But that is fixable in the data step.

New Contributor
Posts: 4

Re: Macro for accessing multiple datasets and calculating a mean

Thanks a lot ballardw!

In the end I realized I was making life pretty hard for myself and what I wanted was not hard at all. This is how I managed to do it:

%macro means(ds, var, year) ;

  %let name = &var.&year ;

  proc means data = &ds mean noprint ;

  var &var ;

  output out = means mean = &name ;

  run ;

  data means ; set means ; keep &name ; label &name = &name ; run ;

  data final ; merge final means ; run ;

%mend means ;

%let var = age ;

data final ; &var = "&var" ; run ;

%means(cno.cno_std_93 , &var , 93) ;

%means(cno.cno_std_94 , &var , 94) ;

%means(cno.cno_std_95 , &var , 95) ;

%means(cno.cno_std_96 , &var , 96) ;

%means(cno.cno_std_97 , &var , 97) ;

%means(cno.cno_std_98 , &var , 98) ;

%means(cno.cno_std_99 , &var , 99) ;

%means(cno.cno_std_00 , &var , 00) ;

%means(cno.cno_std_01 , &var , 01) ;

%means(cno.cno_std_02 , &var , 02) ;

%means(cno.cno_std_03 , &var , 03) ;

%means(cno.cno_std_04 , &var , 04) ;

%means(cno.cno_std_05 , &var , 05) ;

%means(cno.cno_std_06 , &var , 06) ;

%means(cno.cno_std_07 , &var , 07) ;

%means(cno.cno_std_08 , &var , 08) ;

%means(cno.cno_std_09 , &var , 09) ;

%means(cno.cno_std_10 , &var , 10) ;

Thanks for all your help!

Sincerely,

SE

Ask a Question
Discussion stats
  • 4 replies
  • 267 views
  • 6 likes
  • 2 in conversation