Macro for using data step to several dataset

Reply
Contributor hua
Contributor
Posts: 33

Macro for using data step to several dataset

Dear all,

 

I'm a beginner for SAS macro. I want to set up a macro to run the data step for several datasets. For example, my code for one datasets is like:

 

proc sort data=VarA; by year;
proc means data=VarA n nmiss noprint;
by year;
var a;
output out=summary1;
data a_summary; set summary1;
if _STAT_ ^= "N" then delete;
keep year a;
run;

 

And I also have datasets: VarB, VarC, ... VarZ..., total 32 datasets. For each dataset, I want do the same procedure, but the names of datasets (VarA, VarB,...)  and vars (a, b, ...) are different. So I'm wondering if I could set up a macro to do this. 

 

Any help is much appreciated. Thanks!

 

Best,

Hua

Esteemed Advisor
Posts: 6,646

Re: Macro for using data step to several dataset

First of all, you have to decide how to build the name of the final output dataset; it needs to enclose all parameters, otherwise results will be overwritten.

Then replace the variable parts of your code with macro variables; keep in mind to terminate the macro variable names with a dot:

data &ds._&var._summary;

Then set those variables with %let, and test if your code works for one parameter set. Once that is made sure of, enclose the code in

%macro yourmac(ds,var);
/* code */
%mend;

and you're done.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Esteemed Advisor
Esteemed Advisor
Posts: 7,203

Re: Macro for using data step to several dataset

"I'm a beginner for SAS macro" - tip one, macro is never needed.  It can be useful in limited situations, however if you find yourself using it for simple programming needs your doing something wrong.  In your example (and I don't have any test data to run this on):

proc sort data=vara; 
  by year;
run;

proc means data=vara n nmiss noprint;
  by year;
  var a b c d;
  output out=summary;
run;

data a_summary (keep=year a b c d);
  set summary;
  if _stat_ ne "N" then delete;
run;

Basically there is no need to do the code more than once as everything in the var line will be analysed.  You haven't split your data up have you, bad idea.  Also, note how I use indetations, put each code line on a new line, finish blocks, use consistent coding.  It is far more important to make your code readable.

Valued Guide
Posts: 505

Re: Macro for using data step to several dataset

Macro for using data step to several dataset

Not a macro solution.

You can deconstruct my algorithm and make it a macro.

insired by
https://goo.gl/x5t1PT
https://communities.sas.com/t5/General-SAS-Programming/Macro-for-using-data-step-to-several-dataset/m-p/348747

HAVE 16 datasets  each with three years 2 different variables
==============================================================

Up to 40 obs WORK.META total obs=96

Obs    YEAR     DSN     VAR1    VAR2

  1    2015    CE67F     CE      EC   data set CE67F with varaibles CE and EC
  2    2016    CE67F     CE      EC
  3    2017    CE67F     CE      EC

  4    2015    DF68G     DF      FD   data set DF68G with varaibles Df and FD
  5    2016    DF68G     DF      FD
  6    2017    DF68G     DF      FD

  7    2015    EG69H     EG      GE
  8    2016    EG69H     EG      GE
  9    2017    EG69H     EG      GE


Here is two of the 16 datasets look like
========================================

Dateset CE67F

Up to 40 obs from CE67F total obs=6

Obs    YEAR    CE    EC    Mean

 1     2015    40    99   (99 + 78)/2 =88.5
 2     2015    71    78

 3     2016    81    57
 4     2016    37     8

 5     2017     5     9
 6     2017    52    15

Dataset  DF68G

Up to 40 obs from DF68G total obs=6

Obs    YEAR    DF    FD

 1     2015    18    46
 2     2015    27    54

 3     2016     2    87
 4     2016    93    32

 5     2017    29     4
 6     2017    31    39

WANT  (summarize by dsn and year 6 obs to 3 means)
====

Up to 40 obs from sum32 total obs=6

Obs     DSN     YEAR  INPVAR  INPVAR2   VAR1    VAR2

 1     CE67F    2015     CE      EC     55.5    88.5  (var2=(99 + 78)/2 =88.5)
 2     CE67F    2016     CE      EC     59.0    32.5
 3     CE67F    2017     CE      EC     28.5    12.0

 4     DF68G    2015     DF      FD     22.5    50.0
 5     DF68G    2016     DF      FD     47.5    59.5
 6     DF68G    2017     DF      FD     30.0    21.5

WORKING CODE
============

    DOSUBL
        set &dsn(rename=(&var1=var1 &var2=var2));
        proc append data=getone base=all32;

    DOSUBL
        proc summary data=all32 mean nway;

FULL SOLUTION
=============

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;

data meta_pre;
 retain year dsn var1 var2;
 retain cnt 0;
 do ltr=67 to 87;
   cnt=cnt+1;
   length dsn $5 var1 var2 $2;
   dsn=cats(byte(ltr),byte(ltr+2),put(ltr,2.),byte(ltr+3));
   var1=cats(byte(ltr),byte(ltr+2));
   var2=cats(byte(ltr+2),byte(ltr));
   do year=2015 to 2017;output;end;
   dsn=cats(byte(ltr+3),byte(ltr+2),put(ltr,2.),byte(ltr+1));
   var1=cats(byte(ltr-1),byte(ltr+2));
   var2=cats(byte(ltr-2),byte(ltr));
   do year=2015 to 2017;output;end;
   drop ltr cnt;
   if cnt=16 then stop;
 end;
run;quit;

proc sort data=meta_pre out=meta;
by dsn year var1 var2;
run;quit;

data _null_;
  set meta;
   if _n_=1 then call streaminit(123);
   call symputx('dsn',dsn);
   call symputx('var1',var1);
   call symputx('var2',var2);
   rc=dosubl('
     data &dsn;
         do year=2015,2015,2016,2016,2017,2017;
            &var1 = int(100*(rand("Uniform")));
            &var2 = int(100*(rand("Uniform")));
            output;
         end;
     ');
run;quit;

*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

proc datasets lib=work;
delete all32;
run;quit;

%symdel dsn var1 var2/ nowarn;
data _null_;

  * put all 16 datasets into all32;
  do until (eof);

     set meta end=eof;

     call symputx('dsn',dsn);
     call symputx('var1',var1);
     call symputx('var2',var2);

     rc=dosubl('
        data getone;
           set &dsn(rename=(&var1=var1 &var2=var2));
           dsn=symget("dsn");
           inp1=symget("var1");
           inp2=symget("var2");
        run;quit;
        proc append data=getone base=all32;
        run;quit;
     ');

  end;

  if rc ne 0 then do;
      put "*** proc append failed *** on dsn=" dsn;
      stop;
  end;

  * summarize the big dataset;
  rc=dosubl('
       proc summary data=all32 mean nway;
         class  dsn year inp1 inp2 ;
         var var1 var2;
         output out=sum32(drop=_type_ _freq_) mean=;
       run;quit;
    ');

   if rc ne 0 then do;
      put "*** proc summary failed ***";
      stop;
   end;

   stop;

run;quit;


SYMBOLGEN:  Macro variable DSN resolves to CE67F
SYMBOLGEN:  Macro variable VAR1 resolves to CE
SYMBOLGEN:  Macro variable VAR2 resolves to EC
NOTE: There were 6 observations read from the data set WORK.CE67F.
NOTE: The data set WORK.GETONE has 6 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              1209.25k
      OS Memory           18668.00k
      Timestamp           04/10/2017 08:17:28 PM
      Step Count                        626  Switch Count  0


NOTE: Appending WORK.GETONE to WORK.ALL32.
NOTE: There were 6 observations read from the data set WORK.GETONE.
NOTE: 6 observations added.
NOTE: The data set WORK.ALL32 has 78 observations and 6 variables.
NOTE: PROCEDURE APPEND used (Total process time):
      real time           0.01 seconds
      user cpu time       0.00 seconds
      system cpu time     0.00 seconds
      memory              1209.25k
      OS Memory           18924.00k
      Timestamp           04/10/2017 08:17:28 PM
      Step Count                        626  Switch Count  0




Contributor hua
Contributor
Posts: 33

Re: Macro for using data step to several dataset

Thank you for your code. It's a complicate one, and it will help if I want a macro to deal with my data.
Contributor hua
Contributor
Posts: 33

Re: Macro for using data step to several dataset

Right! I haven't use the macro. And I just copy and paste to get my results. Thank you!

Esteemed Advisor
Posts: 6,646

Re: Macro for using data step to several dataset

Concur with @rogerjdeangelis and @RW9 that you most probably can solve your issue without the use of a macro. My post was intended to give you tips for developing a macro when you really need one.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor hua
Contributor
Posts: 33

Re: Macro for using data step to several dataset

Thank you anyway. Your code gave me some ideas about macro, and I found my situation didn't need a macro.
Ask a Question
Discussion stats
  • 7 replies
  • 169 views
  • 2 likes
  • 4 in conversation