DATA Step, Macro, Functions and more

Looping over a dataset

Reply
Regular Contributor
Posts: 162

Looping over a dataset

Hi,

 

I have a macro which takes a single dataset as input(test ) and does manipulations and creates 4 datasets match02a ,notmatch02a, match02b,notmatch02B.

 

%macro enhance;

 

data test1;

set test;

where condition =xxx;

run;

data test2;

set test;

where condition =yyy;

run;

data match02a notmatch02a;

set test1;

/* manipulations*/

run;

 

data match02b notmatch02b;

set test2;

/* manipulations*/

run;

%mend enhance;

%enhance

 

I am trying to pass a list of dataset to the above macro and create specific output for each dataset

 

Below is the another macro which I am using to split the dataset test into 3 datasets depending on the country name..When I execute the below macro I get three temporary datasets :  out_country1, out_country2,out_country3. 

 

 

 

%macro temp1;

 

/* define which libname.member table, and by which column */

%let TABLE= test;

%let COLUMN=country;

proc sql noprint;

/* build a mini program for each value */

/* create a table with valid chars from data value */

select distinct

   cat("out_",compress(&COLUMN.,,'kad'),"(where=(&COLUMN.=",quote(&COLUMN.),"))") into :allsteps separated by ' '

  from &TABLE.;

quit;

 

%put NOTE: &allsteps.;

data &allsteps.;

   set &table;

   run;

 

%mend temp1;

%temp1;

 

 

I need to passing these three datasets (out_country1, out_country2,out_country3. ) to enhance macro to create 4 datasets from each

  1.  match02a_country1 ,notmatch02a_country1, match02b_country1,notmatch02B_country1.
  2. match02a_country2 ,notmatch02a_country2, match02b_country2,notmatch02B_country2.
  3. match02a_country3 ,notmatch02a_country3, match02b_country3,notmatch02B_country3.

 

 

How do I pass the datasets as a parameter to enhance macro and generate output for each? is it possible to loop over these datasets and pass this as a parameter?

 

It will be great if you can guide me here

Super User
Posts: 10,497

Re: Looping over a dataset

Your macro Enhance should have two parameters, the  input dataset name and the country name.

 

%macro enhance (datain=, country=);
 
data test1;
   set &datain;
   where condition =xxx;
run;
data test2;
   set &datain;
   where condition =yyy;
   run;
data match02a_&country notmatch02a__&country;
   set test1;
/* manipulations*/
run;
 
data match02b__&country notmatch02b__&country;
   set test2;
/* manipulations*/
run;
%mend enhance;

Then make your calls such as

 

%enhance(datain=test, country=USA);

 

It sounds like you may want to use something like Call Execute from a data step to call stuff. That is often a better approach then creating a macro variable list and then parsing through the list to make macro calls.

I'm not sure why you are putting the stuff into that ALLSTEPS macro variable.

Regular Contributor
Posts: 162

Re: Looping over a dataset

Hi ballardw,

 

Thanks a lot for the reply.

 

I will look into call execute approach.

 

Also the reason for using allsteps macro variable is as follows.

 

I am planning to split the dataset test by country. country has 3 distinct values - country 1 ,country2,country3. So here i will have test_country1,test_country2, test_country3. I am using allsteps macro variable to create three temporary datasets.  I am planning to run the enhance macro for all three subsets in parallel to as to save time. and then to concatenate the results. So i need to pass these datasets name to the macro.

 

Also since the test may have varying number of states one day it might have 3 ,next day it might have 2 . So i would like to make it dynamic.

 

Thanks,

 

 

Regards,

Sheeba Swaminathan

Respected Advisor
Posts: 3,889

Re: Looping over a dataset

1. Change your %enhance macro so that it takes the country name as a parameter which you then use as part of your table names.

 

2. Create a SAS table with the country names in it.

 

3. In a data step (set country name table) use call execute() calling your macro and passing in the country name for every iteration of the data step.

 

There are quite a few examples/threads in this forum using above method

Regular Contributor
Posts: 162

Re: Looping over a dataset

Hi Patrick,

 

Thanks a lot for the reply.

 

I will check out the examples.

 

I will add the country names to the table. I am planning to execute the enhance macro parallely for each country . So i will have to create more than one table .

 

Thanks,

 

Regards,

Sheeba Swaminathan

Respected Advisor
Posts: 4,646

Re: Looping over a dataset

[ Edited ]

Your code above makes me think that you are missing two key concepts.


1) Macro parameters.

When a macro is defined as

 

%macro myMacro(param);

 

and called with

 

%myMacro(myValue);

 

it is equivalent to


%macro myMacro();
%let param=myValue;

 

so that you can call the macro repeatedly with different parameter values.

 

2) BY processing.

Dividing a dataset into many subsets to perform separate analyses is not efficient in SAS. It is much more efficient to keep the data together and to have one or many variables identifying the subsets. Almost every SAS procedure can perform separate processing on data subsets with the BY statement. Doing it this way keeps your data in a single structure, is simpler to code, and is usually much more efficient.

 

Most importantly, understanting 2) often makes 1) irrelevant. Adding Country and MatchType to your data and processing BY Country MatchType; might be enough to free you from macro processing and complications.

PG
Regular Contributor
Posts: 162

Re: Looping over a dataset

Hi PG Stats,

 

 

Thanks a lot for the details.

 

Sure. I will add the macro parameters.

 

one thing which i missed to add in enhance macro was that at the below point, it is not reading only from test1 and test2 but there is a join happening with another dataset . I cannot do a merge here due to the issue of many to many matches and i need to stick on to the cartesian product. More over the manipulations mentioned below are also dynamic and comes from a table .

 

I am really sorry for not adding this information.

 

Thanks again

 

 

data match02a notmatch02a;

Join of test1 with dataset2;

/* manipulations*/

run;

 

data match02b notmatch02b;

Join of test1 with dataset2;

/* manipulations*/

run;

 

Regards,

Sheeba Swaminathan

 

%macro enhance;

 

data test1;

set test;

where condition =xxx;

run;

data test2;

set test;

where condition =yyy;

run;

data match02a notmatch02a;

set test1;

/* manipulations*/

run;

 

data match02b notmatch02b;

set test2;

/* manipulations*/

run;

%mend enhance;

%enhance

Super User
Posts: 10,497

Re: Looping over a dataset

[ Edited ]

Anything that changes such as input data sets, variable names, even procedure options, should be macro parameters. That is a matter of degree not kind. Add the name of a macro variable to the definition and use where needed.

 

If you have code that already does everything you need then YOU know where the changes go and should be able to set the parameters as needed. There may be some fiddly bits regarding how to pass the parameter if it involves special characters or delimiters but that is one use of the macro quoting functions such as %str an %nrstr are designed for.

Regular Contributor
Posts: 162

Re: Looping over a dataset

Hi ballardw,

 

Thanks a lot for the details.

 

I will add the changes and i will update on how it goes.

 

Thanks,

 

Regards,

Sheeba

Ask a Question
Discussion stats
  • 8 replies
  • 364 views
  • 4 likes
  • 4 in conversation