DATA Step, Macro, Functions and more

call symput within data _null_: multiple iterations

Accepted Solution Solved
Reply
Contributor fre
Contributor
Posts: 35
Accepted Solution

call symput within data _null_: multiple iterations

[ Edited ]

 

Hi,

 

I have a dataset "TABLES" which contains 100 rows and only 1 variable "name_dataset".

Each row contains the name of another SAS dataset.

 

I want to write a procedure which

- reads those 100 rows one by one, (read the name of the SAS dataset)

- via a macro, reads each dataset (one by one), do some modifications, and append the results to a master table.

 

Seems easy, but I'm a little bit stuck here. 

I thought that with the code below, it would do 100 iterations and at each iteration it would save the name of the dataset of the current iteration into a macrovariable and executes the macro that follows.(readOneFile).  But what it does, it takes only the last value (row 100) and only executes 1 time the macro.  So it's only 1 iteration if I understand well.  Does anybody knows in what way I have to modify my code so that i actually does 100 times the same, but each time with another dataset/table?

 

 

This is the code I have so far:

 

%macro readAllFiles(table_master=);

 

... some code, which works well ...

 

    data _null_;

      set test.TABLES;  /* TABLES contains 100 observations and only 1 field 'name_dataset' */

      call symput ('name_dataset', name_dataset);

      %readOneFile(lib=source,table_in=&name_dataset, table_out=temp_table)

      PROC APPEND BASE=&table_master(compress=yes) DATA=temp_table force;

      RUN;

      %delete_table(table=temp_table);

    run;

%mend;

 

%macro ReadOneFile (lib=,table_in=,table_out=);

... some code, which works well ...

%mend;

 

 

 


Accepted Solutions
Solution
‎10-16-2015 03:51 AM
Trusted Advisor
Posts: 1,932

Re: call symput within data _null_: multiple iterations

[ Edited ]

Here's a shorter method, where data _null_ reads all observations in test.tables

 

%macro readAllFiles(table_master=);
    data _null_;
        set test.tables end=eof;
        call symputx('name_dataset'||left(_n_),name_dataset);
        if eof then call symputx('nobs',_n_);
    run;
    %do ii=1 %to &nobs;
        %readOneFile(lib=source,table_in=&&name_dataset&ii,table_out=temp_table)
        proc append ...
        run; 
        %delete_table(table=temp_table)
    %end;
%mend;

 

View solution in original post


All Replies
Trusted Advisor
Posts: 1,932

Re: call symput within data _null_: multiple iterations

[ Edited ]


%macro ReadOneFile (lib=,table_in=,table_out=);

... some code, which works well ...

%mend;


Since you have specifically avoided telling us what is in this macro, we have to speculate, which is never a good thing. But regardless, my speculation is that you cannot include this macro inside a data _null_; data step, because only data step code can appear inside a data step.

 

 The general outline of what you need to do is

 

%macro readAllFiles(table_master=);
%do ii=1 %to 100;
    data _null_;
        set test.tables(obs=&ii firstobs=&ii);
        call symputx('name_dataset',name_dataset);
    run;
    %readOneFile(lib=source,table_in=&name_dataset,table_out=temp_table)
    proc append ...
    run; 
    %delete_table(table=temp_table)
%end;
%mend;

 

Contributor fre
Contributor
Posts: 35

Re: call symput within data _null_: multiple iterations

[ Edited ]
Posted in reply to PaigeMiller

Hi PaigeMiller,

 

thank you very much,  this is what I was looking for.

Sorry for not including all the code, I thought I wrote down all the relevant code for explaining my problem.

 (in the second macro, there were some procs involved, such as import, sort, sql)

 

So what I learned is that in a datastep, you can not make a reference to a macro.

 

I thought I could avoid using a DO-loop because the data _null_ step would iterate all records just till the last record.  But that's also not the case.

If I have a table with a variable number of rows instead of 100 rows, is there a short way to replace  " %do ii = 1 %to 100;  " ?

 

Or do I have to use a proc SQL (with a count) to determine the number of rows before the do loop?

Trusted Advisor
Posts: 1,932

Re: call symput within data _null_: multiple iterations

[ Edited ]

fre wrote:

So what I learned is that in a datastep, you can not make a reference to a macro.

This is not what I said, and is not correct. In a datastep, you must only have data step commands or a macro that resolves to data step commands. Here is a trivial example.

 

%macro trivial;
%do ii=1 %to 2; y=ⅈ output; %end;
%mend

data trivial_data_step;
%trivial
run;

I thought I could avoid using a DO-loop because the data _null_ step would iterate all records just till the last record.  But that's also not the case.

Again, that's not what I said, and not true. data _null_; can indeed iterate through an entire data step. I could have written my macro to use data _null_ to iterate through the entire data set.

Or do I have to use a proc SQL (with a count) to determine the number of rows before the do loop?

PROC SQL works or the %nobs macro works

Contributor fre
Contributor
Posts: 35

Re: call symput within data _null_: multiple iterations

Posted in reply to PaigeMiller

Thank you very much for correcting me.

Valued Guide
Posts: 765

Re: call symput within data _null_: multiple iterations

[ Edited ]

Hi, here's one way how to determine how many iterations are needed to append all the data sets in your list ...

 

* make some data sets to use in PROC APPEND;

data class1;
set sashelp.class;
run;

 

data class2;
set sashelp.class;
run;

 

* the data set with list of data sets;
data list;
input dset $30.;
datalines;
sashelp.class
work.class1
work.class2
;

 

* how many observations in the list of data sets;
data _null_;
call symputx('ndsets',obs);
stop;
set list nobs=obs;
run;

 

* APPEND makes a new data set (specified as the argument) ;

* number of APPEND iterations specified in loop with macro variable NDSETS;

%macro append(newdataset);
%do j=1 %to &ndsets;

 

data _null_;
rec = &j;
set list point=rec;
call symputx('name',dset);
stop;
run;

 

proc append base=&newdataset data=&name;
run;

 

%end;
%mend;

 

%append(class123);

 

You could also move the determination of itaerations inside the macr, then specify the data set with the file list as an argument ...

 

%macro append(filelist,newdataset);

 

data _null_;
call symputx('ndsets',obs);
stop;
set &filelist nobs=obs;
run;


%do j=1 %to &ndsets;

 

data _null_;
rec = &j;
set &filelist point=rec;
call symputx('name',dset);
stop;
run;

 

proc append base=&newdataset data=&name;
run;

 

%end;
%mend;

 

%append(list, class123);

Solution
‎10-16-2015 03:51 AM
Trusted Advisor
Posts: 1,932

Re: call symput within data _null_: multiple iterations

[ Edited ]

Here's a shorter method, where data _null_ reads all observations in test.tables

 

%macro readAllFiles(table_master=);
    data _null_;
        set test.tables end=eof;
        call symputx('name_dataset'||left(_n_),name_dataset);
        if eof then call symputx('nobs',_n_);
    run;
    %do ii=1 %to &nobs;
        %readOneFile(lib=source,table_in=&&name_dataset&ii,table_out=temp_table)
        proc append ...
        run; 
        %delete_table(table=temp_table)
    %end;
%mend;

 

Valued Guide
Posts: 765

Re: call symput within data _null_: multiple iterations

Posted in reply to PaigeMiller

Hi ... could you show a complete example of how that works given the following ... want to append SASHELP.CLASS, CLASS1, CLASS2.  Thanks.

 

* make some data sets to use in PROC APPEND;

data class1;
set sashelp.class;
run;

 

data class2;
set sashelp.class;
run;

 

* the data set with list of data sets;
data list;
input dset $30.;
datalines;
sashelp.class
work.class1
work.class2
;

Super User
Super User
Posts: 7,077

Re: call symput within data _null_: multiple iterations

You can use CALL EXECUTE to generate code that will sit on a stack and execute AFTER the data step stops.

So your DATA step should look something like this:

 

data _null_;
   set test.TABLES;  
   call execute(cats('%nrstr(%readOneFile)'
    ,'(lib=source,table_in=',name_dataset
    ,',table_out=temp_table)'
    ));
   call execute('PROC APPEND BASE=&table_master(compress=yes) DATA=temp_table force; RUN;');
run;

%delete_table(table=temp_table);
Contributor fre
Contributor
Posts: 35

Re: call symput within data _null_: multiple iterations

@PaigeMiller

Thank you for sharing this shorter version, which worked very well in my program. 

It looks cleaner then your first solution (which also worked very well).

If you don't mind, I will change the accepted solution to this second shorter version of you.

 

@Tom

Thank you for this alternative version.  It looks very interesting that you can stack some code while in a datastep to run after a datastep completes.  I've learned a lot since the beginning of this thread.

Trusted Advisor
Posts: 1,932

Re: call symput within data _null_: multiple iterations


fre wrote:

@PaigeMiller

Thank you for sharing this shorter version, which worked very well in my program. 

It looks cleaner then your first solution (which also worked very well).

If you don't mind, I will change the accepted solution to this second shorter version of you.



Hello @fre

 

I'm glad you liked the 2nd solution I provided. I often hear people say words like "cleaner", and I'm not sure what you mean by that. Anyway, over many years of programming, I have decided that I prefer the first program to the second program, although that is entirely subjective and I'm not even sure I can state why; but as you said, both work as desired and if you like the second solution better, that's great, I'm glad it works for you.

Contributor fre
Contributor
Posts: 35

Re: call symput within data _null_: multiple iterations

Posted in reply to PaigeMiller

Maybe 'cleaner' wasn't the right word to use. 

The second solution just looks more efficient (but i'm not sure if it actually is) because you only access the dataset "test.tables" 1 time.   In the first solution, it looks like the same table is accessed 100 times.

Most important is that both solutions work, so I'm already a happy man.

Super User
Super User
Posts: 7,077

Re: call symput within data _null_: multiple iterations

I really do not like the idea of generating hundreds of macro variables. 

Either use CALL EXECUTE to generate the code or you can actually use a data step to write the code to a file.  Writing the code to a file is even easier to debug since you can examine the file and submit pieces of it to make sure your code generation logic is correct.

Trusted Advisor
Posts: 1,932

Re: call symput within data _null_: multiple iterations

[ Edited ]

Tom wrote:

I really do not like the idea of generating hundreds of macro variables. 

 


Yes, now that you mention it, I think this is the reason I began using the first code rather than the second code.

 

Either use CALL EXECUTE to generate the code or you can actually use a data step to write the code to a file.  Writing the code to a file is even easier to debug since you can examine the file and submit pieces of it to make sure your code generation logic is correct.


This certainly is good advice — however I feel that for me personally, I have reached the point where I can write working macro code without the need for a file of code to help me see if I have gotten it right. So its faster for me to not write the code to a file.

Contributor fre
Contributor
Posts: 35

Re: call symput within data _null_: multiple iterations

Posted in reply to PaigeMiller

 

It's a pity only one answer can be marked as "the solution".

Just did some research on 'call execute', and indeed, very powerful too to handle my original question.

So dear reader, this thread contains at least 3 solutions so far.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 578 views
  • 0 likes
  • 4 in conversation