BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fre
Quartz | Level 8 fre
Quartz | Level 8

 

Hi,

 

I have a dataset "TABLES" which contains 100 rows and only 1 variable "name_dataset".

Each row contains the name of another SAS dataset.

 

I want to write a procedure which

- reads those 100 rows one by one, (read the name of the SAS dataset)

- via a macro, reads each dataset (one by one), do some modifications, and append the results to a master table.

 

Seems easy, but I'm a little bit stuck here. 

I thought that with the code below, it would do 100 iterations and at each iteration it would save the name of the dataset of the current iteration into a macrovariable and executes the macro that follows.(readOneFile).  But what it does, it takes only the last value (row 100) and only executes 1 time the macro.  So it's only 1 iteration if I understand well.  Does anybody knows in what way I have to modify my code so that i actually does 100 times the same, but each time with another dataset/table?

 

 

This is the code I have so far:

 

%macro readAllFiles(table_master=);

 

... some code, which works well ...

 

    data _null_;

      set test.TABLES;  /* TABLES contains 100 observations and only 1 field 'name_dataset' */

      call symput ('name_dataset', name_dataset);

      %readOneFile(lib=source,table_in=&name_dataset, table_out=temp_table)

      PROC APPEND BASE=&table_master(compress=yes) DATA=temp_table force;

      RUN;

      %delete_table(table=temp_table);

    run;

%mend;

 

%macro ReadOneFile (lib=,table_in=,table_out=);

... some code, which works well ...

%mend;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Here's a shorter method, where data _null_ reads all observations in test.tables

 

%macro readAllFiles(table_master=);
    data _null_;
        set test.tables end=eof;
        call symputx('name_dataset'||left(_n_),name_dataset);
        if eof then call symputx('nobs',_n_);
    run;
    %do ii=1 %to &nobs;
        %readOneFile(lib=source,table_in=&&name_dataset&ii,table_out=temp_table)
        proc append ...
        run; 
        %delete_table(table=temp_table)
    %end;
%mend;

 

--
Paige Miller

View solution in original post

14 REPLIES 14
PaigeMiller
Diamond | Level 26


%macro ReadOneFile (lib=,table_in=,table_out=);

... some code, which works well ...

%mend;


Since you have specifically avoided telling us what is in this macro, we have to speculate, which is never a good thing. But regardless, my speculation is that you cannot include this macro inside a data _null_; data step, because only data step code can appear inside a data step.

 

 The general outline of what you need to do is

 

%macro readAllFiles(table_master=);
%do ii=1 %to 100;
    data _null_;
        set test.tables(obs=&ii firstobs=&ii);
        call symputx('name_dataset',name_dataset);
    run;
    %readOneFile(lib=source,table_in=&name_dataset,table_out=temp_table)
    proc append ...
    run; 
    %delete_table(table=temp_table)
%end;
%mend;

 

--
Paige Miller
fre
Quartz | Level 8 fre
Quartz | Level 8

Hi PaigeMiller,

 

thank you very much,  this is what I was looking for.

Sorry for not including all the code, I thought I wrote down all the relevant code for explaining my problem.

 (in the second macro, there were some procs involved, such as import, sort, sql)

 

So what I learned is that in a datastep, you can not make a reference to a macro.

 

I thought I could avoid using a DO-loop because the data _null_ step would iterate all records just till the last record.  But that's also not the case.

If I have a table with a variable number of rows instead of 100 rows, is there a short way to replace  " %do ii = 1 %to 100;  " ?

 

Or do I have to use a proc SQL (with a count) to determine the number of rows before the do loop?

PaigeMiller
Diamond | Level 26

@fre wrote:

So what I learned is that in a datastep, you can not make a reference to a macro.

This is not what I said, and is not correct. In a datastep, you must only have data step commands or a macro that resolves to data step commands. Here is a trivial example.

 

%macro trivial;
%do ii=1 %to 2; y=ⅈ output; %end;
%mend

data trivial_data_step;
%trivial
run;

I thought I could avoid using a DO-loop because the data _null_ step would iterate all records just till the last record.  But that's also not the case.

Again, that's not what I said, and not true. data _null_; can indeed iterate through an entire data step. I could have written my macro to use data _null_ to iterate through the entire data set.

Or do I have to use a proc SQL (with a count) to determine the number of rows before the do loop?

PROC SQL works or the %nobs macro works

--
Paige Miller
fre
Quartz | Level 8 fre
Quartz | Level 8

Thank you very much for correcting me.

MikeZdeb
Rhodochrosite | Level 12

Hi, here's one way how to determine how many iterations are needed to append all the data sets in your list ...

 

* make some data sets to use in PROC APPEND;

data class1;
set sashelp.class;
run;

 

data class2;
set sashelp.class;
run;

 

* the data set with list of data sets;
data list;
input dset $30.;
datalines;
sashelp.class
work.class1
work.class2
;

 

* how many observations in the list of data sets;
data _null_;
call symputx('ndsets',obs);
stop;
set list nobs=obs;
run;

 

* APPEND makes a new data set (specified as the argument) ;

* number of APPEND iterations specified in loop with macro variable NDSETS;

%macro append(newdataset);
%do j=1 %to &ndsets;

 

data _null_;
rec = &j;
set list point=rec;
call symputx('name',dset);
stop;
run;

 

proc append base=&newdataset data=&name;
run;

 

%end;
%mend;

 

%append(class123);

 

You could also move the determination of itaerations inside the macr, then specify the data set with the file list as an argument ...

 

%macro append(filelist,newdataset);

 

data _null_;
call symputx('ndsets',obs);
stop;
set &filelist nobs=obs;
run;


%do j=1 %to &ndsets;

 

data _null_;
rec = &j;
set &filelist point=rec;
call symputx('name',dset);
stop;
run;

 

proc append base=&newdataset data=&name;
run;

 

%end;
%mend;

 

%append(list, class123);

PaigeMiller
Diamond | Level 26

Here's a shorter method, where data _null_ reads all observations in test.tables

 

%macro readAllFiles(table_master=);
    data _null_;
        set test.tables end=eof;
        call symputx('name_dataset'||left(_n_),name_dataset);
        if eof then call symputx('nobs',_n_);
    run;
    %do ii=1 %to &nobs;
        %readOneFile(lib=source,table_in=&&name_dataset&ii,table_out=temp_table)
        proc append ...
        run; 
        %delete_table(table=temp_table)
    %end;
%mend;

 

--
Paige Miller
MikeZdeb
Rhodochrosite | Level 12

Hi ... could you show a complete example of how that works given the following ... want to append SASHELP.CLASS, CLASS1, CLASS2.  Thanks.

 

* make some data sets to use in PROC APPEND;

data class1;
set sashelp.class;
run;

 

data class2;
set sashelp.class;
run;

 

* the data set with list of data sets;
data list;
input dset $30.;
datalines;
sashelp.class
work.class1
work.class2
;

Tom
Super User Tom
Super User

You can use CALL EXECUTE to generate code that will sit on a stack and execute AFTER the data step stops.

So your DATA step should look something like this:

 

data _null_;
   set test.TABLES;  
   call execute(cats('%nrstr(%readOneFile)'
    ,'(lib=source,table_in=',name_dataset
    ,',table_out=temp_table)'
    ));
   call execute('PROC APPEND BASE=&table_master(compress=yes) DATA=temp_table force; RUN;');
run;

%delete_table(table=temp_table);
fre
Quartz | Level 8 fre
Quartz | Level 8

@PaigeMiller

Thank you for sharing this shorter version, which worked very well in my program. 

It looks cleaner then your first solution (which also worked very well).

If you don't mind, I will change the accepted solution to this second shorter version of you.

 

@Tom

Thank you for this alternative version.  It looks very interesting that you can stack some code while in a datastep to run after a datastep completes.  I've learned a lot since the beginning of this thread.

PaigeMiller
Diamond | Level 26

@fre wrote:

@PaigeMiller

Thank you for sharing this shorter version, which worked very well in my program. 

It looks cleaner then your first solution (which also worked very well).

If you don't mind, I will change the accepted solution to this second shorter version of you.



Hello @fre

 

I'm glad you liked the 2nd solution I provided. I often hear people say words like "cleaner", and I'm not sure what you mean by that. Anyway, over many years of programming, I have decided that I prefer the first program to the second program, although that is entirely subjective and I'm not even sure I can state why; but as you said, both work as desired and if you like the second solution better, that's great, I'm glad it works for you.

--
Paige Miller
fre
Quartz | Level 8 fre
Quartz | Level 8

Maybe 'cleaner' wasn't the right word to use. 

The second solution just looks more efficient (but i'm not sure if it actually is) because you only access the dataset "test.tables" 1 time.   In the first solution, it looks like the same table is accessed 100 times.

Most important is that both solutions work, so I'm already a happy man.

Tom
Super User Tom
Super User

I really do not like the idea of generating hundreds of macro variables. 

Either use CALL EXECUTE to generate the code or you can actually use a data step to write the code to a file.  Writing the code to a file is even easier to debug since you can examine the file and submit pieces of it to make sure your code generation logic is correct.

PaigeMiller
Diamond | Level 26

@Tom wrote:

I really do not like the idea of generating hundreds of macro variables. 

 


Yes, now that you mention it, I think this is the reason I began using the first code rather than the second code.

 

Either use CALL EXECUTE to generate the code or you can actually use a data step to write the code to a file.  Writing the code to a file is even easier to debug since you can examine the file and submit pieces of it to make sure your code generation logic is correct.


This certainly is good advice — however I feel that for me personally, I have reached the point where I can write working macro code without the need for a file of code to help me see if I have gotten it right. So its faster for me to not write the code to a file.

--
Paige Miller
fre
Quartz | Level 8 fre
Quartz | Level 8

 

It's a pity only one answer can be marked as "the solution".

Just did some research on 'call execute', and indeed, very powerful too to handle my original question.

So dear reader, this thread contains at least 3 solutions so far.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 14 replies
  • 5277 views
  • 0 likes
  • 4 in conversation