DATA Step, Macro, Functions and more

Merging Files Automatically

Reply
Contributor JAR
Contributor
Posts: 45

Merging Files Automatically

Dear All,

I have to pull data from several datasets into one. The following code works perfectly:

Data Final;
Set
D3 D4 D5 D6 D7 D8 D9 D10;

As the number of datasets changes each time, I wonder if there is a way to call them in one step (similar to Array D3-D10).

If it is not possible, is there a way to use a macro...?

Regards,

JAR

PROC Star
Posts: 7,474

Merging Files Automatically

I think I read that is possible in 9.3

PROC Star
Posts: 7,474

Merging Files Automatically

Of course, even without 9.3, you could always use something like:

data d1;

x=1;

output;

run;

data d2;

x=2;

output;

run;

data all;

  set d:;

run;

PROC Star
Posts: 7,474

Merging Files Automatically

And, while I had never tried it, it works in 9.2 as well:

data all;

  set d1-d2;

run;

What I must have read is the new ability to do the same thing in the data statement itself.

Contributor JAR
Contributor
Posts: 45

Merging Files Automatically

I am using learner's edition of Enterprise Guide. The engine is still 9.1, your code does not work in it:

data all;

set d1-d2;

run;

Regards,

JAR

PROC Star
Posts: 7,474

Merging Files Automatically

You could always approximate it using a combination of proc sql and a datastep. E.g.,:

proc sql noprint;

  select memname into : files

      separated by " "

        from dictionary.tables

                where libname="WORK" and

                 memname like 'D%'

;

quit;

data want;

  set &files.;

run;

Frequent Contributor
Posts: 82

Merging Files Automatically

This should work as well:

%macro combine;

data final;

set

%do i=3 %to 10;

d&i

%end;

;

run;

%mend;

%combine;

Valued Guide
Posts: 765

Re: Merging Files Automatically

Hi ... as more and more data sets get added, would PROC APPEND be faster for concatenating data sets ...


%macro fakedata;

%do j=1 %to 10;

data d&j;

do j=1 to 1e6;

output;

end;

run;

%end;

%mend;

* make 10 data sets ... d1 through d10;

%fakedata;


data _null_;

do ds=1 to 10;

   call execute(catt('proc append base=final data=d',ds,';run;'));

end;

run;

PROC Star
Posts: 7,474

Re: Merging Files Automatically

Interestingly, yes, proc append (and/or probably using append in proc datasets) is quite a bit more efficient.  I wonder why the same operation uses a different algorithm in a datastep.  There shouldn't be any need to re-read each file when appending additional files, but the processing time indicates otherwise.

Regular Contributor
Posts: 184

Merging Files Automatically

APPEND is a specialized tool, and that allows a degree of optimization (block operations, etc.). The DATA step is a very flexible thing, but at a cost. It drags all of the data, an observation at a time, through the program data vector. That adds overhead.

I'm pretty sure there's no re-reading. That would have to be deliberately contrived.

Also: OPEN=DEFER may help in the DATA step, if the data sets meet the requirements.

art297 wrote:

Interestingly, yes, proc append (and/or probably using append in proc datasets) is quite a bit more efficient.  I wonder why the same operation uses a different algorithm in a datastep.  There shouldn't be any need to re-read each file when appending additional files, but the processing time indicates otherwise.

Ask a Question
Discussion stats
  • 9 replies
  • 198 views
  • 0 likes
  • 5 in conversation