Data Appending in a Loop

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Data Appending in a Loop

Every month we run the following to create data files:

 

%macro getdata;
%let year1 = 2000;
%let year2 = 2011;
%let year3 = 2012;
%let year4 = 2017;

%do i = &year1 %to &year2;
	data egtask.dailysup_&i.;set ibs_arc.dsup&i.;run;
%end;

%do i = &year3 %to &year4;
	data egtask.dailysup_&i.;set ibs.dsup&i.;run;
%end;
%mend getdata;
%getdata;

PROC SQL;
CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
SELECT * FROM EGTASK.DAILYSUP_2000
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2001
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2002
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2003
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2004
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2005
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2006
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2007
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2008
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2009
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2010
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2011
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2012
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2013
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2014
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2015
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2016
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2017
;
run;

 

As you can see, the sql command is not very elegant. Is it possible to put it into a loop so one would just be able to use the year variables at the start of the macro to make the table? This way we would not need to add a new select * line at the end of every year; it would just be fully automated.

 

Thanks!


Accepted Solutions
Solution
2 weeks ago
Super User
Posts: 1,238

Re: Data Appending in a Loop

Based on @Astounding's and @Reeza's notes:

 

Assuming archieve data desn't change, then create base - run it once - as:

     

data egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsuo2011;
run;

After successfull run you can deside if to cancel it from archieve or keep for backup;

 

having the base created, you can continue monthly with next code, either:

data SASCOMB.DAILY_SUP_APPEND;
    set egtask.dsup2000_2011    /* the base file */
          ibs.dsup2012 - ibs.dsup2017
   ;
run;

or using my fixed code addapted to above:

%macro join(ystart,yend);
%DO;
     PROC SQL;
     CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
     select * from egtask.dsup2000_2011
  %do year=&ystart %to &yend;
      outer union corr
      SELECT * FROM EGTASK.DAILYSUP_&year
  %end;
   ; QUIT;
%END;
%mend join;
%join(2012,2017);

 

 

  

View solution in original post


All Replies
Super User
Posts: 1,238

Re: Data Appending in a Loop

Assuming your code works fine, you can replace the SQL step with next code:

%macro join(ystart,yend);
%DO;
     PROC SQL;
     CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
  %do year=&ystart %to &yend;
      SELECT * FROM EGTASK.DAILYSUP_&year
       OUTER UNION CORR 
  %end;
   ; QUIT;
%END; %mend join; %join(2000,2017);
Grand Advisor
Posts: 17,396

Re: Data Appending in a Loop

Wouldn't PROC APPEND be more efficient, even if you did have to do it monthly?

Respected Advisor
Posts: 4,992

Re: Data Appending in a Loop

Potentially, there is a lot wrong with this process.  Making the code more elegant by setting up a loop?  No way.  That would just hide the bad features that are already built in so that they would never get changed.  Let's take a look under the hood here ...

 

Does the underlying data for years 2000 to 2016 ever change?  If not, why not just save the combined result through 2016 permanently.  Then you can add in the data for 2017 monthly, to create an up-to-date picture.

 

Why copy data sets to the EGTASK library?  That's a waste of resources.  SQL can operate on the original data from the IBS and IBS_ARC libraries.

 

Trying to add macro loops to cover up the blemishes?  Not the right solution.

Contributor
Posts: 23

Re: Data Appending in a Loop

@Astounding.

I have inherited this code and I come from an eviews background, so bear with me. Yes the past data can change, so it needs to be rerun. Can you elaborate on using the original data from the libraries as opposed to copying to EGTASK? How would this be implemented with Shmuel's idea.

Thanks!
Respected Advisor
Posts: 4,992

Re: Data Appending in a Loop

[ Edited ]

Before we get to any macro language, picture this code as a goal:

 

PROC SQL;
CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
SELECT * FROM ibs_arc.DSUP_2000
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2001
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2002
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2003
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2004
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2005
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2006
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2007
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2008
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2009
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2010
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2011
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2012
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2013
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2014
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2015
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2016
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2017
;
run;

 

Copying the data sets from their original locations almost certainly represents an extra unnecessary step (unless the EGTASK folder also needs to contain a separate copy of the data).

 

Could the SQL code be replaced with something better?  I'm not sure.  I don't know enough about how OUTER UNION CORR works to combine files.  If that's the right way to go, Schmuel's idea is close.  It does generate the words OUTER UNION CORR an extra time, but that can be addressed by macro language.

Respected Advisor
Posts: 4,992

Re: Data Appending in a Loop

BCNAV,

 

Now you're looking in the right direction ... what's the process, what does it need to be, what could it become?

 

Question:  When you say the data for past years could change, does that include the earlier archived years (2000 to 2011)?  If those years are fixed, they could be combined and saved that way to speed up the monthly process.

 

Question for the board:  What does OUTER UNION CORR do?  Is it identical to concatenating the data sources in a SET statement, or does it imply some other result?  Don't make me message @PGStats

Super User
Posts: 1,238

Re: Data Appending in a Loop

SQL - OUTER UNION CORR - 

Next documentation taken from:

https://v8doc.sas.com/sashtml/proc/zueryexp.htm

 

CORRESPONDING (CORR) Keyword
The CORRESPONDING keyword is used only when a set operator is specified. 
CORR causes PROC SQL to match the columns in table-expressions by name and not by ordinal position.
Columns that do not match by name are excluded from the result table, except for the OUTER UNION operator. See OUTER UNION . For example, when performing a set operation on two table-expressions,
PROC SQL matches the first specified column-name (listed in the SELECT clause)
from one table-expression with the first specified column-name from the other.
If CORR is omitted, PROC SQL matches the columns by ordinal position.

 

Grand Advisor
Posts: 17,396

Re: Data Appending in a Loop

Here's a one way that's damn easy to code but may not be the most efficient Smiley Happy

This assumes you're only appending data and would replace your entire program.

 

%let year1 = 2000;
%let year2 = 2011;
%let year3 = 2012;
%let year4 = 2017;

data want;
 set 
     ibs_arc.dsup&year1. - ibs_arc.dsup&year2. 
     ibs_arc.dsup&year3. - ibs_arc.dsup&year4. ;
 run;

 

 

 

Contributor
Posts: 23

Re: Data Appending in a Loop

The reason they are copied to EGTASK, is that IT has some of the data as "current" (last few years), while older stuff is in an archive folder. This way the data to append is all in one spot. I do like Shmuel's idea and I can see it work, but it could also be nicer not to copy the data locally to EGTASK, as Astounding notes. Not sure it is worth the effort, but could his idea be modified to get data from the two sources?  Maybe not worth the effort.

 

thx

 

 

Grand Advisor
Posts: 17,396

Re: Data Appending in a Loop

Your archive data isn't changing is it? You can make that your 'base' table and append to it once, you shouldn't combine them all each time.

Solution
2 weeks ago
Super User
Posts: 1,238

Re: Data Appending in a Loop

Based on @Astounding's and @Reeza's notes:

 

Assuming archieve data desn't change, then create base - run it once - as:

     

data egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsuo2011;
run;

After successfull run you can deside if to cancel it from archieve or keep for backup;

 

having the base created, you can continue monthly with next code, either:

data SASCOMB.DAILY_SUP_APPEND;
    set egtask.dsup2000_2011    /* the base file */
          ibs.dsup2012 - ibs.dsup2017
   ;
run;

or using my fixed code addapted to above:

%macro join(ystart,yend);
%DO;
     PROC SQL;
     CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
     select * from egtask.dsup2000_2011
  %do year=&ystart %to &yend;
      outer union corr
      SELECT * FROM EGTASK.DAILYSUP_&year
  %end;
   ; QUIT;
%END;
%mend join;
%join(2012,2017);

 

 

  

Super User
Posts: 1,238

Re: Data Appending in a Loop

On 2nd thought - why copy archive data - maybe creating the base as a view will be more efficient:

data egtask.dsup2000_2011 / view=egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsuo2011;
run;
Contributor
Posts: 23

Re: Data Appending in a Loop

[ Edited ]

In an effort to keep things simple for others, I decided to use the simpler code by Shmuel (despite my liking of macros and loops). Everyone can understand it, is intuitive, and can easily be changed. Thanks to all!!

 

/* Use a view as a "virtual dataset" so that data is not compiled unnecessarily */
data egtask.dsup2000_2011 / view=egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsup2011;
run;
data SASCOMB.DAILY_SUP_APPEND_2;
    set egtask.dsup2000_2011    /* The base file view from above*/
          ibs.dsup2012 - ibs.dsup2017
   ;
run;
proc sql; drop view egtask.dsup2000_2011; run;

Grand Advisor
Posts: 17,396

Re: Data Appending in a Loop

@BCNAV Please mark one of his answers as the solution.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 118 views
  • 3 likes
  • 4 in conversation