BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BCNAV
Quartz | Level 8

Every month we run the following to create data files:

 

%macro getdata;
%let year1 = 2000;
%let year2 = 2011;
%let year3 = 2012;
%let year4 = 2017;

%do i = &year1 %to &year2;
	data egtask.dailysup_&i.;set ibs_arc.dsup&i.;run;
%end;

%do i = &year3 %to &year4;
	data egtask.dailysup_&i.;set ibs.dsup&i.;run;
%end;
%mend getdata;
%getdata;

PROC SQL;
CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
SELECT * FROM EGTASK.DAILYSUP_2000
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2001
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2002
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2003
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2004
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2005
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2006
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2007
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2008
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2009
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2010
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2011
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2012
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2013
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2014
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2015
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2016
 OUTER UNION CORR 
SELECT * FROM EGTASK.DAILYSUP_2017
;
run;

 

As you can see, the sql command is not very elegant. Is it possible to put it into a loop so one would just be able to use the year variables at the start of the macro to make the table? This way we would not need to add a new select * line at the end of every year; it would just be fully automated.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Based on @Astounding's and @Reeza's notes:

 

Assuming archieve data desn't change, then create base - run it once - as:

     

data egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsuo2011;
run;

After successfull run you can deside if to cancel it from archieve or keep for backup;

 

having the base created, you can continue monthly with next code, either:

data SASCOMB.DAILY_SUP_APPEND;
    set egtask.dsup2000_2011    /* the base file */
          ibs.dsup2012 - ibs.dsup2017
   ;
run;

or using my fixed code addapted to above:

%macro join(ystart,yend);
%DO;
     PROC SQL;
     CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
     select * from egtask.dsup2000_2011
  %do year=&ystart %to &yend;
      outer union corr
      SELECT * FROM EGTASK.DAILYSUP_&year
  %end;
   ; QUIT;
%END;
%mend join;
%join(2012,2017);

 

 

  

View solution in original post

15 REPLIES 15
Shmuel
Garnet | Level 18

Assuming your code works fine, you can replace the SQL step with next code:

%macro join(ystart,yend);
%DO;
     PROC SQL;
     CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
  %do year=&ystart %to &yend;
      SELECT * FROM EGTASK.DAILYSUP_&year
       OUTER UNION CORR 
  %end;
   ; QUIT;
%END; %mend join; %join(2000,2017);
Reeza
Super User

Wouldn't PROC APPEND be more efficient, even if you did have to do it monthly?

Astounding
PROC Star

Potentially, there is a lot wrong with this process.  Making the code more elegant by setting up a loop?  No way.  That would just hide the bad features that are already built in so that they would never get changed.  Let's take a look under the hood here ...

 

Does the underlying data for years 2000 to 2016 ever change?  If not, why not just save the combined result through 2016 permanently.  Then you can add in the data for 2017 monthly, to create an up-to-date picture.

 

Why copy data sets to the EGTASK library?  That's a waste of resources.  SQL can operate on the original data from the IBS and IBS_ARC libraries.

 

Trying to add macro loops to cover up the blemishes?  Not the right solution.

BCNAV
Quartz | Level 8
@Astounding.

I have inherited this code and I come from an eviews background, so bear with me. Yes the past data can change, so it needs to be rerun. Can you elaborate on using the original data from the libraries as opposed to copying to EGTASK? How would this be implemented with Shmuel's idea.

Thanks!
Astounding
PROC Star

Before we get to any macro language, picture this code as a goal:

 

PROC SQL;
CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
SELECT * FROM ibs_arc.DSUP_2000
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2001
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2002
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2003
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2004
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2005
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2006
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2007
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2008
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2009
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2010
 OUTER UNION CORR 
SELECT * FROM ibs_arc.DSUP2011
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2012
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2013
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2014
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2015
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2016
 OUTER UNION CORR 
SELECT * FROM ibs.DSUP2017
;
run;

 

Copying the data sets from their original locations almost certainly represents an extra unnecessary step (unless the EGTASK folder also needs to contain a separate copy of the data).

 

Could the SQL code be replaced with something better?  I'm not sure.  I don't know enough about how OUTER UNION CORR works to combine files.  If that's the right way to go, Schmuel's idea is close.  It does generate the words OUTER UNION CORR an extra time, but that can be addressed by macro language.

Astounding
PROC Star

BCNAV,

 

Now you're looking in the right direction ... what's the process, what does it need to be, what could it become?

 

Question:  When you say the data for past years could change, does that include the earlier archived years (2000 to 2011)?  If those years are fixed, they could be combined and saved that way to speed up the monthly process.

 

Question for the board:  What does OUTER UNION CORR do?  Is it identical to concatenating the data sources in a SET statement, or does it imply some other result?  Don't make me message @PGStats

Shmuel
Garnet | Level 18

SQL - OUTER UNION CORR - 

Next documentation taken from:

https://v8doc.sas.com/sashtml/proc/zueryexp.htm

 

CORRESPONDING (CORR) Keyword
The CORRESPONDING keyword is used only when a set operator is specified. 
CORR causes PROC SQL to match the columns in table-expressions by name and not by ordinal position.
Columns that do not match by name are excluded from the result table, except for the OUTER UNION operator. See OUTER UNION . For example, when performing a set operation on two table-expressions,
PROC SQL matches the first specified column-name (listed in the SELECT clause)
from one table-expression with the first specified column-name from the other.
If CORR is omitted, PROC SQL matches the columns by ordinal position.

 

Reeza
Super User

Here's a one way that's damn easy to code but may not be the most efficient 🙂

This assumes you're only appending data and would replace your entire program.

 

%let year1 = 2000;
%let year2 = 2011;
%let year3 = 2012;
%let year4 = 2017;

data want;
 set 
     ibs_arc.dsup&year1. - ibs_arc.dsup&year2. 
     ibs_arc.dsup&year3. - ibs_arc.dsup&year4. ;
 run;

 

 

 

BCNAV
Quartz | Level 8

The reason they are copied to EGTASK, is that IT has some of the data as "current" (last few years), while older stuff is in an archive folder. This way the data to append is all in one spot. I do like Shmuel's idea and I can see it work, but it could also be nicer not to copy the data locally to EGTASK, as Astounding notes. Not sure it is worth the effort, but could his idea be modified to get data from the two sources?  Maybe not worth the effort.

 

thx

 

 

Reeza
Super User

Your archive data isn't changing is it? You can make that your 'base' table and append to it once, you shouldn't combine them all each time.

Shmuel
Garnet | Level 18

Based on @Astounding's and @Reeza's notes:

 

Assuming archieve data desn't change, then create base - run it once - as:

     

data egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsuo2011;
run;

After successfull run you can deside if to cancel it from archieve or keep for backup;

 

having the base created, you can continue monthly with next code, either:

data SASCOMB.DAILY_SUP_APPEND;
    set egtask.dsup2000_2011    /* the base file */
          ibs.dsup2012 - ibs.dsup2017
   ;
run;

or using my fixed code addapted to above:

%macro join(ystart,yend);
%DO;
     PROC SQL;
     CREATE TABLE SASCOMB.DAILY_SUP_APPEND AS 
     select * from egtask.dsup2000_2011
  %do year=&ystart %to &yend;
      outer union corr
      SELECT * FROM EGTASK.DAILYSUP_&year
  %end;
   ; QUIT;
%END;
%mend join;
%join(2012,2017);

 

 

  

Shmuel
Garnet | Level 18

On 2nd thought - why copy archive data - maybe creating the base as a view will be more efficient:

data egtask.dsup2000_2011 / view=egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsuo2011;
run;
BCNAV
Quartz | Level 8

In an effort to keep things simple for others, I decided to use the simpler code by Shmuel (despite my liking of macros and loops). Everyone can understand it, is intuitive, and can easily be changed. Thanks to all!!

 

/* Use a view as a "virtual dataset" so that data is not compiled unnecessarily */
data egtask.dsup2000_2011 / view=egtask.dsup2000_2011;
      set ibs_arc.dsup2000 - ibs_arc.dsup2011;
run;
data SASCOMB.DAILY_SUP_APPEND_2;
    set egtask.dsup2000_2011    /* The base file view from above*/
          ibs.dsup2012 - ibs.dsup2017
   ;
run;
proc sql; drop view egtask.dsup2000_2011; run;

Reeza
Super User

@BCNAV Please mark one of his answers as the solution.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 5395 views
  • 5 likes
  • 4 in conversation