BookmarkSubscribeRSS Feed
milts
Pyrite | Level 9

I have two tables which I need to combine as sometimes some dates are found in table A and not in table B and vice versa. My desired result is that for those overlaps on consecutive days be combined.

How can I achieve this using proc SQL only?

Table A

id  startdate   enddate--------------------------
101 12/28/2013  12/31/2013

Table B

id  startdate   enddate--------------------------
101 12/15/2013  12/15/2013
101 12/16/2013  12/16/2013
101 12/28/2013  12/28/2013
101 12/29/2013  12/31/2013

Desired Result

id  startdate   enddate-------------------------
101 12/15/2013  12/16/2013
101 12/28/2013  12/31/2013

 

3 REPLIES 3
ballardw
Super User

It appears that you are collapsing intervals from data set B. Are you doing the same from dataset A? You should include an example for the data set A that clearly isn't in B so we have some more idea what is going on as I can see the final result coming only from B.

 

And why a restriction for proc sql only? Since order is important that restriction may add complexity to code that could be very straight forward in data step code.

Haikuo
Onyx | Level 15

Sometimes solving problems are all about choosing the right tools. Like @ballardw mentioned, data step is the right tool for this kind of task. That said, it still can be done using Proc SQL, but you may not like the complexity of the code:

proc sql;
	create table have as
		select * from table_a
			union corr
		select * from table_b
	;
quit;

proc sql;
	create table want_sql as    
		select distinct id, min(startdate, (select min(startdate) from have where id=a.id and (startdate between a.startdate-1 and a.enddate+1 
			or enddate between a.startdate and a.enddate or a.startdate between startdate-1 and enddate+1 or a.enddate between startdate and enddate))) as startdate format=date9.,
			max(enddate, (select max(enddate) from have where id=a.id and (startdate between a.startdate-1 and a.enddate+1 
			or enddate between a.startdate and a.enddate or a.startdate between startdate-1 and enddate+1 or a.enddate between startdate and enddate))) as enddate format=date9.
		from have a
	;
quit;

 

 The idea is simple: try to group the interval; while the implementaion is strenous. Good Luck!

hbi
Quartz | Level 8 hbi
Quartz | Level 8

Hi,

 

Below is another pure PROC SQL approach. It uses a self-join twice to (detect earlier and later rows) and enables two rows being collapsed into one. However, it does not deal with collapsing three or more rows into one; not sure if that kind of scenario would happen in your data, based on the example you provided. 

 

PROC SQL;
  CREATE TABLE have (
    id        num, 
    startdate num format=date11., 
    enddate   num format=date11.);
QUIT;


PROC SQL;
INSERT INTO have
values (101, '15-Dec-2013'd, '15-Dec-2013'd)
values (101, '16-Dec-2013'd, '16-Dec-2013'd)
values (101, '28-Dec-2013'd, '28-Dec-2013'd)
values (101, '29-Dec-2013'd, '31-Dec-2013'd)
/* add some additional data for good measure ... */
/* note that there are 3 consecutive rows (1/15 to 1/28/2014) */
/* values (201, '15-Jan-2014'd, '17-Jan-2014'd) */
/* values (201, '18-Jan-2014'd, '26-Jan-2014'd) */
/* values (201, '27-Jan-2014'd, '28-Jan-2014'd) */
/* values (201, '01-Feb-2014'd, '01-Feb-2014'd) */
;
QUIT;


PROC SQL;
  CREATE TABLE want AS 
  SELECT id
       , startdate
       , coalesce(nextRow_enddate, enddate) format=date11. AS wanted_enddate
  /* subquery joins current row to earlier row and future row (if they exist) */
  FROM (SELECT aa.id
             , aa.startdate
             , aa.enddate
             , monotonic()   as seq_num
             , next.enddate  as nextRow_enddate
             , prior.enddate as priorRow_enddate
        FROM have as aa
        LEFT JOIN have as next
               ON next.id = aa.id
              AND next.startdate = aa.enddate+1
        LEFT JOIN have as prior
               ON prior.id = aa.id
              AND prior.enddate = aa.startdate-1
        WHERE priorRow_enddate IS NULL) as qry;
QUIT;

 

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1467 views
  • 1 like
  • 4 in conversation