DATA Step, Macro, Functions and more

full join

Reply
Contributor
Posts: 37

full join

I do not understand why the result of the following full join triples datetime1 field:

 

data tab1;
input datetime1:datetime16.  class1:$70. var1:8.3  ;
datalines;
17DEC15:00:00:00  XBC1_01 0
run;
data tab2;
informat datetime1 datetime16.;
input datetime1:datetime16.  class1:$70. var1:8.3;
datalines;
17DEC15:00:00:00 XBC1_06 .
17DEC15:00:00:00 XBC1_07 .
17DEC15:00:00:00 XBC1_09 .
06JAN16:20:50:00 XBC1_06 1323.510288
06JAN16:20:50:00 XBC1_09 1242.7005635
06JAN16:20:50:00 XBC1_07 1476.8484275
run;
 
proc sql;
create table tab_full as
select distinct
       coalesce(a.datetime1, b.datetime1, c.datetime1, d.datetime1)as dt1 format=datetime18. ,
       a.var1 as XBC1_01, 
       b.var1 as XBC1_06,
       c.var1 as XBC1_07,
       d.var1 as XBC1_09
from tab1 as a 
full join tab2 (where=( class1='XBC1_06')) as b 
on (a.datetime1=b.datetime1)
full join tab2 (where=( class1='XBC1_07')) as c
on (a.datetime1=c.datetime1) 
full join tab2 (where=( class1='XBC1_09')) as d
on (a.datetime1=d.datetime1)
order by dt1;
quit;

I would expect the following result table "tab_full":

 

dt1

XBC1_01

XBC1_06

XBC1_07

XBC1_09

1

17DEC15:00:00:00

0

.

.

.

2

06JAN16:20:50:00

.

1323.510288

1476.8484275

1242.7005635

 

 
And not the following:

 

dt1

XBC1_01

XBC1_06

XBC1_07

XBC1_09

1

17DEC15:00:00:00

0

.

.

.

2

06JAN16:20:50:00

.

.

.

1242.7005635

3

06JAN16:20:50:00

.

.

1476.8484275

.

4

06JAN16:20:50:00

.

1323.510288

.

.

Super User
Super User
Posts: 7,949

Re: full join

[ Edited ]
Posted in reply to mario_pellegrini

The reason is there is no match from b,c,d for 06JAN in a, hence all three rows get copied across - i.e. there is no join match.  Let me just explain further, on the first join step, the row is brought in, there is no match so obs is written out, then the next join same thing, and the third, so you end up with three obs going out as there is no match.

 

 

I would suggest that maybe you are using SQL but don't really know what you are doing with it, correct?  Its just the very odd use of datastep where statements in amongst SQL, its not good.  Your merge (which is what it is) can be really simplfied:

data want;
  merge tab1 (rename=(var1=xbc1_01))
        tab2 (rename=(var1=xbc1_06) where=(class1="XBC1_06"))
        tab2 (rename=(var1=xbc1_07) where=(class1="XBC1_07"))
        tab2 (rename=(var1=xbc1_09) where=(class1="XBC1_09"));
  by datetime1;
run;

 

Contributor
Posts: 37

Re: full join

this is an excellent solution..but because the data merge does not behave like the proc sql?

 

 

Ask a Question
Discussion stats
  • 2 replies
  • 218 views
  • 1 like
  • 2 in conversation