DATA Step, Macro, Functions and more

SQL questions for joining two tables

Reply
Frequent Contributor
Posts: 109

SQL questions for joining two tables

I've two data sets I need to use SQL to stack up or join after doing the count (# of records). I then need to sum up the count from data set 1 and the count from data set 2. Since the data sets are really huge, I'm trying to do in fewer steps if possible. Here is the code: I used UNION to join  but it would only read in the count variable from data 1, not the count from data 2. If I use OUTER UNION Corr, it will treat those from data 1 and data 2 as two records, naturally. I was hoping there is a way SAS SQL can smartly consolidate the two cases. I know I can output the count-processed data for data 1, and then for data 2, and then create data 3 through FULL OUTER JOIN but I was hoping not to output the intermediate steps to cut my data running time. Is there a way?  Thanks in advance.

 

 

data d1; input id1 id2;
datalines;
1 11
2 12
2 12
3 11
3 14
3 14
;
data d2; input id1 id2;
datalines;
1 11
4 15
4 15
;
proc sql; create table d3 as 
	select distinct id1, id2, count(*) as C1 from d1 group by id1, id2
	outer union corr 
	select distinct id1, id2, count(*) as C2 from d2 group by id1, id2
	order by id1, id2;
quit;
proc print data=d1; run;
proc print data=d2; run;
proc print data=d3; run;

 

Here is what I got if I use UNION or OUTER UNION CORR.

If using UNION       If using OUTER UNION CORR    
Obs id1 id2 C1   Obs id1 id2 C1 C2
1 1 11 1   1 1 11 . 1
2 2 12 2   2 1 11 1 .
3 3 11 1   3 2 12 2 .
4 3 14 2   4 3 11 1 .
5 4 15 2   5 3 14 2 .
          6 4 15 . 2

 

 

Here is the data I want.

Obs id1 id2 C1 C2
1 1 11 1 1
3 2 12 2 .
4 3 11 1 .
5 3 14 2 .
6 4 15 . 2
Super User
Posts: 6,763

Re: SQL questions for joining two tables

[ Edited ]

For a SQL guru, this should be easy.  Until one replies, you can just follow a slightly different path:

 

Actually create summaries of d1 and d2, such as:

 

proc sql;

create table d1counts as select id1, id2, count(*) as c1 from d1 group by id1, id2;

create table d2counts as select id1, id2, count(*) as c2 from d2 group by id1, id2;

quit;

 

It should be possible to create d1counts and d2counts as views rather than data sets.

 

Then put them together:

 

data d3;

merge d1counts d2counts;

by id1 id2;

run;

Esteemed Advisor
Posts: 5,526

Re: SQL questions for joining two tables

You want a full join:

 

proc sql; 
create table d3 as
select 
    coalesce(a.id1, b.id1) as id1, 
    coalesce(a.id2, b.id2) as id2, 
    C1, C2 
from 
    (select id1, id2, count(*) as c1 from d1 group by id1, id2) as a
	full join
	(select id1, id2, count(*) as c2 from d2 group by id1, id2) as b
        on a.id1=b.id1 and a.id2=b.id2
	order by id1, id2;
quit;
PG
Super User
Posts: 10,770

Re: SQL questions for joining two tables

data d1; input id1 id2;
datalines;
1 11
2 12
2 12
3 11
3 14
3 14
;
data d2; input id1 id2;
datalines;
1 11
4 15
4 15
;

proc sql;
select coalesce(a.id1,b.id1) as id1,coalesce(a.id2,b.id2) as id2,c1,c2
 from (select id1,id2,count(*) as c1 from d1 group by id1,id2) as a 
      full join
      (select id1,id2,count(*) as c2 from d2 group by id1,id2) as b 
      on a.id1=b.id1 and a.id2=b.id2;
quit;
Ask a Question
Discussion stats
  • 3 replies
  • 111 views
  • 1 like
  • 4 in conversation