Hi all,
I'm currently having a brain blockage struggling with defining a SQL which returns all rows from both tables with no matching key to the other table.
Sample data
data have1;
var1='1';
do key=1,2,3,4;
output;
end;
stop;
run;
data have2;
var2=2;
do key=1,4,5,6;
output;
end;
stop;
run;
Desired result
Key1 | Key2 | Var1 | Var2 |
---|---|---|---|
2 | 1 | ||
3 | 1 | ||
5 | 2 | ||
6 | 2 |
It would be easy enough to use a SAS Merge but I'm dealing with two big Oracle tables (>100M rows each) so this needs to run in-database - and the job is on the critical path so I need something which performs as good as possible.
I can create additional indexes on the source tables if required - so assume such indexes exist.
Really would appreciate some guidance with this.
Environment is:
SAS 9.4 under 64bit Linux
Oracle 11c
Thanks,
Patrick
Hi,
Perhaps this visual + code will help.
Code option for...
...seems to do the trick without creating a Cartesian product. I'll try this one then unless someone suggest something even faster - cursor would be acceptable.
I have quite a bit of freedom of how I implement as long as I don't change table structures (which are part of a SAS solution used by OoTB processes).
proc sql;
select coalesce(have1.key,have2.key) as key, have1.var1, have2.var2
from have1 full outer join have2
on have1.key=have2.key
where have1.key is NULL or have2.key is NULL
;
quit;
This is Proc SQL code you need to modify it for oracle. I think in Oracle SQL it is minus instead of except.
Proc SQL;
create table result as
(select h1.key as key1,
h1.var1 as var1,
. as var2
from have1 h1
except
select h1.key as key1,
h1.var1 as var1,
. as var2
from have1 h1,
have2 h2
where h1.key = h2.key)
union
(select h2.key as key1,
' ' as var1,
h2.var2 as var2
from have2 h2
except
select h2.key as key1,
' ' as var1,
h2.var2 as var2
from have1 h1,
have2 h2
where h1.key = h2.key);
quit;
I would expect the Outer Join approach to perform better. What's your take on this?
OK. Mr Patrick.M ,
data have1; var1='1'; do key=1,2,3,4; output; end; stop; run; data have2; var2=2; do key=1,4,5,6; output; end; stop; run; proc sql; create table want as select * from have1 where key not in (select key from have2) outer union corr select * from have2 where key not in (select key from have1) ; quit;
Xia Keshan
That's the coding option I had initially in mind but then felt that this means multiple passes through the data. I will run an "explain" for what you suggest and the "Outer Join" approach and see where Oracle tells me the cost is lower. I can't really test against real data yet as my development environment doesn't have the real data volumes (and also the hardware is very different to what it will be in Prod).
Patrick,
Why not ask Oracle Admin to do that , since you want to process these tables in Oracle .
If you can process them in SAS, I suggest to use Hash Table .
Xia Keshan
Thank you for your time and support.
What I will do for now is to take the Full Outer Join approach as this seems to work and is the "documented" coding approach. So that's how I'm going to implement first.
I then will take your advice and involve the Oracle DBA asking to performance tune the SQL in case this job should remain on the critical path and really be long running.
Pulling the data into SAS and do stuff there is absolutely no option as the tables I'm dealing with have more than 100M rows each but the delta will be less than 1% - and I need the result set in Oracle and not SAS for further processing.
Thanks,
Patrick
An alternative code , see which one would be faster . I guess it is me,since you are asking vertically union not horizontally union .
proc sql;
create table want as
select *
from have1 as a
where not exists(select * from have2 where key=a.key)
outer union corr
select *
from have2 as b
where not exists(select * from have1 where key=b.key) ;
quit;
Xia Keshan
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.