09-24-2017 02:42 PM
Hi, i want to make something like this work in proc sql:
create table sql.table as
select some_id, date from connection to oracle
select some_id, date, row_number()over(partition by some_id order by date) as row from dataset d join sql_table s on d.some_id=s.some_id
) where row=1
Basically i need to create table in oracle based on dataset joined with oracle table and take the first date for each id.
If i use "from oracle" pass through it doesn't see the dataset, if i make it the other way around i can't use window functions(row_number()) becouse they are not in SAS. For now i create a table from that dataset and then do a pass through but it's large and i need to do it in one go. Any suggestions?
09-24-2017 04:59 PM
The usual approach with this type of problem is to load the SAS table into Oracle as a temporary table first, then use SQL passthru to join to the temporary table. You can't really avoid moving one of the tables into the software environment where you are going to do the processing if they are large as it is the only way the join is going to work efficiently.
09-24-2017 05:12 PM
With a heterogenous join and if you want to use Oracle specific functions then there is no other way than to first upload the SAS table into Oracle.
Why do you need it "in one go"?
You can also upload into a temporary table and you can use bulk-load.
As you certainly know data transfer between SAS and the DBMS is most of the time the bottleneck so you want to minimise such transfer. Which way to go depends on your table volumes and cardinality.
If that's a regular process and performance is critical then also consider to maintain a permanent table in Oracle for your SAS data and only send updates to this table.... but if and what's possible depends of course on your actual circumstances so just throwing ideas here.
09-24-2017 08:01 PM
How big is the SAS data set? One option depending on the logic is to move the list of ID's over as macro variables.
Or load the SAS table to your Oracle env.