BookmarkSubscribeRSS Feed
OscarBoots2
Calcite | Level 5

Hi Forum,

 

Using SAS EG 7.1 with Windows 7.

 

I have 2 large tables, one with 100,000s of records & one with millions of records that I access via an Oracle Pass Through query.

 

To keep the time taken to run down, I've firstly created a Pass Through that extracts the common data between the 2 tables reducing the data to 60K uniq IDs.

 

I now want to use that data as a join in a new Pass Through so I can get some more fields from these large tables.

 

The plan is that the tables now only need to check 60K of records instead of the entire table.

 

Unfortunately, a subquery still takes too long to run & when I try to add the extracted dataset in the Pass Through, it's not recognised

 

I wonder if the 60K of Unique IDs can be saved somewhere (Temp table?) so It will be usable in the second Pass Through?

 

Any suggestions welcome.

 

Thanks

34 REPLIES 34
OscarBoots2
Calcite | Level 5

This may help explain my issue;

 

I had a thought that if a temp table isn't an option, maybe an export to a new location may be an answer?

 

Just thinking of any possibilities.

 

 

proc sql;
connect to oracle as ora (path="(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)
(HOST = xxxxxxxxxxxxxxx.com)(PORT = 1234))
(CONNECT_DATA = (SERVICE_NAME=xxxxxxxx)))" 
user=xxxxxx pw=xxxxxxxx);

create table WORK.xxxxxxxxx as select * from connection to ora
(
select distinct
A.ID

from QUERY A
LEFT JOIN QUERY_NEXT B
ON A.ID = B.ID

/*-----------------------------------------------*/

/*Is it possible to add code here to export the data back to another location?*/

/*--------------------------------------------------*/

);
disconnect from ora;
quit;
kiranv_
Rhodochrosite | Level 12

can do create final tables in multiple steps in oracle by using global temporary table and then can access those tables outside of connect by using libname method. I have not tested any of this and it has been while I did something like this in oracle. but you will get the idea. I know @SuryaKiran is well versed with oracle pass through. l would like to hear what he thinks on this and what would be his suggestion.

 

proc sql;
connect to oracle as ora (path="(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)
(HOST = xxxxxxxxxxxxxxx.com)(PORT = 1234))
(CONNECT_DATA = (SERVICE_NAME=xxxxxxxx)))" 
user=xxxxxx pw=xxxxxxxx connection =global);

excute( create global temporary table xyz as 
select distinct
A.ID

from QUERY A
LEFT JOIN QUERY_NEXT B
ON A.ID = B.ID) on commit preserve rows;
execute(commit) by oracle;

/*Is it possible to add code here to export the data back to another location?*/
excute( create global temporary table hello as 
select * from xyz
where name ='whatever') on commit preserve rows;
execute(commit) by oracle;



/*--------------------------------------------------*/

);
disconnect from ora;
quit;




/* you can use this temporary table by libname and create your sas table */


libname oradb oracle user=scott pw=tiger path=lupin connection=global dbtemp = yes;

proc sql;
create table work.table_you_want as select * from oradb.hello ;
quit;

  

kiranv_
Rhodochrosite | Level 12

yes this done easily. below is sample code for teradata but in oracle there is global temporary table which you can use. below code creates 2 temporary tables and then creates final table. please look into the paper given below. even though it is mostly for teradata, can be easily implied to oracle.  http://analytics.ncsu.edu/sesug/2017/SESUG2017_Paper-57_Final_PDF.pdf

proc sql;
connect to teradata (server=myserver user=myuserid pw=mypass connection = global);
/* 1st volatile table created*/
execute(create volatile table staging_customer as
select * from edwwrkuser.Cusomter table
where create_dt between ‘2017-01-01’ and ‘2017-01-31’ )
with data primary index(cust_id)on commit preserve rows) by teradata;
execute(commit work) by teradata;
/* second Teradata table creation is not shown*/
/* final Teradata table to be created*/
execute(create volatile table Final_cust_txn
select a.* , b.txn_id from staging_customer a
inner join
staging_txn_tbl
on a.cust_id =b.cust_id)
with data no primary index on commit preserve rows)by teradata;
execute(commit work) by Teradata;
disconnect from teradata;
quit;

 

OscarBoots2
Calcite | Level 5

Thanks kiranv,

 

That's the sort of solution I'm after.

 

Let me give it a try.

 

Thanks

OscarBoots2
Calcite | Level 5

Hi kiranv,

 

Are you able to use my posted code to show me the correct Oracle syntax for this or suggest an article I could check?

 

I'm having some trouble blending the two together.

 

No worries if you're a Teradata user, I am too & I'm trying to get this right nit having used Teradata for a while.

 

Thanks

Patrick
Opal | Level 21

You can always combine queries as below. That's certainly as fast as first running the inner query, storing it in some table and then use this table for the next query.

Look at the query bit in brackets as an implicit view.

create table want as
  select 
    outer.*
table3 outer left join ( your current query joining two tables ) inner on outer.blah=inner.blah
OscarBoots2
Calcite | Level 5

Thanks Patrick,

 

I have tried a subquery but it is still far to slow to help in this situation.

 

Cheers

Patrick
Opal | Level 21

Reading your original post again: You're first joining the two tables to get the common keys and then you want to join again to the one big table to get additional columns. That won't speed up things.

Just join the two tables and get all the columns you want in a single select statement. If the join is too slow then reformulate your join condition; ideally in a way that it can use existing indeces and if this doesn't exist then by trying to avoid OR conditions or and by using the comparisons which will be rarely true first in your condition (so the rest doesn't need testing anymore).

If using pass-through look also into Oracle hints allowing you to tell Oracle to execute the query in parallel.

 

And last but not least: Make sure you set option READBUFF to a bigger value than installation default because may-be the time consuming part is not your query on the Oracle side but the data transfer from Oracle to SAS.

SuryaKiran
Meteorite | Level 14

It seems to me like, your trying to avoid Heterogeneous joins between SAS Table and Oracle tables. The best approach is load the SAS table(Just the Key variables you need) into permanent table in Oracle and then run pass-through for in-database query. Most of the environments will not allow user to create permanent tables in PROD, in this situation check whether you can create a GLOBAL TEMPORARY table for Oracle or MULTISET VOLATILE TABLE in Teradata. If yes, then load the Key variable values for 60K into temp table. 

Note: Loading data into temp table is not as fast as loading data into permanent table. 

 

libname ora oracle user= pw= path= connection=global DBMSTEMP=YES;
  proc sql;
  connect to oracle (user= pw= path= connection=global);
   execute (create global temporary table TEMP
       (ID number) on commit preserve rows) by oracle;
  quit;

proc append base=ora.TEMP(Bulkload=YES) data=SAS_DATASET;
run;

Remember to keep CONNECTION=GLOBAL and ON COMMIT PRESERVE ROWS

 

 

 

 

Thanks,
Suryakiran
OscarBoots2
Calcite | Level 5

Thanks to all for your input so far.

 

Suryakiran, I've changed your code to do what I think I need to do to get the connection right, can you advise if it's correct or not.

 

Note, I have HOST & PATH in my connection criteria, how do I adapt it to a Global Temp criteria.

 

Thanks

 

libname ora oracle user=xxxxxxxx pw=xxxxxxx path= ?
(HOST = XXXXXXXXXXXXXXX.com)
(PORT = 1234)
connection=global DBMSTEMP=YES;
proc sql; connect to oracle (user=XXXXXXXX pw=XXXXXXXX path= ?
connection=global);
execute (create global temporary table TEMP (ID number) on commit preserve rows) by oracle; quit; proc append base=ora.TEMP(Bulkload=YES) data=SAS_DATASET; run;

 

 

SuryaKiran
Meteorite | Level 14

Your are using a complex type of connection instead of defining your connection entries in tnsnames.ora. Work with your DBA to set up the tnsnames.ora for connecting to oracle with less hassle. Anyways if you prefer this way then check this LIBNAME 

 

libname oralib1 oracle path="(DESCRIPTION=(ADDRESS = (PROTOCOL = TCP)(HOST =)(PORT = ))
(CONNECT_DATA = (SERVICE_NAME=)))" user=myuser password=mypasswd connection=global dbmstemp=yes;

For more information take a look at this topic: A quick guide to connecting to Oracle from SAS by @JBailey

Thanks,
Suryakiran
OscarBoots2
Calcite | Level 5

Thanks Suryakiran,

 

That worked or at least didn't error, so I assume I now have created a temp table because there's no visible result.

 

How can I now insert my records from a query into this table?

 

Thanks

SuryaKiran
Meteorite | Level 14
/* This Query will only create an empty table */
proc
sql; connect to oracle (user=XXXXXXXX pw=XXXXXXXX path= ? connection=global); execute (create global temporary table TEMP (ID number) on commit preserve rows) by oracle; quit; /* Insert records into temp table using PROC APPEND */ proc append base=ora.TEMP(Bulkload=YES) data=SAS_DATASET; run;
Thanks,
Suryakiran
OscarBoots2
Calcite | Level 5

Thanks, should I be adding a select statement after 'data=' in order to get results?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 34 replies
  • 3190 views
  • 2 likes
  • 4 in conversation