10-12-2016 07:10 AM
Currently I have a long code (BASE 9.2) where work folder is stored locally. And this is slow. How can I enforce data processing for all intermediate steps to be performed on the professional rdbms (ORACLE). Can I just reassign at the begining of the code the local work to ORACLE?
10-12-2016 07:39 AM
WORK is a very special location that is created by the SAS process at startup. Instead of shifting it to the RDBMS (which cannot satisfy some of the particulars of the WORK location, as there are objects stored there that are not datasets), rather get some high-speed storage for it. SSDs are recommended.
And you probably need to work on the performance of your code. Identify the long-running steps, and see what you can do about them.
Do a google search for "improve sas performance" and apply the many suggestions in those documents.
Or come here for guidance on how to improve certain steps.
How large is your code (in lines of code), anyway?
10-12-2016 07:46 AM - edited 10-12-2016 07:59 AM
No, you can't. SAS and Oracle are totally different systems. Oracle doesn't have the concept of WORK the way SAS has it.
When interacting with a database and larger data volumes then you need to pay a bit more attention on how you design your data flow.
What you want to do, is to keep data transfer between the database and SAS to a minimum. What you normally would try to do, is to keep processing in-database until you've reduce data volumes to the level how you need it in SAS, then transfer the data and do the rest of data manipulations in SAS.
What this means for you on the Oracle side:
1. Quite a bit of code to be written in Oracle SQL syntax using the pass-through facility
2. On the Oracle side either create intermediary tables (if the DBA allows for it) and drop these tables immediately once no more required; OR... pack all the logic into a single Oracle query.
There is a concept of "temporary tables" in Oracle which I confused in the beginning with SAS Work tables. It's actually not the same as the table structure as such is not at all temporary (you still need to create and/or drop it). Temporary is only the data attached to this table structure.
You can use such temporary Oracle tables out of SAS but it's most of the time not really what you're after (as it's not the same like a SAS table in WORK ). IMHO temporary Oracle tables are only worth considering if you need the same table structure regularly out of different processes. The main advantage of temporary tables is basically that you don't have to drop the data (delete/truncate) once you're done with your job and that different processes can attach different data to the same temporary table structure at the same time in parallel (and each process gets only its own data).
10-12-2016 08:03 AM
The main advantage of temporary tables is basically that you don't have to drop the data (delete/truncate) once you're done with your job
Yes, this is exactly the reason. The final data afterwards will be anyway stored in the same ORACLE, but moved to its final tables. Actually SAS is used just for retreiving data from different sources (csv, txt, ORACLE, SQL Server) at the same time, merge it and do with that some basic transformations (for which SAS is overqualified:-)).
The code itself is split up into many different files, but put together I get 286 A4 pages. So the intention to move the intermediate steps to ORACLE was just kind of quick win (just by redirecting at the very begining of the script). Next steps will be anyway to rewrtite the whole script towards SQL (on ORALCE), but using SAS just as ETL tool (other ETL tools are not allowed unfort.).
10-12-2016 08:11 AM - edited 10-12-2016 08:34 AM
other ETL tools are not allowed unfort
Why is that unfortunate? SAS is really great at reading data from many sources and text files in any crazy structure.
So what you possibly want to aim for is ETLTL.
Use SAS to extract the data, do some data validation and "basic" transformations (like re-coding; SAS Formats are great for that) and then load the data into Oracle staging tables. Do the lookup validations, joins and loads directly in Oracle (using pass-through SQL you still can run everything out of SAS but in this stage SAS becomes the "remote control" only).
Does that make sense?
P.S: Once you've got your staging tables loaded you can develop all the Oracle SQL directly with SQL clients (eg. SQL Developer). Once everything is working you just need to "wrap" SAS pass-through Execute syntax around the Oracle SQL (and this could even be PL/SQL). I've done this successfully in the past - and that's what I call using SAS as "remote control".
The advantage of such an approach is that from a scheduling and monitoring perspective everything runs out of SAS.
10-12-2016 08:20 AM
unfort. just because I think SAS deserves more, than just how is it utilized in the project I am working on. The utilization of SAS in this project is like to "shoot with cannons on sparrows".
Sure, it make sense so! But obviously I have to forget about my "quick win"...