SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

proc sql to retrieve 10m rows from Oracle does not finish

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 133
Accepted Solution

proc sql to retrieve 10m rows from Oracle does not finish

Hello SAS Experts,

I am without a clue why the performance in SAS is so bad. I run the SQL via EG 7.1 on our SAS Server using SAS/ACCESS for Oracle.

The SQL Script itself selects abou 10-12m rows in Oracle in less than 30 minutes (Oracle SQL Developer) if I submit the same

SQL script via EG it just does not finish. I come back the next day and its still running. Something is really strange.

Any ideas would be amazing!



The below SQL embedded in the proc sql takes 3 minutes to run on the Oracle DB and does not finish after 3 hours within a SAS program running on a SASApp Server. why is that? 10m rows.


proc sql;
   connect to oracle as oracleadb (user=xxxx password='xxxx' path=DMP_ADB buffsize=50000);
     create table SASDWH.sasdwh_fact as
     (
       select *
           from connection to oracleadb          
             (

                /* takes 3 minutes to run in Oracle, 10m rows */
                SELECT
                       ....
            )  /*connection close*/      
     );
   disconnect from oracleadb;
quit;


Accepted Solutions
Solution
‎06-13-2016 07:02 AM
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

Well, you obviously ran out of disk space in library SASDWH.

Try using the compress=yes dataset option on your output dataset; this is especially helpful when long character fields are present.

If that does not help, either free up disk space by removing unneeded data or somehow get more space.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super User
Posts: 5,429

Re: proc sql to retrieve 10m rows from Oracle does not finish

What happends if you issue this SQL in the via the SQL*NET client on the SAS server, but not using SAS?

The gut feeling is that the bottleneck is on the network.

Another way to test the bandwith is to ftp files between the hosts.

Data never sleeps
Frequent Contributor
Posts: 133

Re: proc sql to retrieve 10m rows from Oracle does not finish

Hi,

 

I have not spoken to IT regarding the network but the SQL client selects the 10m rows in 30 minutes. SAS proc sql does not finish but quits with an error:

 

ERROR: Insufficient space in file xxxxx is damaged. I/O processing did not complete.

SAS System                              14:08 Wednesday, June 8, 2016

ORACLE: Pipelined I/O thread did not timely terminate

ORACLE: Pipelined I/O thread terminated on 2nd attempt

NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
NOTE: SQL Statement used (Total process time):
      real time           41:27.17
      user cpu time       3:19.85
      system cpu time     1:05.44
      memory              1351358.96k
      OS Memory           1369752.00k
      Timestamp           06/08/2016 02:50:17 PM
      Step Count                        2  Switch Count  1425
      Page Faults                       0
      Page Reclaims                     4367
      Page Swaps                        0
      Voluntary Context Switches        1499183
      Involuntary Context Switches      19569
      Block Input Operations            656
      Block Output Operations           62433400
     

 

Solution
‎06-13-2016 07:02 AM
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

Well, you obviously ran out of disk space in library SASDWH.

Try using the compress=yes dataset option on your output dataset; this is especially helpful when long character fields are present.

If that does not help, either free up disk space by removing unneeded data or somehow get more space.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 133

Re: proc sql to retrieve 10m rows from Oracle does not finish

Posted in reply to KurtBremser
Hi Kurt,
I changed to the library engine from V9 to SPD and set the partsize to 1250M and the new error is:

ERROR: The table or an index was not created because of insufficient disk space.

I wonder how I can see how much disk space I acutally need or if its a trial and error.
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

Since you could use the DB client, you have access to the SAS server outside of SAS (commandline, GUI). This means you can look at the disk space of your SASDWH location with tools depending on the type of the operating system (ie df on UNIX)

 

What if you just import 1 million records instead of 10 million?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 133

Re: proc sql to retrieve 10m rows from Oracle does not finish

Posted in reply to KurtBremser
Hi. I just talked to the SAS Server Admin. I have filled the whole 30GB partition in minutes trying to write a SAS data set with about 10m rows (also using compress) ok. I will try it with 1m now. yes, we use Unix. I am suprised that 10m rows take up more than 30gb...that doesnt seem right.
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

Some things to ponder:

Look at the structure of your dataset (create it with just a few records and use proc contents). This will show you the observation size.

Next, look at the state of the partition before and while you run your data step.

Third, look if there's a quota regulation active on the partition. You might run into a personal quota limit long before the filesystem actually flows over.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 133

Re: proc sql to retrieve 10m rows from Oracle does not finish

Posted in reply to KurtBremser
Hi,

I run the proc contents on the 1m rows that I managed to load.

Blocking Factor (obs/block) 264
Data Partsize 1310664960

Obersvations: 1000000
Variables: 100
Obs length: 3960
Compressed: No
Sorted: No

Engine: META
membertype: data
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

So you see that you get roughly 3960 * 1000000 bytes. Thats about 3.8 GB. Multiply by 10, and you get your filesystem overflow past 30 GB.

Also note that

Compressed: No

says you did not yet use the (compress=yes) dataset option, which might reduce the physical dataset size significantly if mostly empty character variables are present in the dataset.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

You should be able to log on to a UNIX server using your SAS server credentials, using a SSH client like PuTTY.

Determine the physical location of your SASDWH by right-clicking on the library in the EG server list and selecting Properties

Then do

df -k location

from the UNIX commandline. This shows you filesystem size and usage in kbytes.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Super User
Posts: 7,955

Re: proc sql to retrieve 10m rows from Oracle does not finish

Have you taken this up with your IT people so they can debug the situation.  Could be any number of things, space on drive, network connection, ODBC driver etc.

Super User
Posts: 3,254

Re: proc sql to retrieve 10m rows from Oracle does not finish

[ Edited ]

Try a select count(*) only with your program. Does it complete and how long does it take? This will test the database response, removing the time to download the data.

Super User
Posts: 10,028

Re: proc sql to retrieve 10m rows from Oracle does not finish

Add libname option readbuff=100000 into it to make it faster. Or can you just use data step to copy Or Try PROC COPY .
Super User
Posts: 7,782

Re: proc sql to retrieve 10m rows from Oracle does not finish

Since SAS uses a locally installed client to the DB in question, I'd use that same client to run the same select and store the resulting data in a local flat file. You might have a misconfiguration in the Oracle client to begin with.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 16 replies
  • 914 views
  • 1 like
  • 6 in conversation