BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Peter_B
Calcite | Level 5

Hi,

 

I have some large tables that I need to query but sometimes the query takes so long, SAS EG times out or crashes.

 

I got a list of the Indexes from our IT guys & read up on some ways to improve my query but I think my syntax is wrong.

 

Below are a couple of examples that just don't run at the moment.

 

Any tips as to my syntax welcomed.

 

I read that writing (idxwhere=no) after the table name will stop SAS doing a full table scan.

 

My error is just a general syntax error so I must have it really wrong. 

 

Thanks for any help.

 

Select *

from Table_a A (idxwhere=no)

Inner Join Table_b B

on A.ID = B.ID

------------------------------------------

Select *

from Table_a A (idxname=Cust_id)

Inner Join Table_b B

on A.ID = B.ID
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@Peter_B

Shouldn't the Index field exist on the table when I query it?

"index fields" don't exist. There can be fields for which an index has been created. Look at an index as a separate object besides of the table. It's kind-off a table of content so when you're looking for something instead of having to scan through the whole table, you first go to the index which tells you where the data lives in the table.

 

There is always the odd exception but normally there is nothing special you need to do/code for Oracle to use an index if and as appropriate. You don't need to specifically "mention" the index name in your code.

 

Looking at the code you've previously posted I would expect C.CMX_TRAN_ID to be indexed (or it's even the primary key).

 

Select xxxx
from TDRCORE.FCM_CSDP C 
INNER JOIN FCMCORE.FCM_RULES_FIRED F ON C.CMX_TRAN_ID = F.CMX_TRAN_ID

 

 

May be it's just the data volume your query returns, which makes that slow. Run a Select count(*) from <your query> to test this theory.

 

Also: Is there also a Create Table in your actual SQL or are you actually creating output. If the latter then this could certainly take up a bit of time as soon as data volumes grow.

 

And last but not least:

Make sure that libname option READBUFF (for TDRCORE and FCMCORE) is set to a much higher value than the default.

http://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.3&docsetId=acreldb&docsetTarget=p0fmg...

 

 

View solution in original post

13 REPLIES 13
andreas_lds
Jade | Level 19

Have a look at the documentation of proc sql - you need

proc sql;
 <insert query here>;
quit;

 

EDIT: Hit "post" to early.

Kurt_Bremser
Super User

As an alternative, use sorting with proc sort and a data step merge (unless you have to deal with a many-to-many relationship).

You might be surprised by the performance.

 

When encountering a syntax error, always post the complete log of the failing step(s). Use the {i} button for logs.

SASKiwi
PROC Star

If you provided more details on your table joins like posting the SAS log of one of them as well as table row numbers we might be able to provide alternative techniques.

Peter_B
Calcite | Level 5

Hi SASKiwi,

 

The error is as below:

 

ERROR: ORACLE prepare error: ORA-00933: SQL command not properly ended. 

The Proc SQL query runs OK without the '(idxwhere=no)' in the where clause below:

 

Select xxxx
from TDRCORE.FCM_CSDP C (idxwhere=no)
INNER JOIN FCMCORE.FCM_RULES_FIRED F
ON C.CMX_TRAN_ID = F.CMX_TRAN_ID

My question is where & why this query fails because of this Index reference & what's the correct way to apply this Index?

 

Thanks

SASKiwi
PROC Star

IDXWHERE is only valid when querying SAS data, not Oracle data.

Peter_B
Calcite | Level 5

Ok, thanks, so how do I apply an index to an Oracle query?

 

 

Patrick
Opal | Level 21

@Peter_B

Oracle does a lot of this stuff for you "automatically". If you know what indexes exist then try to formulate your WHERE clauses and JOIN conditions along these indexes, i.e. if the is an index for ColA, ColB, ColC the write your select to first select on ColA before ColB.

 

What's also important when interfacing with data bases: Try and push processing to the data base so that you reduce data volumes as much as possible before you transfer the result set into SAS.

1. Avoid joins between SAS and Oracle tables (load the smaller table to the side of the larger table first)

2. Avoid using SAS functions which the SAS Access engine can't push to the database (as else all the data gets first transferred to the SAS side and only then processed)

 

Here the list of supported SAS functions:

http://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.3&docsetId=acreldb&docsetTarget=p0f64...

 

Use the following options in your code as this will show you in the SAS log what part of implicit SAS SQL has actually been sent to the database for execution - and though which part has only been executed on the SAS side.

options sastrace=',,,d' sastraceloc=saslog nostsuffix;

 

And if performance becomes really important and you need full control: Use explicit SQL pass-through as this allows you to write the SQL directly in the Oracle SQL flavor and also use Oracle SQL specific functions and syntax (including hints). 

You can develop such SQL directly in clients like SQL Developer and once working just copy/paste the code into the SAS code part for your explicit SQL. 

 

Peter_B
Calcite | Level 5

Thanks Patrick,

 

Your explanation helps a lot.

 

So I now understand the Index fields should be used in the Join between the 2 tables I;m using.

 

I have a list of the Index fields that IT has created on the Large Transaction table I'm joining to but none of those fields exist on my Transaction or main tables?

 

Shouldn't the Index field exist on the table when I query it?

 

Do I need to go back to IT & ask how I join these 2 tables using an Index considering I can't see any Index field?

 

Or should I just use one of the Index fields & join both tables by an Alias?

 

Thanks for you help.   

Patrick
Opal | Level 21

@Peter_B

Shouldn't the Index field exist on the table when I query it?

"index fields" don't exist. There can be fields for which an index has been created. Look at an index as a separate object besides of the table. It's kind-off a table of content so when you're looking for something instead of having to scan through the whole table, you first go to the index which tells you where the data lives in the table.

 

There is always the odd exception but normally there is nothing special you need to do/code for Oracle to use an index if and as appropriate. You don't need to specifically "mention" the index name in your code.

 

Looking at the code you've previously posted I would expect C.CMX_TRAN_ID to be indexed (or it's even the primary key).

 

Select xxxx
from TDRCORE.FCM_CSDP C 
INNER JOIN FCMCORE.FCM_RULES_FIRED F ON C.CMX_TRAN_ID = F.CMX_TRAN_ID

 

 

May be it's just the data volume your query returns, which makes that slow. Run a Select count(*) from <your query> to test this theory.

 

Also: Is there also a Create Table in your actual SQL or are you actually creating output. If the latter then this could certainly take up a bit of time as soon as data volumes grow.

 

And last but not least:

Make sure that libname option READBUFF (for TDRCORE and FCMCORE) is set to a much higher value than the default.

http://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.3&docsetId=acreldb&docsetTarget=p0fmg...

 

 

Peter_B
Calcite | Level 5

Thanks Patrick,

 

That's very informative, the ID field you mentioned is the PK so it may be the best option.

 

I'll ask about the READBUFF option & suggest this to IT support.

 

Thanks again.

Peter_B
Calcite | Level 5

I've just tried to run the Select count(*) from table query which is also taking a fair amount of time.

 

Would adjusting the READBUFF help here?

SASKiwi
PROC Star

READBUFF won't help here. This option is for optimising the data extraction from the database. Since select count(*) is only a single row it will have no impact. Since select count(*) is still slow then you can assume it is the query itself that is running slow, not the reading of the data across a network to your SAS server.

I suggest you post the complete SAS log of one of your slow queries including the actual run, CPU time and table row count notes.

I've always found Oracle tricky to get good performance out of for complex queries. One technique that I've found pretty useful for speeding up queries is by supplying an Oracle hint to parallelise it and by using SQL passthru to get total control. Here is an example of this:

https://communities.sas.com/t5/Base-SAS-Programming/Please-explain-the-purpose-of-statement-Select-p...

 

 

 

Peter_B
Calcite | Level 5

Another question for you about your comment 

 

so when you're looking for something instead of having to scan through the whole table, you first go to the index which tells you where the data lives in the table.

 

 

I also have the Index field names that don't match any fields, so how do I use the Index filed(s) in my query?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2357 views
  • 0 likes
  • 5 in conversation