Desktop productivity for business analysts and programmers

long time to get the results for query

Reply
Contributor brm
Contributor
Posts: 30

long time to get the results for query

Hi,

I'm working on Environment where we have connection to the database from sas server.We are using EG 4.2 to access those database files.

I'm running below query,it is running since 40 min.I have million records to process.

select fs.sender_name "Specialty Pharmacy",

pat.drc_case_id "DRC CASE ID",

ship.most_recent_ship_date "Milestone_Date",

ship.ndc_nbr "Product",

'Ship' as "Milestone_Type"

bi.num_of_days_supplied "Days of Therapy"

from cld_account_cooked acct,

cld_benefit_cooked bi,

received_files rf,

cload.file_senders fs,

cld_patient_cooked pat

left join cld_detail_cooked ship on ship.cld_patient_cooked_id = pat.cld_patient_cooked_id

where rf.file_date_of_report >= '01-Jan-2010'

and fs.sender_name = 'SDI CuraScript'

and rf.received_file_id = pat.received_file_id

and acct.cld_account_cooked_id = pat.cld_account_cooked_id

and bi.cld_patient_cooked_id = pat.cld_patient_cooked_id

and fs.file_sender_id = rf.file_sender_id)

Could somebody suggest me what i can do to improve performance.

Thanks,
brm.
Frequent Contributor
Posts: 91

Re: long time to get the results for query

Hi,

I assume that you are accessing an external relational database, using some access engine. If this is the case, or you are using your local ODBC from SAS EG, then, your query is probably not implicitly able to use SQL passthrough. That means, that SAS need to copy the entire database contents before it is able to process the query, as apposed to using the DBMS to carry out the processing and only pushing through the data that you need. This is the difference between waiting minutes, possibly hours, compared to a few seconds.

If you dig up from the SAS Global Forum 2011 the Paper 'Explicit SQL Pass-through: Is it Still Useful', there are some hints on this issue. This article is also useful:

http://support.sas.com/resources/papers/sgf2008/optimization.pdf

Nick
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

Yes Nick i'm trying to access an external relational database, using SAS Server.

All database tables r there in one folder called XYZ.

If i'm trying to access those tables programatically,for example

XYZ.tablename...EG giving error saying not able to find that folder.

Can you suggest me how can i access those tables...rather than point and click interface.

Thanks,
brm
Trusted Advisor
Posts: 2,114

Re: long time to get the results for query

Can you share a copy of the log with the error message as well as the log for assigning the LIBNAME. I suspect your LIBNAME statement.

When you use the point-and-click interface to get to the database, you are actually accessing it from the client rather than the server. That too may be part of the slowness; see
http://www.youtube.com/watch?v=OSTa1EUpKT8
for an example of the problem.
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

I'm using this kind of environment for first time..I don't know how to access this files(without
point and click interface).

i'm doing like this.


unders servers...sas server(name) Smiley FrustratedASAPP

folder under sas server(where oracle table are stored)
ctest_data

i'm trying to access ctest_data.tablename.Getting error saying,libname ctest_data
not found.

we don't have any kind of access to server(i mean userid password or path or dsn)..
I don't know how to use libname statement.

Pls someone help me on this issue.

Thanks,
brm
Respected Advisor
Posts: 4,135

Re: long time to get the results for query

Hi

Right click on the table and select "Add to project".

Double click the table under the project flow. If this opens it and you can see data then access to the data base as such works. If this doesn't work then contact your SAS admin as this is then an issue you can't resolve.

If access to data works:
Open a program window and drag the table from the process flow into the program window.

This will create a comment in your program window with the correct libref and table name. Just use exactly this one in your FROM statement.

HTH
Patrick
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

Patrick,

Access to the data worked....but if i drag the table and placing in the program window it's not genarating libname statement.

Thanks,
brm
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

Hi,

Following the same steps u suggested,but still giving an error saying library is not assigned.

/*Data Source: CLOAD_RT.RECEIVED_FILES */

/*Data Source: CLOAD_RT.FILE_SENDERS */
PROC SQL;
CREATE TABLE TEST AS
SELECT * FROM CLOAD_RT.RECEIVED_FILES;
QUIT;

ERROR: Libname CLOAD_RT is not assigned.
Respected Advisor
Posts: 4,135

Re: long time to get the results for query

Hi

If you can click on the table as such and you see the data connection to the database works and rights to access table data are sufficient.

Dragging the table into the program window will generate the correct syntax for libref.tablename to be used in any program window.

The error you're getting is possibly caused because the library is NOT pre-assigned - meaning the library node is initially white and not yellow. If so then first right click on this node and choose "assign".

Also: If you have several servers showing up in EG make sure you're running your code on the correct server (the one under which the library node shows up). In your program window right click and then select the correct server.

You don't have to issue any libname statement - but if you're interested how the libname statement looks like then issue in a program window the following command (once the library node is yellow): libname cload_rt list;

HTH
Patrick
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

Hi Patrick,

Library is assigned,it is yellow in colour.I ran the libname statement as u suggested.

libname cload_rt list;

libname cload_rt list;
ERROR: Libname CLOAD_RT is not assigned.
ERROR: Error in the LIBNAME statement.

What might be the problem for this?..one more thing is i have to run three quries like i posted,and then i have to combine those three results.If i do it by point and
click interface,i have to run three subquries and then joining those results...which is causing load on the server.

If i'm able to run the query through programming i can use union and send everything at once....it might reduce load on SAS Server.

There are 1.2 million records in the final output table.

Thanks,
Bhavani.
Respected Advisor
Posts: 4,135

Re: long time to get the results for query

Hi Bhavani

The point and click interfaces generate also code and it's code only which is sent to the SAS Server for execution. So if this works the you should also be able to send code from a programming node to the SAS Server.

To start with valid syntax you could copy the code generated by the query builder into a programming window (i.e. from the code preview or if already executed once then also from the code tab).

If the library node is yellow AND point and click works then the only explanation I have is that you're sending the code in the program window to the wrong server (where the libname is not defined).
Have you selected the correct server for the program node?
Try the following at the beginning of your code:
%put _global_;
...and then check in the log the value for "_SASSERVERNAME" which should be the name of the server where the library is assigned.

I agree that executing everything within the DBMS and then only load the result set into SAS is generally the way to go.

If in your case this is an outer union then you're probabely also fine by defining 3 queries with views as output (just select as output view in the query builder) and then append these views via "Task\Data\Append Table...".

HTH
Patrick

Message was edited by: Patrick
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

Thanks everyone for the help......Finally i'm able to run the query without issues....Again i have to check with large amount of data.
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

Here is the libname ststement and error message.

libname CLOAD_RT SASApp user=xyz password=XXXXXXXX path='abc';
ERROR: The SASAPP engine cannot be found.
ERROR: Error in the LIBNAME statement.


16 PROC SQL;
17 SELECT fs.sender_name FROM CLOAD_RT.FILE_SENDERS fs
18 WHERE fs.sender_name = 'SDI CuraScript';
ERROR: Libname CLOAD_RT is not assigned.
Contributor brm
Contributor
Posts: 30

Re: long time to get the results for query

I don't have oracle installed on local machine
Respected Advisor
Posts: 4,135

Re: long time to get the results for query

Hi

Using below option will show you in the log what Oracle SQL syntax actually gets sent to the DB.
options sastrace=',,,d' sastraceloc=saslog nostsuffix;
from: http://support.sas.com/documentation/cdl/en/acreldb/63647/HTML/default/viewer.htm#a000433982.htm

You're joining 6 tables (one of it a left join). Depending of what path Oracle takes this could be quite inefficient.

It sometimes helps to just re-order the joins a bit. What I like to do in such a case is to use a tool like SQL developer, write my SQL query in Oracle syntax and then use explain - and also look at the "Costs". This makes tweaking the code much easier.

If you don't have this possibility then one way worth trying could be to identify tables where an inner join reduces volumes the most and then reformulate your query accordingly. I.e. let's say the inner join of account and patient:

select fs.sender_name "Specialty Pharmacy",
pat.drc_case_id "DRC CASE ID",
ship.most_recent_ship_date "Milestone_Date",
ship.ndc_nbr "Product",
'Ship' as "Milestone_Type"
bi.num_of_days_supplied "Days of Therapy"

from
(
cld_account_cooked acct,
cld_benefit_cooked bi,
received_files rf,

cload.file_senders fs,
(select
pat.drc_case_id "DRC CASE ID",pat.received_file_id
from
cld_account_cooked acct,
cld_patient_cooked pat
where acct.cld_account_cooked_id = pat.cld_account_cooked_id
) pat
)
left join cld_detail_cooked ship on ship.cld_patient_cooked_id = pat.cld_patient_cooked_id

where rf.file_date_of_report lt;= '01-Jan-2010'
and fs.sender_name = 'SDI CuraScript'
and rf.received_file_id = pat.received_file_id
and bi.cld_patient_cooked_id = pat.cld_patient_cooked_id
and fs.file_sender_id = rf.file_sender_id


As inner joins perform normally much better than left joins you could also try to have all the inner joins in such a sub-select bracket and only then add the left join.

Make also sure that the table for the left join has a 1 to zero or one cardinality or else you might end up with much more resulting rows than expected (and this would be a reason for the long running query).

By the way: Should the date end with a 'd': '01-Jan-2010'd


HTH
Patrick Message was edited by: Patrick
Ask a Question
Discussion stats
  • 14 replies
  • 303 views
  • 0 likes
  • 4 in conversation