DATA Step, Macro, Functions and more

few questions in SAS.

Reply
N/A
Posts: 0

few questions in SAS.

Hi friends,
I have couple of question in sas

I am connecting to a SQL server through a SQL pass through query by OLEDB connection. But it takes too much time to fetch data. I have a huge data, i know but what may be the other possible reasons?
If i use READBUFF option what is the optimal value for it?
and i also have some space problem since i frequently occur "file is full or damaged" ERROR. even though i store that dataset at a permanent library. so what is maximum size of dataset and how to increase that size.

I have to fetch some history data also , so is there some way so that i do not need to fetch data form SQL every time, but i use the permenent dataset and append new data to it everytime.?

thank you.
Regards
Avi.
SAS Employee
Posts: 160

Re: few questions in SAS.

If you are accessing your SQL database through ODBC then avoid using SAS functions in your where statement. (otherwise all rows will be transfered to your SAS session)

Check if your database has any indexes, you can use in you where statement.
Trusted Advisor
Posts: 2,113

Re: few questions in SAS.

Avi,

I've not used OLEDB much; it's pretty much obsolete. Much better to use SAS/Access to Oracle if you have it and ODBC if you don't. SAS/Access to Oracle is optimized to get the maximum performance and features out of the linkage, so using that could make the transfer much quicker.

If you are doing SQL pass-through, then the WHERE and other constructs go through to the host UNLESS SAS recognizes them as not being supported. You probably want to read the SAS manuals on OLEDB to see if that is causing the entire table (or set of tables) to come to SAS from Oracle.

The maximum size of a SAS dataset is generally bigger than the maximum allowed by the operating system; I suspect the OS disk drive is running out of space (or out of the allocation allowed by your network admin). You can modify the length of the individual columns as they are stored into the table by PROC SQL; see the manual for how to do that. You also may be getting the error on your WORK library, especially if SAS decides to bring everything over from Oracle, so check the source.

To link new data to an archive, you can bring in the new data and combine it with the archive in a separate step. Since space is an issue, you may want to create a VIEW to do the combining.

Doc Muhlbaier
Duke
N/A
Posts: 0

Re: few questions in SAS.

Hi Doc,

The Database to which I am connecting is MSSQL Server. There is NO Where clause to my query therefore I cannot optimize my query there. As I need to generate many reports that needs all data for either of reports. And I have to use OLEDB only as i have only got that.
The other problem seems to be, i need to generate may of datasets for aggregating my data, though i am reusing the datasets as n when its use is over but still i think i should be some more efficient.

Regards
Avi
Trusted Advisor
Posts: 2,113

Re: few questions in SAS.

Try searching support.sas.com for optimization techniques; here is one I found

http://support.sas.com/documentation/cdl/en/acreldb/61890/HTML/default/a002252991.htm

If you are not using a WHERE, then I infer that you are bring the entire table across. Minimizing the length of the variables you store in a SAS dataset will make subsequent reads faster. Multiple reports generally run faster off the SAS datasets than if you were to query the database each time.

You may find a lot of good ideas for optimization in the BBU "Professional SAS Programming Secrets" by Aster & Seidman.
N/A
Posts: 0

Re: few questions in SAS.

Thanks Doc

thanks for help. I will surly read the book.
But i I am surprise even there is no where clause still its taking so much time .
i have tried with series of values for readbuff option but there was no appreciable change in performance.
You said
"Multiple reports generally run faster off the SAS datasets than if you were to query the database each time."

i did not get that . can u explain more..

regards
Avi
Trusted Advisor
Posts: 2,113

Re: few questions in SAS.

You said
"Multiple reports generally run faster off the SAS datasets than if you were to query the database each time."

Performance tuning has lots of variables to deal with, so any generalization needs to be tested locally. "Your mileage may vary." Here is my reasoning for this question. SAS is optimized to read SAS datasets. If you are bringing across an entire table (e.g. no where processing), then that is the equivalent of reading a SAS dataset sequentially, so SAS usually does that quicker than it would using a remote database. However, there are other variables in the mix: How fast is the disk drive for the SAS dataset? Is the disk drive local or on a network drive? How fast is the network connection, if needed? How fast is the connection to the database server (it could be different from the connection to the network server to the database server).

The most common situation is a remote database server and getting the data from it requires running the server tools as well as transmission over the network. A work SAS dataset on a bus-attached (local) drive is often substantially faster than the network transfer.
Trusted Advisor
Posts: 2,113

Re: few questions in SAS.

You said
"Multiple reports generally run faster off the SAS datasets than if you were to query the database each time."

Performance tuning has lots of variables to deal with, so any generalization needs to be tested locally. "Your mileage may vary." Here is my reasoning for this question. SAS is optimized to read SAS datasets. If you are bringing across an entire table (e.g. no where processing), then that is the equivalent of reading a SAS dataset sequentially, so SAS usually does that quicker than it would using a remote database. However, there are other variables in the mix: How fast is the disk drive for the SAS dataset? Is the disk drive local or on a network drive? How fast is the network connection, if needed? How fast is the connection to the database server (it could be different from the connection to the network server to the database server).

The most common situation is a remote database server and getting the data from it requires running the server tools as well as transmission over the network. A work SAS dataset on a bus-attached (local) drive is often substantially faster than the network transfer.
Ask a Question
Discussion stats
  • 7 replies
  • 194 views
  • 0 likes
  • 3 in conversation