Optimise a simple SAS code

Reply
Occasional Contributor
Posts: 8

Optimise a simple SAS code

There is a code...

data temp;

set temp1(keep = a b c d);

run;

Can we reduce the time by any means??

Actually the dataset(temp1) has 7 million rows and it takes a long time....is there anything we can do about it?...please help....

Super Contributor
Posts: 578

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

The only thing your doing is making a copy of temp1 with a subset of the columns?

Occasional Contributor
Posts: 8

Re: Optimise a simple SAS code

no...i am copying the dataset form server.....otsia huge dataset....so it takes a lot of time...i am creating new dataset at my local machine...

Super User
Posts: 5,435

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

If this is a common task, you want to do an infrastructure review, so you can avoid any bottlenecks. This would include the server (CPU, I/O), network and speed of your PC (again, CPU and disk I/O).to

Do you copy the table because the server connection is slow, or do you have laptop and want to bring your data from the office?

Data never sleeps
Super User
Posts: 5,516

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

A couple of things to consider ...

Do you really need to create this data set at all, or could you use TEMP1 for analysis?

The setting for the COMPRESS option can affect many of the resources required.  You could run a PROC OPTIONS to see your default setting for COMPRESS.  While this may reduce the time, it may increase the use of other resources:

data temp (compress=NO);

set temp1 (keep=a b c d);

run;

The results will depend on characteristics of your data, so it is difficult to predict ahead of time.

Good luck.

Super User
Posts: 5,435

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

If you need to do this copy at all (what is the requirement for this task?), why is both source ad target tables in SASwork? What is the application?

If you have read from a permanent location, you could consider moving your source table to SPDE, which allow you to do multi-threaded table scans.

Data never sleeps
Super Contributor
Posts: 358

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

Hi:

It also depends on where your temp1 dataset is located.  If it is on a SAN or in Oracle for example, the I/O for the file could be what is taking all the time.

I found that a SQL sometimes works faster than a data step in these cases....

PROC SQL;

CREATE TABLE TEMP as

SELECT A, B,. C, D

FROM TEMP1;

QUIT;

Occasional Contributor
Posts: 8

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

hi guyss.....actually i am taking data from server....so the temp1 dataset is at server location.....and i am creating the new dataset at my local desktop......so just wanted to check up if there is any technique which can be applied so that time is reduced to get those columns.....its a huge data......so it takes 2 hours for only this set of statements.....

Super User
Posts: 5,516

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

Unless you have a very old PC, 2 hours is longer than it would take to run the program if both data sets were located on the PC.

One possibility would be to use PROC COPY to copy the data set.  You would have to copy the entire data set, and then run a second program later on the PC to subset the variables.  That combination could easily be faster, but it would depend on how many variables are in the original data set.

If you have the storage space on your PC, it's worth testing how long PROC COPY would take.

Occasional Contributor
Posts: 8

Re: Optimise a simple SAS code

Posted in reply to Astounding

Hi....

Actually the dataset is too big to copy it on local drive....it will take a very long time....around 2 days....

so the connection to the server will break after 2-3 hours....

i am getting the data frm the server database....so i need to think of a measure where the execution time is less.....

i cop the data because i use laptop and the whole data is on the server....

Super User
Posts: 5,435

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

Is there a SAS server involved in the setup, or do you just have a bunch of PC licenses and a shared disk on the network.

Sounds like that you are in need for a more centralized data store, with a compute server. This kind of setup is less dependant on network bandwidth. Such setup can be solved either by SAS/CONNECT or by having Enterprise Guide clients talking to a SAS WorksSpace Server (part of Intelligence platform). The actual SAS module for this is Integration Technologies, bot often part of SAS offered bundles (BI Server, DI Server etc).

Data never sleeps
Respected Advisor
Posts: 4,173

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

The code sample you've given us in your original post...

data temp;

set temp1(keep = a b c d);

run;

...creates SAS table "temp" in SAS Work using table "temp1" also in SAS Work. So this would run on the same server and on the same file system. Also you're only copying 4 columns in your sample code. From what you describe this sample code does not represent your reality and you will need to tell us more in detail where your data is coming from and how your code actually looks like.

7M rows are not that much if it's only about 4 variables so I must assume that the way you've written your real code actually copies the full data to your laptop before it sub-sets it.

If you tell us exactly how your environment looks like - eg: Source data resides in a DBMS (which one), there is a remote server to which you can connect via SAS\Connect, etc. - and if you post your real code then I'm sure someone here can come up with some ideas of how to improve performance (eg: readbuf, multithreading, spde enging, ....).

By the way: Which SAS version are you on? Since SAS 9.4 PROC DS2 is production and allows for multithreading with SAS datasets.

PROC Star
Posts: 1,760

Re: Optimise a simple SAS code

Posted in reply to anshulgoel

To download a data set, proc download is the fastest method.

Respected Advisor
Posts: 4,173

Re: Optimise a simple SAS code

Hi Chris

I believe with Proc DS2 now being production in SAS9.4 and allowing for multi-threading also for SAS data sets, using such an approach could be even faster than Proc Download.

PROC Star
Posts: 1,760

Re: Optimise a simple SAS code

I seriously doubt it Patrick.

1- We are transfering data between hosts, proc DS2 was never optimised for this at all, proc download was

2- Multithreading only makes sense if the CPU is the bottleneck, which I doubt is the case here (what are the log figures?)

   Usually I/O is the bottleneck, and for a download, the network is even slower and becomes the bottleneck.

   If you run 10 processes instead of one when the network or the disk is saturated, there is no point. It actually makes things worse.

3- Even for data steps that run on one host, using proc DS2 with several threads is no guarantee things will speed up. Quite the opposite in many cases.

    Proc DS2 is a solution to some problems, but certainly not a cure-all.

Ask a Question
Discussion stats
  • 17 replies
  • 510 views
  • 1 like
  • 8 in conversation