BookmarkSubscribeRSS Feed
vomer
Obsidian | Level 7

Hi guys,

I have some fairly large data sets that are created on a monthly basis. I was wondering if you have any tips on data compression techniqes to speed up the time SAS takes to create these data sets?

Thanks for the tips!

10 REPLIES 10
sassharp
Calcite | Level 5

Instead of using proc sql by libname reference go through Oracle or ODBMS pass through. try to use dbkey in proc sql. It drastically increases the join times.

sassharp
Calcite | Level 5

just paste some code here. there are legends they help you tune up your code.

Astounding
PROC Star

vomer,

You're asking two different questions.  Compression affects disk space usage.  As a general rule, this does not speed up programs but rather slows them down slightly.  What are you trying to accomplish?

vomer
Obsidian | Level 7

Mainly looking to speed up the code run time. Wondering if thewe are any tricks that everyone here has used to speed things up in general.

Tom
Super User Tom
Super User

Summarize as early in the process as possible.

art297
Opal | Level 21

Like others have said, seeing what your data and data needs are, as well as environment, would have to be known in order for anyone to suggest valid possibilities.

Tom mentioned summarizing as early as possible.  I agree, but would even back up a step before that.  Only import and/or keep the data that are really necessary.

FriedEgg
SAS Employee

Astounding,

Since the slower computing operation is typically I/O compression often does speed up programs by allowing the disk to push the data faster at the cost of increased CPU usage.  It all depends where your bottlenecking in performance.

vomer,

Please provide additional information about how these files are created currently on a monthly basis and any system information you know or feel comfortable sharing, especially OS, available storage types, cpu counts, ram, etc..

What is the nature of the data, does it have natural segmentation for the analysis you intend to do with it?  Why do you feel this program is running slow?  What is it's current performance metrics and where would you want it to be?

Increasing performance is a very large question to ask so vaguely.

twocanbazza
Quartz | Level 8

In My experience compressing has actually speed up the process, it really depends on where the bottle neck is... Ie if you have slow disk, writing/reading as little as possible helps...

Barry

LinusH
Tourmaline | Level 20

If we look at compression as such, if your data is mainly character, try the ordinary CHAR method.

If you have lots of numerical data, try BINARY.

If you have lots of numerical integers, you can try to specify other length than the default 8.

As Barry says, it deponds on your system and data, if compression will help processing time. The penalty for compression is more CPU cycles.

My experience is that you need at least 50% compression to gain shorter processing time.

Hoe long is your run time anyway? Do you have indexes defined?

Depending on the requirements, some re-modelling of the data can sometimes help.

Data never sleeps
ballardw
Super User

If I don't want to babysit the program while running I'd be tempted to schedule it to run while I'm out of the office, say overnight if possible. Then everything is done when I come in the morning.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1616 views
  • 2 likes
  • 9 in conversation