06-04-2013 05:13 AM
I'm an admin with SAS 9.3 installed on 64 bit Windows 2008 R2 VMs with 24-48 GB ram and 4-8 cpus. I've been tweaking settings for memsize and sortsize and added the cpucount, threads and alignsasiofiles options to sasv9.cfg, and I see improvements on test jobs, but I'm not a sas programmer and the user who wrote the test jobs is not available for consultation. I'm not sure what else to set in the cfg file since some things will depend on the job the user submits.
Is there a sort of quick start document available that I can point my users to with a few highlights for checking their code before running it, especially for large jobs? I imagine something that would prioritize a few things, like:
- convert datasets to 9.3
- calculate dataset size
- use statements like sasfile, keep, drop, etc
I've seen this guide on optimizing systems performance, but some of it looks like it's directed at admins.
I don't expect our users to squeeze every last drop out of a resource, but they can probably benefit quite a bit if they can hit some highlights. Thanks.
06-04-2013 06:08 AM
My 5 cents:
A lot of SAS users are not your typical IT developers but specialised business users. In my experience you won't succeed asking these guys to pre-calculate dataset size and thing like that. They will develop their code more in a "try and error" manner.
There is no SAS9.3 dataset format. The format is unchanged since SAS version 7. There is though 32bit vs. 64bit in case they are moving the stuff from an older environment (Proc Migrate will support you in converting the data sets).
I believe the general recommendations you can make are:
- Reduce volumes as soon as possible (less rows, less columns).
- Reduce passes through data (so better do stuff in 3 data steps than in 10).
I wouldn't recommend "sasfile" as this loads a whole dataset into memory. If too many users are doing this then you might end up with memory problems especially as the datasets stay in memory for the whole session if users don't specifically unload them.
Are your users complaining about performance issues right now? Most of the time it's I/O. I would define "compress=yes" as default. You also might want SAS work and utilloc on separate and fast disks. And the general data storage area for permanent datasets should have a "reasonable" throughput to the SAS Workspace Server.
Else: Monitor your server but may be don't ask/expect too much from your users. It's eventually much cheaper to increase compute power if necessary than to ask highly paid business specialists to invest more time in performance tweaking their code. This doesn't mean that trying to implement some "good practices" isn't worth it.
If there is also SAS/Access involved: Check the settings on metadata libnames for insertbuff, readbuff and dbcommit. The defaults are normally way too low.
06-04-2013 06:51 AM
Once you've got answers please also mark the most helpful ones as "helpful" and the one which was most usefull for you as "correct" because this way "we" see that the question is answered and don't look into it anymore.
It's also always nice if you don't just ask a question but also give some sort of feedback to the ones spending their time answering your question.
06-04-2013 08:26 AM
Thanks, Patrick. We're a campus research center, so a lot of the users are students who are not very familiar with sas when they start. I definitely can't ask too much of them, but they are trainable. We have some tutorials, but they were developed a while ago so they probably don't include some features in the newer versions.
I have the vmware/windows environment set up with an iscsi san that is operating pretty efficiently and users are generally happy with the performance, but most jobs are not very big. I have the option of setting up users on their own server vm temporarily for large, complex jobs, so that's what I'm trying to plan for. If I can get them to easily do the size calculation, it may be possible to put the whole file in memory.
I'll check out the compress and sas/access options.
06-04-2013 10:50 AM
There is a whole lot more to writing efficient programs. Setting a few options is just a tiny piece of the task. Here's a good place to start:
06-04-2013 11:30 AM
Yes, all the concepts still apply. Sometimes the percentage savings vary from one release to the next, such as the savings from applying a format. But the concepts are all still applicable.