06-12-2015 02:08 PM
Hi folks, I'm hoping someone can help me out here.
I've read a lot of discussions on SASWork and the talk of adding quotas to users, specific types of joins that can plug up SASWork, etc.
I've also received some great help from SAS themselves.
What I'm wondering is, what is your organization's allocation for space for SASWork?
Our current allocation is 194Gig to SASWork and we have approximately 100 users (most are basic users with about a dozen power users that bring down a lot of data).
I'm just wondering what everyone else has for their SASWork allocation compared to their number of users?
06-12-2015 02:58 PM
750GB for ~200 users.
I think it more depends on the size of data they will be working with though. What's the typical dataset size? For us a single year of data can be 32GB.
06-12-2015 03:35 PM
I'd be comfortable with the 194 GB for the 100 basic users, but power users can EASILY chew through 50 to 100 GB each.
I would think that 750 to 1,000 GB, with good data management practices and enforcement, would do nicely.
06-13-2015 02:20 AM
Why using a shared work for a al usage and kind of users.
The sizing of work is dependent of the size of data being processed. Spreading in time and sharing those resources can decrease the total amount.
500Gb can be quite normal for some usage to start with these days. The tuning optimization of the IO system is one of the things to start with. It should not come in the end. That is difference between sales and technical wit business alignment.
06-13-2015 09:57 PM
SAS provides a utility for cleaning up WORK libraries being filled with old SAS processes such as one day or older. I don't know if old work directories are contributing to your problem but if it is then you can check it out here:
It is common practice to schedule this utility to run daily.
06-15-2015 03:41 AM
Depends on your data sizes. One very good way to handle these problems is by implementing quota limitations for users in shared resources. This eleiminates the "showstoppers" and only causes individual users to get the disk full messages.
A sideeffect of running a quota system is that you can get a report on individual disk usage with a simple command (in AIX this is repquota).
Then you can investigate why some users overrun their quota and others don't. Often it is just plain lazyness in removing unneeded work data, or you see them doing inefficient things with PROC SQL. Or they include the proverbial kitchen sink in work files when only three colums are needed for analysis.
In your situation, I'd start with limiting individual users to 10 or 15 GB and then see what happens.
My personal rule is that no shared resource where users have write access runs without quota management.
06-15-2015 08:22 AM
Not bad Kurt those quotas. To bad to have forgotten it is not the default way of a sas installation as guided by sas.
Designing file systems to isolate group of users on a shared machine is not mentioned. When doing that you can easily add the mandatory os controls with the security. Groups and ownership.
Wanting to use the ulimits on a Unix based system you cannot use the user Unix ssettings but instead have to do it scripted in the sasapp configuration. It is the object spawner user ID that is the settings being shared for all Sas processes started by the objectspawner. By default unlimited is needed.
That quotas I see coming back as all those mis alignments by offering a vm machine for every type of user/group. Building vm machines is something Unix admins understand and they do not like The Sas approaches as security guys also do not like the sas stuff.
06-15-2015 09:29 AM
Well, the ulimits are taken from the userid running the spawner, but the quotas depend on the user requesting the workspace server.
Using VMs to separate users would also constitute licensing problems.
06-17-2015 09:33 PM
We have 400GB allocated for about 20 active SAS developers, and several hundred ad hoc report requests during the day. UTILLOC is also in a separate folder.
But what can happen is one step in one program can run a bad cartesian join against a database table. When SAS attempts to bring all of that down to WORK, the disk could fill up. And so what's essential is a smart disk space monitoring script or utility that sends alerts. Don't rely on the UNIX admin's tools.
The shell script I wrote sends us e-mails when the WORK space usage goes past a threshold, e.g. 70%, and then sends a new e-mail if the usage % has increased since the last check 15 mins ago. I also get an alert on my phone via the e-mail app. The world record size for a WORK dataset we had was approx 140GB before I contacted the developer.
I've also scheduled the SAS WORK clean-up utility to run every hour via CRON, but this was when we had less disk space.
06-18-2015 01:39 AM
"Don't rely on the UNIX admin's tools."
Nope. Actually the system admin tools are THE way to go here. Everything else is a stopgap, as it takes much less than a few minutes to create a 100 GB file on a modern machine and storage.
I repeat: quotas, quotas, quotas, as they act proactively and stop the offending process before the admin even has to go into action. Bad programming is therefore not detected when the whole service comes crashing down, but when a single desperate user calls for help.
06-18-2015 01:56 AM
"Nope. Actually the system admin tools are THE way to go here."
Sorry...what I meant was to have both in place, the standard UNIX system monitoring, and your own monitoring.
At some sites, a generic UNIX server disk space alert can be received by techos who know nothing of the context and seriousness of the alert. And the threshold tends to be set high, e.g. 90% or more.
For SAS WORK, you want to get early notification of WORK space usage creeping up, not wait until it's too late, and/or get SPAMed with system generated alerts and new incident tickets every 5 mins.
If standard UNIX resource monitoring can run with a variety of rules, thresholds, notification recipients, etc, then all good....just use that.
06-18-2015 02:14 AM
Understood. But I still find it better to not only monitor disk usage, but actively prevent an overrun.
Back in the 80's, HP-UX per default prevented writes by a user other that root to any file system that was at 90% or higher.
Nowadays, advanced quota system (as in IBM's JFS2) make the jobe quite easy.
06-18-2015 03:15 AM
If overruns could be blocked for standard users and yet allow critical system users (schedulers etc) to keep running, that sounds like an excellent capability to have.
I'll look into quotas for our servers.
06-18-2015 03:50 AM
Speaking from my experience with JFS2:
First, one needs to enable quota management for users and/or groups on the respective filesystem.
With this, the system sets up one quota management class (0) for user (and group, if activated). Users are automatically assigned to class 0, which initially has no limits.
You then add a new class (eg 1) that also has no limits, assign the system user(s) to that class, and implement sensible limits on class 0. After that, you only need to take action when a new user is added that needs unlimited access, when an additional class is needed, or to modify the 0 class (which automatically affects all users that are not assigned to another class)