Optimize IT resource capacity and performance with SAS

Minimum disk space required for saswork and sasdata

Reply
Occasional Contributor
Posts: 5

Minimum disk space required for saswork and sasdata

Hi,

I am looking to understand how much do I allocate the disk space for sasdata and saswork location on my linux server.

I have around 50G of raw data to start with, using which i would need to build a data mart for the analysis team.

The team would be using this datamart for the various analysis that they would be running.

Any suggestions on how can I come up with the calculation of space requirement.

thanks,

Super User
Posts: 981

Re: Minimum disk space required for saswork and sasdata

hi,

i can give you an initial estimate (based on nothing else but your raw data), like 100GB for data and 200GB for work and temp. You should also keep in mind the sizing for your users SAS folders.

by the way, everything will depend on:

the amount if your users

prediction about the growth of your data

other datasources more than raw data

other variables specific for your business needs

keep in mind the second consideration: the future. I would never advise to stick only to the present, but keep an eye on the future, so you won't have troubles with your disks in the close future: SAS data and requirements grows pretty fast!

Occasional Contributor
Posts: 5

Re: Minimum disk space required for saswork and sasdata

Thanks Juan.

Let me use the parameter you have mentioned to start with my estimate.

Esteemed Advisor
Posts: 5,984

Re: Minimum disk space required for saswork and sasdata

Factors:

- size of data. Depending on the format of the raw data, the SAS tables will be between 0.5 and 2.0 of the size of your source data (using compress=yes), tables with wide character columns will shrink considerably. This also reduces I/O load during processing

- number of users, esp. concurrent users. Size your SASWORK along ((size of biggest dataset - uncompressed!) * 3 + (size of biggest dataset probably merged) * 2) * (number of concurrent users). SASUTIL should be (size of biggest dataset - uncompressed!) * (number of concurrent users).

Wherever users have write access, set up a quota system so that one single user can't cause a loss of service for all others. This means SASWORK, SASUTIL and the volume for the home directories (where the SASUSER libs will reside). If sized correctly, a quota overrun will signal bad coding practice (like a piece of SQL that causes a cartesian join where 10,000 * 10,000 suddenly results in 100,000,000 records)

As Juan said, keep a wary eye on data growth.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Post a Question
Discussion Stats
  • 3 replies
  • 531 views
  • 3 likes
  • 3 in conversation