BookmarkSubscribeRSS Feed
GauravT
Calcite | Level 5

Hi,

I am looking to understand how much do I allocate the disk space for sasdata and saswork location on my linux server.

I have around 50G of raw data to start with, using which i would need to build a data mart for the analysis team.

The team would be using this datamart for the various analysis that they would be running.

Any suggestions on how can I come up with the calculation of space requirement.

thanks,

3 REPLIES 3
JuanS_OCS
Amethyst | Level 16

hi,

i can give you an initial estimate (based on nothing else but your raw data), like 100GB for data and 200GB for work and temp. You should also keep in mind the sizing for your users SAS folders.

by the way, everything will depend on:

the amount if your users

prediction about the growth of your data

other datasources more than raw data

other variables specific for your business needs

keep in mind the second consideration: the future. I would never advise to stick only to the present, but keep an eye on the future, so you won't have troubles with your disks in the close future: SAS data and requirements grows pretty fast!

GauravT
Calcite | Level 5

Thanks Juan.

Let me use the parameter you have mentioned to start with my estimate.

Kurt_Bremser
Super User

Factors:

- size of data. Depending on the format of the raw data, the SAS tables will be between 0.5 and 2.0 of the size of your source data (using compress=yes), tables with wide character columns will shrink considerably. This also reduces I/O load during processing

- number of users, esp. concurrent users. Size your SASWORK along ((size of biggest dataset - uncompressed!) * 3 + (size of biggest dataset probably merged) * 2) * (number of concurrent users). SASUTIL should be (size of biggest dataset - uncompressed!) * (number of concurrent users).

Wherever users have write access, set up a quota system so that one single user can't cause a loss of service for all others. This means SASWORK, SASUTIL and the volume for the home directories (where the SASUSER libs will reside). If sized correctly, a quota overrun will signal bad coding practice (like a piece of SQL that causes a cartesian join where 10,000 * 10,000 suddenly results in 100,000,000 records)

As Juan said, keep a wary eye on data growth.

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

CLI in SAS Viya

Learn how to install the SAS Viya CLI and a few commands you may find useful in this video by SAS’ Darrell Barton.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 2441 views
  • 4 likes
  • 3 in conversation