BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Smile1
Calcite | Level 5

Hi~ all,

 

I have 1TB D drive. So, I located my SAS work library on D drive. However, I found that 1TB is not enough since nowadays, my D drive became full when I run my SAS program. 

 

So I wonder if I can locate work library on D drive as well as C drive at the same time or not...

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

We have a X-Y issue here. You think you need Y (more WORK space), but in fact X is your problem, and X in this case means steps that need much too much space, given the data size you have. So you need to fix THAT first. Throwing storage at a problem that is out to consume "infinite" resources will only delay the crash, not stop it from happening.

View solution in original post

12 REPLIES 12
PaigeMiller
Diamond | Level 26

Here's an example of how to put the WORK library on the disk that has the most space

https://documentation.sas.com/doc/en/pgmmvacdc/9.4/hostwin/n1qr5dmzagn9krn1lt1c276963za.htm#n03r1zm3...

--
Paige Miller
Smile1
Calcite | Level 5
Thank you for your information, Pagie. I want to use D and C drive at the same time, rather than use either D or C drive randomly, though. Maybe, I have to consider if I can reduce the size of my data set or not...
ballardw
Super User

@Smile1 wrote:

Hi~ all,

 

I have 1TB D drive. So, I located my SAS work library on D drive. However, I found that 1TB is not enough since nowadays, my D drive became full when I run my SAS program. 

 

So I wonder if I can locate work library on D drive as well as C drive at the same time or not...


Strongly suggest that you determine what filled up a terabyte of disk space. That sounds like you either need to consider cleaning up after your code as you go along and remove data sets no longer needed (Proc DATASETS will do this) or perhaps make sure your code isn't creating unneeded copies of data sets.

 

Are you experimenting with macros and creating sets inside %do /%end loops?

Smile1
Calcite | Level 5
Thank you for your response, ballardw. No, I am not experimenting with macros and creating sets inside %do /%end loops.

I downloaded data sets from WRDS (Dealscan and S34 13F) and the size of each data set is about 210,000 x 15 and 33,000,000 x 26. I tried to join these two data set. That's why a terabyte of disk space is filled up.

Maybe, I have to re-consider if I can further reduce the size of my data set or not...
ChrisNZ
Tourmaline | Level 20

Notes:

1. If the data sets are sorted, the join will not use much space at all.

2. Also consider compressing the data sets (this will not influence the utility files' size).

3. A good option for performance is to have the work library on one drive and the utility folder on a different drive.

4. A slow option that saves space is to compress the utility folders

5. This might interest you, to spread WORK across locations, but is not a better option than the 3 top advices above.

 

Smile1
Calcite | Level 5
Hi Chris,

Thank you for your information. Now, I'm not sure whether it is possible to spread WORK across locations or not.
I'm trying to sort out data that doesn't have stock information and to join again, FYI.
ChrisNZ
Tourmaline | Level 20

> I'm not sure whether it is possible to spread WORK

Have you seen point 5?

Kurt_Bremser
Super User

Assuming a variable length of 8, your larger dataset comes out as ~7GB, so it's VERY far from the 1 TB you have available.

Work through your steps one by one, watch the disk consumption during the step (to get a feel for the utility files), and run a PROC CONTENTS on the resulting datasets to see their physical size. Once you have identified a problem step that eats WORK space or creates monster datasets, see how you can optimize that step; if in doubt, post the code and log here.

 

I work routinely with datasets in the 30+ million obs range, and have a disk quota in WORK of 10 GB, and 3 other filesystems available for temporary storage, also with a 10GB quota. Which means that all my jobs work with a temporary storage of 40 GB; your 1 TB HAS to be sufficient unless you commit serious mistakes (e.g. a cartesian join "under the hood").

Smile1
Calcite | Level 5
Hi Kurt,

Thank you for your response. Now, I'm not sure whether it is possible to spread WORK across locations or not.
I'm trying to sort out data that doesn't have stock information and to join again, FYI.

Thank you for your suggestion. I will post my code and log here if something is wrong.
Kurt_Bremser
Super User

We have a X-Y issue here. You think you need Y (more WORK space), but in fact X is your problem, and X in this case means steps that need much too much space, given the data size you have. So you need to fix THAT first. Throwing storage at a problem that is out to consume "infinite" resources will only delay the crash, not stop it from happening.

Smile1
Calcite | Level 5
Hi Kurf,

Yes. I agree with you. I will try to drop unnecessary columns from my dataset.
Kurt_Bremser
Super User

PS if you want to do something REALLY useful for your WORK, add 2 (or more) SSD's to your computer, and define a striped volume on them, and use that for WORK. This will speed up any operation on WORK by orders of magnitude.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1230 views
  • 0 likes
  • 5 in conversation