BookmarkSubscribeRSS Feed
NMB82
Obsidian | Level 7

I'm running SAS EG 7.1 on a Citrix machine at work. I'm trying to do a very large sql pull (50gb) and keep running out of disk space. Our server drive has plenty of storage, so I assume switching the working directory to a folder on that drive temporarily will avoid this error?

The error says "*insufficient disk space....file is damaged. I/O processing did not complete. You may be able to execute sql statement successfully if you allocate more space to the WORK library."* I don't seem to have access to the config file and I've tried to change it programmatically with no luck. I am trying to save the resulting dataframe to a server folder already with a LIBNAME statement, but I think the temporary files created in WORK during the process are too much to handle. It seems the DLGCDIR function, that would help with this, starting with SAS 9.4 doesn't work with EG 7.1. Any help?

I've tried both:

    x 'cd "Q:\folder"';

and

   data _null_;
       rc = system( 'cd "Q:\folder"' );
       if rc = 0 
          then putlog 'Command successful';
          else putlog 'Command failed';
    run;

These run fine, but in the log it still says my working directory is unchanged:

    SYMBOLGEN:  Macro variable SASWORKLOCATION resolves to "C:\Users\user\AppData\Local\Temp\SEG5432\SAS Temporary 
                Files\citrixMachineDrive\Prc2/"

Here is my simplified SAS code. I am using UNION on several datasets in my Q server drive folder (TB's of space) within the FROM statement (50GB total). I'm then left joining on a smaller dataset from the database and saving the resulting dataset to the Q drive folder. This was code that was previously run on a weekly/monthly basis, but now we want a full year of data. Worse case scenario is I rework the related code into PySpark, but I was hoping to get this done quickly and that will take me some time.

LIBNAME PCLIB = 'Q:/folder';



PROC SQL FULLSTIMER;
CREATE TABLE PCLIB.NEED (COMPRESS=ON) AS
SELECT
   A.COL1,
   CASE WHEN B.COL2 NE '' THEN B.COL2 ELSE A.COL3 END AS COL4,
   SUM(COL5) AS COL6,
   SUM(COL7) AS COL8
FROM (SELECT * FROM PCLIB.TABLE1 UNION
      SELECT * FROM PCLIB.TABLE2 UNION
      SELECT * FROM PCLIB.TABLE3 UNION
      SELECT * FROM PCLIB.TABLE4 UNION
      SELECT * FROM PCLIB.TABLE5 UNION
      SELECT * FROM PCLIB.TABLE6
      ) AS A
LEFT JOIN TABLE_db B ON A.ID=B.ID
GROUP BY
COL1,
COL4;
QUIT;
16 REPLIES 16
SASKiwi
PROC Star

Run this in your SAS EG session:

%put %sysfunc(pathname(WORK));

Does this match the Citrix WORK location? Typically SAS WORK storage should be located on specially configured SAS App server storage for best performance.

 

EDIT: I'm assuming that you are using a remote SAS server. If you are using a connection profile in EG then this would be the case.

NMB82
Obsidian | Level 7

I ran that and the result was a location I have no access to. Pardon the blackout...those are work-related user/pc numbers.

 

SASwork.jpg

SASKiwi
PROC Star

Are you running SAS locally for remotely? If you use a connection profile in EG then you will be connecting to a remote SAS server.

NMB82
Obsidian | Level 7

It's run remotely on a citrix virtual machine. I have no direct access to my local pc through SAS...only the company server.

SASKiwi
PROC Star

That's doesn't really answer my question. EG is only a client app that works with either a local SAS installation, which in your case would also be Citrix, or with a remote SAS installation, in your case remote from Citrix. Do you use a connection profile in EG? If you do, then you are connecting to a remote (from Citrix) server-based SAS installation.

 

Patrick
Opal | Level 21

In your EG connection does it say "do not use a profile" or are you connected to a server profile?

Patrick_0-1648951331693.png

EG is just the client. The relevant bit for what you're dealing with is where SAS executes.

 

NMB82
Obsidian | Level 7

@SASKiwi @Patrick 

 

I understand now. I do not use a profile...it says exactly that, "do not use a profile." I haven't used SAS EG a whole lot. I just started at this company and we're currently moving everything from SAS to Python/Spark.

Patrick
Opal | Level 21

May-be share your SQL. If this is just about copying data from a database to a "local" disk then it depends very much on your SQL how much SAS WORK actually is required.

 

Looks like you don't have enough space on your C-Drive.

First measure:

1. Close all your EG sessions

2. Go to C:\Users\<your user>\AppData\Local\Temp

3. Delete all folders starting with name SEG-....

Note: ....\AppData\.. is a hidden folder. You either need to change your folder views to also show hidden files and folders or copy/paste the path C:\Users\<your user>\AppData into the File explorer to get past the hidden folder.

 

If after this there is still not enough space on your C-drive and you can't delete other files to create the required disk space:

 

WORK gets created during SAS invocation. 

In my pretty much vanilla local SAS installation the .cfg that defines the path is here: C:\Program Files\SASHome\SASFoundation\9.4\nls\u8\sasv9.cfg

Patrick_0-1648952826535.png

IF you change this .cfg directly: Take first a backup! 

I'm not sure how exactly you would need to replace !TEMP but it must be something dynamic and not a hardcoded path.

 

WORK should point to the fastest disk available so repointing it to some network drive will highly likely impact SAS performance.

SASKiwi
PROC Star

In that case, since EG and SAS configuration is locked down under Citrix, you will either need to talk to your SAS administrator regarding getting more space, or write directly to your own SAS library where there is more space, like this: LIBNAME MyLIb 'd:\MyFolder';

Sajid01
Meteorite | Level 14

Hello @NMB82 
1.Looks like the SAS workspace is in your %APPDATA%  folder and typically users have access to it even if it is a remote desktop. Remember you are able to write to it. Check if your SAS / SAS EG is distributed/streamed as a virtual app.

2. To transfer convert SAS datasets to python data frame one does not need SAS ("we're currently moving everything from SAS to Python/Spark".). Python pandas will do it. I would use something like this

import pandas as pd
df=pd.read_sas("Path to your sas dataset")

3,Even if you are not able to locate/modify your SAS configuration files, you can still use the space on server. Simply have the remote location mapped and mounted at the following location.(I got this location from your posts above.). Typically the users do have the access themselves, but in case of difficulty request assistance from Desktop support.

C:\Users\user\AppData\Local\Temp\SEG5432 

 

AhmedAl_Attar
Rhodochrosite | Level 12

@NMB82 

From looking at all information exchanges so, this is what I was able to conclude - Please feel free to correct my misunderstanding:

  • It looks like your company have installed SAS software (Enterprise Guide) on a Citrix Presentation Server, and they stream the display back to the user's desktop.  i.e. Users such as yourself, would see an Icon in their Citric Portal, and when they click on, it would start the software on the Citrix Server, and stream the Graphical User Interfaces (GUI) back to user's desktop. There are no local processing on the user's desktop, everything is running on the Citrix Server, only the Keyboard strokes and mouse clicks from the user's Desktop are recognized, and applied to the SAS software (Enterprise Guide) instance running on the Citrix Server.
  • That's why you are seeing SAS Enterprise Guide running without Profile - It's running locally on the Citrix Server, and not on your desktop!
  • As a user, you have no access to the SAS configuration nor the ability to change it!

Now that we have that out of the way, let's focus on how to address your Query issues "I'm trying to do a very large sql pull (50gb) and keep running out of disk space"

  • If you are pulling data from a Database (Not SAS), then I would recommend 
    • Using this SAS OPTIONS statement to trace your SQL query

OPTIONS DEBUG=DBMS_SELECT SASTRACE=',,,d' SASTRACELOC=SASLOG NOSTSUFFIX;

  • Try to assign a library to a location on the Citrix Server filesystem that has enough disk space, You probably would need to get that information from your Citrix Administration team, and save your query result in that library instead of the WORK library
  • Ask yourself, do your really need all 50 GB?
    • Can you review your query and minimize your returned records & columns? 
    • Use the COMPRESS=YES option to compress the output data set. This may help with reducing the required disk space to store the data.

Hope this helps,

Ahmed 

NMB82
Obsidian | Level 7

Thank you all, I'll give this all a shot. I've also updated the OP with my SAS code.

 

@Sajid01 @SASKiwi @Patrick 

SASKiwi
PROC Star

@NMB82  - Just bear in mind that all of the table joining in your SQL query will be done in SAS WORK regardless of whether you store the results in a permanent library. So a solution to increasing your SAS WORK space is still needed.

Patrick
Opal | Level 21

@NMB82 wrote:

Thank you all, I'll give this all a shot. I've also updated the OP with my SAS code.

 

@Sajid01 @SASKiwi @Patrick 


@NMB82 Now that you've shared your code we might find a coding alternative that will require much less disk space during processing. The SQL you formulated will require a lot of sorting - and such intermediary tables get stored in UTILLOC which normally resides in the same place like WORK ...which is on your C-drive. Sorting normally requires a multiple of disk space of your source table during processing.

Some tweaks to your code will likely drastically reduce the required disk space - but we need to know a bit more about your data to provide a solution that will work for you.

 

Questions

1. How much free space do you have on your C-drive?

2. Are COL5 and COL7 from the "table" with alias A or B?

3. You are using UNION and not UNION ALL. UNION alone will dedup your data but requires a lot of sorting to do so. Do you really need to dedup?

4. What's the volume of the table with alias B? If you execute the following code what's the volume of table TEST?

data test(compress=no);
  set TABLE_db(keep=id col2);
run;

5. For your "group by col1, col4": Can there be a distinct value combination that exists in more than one of your source tables Table1 - Table4? Or could we eventually sum table by table separately?

 

N.B: Changing the .cfg to repoint WORK would be the very last measure - and you would also need to repoint UTILLOC. Let's first try and tweak your code.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 2192 views
  • 10 likes
  • 7 in conversation