BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Alin1984
Fluorite | Level 6

HI All,

 

I need your suggestion on an issue that our users are experiencing intermittently. From investigation i managed to do until now, they are experiencing I/O errors when transferring SAS tables from work to a file server via the network. This error doesn't appear all the time and that is why it is hard to gather evidence. Platform details: 16 cores server, 128 GB RAM, 250 GB C drive where windows server 2016 is installed, 250 GB D drive where SAS is installed, 3TB S drive where work space is. The throughput is sufficient to and from work drive in terms of SAS requirements and enough free space is on all drives for what the users are doing (using EG clients for manipulating data).

 

Findings until now: 

- user tries to do a simple data step to transfer a table from work to an external file server folder via network

- initial size of the table in work area 93 GB (big 2000 tributes table with 3 million records) the table is uncompressed

- destination folder is a COMPRESSED folder on an external file server.

- I/O error is returned and destination table is damaged (without possibility of repair using proc datasets)

- same error may or may not happen on smaller tables like 40 GB but only if other users are writing in the same folder in the same time

- did the same test with the 93 GB table and transferred it to the same file server in an uncompressed folder and it worked fine.

- then the user moved manually the table from the uncompressed folder to the compressed one and it decreased size to 24 GB - this is the current workaround. The table is undamaged if transferred in this way.

- The network connection was monitored during the transfer and nothing seems to fail, also the process was not interrupted

- The user also used x command to transfer the file in the compressed folder and it arrived there damaged as well (which makes me think that the windows process that compresses big files somehow is not working correctly)

 

The question:

Naturally I want to say that the conclusion is that when you transfer to a compressed folder using data step that does the compression while you transfer then SAS loses pointer to the rows of the table and ends the file in an unnatural way if the file is too big.

What is your opinion? Am I jumping to conclusions? Any other tests I might do?

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@Alin1984 wrote:

I am saying compressed FOLDER (windows feature). Right click on a folder, properties, advanced and then you will see the "miraculous" option "Compress contents to save disk space". By default in all windows folders it is unchecked but our users checked this so they can save space on their file server.

 

I know, it was news to me as well 🙂


Sounds like time for a big "Don't do that!"

 

I might guess that the process to write the file is taking long enough the OS "compression" is working on parts of the file and that results in corrupted records. I would suggest using the COMPRESS option on the LIBNAME statement used to identify the destination of the data.

View solution in original post

7 REPLIES 7
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

When you say compressed files are you using a 3rd party zipping tool like zip7 or WinZip. or are you using SAS compression.

 

Alin1984
Fluorite | Level 6

I am saying compressed FOLDER (windows feature). Right click on a folder, properties, advanced and then you will see the "miraculous" option "Compress contents to save disk space". By default in all windows folders it is unchecked but our users checked this so they can save space on their file server.

 

I know, it was news to me as well 🙂

ballardw
Super User

@Alin1984 wrote:

I am saying compressed FOLDER (windows feature). Right click on a folder, properties, advanced and then you will see the "miraculous" option "Compress contents to save disk space". By default in all windows folders it is unchecked but our users checked this so they can save space on their file server.

 

I know, it was news to me as well 🙂


Sounds like time for a big "Don't do that!"

 

I might guess that the process to write the file is taking long enough the OS "compression" is working on parts of the file and that results in corrupted records. I would suggest using the COMPRESS option on the LIBNAME statement used to identify the destination of the data.

Kurt_Bremser
Super User

Mind that this is a Windows feature, brought to us by our second-most-ridiculed software company (#1 is Adobe, of course). Such stuff can only be expected to work reliably in odd leapyears. Investing in more disk space is much better than relying on this.

 

Consider using dedicated SAN infrastructure for your storage needs.

Tom
Super User Tom
Super User

Does the windows compressed folder option even work for files that large?

I would suggest use operating system to move the files.

So instead of use a data step 

libname out 'path to compressed folder';
data out.bigfile;
  set work.bigfile;
run;

Or a proc 

proc copy inlib=work outlib=out;
  select bigfile;
run;

Use the operating system.

data _null_;
  length fname $300 cmd $500;
  fname=catx('\',pathname('work'),'bigfile.sas7bdat');
  cmd=catx(' ','copy',quote(trim(fname)),quote(trim(pathname('out'))));
  infile dummy pipe filevar=cmd;
  input;
  put _infile_;
run;
Alin1984
Fluorite | Level 6

Found this: https://blogs.msdn.microsoft.com/ntdebugging/2008/05/20/understanding-ntfs-compression/

Apparently it is a known issue by Microsoft so not really a SAS issue.

Well, at least it might be useful for other users that might encounter I/O error and run into the same problem.

Kurt_Bremser
Super User

If your users want to reduce disk space consumption, they should try the compress=yes option on datasets. It is reliable and works quick with very little CPU load. When you have lots of mostly empty character variables, the reduction in physical file size can be considerable.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1713 views
  • 2 likes
  • 5 in conversation