DATA Step, Macro, Functions and more

SAS dataset size

Accepted Solution Solved
Reply
Contributor
Posts: 37
Accepted Solution

SAS dataset size

Hi Guys,

 

Currently i have a problem where i have retreiving data from a DB (around 10G) and save it as Sas7dat , but doing so the file size is 100G almost 10 times of its original file.

 

Do you guys have a solution for that ? thx in advance.

 

libname Folder 'C:\Sas';

 

proc sql.....create table sqltable as ....

Select a,b,c from A.Oracle .

........

 

data SAS.file;

set sqltable;

run;

 


Accepted Solutions
Solution
‎02-04-2017 08:45 PM
Super User
Super User
Posts: 6,498

Re: SAS dataset size

Most likely you have long character variables. SAS stores all strings as fixed length, padded with spaces. Most database systems, like Oracle, have a VARCHAR() type that does not store the trailing spaces.

 

If you do have long character variables then turning on COMPRESS dataset option will help.  But SAS will probably not compress as well as a true database engine could.

 

You can set the system option.

options compress=yes;

Or you could set it on the LIBNAME statement so that it is the default for all files written to that libref.

libname folder 'my path' compress=yes;
....
create table folder.mytable as 
....

Or use it as a dataset option instead to apply it to an individual dataset.

create table folder.mytable (compress=yes) as ...

Note that SAS uses a very simple compression algorithm. This means that for some tables the overhead added for implementing the compression will cause the table to be larger so check your SAS log.  When it writes the file it will note in the log the effect of the compression.

View solution in original post


All Replies
Solution
‎02-04-2017 08:45 PM
Super User
Super User
Posts: 6,498

Re: SAS dataset size

Most likely you have long character variables. SAS stores all strings as fixed length, padded with spaces. Most database systems, like Oracle, have a VARCHAR() type that does not store the trailing spaces.

 

If you do have long character variables then turning on COMPRESS dataset option will help.  But SAS will probably not compress as well as a true database engine could.

 

You can set the system option.

options compress=yes;

Or you could set it on the LIBNAME statement so that it is the default for all files written to that libref.

libname folder 'my path' compress=yes;
....
create table folder.mytable as 
....

Or use it as a dataset option instead to apply it to an individual dataset.

create table folder.mytable (compress=yes) as ...

Note that SAS uses a very simple compression algorithm. This means that for some tables the overhead added for implementing the compression will cause the table to be larger so check your SAS log.  When it writes the file it will note in the log the effect of the compression.

Contributor
Posts: 37

Re: SAS dataset size

Thx Tom, you help me a lot.

 

Do you think if i save it as csv as a datasource/DB instead of sas7dat works ?

 

what is the drawback of doing so ?

 

thanks

Super User
Posts: 10,476

Re: SAS dataset size

If you are going to do any analysis is SAS then if you saved that data as CSV instead of a SAS dataset then before any action on the data by SAS you would have to read it into SAS and create a SAS data set.

 

Whether saving the data as csv will work for other actions it really depends on what you may be attempting.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 209 views
  • 4 likes
  • 3 in conversation