Help using Base SAS procedures

Created Table Is HUGE

Reply
New Contributor
Posts: 4

Created Table Is HUGE

I have the following code:

PROC SQL;

CONNECT TO ODBC(dsn='SomeDataSource');

CREATE TABLE Data_1 AS

SELECT * FROM CONNECTION TO ODBC(select col1, col2, col3 from SomeTable);

QUIT;

In the source DB col1, col2, col3 are all varchar(255)  It is MS SQL Server if that matters.

My problem is that the size of the created table in SAS is very large relative to the data.  More specifically, it looks like the table is creating all columns as char instead of varchar.  We do not have any records where the length of the contents of col1, col2, or col3 are the full 255 characters.  If I change the SQL portion to convert(varchar(10), col1) the resulting table is much smaller, and no data is lost as these columns are short.  I don't have the ability to modify the source DB, and am forced to convert many of the columns to reduce the size of the table.  I am able to get it down to under 20 MB instead of over 100 MB.

I am sure I am doing something wrong and that there must be a better way for me to do it but I can't for the life of me figure it out.

Any help would be appreciated.

Respected Advisor
Posts: 4,659

Re: Created Table Is HUGE

How about telling SAS what you want the string lengths to be :

PROC SQL;

CONNECT TO ODBC(dsn='SomeDataSource');

CREATE TABLE Data_1 AS

SELECT col1 length=10, col2 length=10, col3 length=10

FROM CONNECTION TO ODBC(select col1, col2, col3 from SomeTable);

QUIT;

PG

PG
New Contributor
Posts: 4

Re: Created Table Is HUGE

Disclaimer, I am totally new to this and I may be totally out of touch with reality.

I am hoping there is an easier and more maintainable way.  Since I don't have control of source DBs, I don't know when they change and when my defined data types are no longer correct.

In my perfect world, SAS would inspect the data types of the data set and create variable length columns based on that.

Respected Advisor
Posts: 4,659

Re: Created Table Is HUGE

Internally, SAS supports only two data types: numeric (float, 1 to 8 bytes) and character (fixed length).Upon import, everything is converted to these datatypes.  Boolean, date and integer datatypes are numerics with different formats. Other datatypes can be created when exporting, depending on the format associated with columns.

PG

PG
New Contributor
Posts: 4

Re: Created Table Is HUGE

Thanks for the info.  I see that SAS is doing the best it can with it... sadly it is very inefficient.

Respected Advisor
Posts: 4,659

Re: Created Table Is HUGE

I also thought that at first but it turns out to be efficient enough for most purposes. Look at the COMPRESS= dataset option if you need to save space.

PG

PG
Respected Advisor
Posts: 4,659

Re: Created Table Is HUGE

In some circumstances it may be more efficient to define your dataset as a view (CREATE VIEW Data_1 AS...) that will take no space at all but will access your database every time you use it.

PG

PG
New Contributor
Posts: 4

Re: Created Table Is HUGE

Awesome feedback and suggestions everyone!!!

I really appreciate the advice.  I am going to try both views and compression and see which works best for us.

Thank you again!

Super User
Posts: 3,115

Re: Created Table Is HUGE

In this situation, as PGStats has already mentioned, suggest you set OPTIONS COMPRESS = BINARY; prior to importing all of your tables via ODBC. Using this we compact most of the database tables we import by 80% or more.

In fact we maintain OPTIONS COMPRESS = BINARY; as our session default so that all downstream datasets are similary compressed and this enhances runtime performance as well as saving space. 

Ask a Question
Discussion stats
  • 8 replies
  • 282 views
  • 6 likes
  • 3 in conversation