How to trade off to save variable space?

Reply
Frequent Contributor
Posts: 75

How to trade off to save variable space?

Hi Folks,

If there is variable with varying length, and i wanna assign with a length statement.However when while reading the source file, if SAS encounters a smaller length value, this would compromise a lot of space. Is there a way to trade off by stating an option ?

I trust this question may have been asked before by somebody else, but I was unsuccessful in my search for the option so far. Thanks.

Mark

Contributor
Posts: 43

Re: How to trade off to save variable space?

If I understand correctly, you are wondering if having a SAS variable defined at a certain length will waste space, for example:

Data temp ;

     length FirstName $20 ;

     FirstName = 'Joe1' ;

     output ;

     FirstName = 'Joe2                        ' ;

     output ;

     FirstName = 'Joe3                        ' ;

     FirstName = TRIM(FirstName) ;

     output ;

run ;

It is my understanding that only 4 bytes will be used to store 'Joe1', not 30.  If you have trailing blanks, they will be saved.  If you don't want trailing blanks, you can use TRIM() to remove both leading and trailing blanks.

Message was edited by: Carla Wilson (accidentally submitted before I was done typing.)

Actually, I did a little more investigating.

Here is a SAS link about saving space by using the length statement on numeric variables:

http://support.sas.com/documentation/cdl/en/hosto390/61886/HTML/default/viewer.htm#mvs-length-length...

Here is a link for saving space with character variables:

http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001336082.htm

Contributor
Posts: 43

Re: How to trade off to save variable space?

Well, I am clearly wrong in thinking that string variables are stored with a variable number of bytes.  I ran a quick test to demonstrate.  (So, I learned something new today.)

No, I don't know of a VARCHAR(n) equivalent, but maybe that would be a good thing to suggest to SAS.

data test_string3_val3 ;
length teststring $3 ;
do i = 1 to 10000 ;
  teststring = 'Joe' ;
  output ;
end ;
drop i ;
run ;

data test_string399_val3 ;
length teststring $399 ;
teststring = 'Joe' ;
do i = 1 to 10000 ;
  output ;
end ;
drop i ;
run ;

data test_string399_val399 ;
length teststring $399 ;
do i = 1 to 399 by 3 ;
  teststring = 'Joe' || trim(teststring) ;
end ;
lentest = length(teststring) ;
put "Length(teststring) = " lentest ;
do i = 1 to 10000 ;
  output ;
end ;
drop i ;
run ;

data test_string399_val399_compressed compress=yes ;
set test_string399_val399 ;
run ;


proc datasets ;
run ;

RESULTS:

 

# Name                                                                       Type      Size             Last Modified

2 TEST_STRING399_VAL3                                    DATA     4113408    07Mar13:11:11:34

3 TEST_STRING399_VAL399                               DATA     4113408    07Mar13:11:13:23

4 TEST_STRING399_VAL399_COMPRESSED DATA         17408      07Mar13:11:17:28

5 TEST_STRING3_VAL3                                         DATA         33792      07Mar13:11:11:34

Super User
Super User
Posts: 7,039

Re: How to trade off to save variable space?

Use the data set option COMPRESS=YES.  SAS will still store the blanks, but when it writes the observations to the disk the compression algorithm will do a very good job of eliminating the blanks.

data cars (compress=yes);

set sashelp.cars;

length longvar $400 ;

run;

NOTE: There were 428 observations read from the data set SASHELP.CARS.

NOTE: The data set WORK.CARS has 428 observations and 16 variables.

NOTE: Compressing data set WORK.CARS decreased size by 60.00 percent.

      Compressed is 4 pages; un-compressed would require 10 pages.

Occasional Contributor
Posts: 10

Re: How to trade off to save variable space?

If you are looking for a field that is equivalent to a VarChar field in Teradata or db2, I do not think that SAS has one.  Compressing the data file will work, but will give you a bit of degradation in performance/speed of processing.

Super Contributor
Posts: 578

Re: How to trade off to save variable space?

Actually, I think in many cases, having datasets compressed actually speeds up processing in that it increases CPU processing but decreases IO processing.

SAS(R) 9.2 Companion for Windows, Second Edition

Ask a Question
Discussion stats
  • 5 replies
  • 366 views
  • 11 likes
  • 5 in conversation