BookmarkSubscribeRSS Feed
Ronein
Onyx | Level 15

Hello

I am creating a data set .

I want to ask a few questions:

1-W is a char var.

Is it mandatory to put $  in input statement after var name w?

I am using length  statement so  I think that then it is not essential to add $ in input statement

2-In length statement should it be written as $4. or $4 or $ 4 (with space between $ and 4)?

3-In which case we use length statement when we create a char var?

Is it essential to use length statement or just if the char var is long?(what is minimum length of char var that is the required to define it on length  statement?)

4-How do you know what is the required length  that can be used?

Should I count letters in the longest value and then put this number in length  statement?

What happen if for example the required length is 4 but I define length  of 100?

Is it just a loose of memory?

 

 

DATA tbl;
Length W  $ 4;
input X   W;
cards;
789 1234
009 0009
1 9999
;
Run;
2 REPLIES 2
s_lassen
Meteorite | Level 14

Here are some answers:

  1. If you have defined the variable as character, you can omit the format.
  2. All three are valid, I normally use $4 (no period, as that makes it look like a format).
  3. Length statements are used when you want to define a length different from the one set by SAS as default (the various functions have default lengths, sometimes depending on the input variables). E.g. S=substr(longstring,3,2) will give S the same length as LONGSTRING, so you may want to set the length to 2 first. When using only $ (no informat) as the input modifier, SAS defaults to a length of 8.
  4. If you use Options COMPRESS=CHAR, the waste of disk space from having strings "too long" is minimal. Memory is normally not a problem in these situations. So set the length so long that you are sure they are long enough, and don't waste too much time counting characters.
ballardw
Super User

Length and Informat while related are not the same thing.

 

Especially in the world (old I know but still valid) of fixed column input.

There are times when you may have to read data with a different informat for one file even while setting the length longer than the actual data.

Consider these three data steps that attempt to read a fixed column data where the first three characters should be read into one variable and the next 3 into a different numeric variable.

data example;
   length x $ 20;
   input x  y;
datalines;
abc123
;

data example2;
   length x $ 20;
   input x @4  y;
datalines;
abc123
;


data example3;
   length x $ 20;
   input x $3.  y;
datalines;
abc123
;

One issue with assigning longer length than needed is the results of some functions will pad the result with the missing characters. Consider the following code:

data _null_;
   length x $ 20;
   x='abc';
   y = quote(x);
   put y=;
run;

When you run the above step the log will show a result of

y="abc                 "

Notice all of the spaces after the c before the closing quote character.

This will happen with a large number of character functions.

Which is why the Strip() function (or older combination of trim(left(var)) ) is used.

 

The final bit is "know thy data". If you have a document describing a data source that indicates the longest value that will be in a field then I would follow that document.

Sometimes the whole process is iterative because your data isn't documented and you have to make some guesses. Some of those guesses will be wrong and you may have to go back and change things to accommodate later knowledge.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 769 views
  • 0 likes
  • 3 in conversation