- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am creating a new character variable in a Data Step. The default length is not sufficient so I need to explicitly state the length.
It seems that both statements below will work. I was just curious if one was more correct than the other or if it's just a matter of preference. Any insight is appreciated. Thanks in advance.
format DESCRIPTION $8.;
length DESCRIPTION $ 8;
data HAVE;
input PRODUCT $ AMT;
datalines;
001 100
002 300
003 1000
;
run;
data WANT1;
set HAVE;
format DESCRIPTION $8.;
if PRODUCT = "001" then DESCRIPTION = "BIKE";
else if PRODUCT = "002" then DESCRIPTION = "FRISBEE";
else if PRODUCT = "003" then DESCRIPTION = "BASEBALL";
run;
data WANT2;
set HAVE;
length DESCRIPTION $ 8;
if PRODUCT = "001" then DESCRIPTION = "BIKE";
else if PRODUCT = "002" then DESCRIPTION = "FRISBEE";
else if PRODUCT = "003" then DESCRIPTION = "BASEBALL";
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the LENGTH statement to DEFINE the length required to STORE a variable.
The FORMAT statement is for telling SAS what format to use to DISPLAY the value.
The FORMAT statement does NOT define the length anymore than referencing the variable in any other statement would. It just gives SAS more information (the width of the display format you are attaching to the variable) on which to base its GUESS about what length you wanted the variable to have than many other statements that reference the variable would.
There is no need to attach the $8. format specification to a character variable with a storage length of 8 bytes. SAS already knows how to display character variables. There is also some down side to attaching formats like that to character variables. For example what if you wanted to use that data as input to a data step that was going to increase the length of the variable so it could add something on the end. If it still has the $8. format specification attached to it your printout might truncate the value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As you observed both statements will create a variable with a length of $8.
From a purist point of view if your intention is to define the length of a variable then using the LENGTH statement is the "right" way to do it. The FORMAT statement is an indirect way to achieve the same result - but people will need more "SAS insight" to understand your code and why you're doing what you're doing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Length statement would be the preferred method of specifying a length for a variable:
One reason is just in case the variable already exists in the data set then the Length statement can trigger a warning about that similar to this:
WARNING: Length of character variable XXXXXX has already been set. Use the LENGTH statement as the very first statement in the DATA STEP to declare the length of a character variable.
Attempting to use Format in this case will only affect display, not actual length and can lead to some odd behavior when comparisons using the stored values don't match the displayed values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Use the LENGTH statement to DEFINE the length required to STORE a variable.
The FORMAT statement is for telling SAS what format to use to DISPLAY the value.
The FORMAT statement does NOT define the length anymore than referencing the variable in any other statement would. It just gives SAS more information (the width of the display format you are attaching to the variable) on which to base its GUESS about what length you wanted the variable to have than many other statements that reference the variable would.
There is no need to attach the $8. format specification to a character variable with a storage length of 8 bytes. SAS already knows how to display character variables. There is also some down side to attaching formats like that to character variables. For example what if you wanted to use that data as input to a data step that was going to increase the length of the variable so it could add something on the end. If it still has the $8. format specification attached to it your printout might truncate the value.