There is another issue in regard to trailing blanks. When you read in the data, you might work hard to read in the trailing blanks. However, consider this data:
[pre]
data muppets;
length On_show $15 Muppet $20;
infile datalines dsd dlm=',';
input On_show $ Muppet $ numtrail;
calclength = length(Muppet);
return;
datalines;
"Sesame Street","Kermit ",3
"Muppet Show","Miss Piggy ",1
"Sesame Street","Snuffleupagus ",2
"Muppet Show","Gonzo ",3
"Fraggle Rock","Gobo Fraggle ",4
"Fraggle Rock","Uncle Traveling Matt",0
;
run;
[/pre]
It's all very well to have trailing blanks explicitly included when the variable is read in. However, the LENGTH of the MUPPET variable will be $20. So it doesn't matter if Kermit has 3 trailing spaces or Miss Piggy has 1 trailing space -- they are individual variable VALUES, which will be internally stored with a maximum length of 20 characters.
If we use the LENGTH function to determine the number of characters in any given value for MUPPET, the LENGTH function -excludes- any trailing blanks and you see that the calculated length (if you run the program) shows only 6 as the length for what was read as "Kermit " (with 3 trailing blanks) and shows 10 as the length for what was read as "Miss Piggy " (with 1 trailing blank).
This is the output you get when you do a PROC PRINT of the data being read above:
[pre]
Variable and Calculated Length -- "trailing blanks" are ignored
Obs On_show Muppet calclength numtrail
1 Sesame Street Kermit 6 3
2 Muppet Show Miss Piggy 10 1
3 Sesame Street Snuffleupagus 13 2
4 Muppet Show Gonzo 5 3
5 Fraggle Rock Gobo Fraggle 12 4
6 Fraggle Rock Uncle Traveling Matt 20 0
[/pre]
And, if you look at the variable characteristics with PROC SQL, you see that the length of the variable is the maximum, or $20:
[pre]
Length as stored in the descriptor portion of the dataset
Column Column
Column Name Type Length
--------------------------------------------------------------------------
On_show char 15
Muppet char 20
numtrail num 8
calclength num 8
[/pre]
Since I don't understand WHY the trailing blanks are significant, and since I do understand how SAS deals with the LENGTH of character variables, I find myself wondering whether this is a moot question or some kind of important distinction for some other software, but a distinction which, in the end, doesn't really matter in SAS.
cynthia