DATA Step, Macro, Functions and more

Keeping Trailing spaces in SAS dataset

Reply
N/A
Posts: 0

Keeping Trailing spaces in SAS dataset

Hi

I read in a text file eg.

filename indat 'C:\songs.txt';
infile indat delimiter=' ' missover length=reclen;


the format of the file is similar to

:3 4 Hi
:7 4 How

so the format is a ':', followed by a delimiter space ' ', followed by a numeric, delimiter space ' ', followed by a numeric, delimiter space ' ', followed by a character

However after the character there is often a trailing space ' ' which is not a delimiter but a part of what I would like to read into the SAS dataset. This ' ' always gets missed by the SAS input statement. How do I import this trailing space.
Super Contributor
Super Contributor
Posts: 3,174

Re: Keeping Trailing spaces in SAS dataset

Hmm - you say you have a delimiter character (a blank-space) that is also part of a data-field, but only sometimes? Might there be more than 1, possibly 2 or more? Suggest you re-define the delimiter character as something other than your data or figure out a way to enclose your data values in quotation-marks.

But, still, what you are looking for is that SAS carry forward any imbedded (extra) blank in your character variable(s). SAS variables have a declared (or implied) LENGTH attribute and so you will have the padded-blanks stored as part of the variable -- but I suppose the question is to ask: what do you expect to do with those trailing blank characters?

Scott Barry
SBBWorks, Inc.
N/A
Posts: 0

Re: Keeping Trailing spaces in SAS dataset

> figure
> out a way to enclose your data values in
> quotation-marks.
>

Hi thanks for your suggestion. I think if I enclose it in quotation marks, it might work.

So I'll have SAS reading in

: 3 4 Hi to
: 3 4 "Hi "

How do I tell SAS to read in everything between the quotation " marks in the input step?

Because I have to set the delimiter to a space ' ' I'm afriad SAS will input in

"Hi and " into different fields because it still thinks the space is the delimiter
Super Contributor
Super Contributor
Posts: 3,174

Re: Keeping Trailing spaces in SAS dataset

Take advantage of the SAS support http://support.sas.com/ website either using its SEARCH facility or consider using a Google advanced search such as the one below:

infile reading delimited files quotation marks site:sas.com


Scott Barry
SBBWorks, Inc.
Contributor
Posts: 34

Re: Keeping Trailing spaces in SAS dataset

You can use QUOTE informat in order to read the Vaue which has enclosed by quotes.
Super Contributor
Super Contributor
Posts: 3,174

Re: Keeping Trailing spaces in SAS dataset

Reading the data is not the real challenge - the OP wants to have SAS character variables retain any trailing spaces up to the declared SAS LENGTH definition (or the SAS default-assigned).

Scott Barry
SBBWorks, Inc.
SAS Super FREQ
Posts: 8,743

Re: Keeping Trailing spaces in SAS dataset

There is another issue in regard to trailing blanks. When you read in the data, you might work hard to read in the trailing blanks. However, consider this data:
[pre]
data muppets;
length On_show $15 Muppet $20;
infile datalines dsd dlm=',';
input On_show $ Muppet $ numtrail;
calclength = length(Muppet);
return;
datalines;
"Sesame Street","Kermit ",3
"Muppet Show","Miss Piggy ",1
"Sesame Street","Snuffleupagus ",2
"Muppet Show","Gonzo ",3
"Fraggle Rock","Gobo Fraggle ",4
"Fraggle Rock","Uncle Traveling Matt",0
;
run;
[/pre]

It's all very well to have trailing blanks explicitly included when the variable is read in. However, the LENGTH of the MUPPET variable will be $20. So it doesn't matter if Kermit has 3 trailing spaces or Miss Piggy has 1 trailing space -- they are individual variable VALUES, which will be internally stored with a maximum length of 20 characters.

If we use the LENGTH function to determine the number of characters in any given value for MUPPET, the LENGTH function -excludes- any trailing blanks and you see that the calculated length (if you run the program) shows only 6 as the length for what was read as "Kermit " (with 3 trailing blanks) and shows 10 as the length for what was read as "Miss Piggy " (with 1 trailing blank).

This is the output you get when you do a PROC PRINT of the data being read above:
[pre]
Variable and Calculated Length -- "trailing blanks" are ignored

Obs On_show Muppet calclength numtrail

1 Sesame Street Kermit 6 3
2 Muppet Show Miss Piggy 10 1
3 Sesame Street Snuffleupagus 13 2
4 Muppet Show Gonzo 5 3
5 Fraggle Rock Gobo Fraggle 12 4
6 Fraggle Rock Uncle Traveling Matt 20 0

[/pre]

And, if you look at the variable characteristics with PROC SQL, you see that the length of the variable is the maximum, or $20:
[pre]
Length as stored in the descriptor portion of the dataset

Column Column
Column Name Type Length
--------------------------------------------------------------------------
On_show char 15
Muppet char 20
numtrail num 8
calclength num 8

[/pre]

Since I don't understand WHY the trailing blanks are significant, and since I do understand how SAS deals with the LENGTH of character variables, I find myself wondering whether this is a moot question or some kind of important distinction for some other software, but a distinction which, in the end, doesn't really matter in SAS.

cynthia
Ask a Question
Discussion stats
  • 6 replies
  • 1629 views
  • 0 likes
  • 4 in conversation