DATA Step, Macro, Functions and more

reading multiple records

Reply
Frequent Contributor
Posts: 100

reading multiple records

Greetings:

Trying out an example from Chapter 20 of SAS Certification Prep Guide, don't know why I could not read in the second observation. I'll copy the raw data and the codes below. It's hard not to feel doomed when an example from the book does not run. I must have done something wrong. Then I tried to modify the code. When the modified code finally worked, I was still not sure.
Thank you for any suggestions!

Raw data:
Lee Athnos
1215 Raintree Circle
Phoenix AZ 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124

Code in the book:
Data work.Address;
infile Example;
input FirstName $ LastName $ /
Address $ 1 - 20 /
City & $10. State$ Zip$;
run;
proc print data=work.address;
run;

Result:
The second observation is missing. From the log message, I guess the problem comes from a shorter address in the second observation. But I thought the column input (Address $ 1 - 20) should have taken care of this problem. ?

SAS log message:

NOTE: LOST CARD.
FirstName=Heidie LastName=Baker Address=Vienna VA 22124 City= State= Zip= _ERROR_=1 _N_=2
NOTE: 6 records were read from the infile EXAMPLE.
The minimum record length was 10.
The maximum record length was 20.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.ADDRESS has 1 observations and 6 variables.

The next twist:
I tried this modified list input approach below and it worked. I'm puzzled. I thought & only works if the raw data values are separated by two or more consecutive blanks. The address lines in this case do not have them.

Data work.Address;
infile Example;
input FirstName $ LastName $ /
Address & $20. /
City & $10. State$ Zip$;
run;
proc print data=work.address;
run;
PROC Star
Posts: 7,364

Re: reading multiple records

You were close! Exchange the & after city for a colon i.e.,

City : $10. State$ Zip$;

HTH,
Art
Frequent Contributor
Posts: 100

Re: reading multiple records

Thanks for the reply! The & follows city because some city names have embedded blanks in the raw data. (Sorry I didn't paste in those observations. My bad!) The & after Address puzzles me but that is the only way I could read the second obs. The address line has embedded blanks so : does not work. My book used column input and a /, which seems reasonable to me but it does not read obs with a shorter address lines.
PROC Star
Posts: 7,364

Re: reading multiple records

Are there one or two spaces between city and state?

If there are two or more, then I think that the following will work:


Data work.Address;
infile cards;
input FirstName $ LastName $ /
Address $ 1 - 20 /
City &: $10. State $ Zip $;
cards;
Lee Athnos
1215 Raintree Circle
New York NY 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124
;
run;

Art
Respected Advisor
Posts: 3,902

Re: reading multiple records

Hi

One possible issue could be that your raw data records are "too short".

It could be that in the example code the lines were all filled up with blanks but when you copied this code into your SAS environment these blanks got "cut off".

Try the following:
infile Example TRUNCOVER;

HTH
Patrick
PROC Star
Posts: 7,364

Re: reading multiple records

I'll add one more possibility in the hope that Peter C might see this and suggest an infile magic way of tackling the problem.

If the problem does end up being that you DON'T have at least two spaces between city and state, then you could simply input then parse the entire third line. E.g.:

data work.Address (drop=_Smiley Happy;
infile cards;
input FirstName $ LastName $ /
Address $ 1 - 20 /
_Third_Line & $80.;
format City $10.;
Zip=scan(_Third_Line,-1);
State=scan(_Third_Line,-2);
call scan(_Third_Line, -2, _position, _length);
City=substr(_Third_Line,1,_position-1);
cards;
Lee Athnos
1215 Raintree Circle
New York NY 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124
;

Art
PROC Star
Posts: 7,364

Re: reading multiple records

Since Peter hasn't responded, I posted the problem on SAS-L. While I didn't really get a better solution, I did get some ideas about how to combine perl with infile magic and, as a result, was able to produce the following possible code:

data Address ( Drop = Text Pattern Match ) ;
Length City $ 40
State $ 2
Zip $ 5 ;
infile cards;
input FirstName $ LastName $ /
Address $ 1 - 20 /
@;
_infile_ = prxchange('s/ ([A-Z]{2})/ $1/',-1,_infile_);
input city & state zip;
cards;
Lee Athnos
1215 Raintree Circle
New York NY 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124
;
run;

Of course, my head hurts now, but I definitely did learn some new things. I can only hope that the offered solutions are actually close to what the original problem might be. The above code was simply a way to deal with the problem I think exists with the current data (namely, a lack of 2 or more spaces between city and zipcode).

Art
PROC Star
Posts: 7,364

Re: reading multiple records

One correction: I was a bit to quick in posting and forgot to remove the drop option.


data Address;
Length City $ 40
State $ 2
Zip $ 5
;
infile cards ;
input FirstName $ LastName $ /
Address $ 1 - 20 /
@;
_infile_ = prxchange('s/ ([A-Z]{2})/ $1/',-1,_infile_);
input city & state zip;
cards;
Lee Athnos
1215 Raintree Circle
New York NY 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124
;
run;

HTH,
Art
Super User
Posts: 9,690

Re: reading multiple records

>I thought & only works if the raw data values are separated by two or more consecutive blanks. The address lines in this case do not have them.


Hi.
& has some same thing with ' : ', ' &' also stop input when encounter delimiter ( but it is double delimiter not as ' : ' which is single delimiter),So you will input address which contains single delimiter(blank).
column input (Address $ 1 - 20) will input all the twenty column even though it encounters delimiter and end character of line.

So if you want use column input ,Try to use Patrick suggested(use option 'truncover', it will stop input when encounter end character of line).

[pre]
filename example 'c:\test.txt';
data address;
infile Example truncover;
input FirstName $ LastName $ /
Address $ 1 - 20 /
City & $10. State$ Zip$;
run;
proc print data=work.address;
run;

[/pre]

Or add some more blanks after address until over 20 columns.



Ksharp
Frequent Contributor
Posts: 100

Re: reading multiple records

Big thanks to all of you (Art, Patrick and KSharp)! As a weekend SAS learner, I can't tell you enough how nice it is to get expert volunteer help here.

There are two blanks in the raw data between City and State. But thanks to Art for trying to find a way to address the "what if there is only one blank" question. I simply typed up the raw data example as what was showing on my SAS book. Patrick has a great point that there are probably blanks padded after the address values in the raw dataset used by the book. (Obviously I couldn't tell by reading the hardcopy. And I suspect most readers do not try to replicate the examples from an exam prep book, so I don't blame the author. :-)

In a nutshell, I tested the truncover idea, it worked great! Then I went back to the raw data and padded additional blanks in the address line. After this modification, the column input worked (without Truncover).

It has been a good lesson for me. Cheers.
PROC Star
Posts: 7,364

Re: reading multiple records

One more lesson for you then. Someone (Søren Lassen) on SAS-L posed a non prx way to solve the missing space problem and attained the desired result. Specifically, Søren proposed using the reverse function, and revers informat, on _infile_:

data Address;
length firstname lastname address $20
city $40 state $2 zip $5;
informat zip state city $revers.;
input firstname lastname /
address & /
@;
_infile_=reverse(_infile_);
input zip state city &;
cards;
Lee Athnos
1215 Raintree Circle
New York NY 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124
;
run;

Art Edited to reflect actually proposed code.

Message was edited by: art297
Valued Guide
Posts: 2,175

Re: reading multiple records

Art
thank you for thinking of me!
Family comes first at christmas and they seem to feel my SAS family should take second place.
Art you were kind to suggest I might have something to offer for a problem like this. However, all possible answers seem to be offered before I catch up with the action!
Perhaps it might open understanding if I describe my interpretation of the problem when there are not enough spaces to reliably use the & modifier (which collapses multiple delimiters as one, only when more than one delimiter is present):

When data positions are not fixed, always predefine (informat) lengths, and the informats where needed - it makes the "input" so much clearer, having just a list of variables.
Seek the reliable rules for parsing - Here these are :
1 there should be a state followed by a zip as the last two things on the third line
2 the prior text is the city name
3 the whole of the second line is the street address
4 assume first and second words of the first line are first and last name (so there won't be middle names nor initials nor titles nor suffix like II/jnr/esq/retd, etcetera)

I have not used the $revers. informat to solve such input problems before but have to recognise that it really does belong in the "_infile_ magic" toolkit!

Without using informat $revers. or PRX my approach would be something like[pre]data Address;
length firstname lastname address $20 city $40 state $2 zip $5;
input firstname lastname ;
input ;
address = _infile_ ;
input @ ;
zip = scan( _infile_, -1 ) ;
call scan( _infile_, -2, POS, LEN ) ;[/pre] * the beginning of the State defines also the end of the city name;
* and call scan() can provide the position for the N-th word
* Here it provides the beginning of the second last word (State) ;[pre] state = substrn( _infile_, pos, len ) ;
city = substrn( _infile_, 1, pos -1 ) ;
cards;
Lee Athnos
1215 Raintree Circle
New York NY 85044
Heidie Baker
1751 Diehl Road
Vienna VA 22124
;[/pre]It is not really "_infile_ magic" as this approach does not modify the _infile_ buffer.
Appropriate respect should go to Søren Lassen for the real "_infile_ magic"!

regards and seasons greetings to all the Forum posters and listeners
peterC
Frequent Contributor
Posts: 100

Re: reading multiple records

Thank you for the detailed explanation! I must be one of the 20% slow learners consuming 80% of volunteer help. Learned a few character functions before in SAS training classes... It's great to see how they can help in real life.
Frequent Contributor
Posts: 100

Re: reading multiple records

Thank you for sharing the magic solution! Did not even know the reverse function before. (For a beginner, most things are brand new to decipher :-).
Ask a Question
Discussion stats
  • 13 replies
  • 309 views
  • 0 likes
  • 5 in conversation