DATA Step, Macro, Functions and more

How do I read a raw data in multiple lines?

New User
Posts: 1

How do I read a raw data in multiple lines?

Hi Community!


I have a question about reading a specific type of raw data.


Suppose I have a data that looks something like this:


340 West 85th Street
New York NY 10024
40.79 -73.98
7901 Annapolis Road
Lanham MD 20706
38.95 -76.88


And I want the 1st line to be an Address, 2nd line contains City (New York), State (NY), Zip (10024), and 3rd line contains Latitude(40.79) and Longitude (-73.98).


So the code may look something like this (it is not correct):

data temp;
input #1 Address $ 
#2 City $ State $ Zip
#3 Latitude Longitude;

Of course, this one does not work well. I am just wondering how to read a raw data like this (with multiple lines).


Thank you!


Occasional Contributor
Posts: 7

Re: How do I read a raw data in multiple lines?

[ Edited ]

Hi Konsenlin,

you could write something like this as long as you make sure there are two blanks between the City and the state (to act as the delimiter of a multi-word-string).

I gave the city and address variable a size of 30


data temp;
length address $30;
length city $30;
input #1 Address $
#2 City & $ State $ 2 Zip 5
#3 Latitude Longitude;


Super User
Posts: 5,083

Re: How do I read a raw data in multiple lines?

The fact that there are multiple words per variable makes the processing a little tricky.  Here's one way to approach it:


data want;

length address $ 50 city $ 30  state $ 2  zip $ 5  dummy $ 1;

input dummy;

address = _infile_;

input dummy;

zip = scan(_infile_, -1);

state = scan(_infile_, -2);

city = substr(_infile_, 1, length(_infile_)-9);

input latitude longitude;

drop dummy;





You might want to inspect the zipcodes to make sure you don't have any longer values there.  Also note, it's better to make zipcode character instead of numeric so you won't have to worry about leading zeros.

Ask a Question
Discussion stats
  • 2 replies
  • 3 in conversation