Hi Community!
I have a question about reading a specific type of raw data.
Suppose I have a data that looks something like this:
340 West 85th Street
New York NY 10024
40.79 -73.98
7901 Annapolis Road
Lanham MD 20706
38.95 -76.88
And I want the 1st line to be an Address, 2nd line contains City (New York), State (NY), Zip (10024), and 3rd line contains Latitude(40.79) and Longitude (-73.98).
So the code may look something like this (it is not correct):
data temp;
input #1 Address $
#2 City $ State $ Zip
#3 Latitude Longitude;
datalines;
...;
run;
Of course, this one does not work well. I am just wondering how to read a raw data like this (with multiple lines).
Thank you!
Hi Konsenlin,
you could write something like this as long as you make sure there are two blanks between the City and the state (to act as the delimiter of a multi-word-string).
I gave the city and address variable a size of 30
data temp; length address $30; length city $30; input #1 Address $ #2 City & $ State $ 2 Zip 5 #3 Latitude Longitude; datalines; ...; run;
The fact that there are multiple words per variable makes the processing a little tricky. Here's one way to approach it:
data want;
length address $ 50 city $ 30 state $ 2 zip $ 5 dummy $ 1;
input dummy;
address = _infile_;
input dummy;
zip = scan(_infile_, -1);
state = scan(_infile_, -2);
city = substr(_infile_, 1, length(_infile_)-9);
input latitude longitude;
drop dummy;
datalines;
...
;
You might want to inspect the zipcodes to make sure you don't have any longer values there. Also note, it's better to make zipcode character instead of numeric so you won't have to worry about leading zeros.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.