BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Seems like there should be a simple solution here, but I'm struggling to find it.

Trying to read in data from a .dat file. Here's an example of the first 2 entries:


Generic College
1234 Main St
New York, NY 12345
718 555-1234
One Star Liberal Arts
Another Generic College
567 Main St
Boston, MA 23456
617 555-5678
Three Star Engineering


Each entry has School Name on the first line, followed by street address then city, st & zip code followed by area code & phone # & then a generic rating scale (one-five star) & the focus of the school

How I tried to read in the data:
input name $ ;
input address $ ;
input city $ st $ zip $ ;
input areacode $ phone $ ;
input star $1-10 focus $ ;

I'm using $ for all since I'm not actually manipulating the area or zip codes as numerical factors. I'm also using input in this fashion as I thought it would input in each line just as I wrote it and then I do do some kind of do loop to repeat throughout the rest of the data (a few thousand entries). However, it's not quite outputting it as planned. Iit's only reading info on each line up to the first space (ie - 1234 as the address instead of 1234 main st).


I'm at a loss of how to read in the data when each entry is separated on multiple lines. Each address & name and city is differing in length, so it's not like I can just use "input name $1-34." I think if I can get over this small hurdle that I can use a do loop to read in successive entries. Every dataset I've used/seen with SAS has had info posted in a very vanilla fashion:
name address city state zip phone stars focus
Where all info for each entry is on the same line, making it easier to read in


Any suggestions/advice for me to follow?
10 REPLIES 10
GreggB
Pyrite | Level 9
Try using a line pointer (/);
input name $ / address $ / city $ / etc.

the slash skips it to the next line of data.
Olivier
Pyrite | Level 9
Gregg is right : use slashes to go and read the next line.
To avoid stopping on the first space, specify the informats this way (not just $ but $ followed by the maximum length you would expect). The : before the $ tells SAS to stop when it encounters the delimiter (specify one in your INFILE statement, because default is blank !) or the end of the line.
[pre]
input name :$100. / address :$20. / city :$50. st $2. +1 zip $5.
/ areacode $3. +1 phone $8. / star $10. focus :$50. ;[/pre]
In this code, I use : when I don't know how long the values will be, but if I know the exact length (as for State, Zipcode of phone numbers) I don't mention any so that SAS reads exactly that number of characters.

Hope it helps.
Olivier
deleted_user
Not applicable
Thanks for the quick responses!

Quick question on the deilimiter. What should I specify it as? On the first & second line (name & address), it's just the end of the line that tells me it's time for the next entry, On the other 3rd line I'd have a comma to tell me the city is done and the last line has only a single space separating the star level with the school focus (on the final line,I'm thinking of counting the word "star" as it's own category and just dropping it so only the number shows in the output)
deleted_user
Not applicable
Update

Code I've used to get to this point:
data school ;
infile 'school.dat' DLM=' ' dsd;
input name & :$100. / address & :$30. / city & :$20. st $ zip $ 31-35
/ areacode $3. phone $8. / star $ star2 $focus & :$50. ;
drop star2;
run ;
proc print data=school (obs=10);
title 'Generic Title' ;
run ;


This gives me everything from zip code to the focus without issue. It also gives me the address without any problems

However, the state is always connected to the city in the output (and so nothing shows up under state). Some of the schools listed also have nicknames attached (such as "The" Ohio State University). On these such schools, the only part of the name that shows up is The (without the quotation marks)

Any more ideas on how to handle these 2 smaller issues?
ballardw
Super User
You can have multiple delimiters specified. So DLM=' ,' should treat the comma between city and state as desired.
deleted_user
Not applicable
Thanks ballardw, that did fix the issue with the states
deleted_user
Not applicable
1 last question

I'm trying to turn the alphabetical spelling of the stars (one, two, etc) into numeric characters

If I don't input the star rating as a character variable, it doesn't show up in the output (specifically in proc print, the star column is all .'s)

If I try to do if thens (if star=one then stars =1, etc), then I'm told that variable one (all the way through five) are uninitialized


Seems like excel's find & replace function would be easier than using SAS here. What am I missing?
RickM
Fluorite | Level 6
You are missing quotes around one i.e. star="one". Keep in mind that matching in SAS is case sensitive. You might want to do if upcase(star)="ONE". There might also be a SAS informat to read character versions of numbers but I'm not as sure about that.
deleted_user
Not applicable
Thanks RickM! Evidently I've been staring at these 10,000 entries far too long today
Ksharp
Super User
If you want very five rows to be a observation, then would be easy.


[pre]
data temp(drop=_: position length);
infile datalines truncover;
input #1 name $100.
#2 address $100.
#3 _city $100.
#4 areacode : $10. phone : $20.
#5 _star $100.;
city=scan(_city,1,',');
st=scan(_city,-2,', ');
zip=scan(_city,-1,', ');
star=catx(' ',scan(_star,1),scan(_star,2));
call scan(_star,3,position,length);
focus=substr(_star,position);
datalines;
Generic College
1234 Main St
New York, NY 12345
718 555-1234
One Star Liberal Arts
Another Generic College
567 Main St
Boston, MA 23456
617 555-5678
Three Star Engineering
;
run;
[/pre]




Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1108 views
  • 0 likes
  • 6 in conversation