BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
howardtsai
Calcite | Level 5

Dear SAS Community:

I want to import a .dat file that is tab delimited. Here is the file (also attached to post):

https://support.sas.com/kb/34/436.html

"Artists.dat", under the "Downloads" tab

 

When I import it using the INFILE statement, it would create a separate column for every space it encounters, rather than using solely the tab (five spaces). Here is the code I have so far:

 

DATA artists;
INFILE 'file_location/file_name';
INPUT Name $ Genre $ Nation $;
RUN;

 

As a result each artist's name, rather than being in a single column, gets divided into separate columns (2 or 3) because of the spaces between the artist's first and last names. Is this a problem of the source data format? Thank you for any advise,

 

Howard

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

Well, I'm confused.  I just clicked on the link in your question, and it looks like it shows how to read the file, using column input:

 

   INPUT Name $ 1-21 Genre $ 23-40 Origin $ 42;

If the data are not delimited, but they are in fixed columns, then column input makes sense.

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

9 REPLIES 9
Quentin
Super User

Is the delimiter an actual tab character, or is the delimiter five spaces?   Big difference. : )

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
howardtsai
Calcite | Level 5

Good question -- it seems like it is actually FIVE spaces in the .txt (after I convert it from .dat) file, and not an actual tab. What makes this a headache is that sometimes the values are divided by less than 5 spaces (3-4 spaces). Just seems like the original data file (artists.txt) wasn't written clearly in the first place.

Quentin
Super User

If there is a variable number of spaces between values, I would think the easiest thing would be to let SAS use a space as delimiter, and then change your code to read FirstName and LastName separately.  As long as there are no spaces within a FirstName or LastName, that should work.

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
howardtsai
Calcite | Level 5

Thank you! I had tried that too, but what complicates it further (arrgh) is artist names with multiple spaces, like "Pierre August Renoir" and "Vincent van Gogh." It's an easy fix, I can just squish the middle and last names together by deleting the space in the original data file. Still, even this simple example has created so much problems that I was just curious about potential solutions.


Howard

Quentin
Super User

Yeah, that's rough.  : )

 

When you look at the data (I haven't), is there a delimiter?  Cleary single space is not a delimiter.  Could you use two (or more) spaces as a delimiter?  If so, maybe try specifying dlmstr='  ' and see if that might help, something like:

 

data artists;
  INFILE cards dlmstr='  ';
  INPUT Name :$20. Genre :$20. Nation :$20.;
cards;
Vince Van Gogh   Impressionist      Netherlands
Pablo Picasso      Cubism      Spain
run ;

I think something like that would handle the rule that the delimiter is "at least two spaces".  

 

So first step would be identify what you can use a as a delimiter.  Then specify it with dlmstr= option if the delimiter is multiple characters.

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Quentin
Super User

Well, I'm confused.  I just clicked on the link in your question, and it looks like it shows how to read the file, using column input:

 

   INPUT Name $ 1-21 Genre $ 23-40 Origin $ 42;

If the data are not delimited, but they are in fixed columns, then column input makes sense.

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
howardtsai
Calcite | Level 5

AH yes that must be it -- I got the two kinds of data confused! Thank you so much Quentin for your help,

 

Howard

Quentin
Super User

Glad you got it working.  There are LOTS of ways to read from a text file in SAS.  I forget all the names (list input, formatted, column .... ?)   SAS may be known for analytics, but it's ability to read data from just about any source truly makes it a standout in the ETL world.

 

I took SAS Programming II in the Prudential Center in Boston in 1998, and managed to hold onto the course notes with my handwritten notes for about 10 years before somehow managing to lose it.   It had an excellent chapter on reading text files, and the instructor added his own tips. Now every time I try to read in a text file (which is only once or twice a year, since most data I work with now is in a database or JSON or similar), I still instinctively reach for the course notes and feel the despair of its loss. : )

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
howardtsai
Calcite | Level 5

Thank you so much Quentin! Yeah I have old college handwritten notes as well....I converted a lot of them into PDF so that was nice. I still enjoy taking notes by hand!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1059 views
  • 1 like
  • 2 in conversation