DATA Step, Macro, Functions and more

Import only some columns of a CSV

Reply
Frequent Contributor
Posts: 78

Import only some columns of a CSV

I have a CSV file with 95 columns and I need about 20 of them.  The way I handled this is to have 65 dummy vars (dummy0-dummy64) that are strings, 20 fields (the ones I care about) that are named/informatted and a giant INPUT statement that uses them to read the rows. The last thing I do in my data step is drop dummy0-dummy64.

Is there a better way to do this?  My input statement is kind of ugly.

Respected Advisor
Posts: 3,799

Re: Import only some columns of a CSV

You don't need but one dummy variable you can use something link INPUT a b c 64*(d) e f g; where d is a one byte character to skip the fields.

I have a program that does something similar in that it reads specific fields from CSVs.  You define a data set with the variables your want to read properly typed and with INFORMATs attached then you call the program and it read the fields that match the variable names from the each CSV.  It will read also from concatenated or wildcard FILEREF.  The fields can be in any order the program just looks for the names.

It's an interesting application of HASH and ARRAY. 

Attachment
Frequent Contributor
Posts: 78

Re: Import only some columns of a CSV

Posted in reply to data_null__

That macro is really nifty; I'll give it a try next time.  One of my colleagues solved the same problem using awk. He ends up with single, rational, delimited file which he can then import simple.  Neither he nor I have your SAS skills.

Ask a Question
Discussion stats
  • 2 replies
  • 2671 views
  • 6 likes
  • 2 in conversation