05-08-2014 05:20 PM
I have a CSV file with 95 columns and I need about 20 of them. The way I handled this is to have 65 dummy vars (dummy0-dummy64) that are strings, 20 fields (the ones I care about) that are named/informatted and a giant INPUT statement that uses them to read the rows. The last thing I do in my data step is drop dummy0-dummy64.
Is there a better way to do this? My input statement is kind of ugly.
05-08-2014 06:15 PM
You don't need but one dummy variable you can use something link INPUT a b c 64*(d) e f g; where d is a one byte character to skip the fields.
I have a program that does something similar in that it reads specific fields from CSVs. You define a data set with the variables your want to read properly typed and with INFORMATs attached then you call the program and it read the fields that match the variable names from the each CSV. It will read also from concatenated or wildcard FILEREF. The fields can be in any order the program just looks for the names.
It's an interesting application of HASH and ARRAY.
05-09-2014 07:59 AM
That macro is really nifty; I'll give it a try next time. One of my colleagues solved the same problem using awk. He ends up with single, rational, delimited file which he can then import simple. Neither he nor I have your SAS skills.