12-30-2015 02:30 PM
I have a challenge importing CSV files.
1) my csv file has over 100 columns, but I only need <20. So I have been doing proc import and then "keep" statement. Is there a way to import only few columns out of these 100+ using other procedures, maybe INFILE ?
2) The other issue I am haivng is that some of these 100+ variable names are very long, so when I import them the names get truncated. How do I avoid that ? I am considering INFILE and do the INFORMAT statement with INPUT But then I dont want to write the name of 100+ variables with INPUT option whereas I only need 10-20 at the most.
Thanks a bunch in advance.
12-30-2015 03:52 PM
12-30-2015 03:56 PM
One set of possible solutions for both questions is the same:
Proc import generated datastep code to read csv files. After you run proc import there will be code in the log for that data step.
Copy the code from the log into the editor. Remove any variables from INFORMAT, FORMAT and INPUT statements past the last one you want. If some other variables are still there, add a DROP statement for those.
To get useable, nicer names, do a search and replace on the generated names to what you would like.
I recommend adding a label statement to describe the columns you want.
Unfortunately the way delimited data read with list input you can't skip any variables. But you need not read the entire row, just up to the last one you want.