Our analytics group is attempting to work with data files in .csv format with variable type inconsistencies. In some cases the variables from the source files are character (this is correct) in other files the same variables with the same names are numeric and causing errors on import and data manipulation. Is there some way to format all input variables in a list and covert them to character formats if they happen to exist as numeric fields? If so what would the syntax look like?
Just write the code to read the files using a data step.
All you need to know is the names of the variables and the types and lengths to use.
data want ;
infile 'myfile.csv' dsd dlm=',' lrecl=100000 truncover firstobs=2 ;
length var1 8 var2 $50 var3 $25 ;
input var1 var2 var3 ;
run;
Assuming you're importing multiple files of the same type don't use proc import, use a "manual" import instead.
Run Proc Import once, get the code from the log and modify it to be what you want. Then use that code to import the rest of the files.
If you're importing many files you'll need to tweak each one somehow, but you could automate it if you did have naming conventions.
Reeza, thank you for the quick response. Do you have sample syntax? I have over 20 source files with this inconsistency issue and manual statements may work but I am hoping to avoid too much typing. I was able to fix the issue using Proc Import followed by put statements but each variable requires a put (about three lines of syntax) for about seven variables per file; any way I have thought of so far gets to quite a bit of coding. Any syntax shortcuts are welcome!
Are the files the same structure/format?
They are indeed.
How are the csv files being created? Some tools (e.g. Excel) will look at the first several rows to determine format.
Then once you've fixed it once you're done. Just change where the import for the .csv file. On a separate note you can read all files at once using a wild card in your filename statement.
If you need help besides this, I'd suggest posting code
Just write the code to read the files using a data step.
All you need to know is the names of the variables and the types and lengths to use.
data want ;
infile 'myfile.csv' dsd dlm=',' lrecl=100000 truncover firstobs=2 ;
length var1 8 var2 $50 var3 $25 ;
input var1 var2 var3 ;
run;
Tom,
Thank you, I will give this a try!
If you want PROC IMPORT to have an option to import all fields as character, please vote:
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.