11-08-2012 04:24 PM
I have a file that is marked by standard delimiters in a csv file.
namely "a","b","c" etc etc. The problem is when I import, it converts some of the variables into dates (which they are numbers) and almost everything else becomes variables with character values (where some are and others not).
I can't share any examples as its all HIPAA. Just looking for reasons/solutions.
I don't want to do with infile as some of the files are not all the same and there are 280 variables.
11-08-2012 04:43 PM
The first thing to do is to set the GUESSINGROWS statement to some large number.
That may fix it.
11-08-2012 04:58 PM
By any chance did you open the CSV file in Excel and then save it? Excel is known to covert text values such as 5-10 in csv to dates without warning.
SAS is likely to treat values such as 9/10/2012 or 9-10-2012 as a date but if there is no punctuation associated 9102012 it won't usually be treated as a date.
When importing CSV the GUESSINGROWS option controls how many rows of data to examine to determine variable type. The default is 20 rows. If your data has just about any character other than . in the first 20 rows for any variable SAS by default will assign it as character. If some character fields have nothing but digits in first 20 then they become numeric. This at least can be solved by using GUESSINGROWS= 32767.
Another issue you may have is numeric coded values with leading zeros may have been treated as numeric and the leading zeros are gone.
Since the data step to read the file is generated when running PROC IMPORT you may find it helpful to either RECALL (F4) the code or copy and edit from the log changing informats and the input statment to desired types.
You could post a couple examples of the values being treated as dates.