SAS newbie here. I am trying to import a CSV file. Two columns are giving me trouble. I have county names in one column and state names in the other. Anything over 6 to 8 characters gets cut off in these columns once imported. Any idea on how I can keep SAS from cutting off my columns. I apologize if this is a problem that can be found by searching the forums, but I simply don't know what keywords to use to search for this problem.
Also, it is too big to convert to XLS. 400,000+ rows of data.
Rick, thanks for the reply. You'll have to forgive me since I'm drastically unfamiliar with SAS. Where would the length code fall within the following import code? And what length is appropriate? (I'm assuming that, say "length county $15" doesn't specify 15 characters...am I correct here?)
I added your guessingrows command and in the log i found a data step code with the informats and formats. I copied and pasted that and increased the number for county and state in both format and informat and that got it to work! Thanks for your help Rick!
don't use proc import,
Instead use a data step with
an infile pointing to your data,
length statements defining your columns
and finally an input statement to read them.
Then you will have control over the parsing required.
The basic idea takes longer to describe that to demo
Imagine I have a pipe-delimited file stored as c:\temp\gambol.csv holding text like
2|two of us are better|NV|-123,456.78|3-Apr-2011
Then this is the simple data step to read that[pre]data gambling ;
infile 'c:\temp\gambol.csv' DSD dlm='|' lrecl= 10000 firstobs=2 ;
length id 8 name $30 state $2 money 8 date 6 ;
informat money dollar20. date date11. ;
format money dollar20. date date11. ;
input id -- date ;
run ;[/pre]The length statement ensures name can hold up to 30 characters.
The DSD option on the infile statement invokes CSV-type parsing. DLM= indicates the delimiter.
That informat statement is very handy as it indicates how to parse those columns which are not simple text or numbers. The format statements ensures the internal storage is converted to something more like the input when you display this data.
An INPUT statement defines the columns to read - and as SAS already knows about the columns you want to read, and learned about them in the order in which they appear on the input file, you need only define the first and last separated by the indication (--) that you want all columns defined from ID to DATE.
The syntax is probably not shorter than proc import, but is much more reliable because SAS is not required to make the judgement of how far to read through the file to decide what informat to use for each column.
However, for a quick load of unknown data, proc import might be helpful.
With 400,000 rows to read I imagine your input data are already well documented, so no need to ask proc import to make up its own definitions.