I'm having trouble importing an excel CSV document into SAS 9.4. (~2 million rows, should only be 60 variables upon import, but it >250 variables) The first thousand or so lines look good, but eventually SAS is reading in multiple observation lines and putting them into one line creating hundreds of extra variables. Here is the code I've used to import it.
libname CDC "filepath";
PROC IMPORT OUT= cdc.filename
DATAFILE= "filepath\file.csv" out=Q12018
dbms=csv replace; delimiter=',';guessingrows=max; getnames=no ; RUN;
For example instead of data looking like this:
ID visit number name
0 012 joe
1 123 jane
2 234 john
It looks like this (after several thousand rows):
ID visit number name var4 var5 var6
0 012 joe
1 123 jane 2 234 john
Any help is appreciated! Thanks!
CSV files have nothing to do with Excel. Did you create the CSV file from an actual Excel file? If so do you still have the Excel file? Can you just import that?
SAS will not normally combine rows into one. In fact is is more normal to see the opposite problem where SAS sees more observations that you expect because some individual values include the end of line characters. Are you sure that the issue is not that your CSV file has poorly matched quotes and is confusing SAS?
Why use PROC IMPORT? Why not just run a data step to read it? You could start with using the data step that PROC IMPORT generates (although it does generate very ugly code).
Thank you. No the vendor sent the file in a "Microsoft Excel Comma Separated Values File" format with a .CSV extension. It's too big to open and save as anything else.
I have finally been able to open the document (even though it doesn't load completely due to the number of rows) and it is a problem with his file and not the import statement - it imported properly. Thanks anyways!
Note that SAS code is a very good tool for looking at files that might be too large for your normal Graphical User Interface tools.
For example you can use a simple data step to look at lines from the file. For example if you are seeing issues after the 123,402nd observation you could run this little step to see that the 5 lines near there look like by dumping them to the SAS log.
data _null_;
infile 'myfile.csv' firstobs=123401 obs=123405 ;
input ;
list;
run;
Thank you very much for that!
I also have this quarter of data split up into 4 files - hopefully it doesn't look like the other, but the extensions are ".CSV.AA", ".CSV.AB", ".CSV.AC", and ".CSV.AD" and they say "AB File" in the filetype. Does anyone know how to import this filetype or read it in? I'm going to have to import it and append them.
@carolynmalone wrote:
Thank you very much for that!
I also have this quarter of data split up into 4 files - hopefully it doesn't look like the other, but the extensions are ".CSV.AA", ".CSV.AB", ".CSV.AC", and ".CSV.AD" and they say "AB File" in the filetype. Does anyone know how to import this filetype or read it in? I'm going to have to import it and append them.
File extensions are just part of the NAME of the file. They do not necessarily have anything at all to do with what type of file it actually is.
Just look at the files with your favorite text editor and see what they contain. You can also just run a quick SAS data step to look at the file. For example this code will dump the first 5 lines from the file to the SAS log so you can see what they look like.
data _null_;
infile 'filename.CSV.AA' obs=5 ;
input;
list;
run;
Did they split the files by observations (rows)? If so then the first line of each probably has the same list of variable names in the same order. Or did they split the files by variables (columns)? If so then the files should each have the same number of lines (observations).
Try termstr= option
FILENAME x "filepath\file.csv" termstr=lf ;
FILENAME x "filepath\file.csv" termstr=crlf ;
PROC IMPORT OUT= cdc.filename
DATAFILE=x out=Q12018
dbms=csv replace; delimiter=',';guessingrows=max; getnames=no ; RUN;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.