10-09-2015 09:11 AM
Hello SAS users,
I have a csv file on Linux box. I am trying to read this file using SAS EG. But some of the value are wrapped text (I found it when I moved this file to windows and opened in Excel, The data values I have issues are wrapped text). I am not able to read this csv file into sas dataset without out errors.
sample of csv file as follows:
101,"Laptop","This device is in good condition, store it in the shelf"
102,"Monitors","we have 100 monitors, about 80 are in good condition.
and 20 need to be fixed"
103,"Mice","we have 250 mice most of them in good conditon."
I am using following code to read above csv file:
infile "infile_path" delimiter=',' DSD lrecl=32000 termstr=LF ;
attrib item_no length=8 format=11.
input item_id @;
attrib Item_name length=$8. format=$8.
input Item_name $ @;
attrib comment length=$80. format=$80.
input comment $ @;
in second row comments variable value is wrapped text ,it has multiple end of line.
how to read this kind of CSV files? Thanks in advance.
10-09-2015 10:17 AM
Issue is that you have either LF or CR/LF in the actual data. How big is the file and will you need to process this type of data again?
If you're going to be doing this a lot it may well be worthwhile going further back in the data stream and removing the LF or CR/LF from the data before extracting to the csv.
I recommend beating the data entry people with wet spaghetti noodles as well.
10-09-2015 01:09 PM
Thank you for you suggesion. It's very big file, has few millions of rows, we get these files everyday. we are bugging data management to change it , they are asking us to read some for now.
10-09-2015 10:36 AM
I agree with ballardw. Go back to the source and get them to fix it. I had it with a certain database sending CSV data, they changed a flag their end and fixed the problem.
10-09-2015 12:54 PM
Search this site as we have answered this question many times.
1) If the file was generated on a PC and you are lucky then the actual end of records indicators will be CR+LF and the embedded line breaks will just be CR or LF. If that is true then you can try reading the file using the INFILE option TERMSTR=CRLF. If you move the file from a PC to Unix to process with SAS then make sure to move it as BINARY and not TEXT so that the end of line characters are not modified.
2) If the file is consistent in the use of double quotes to protect values with embedded line breaks then you can create a fixed version by reading the file character by character and replacing the embedded LF or CR with some other character.