Hi All,
This may be a quick one... I used to be very proficient in SAS and returning to it after some time off. As I recall, a CSV file (such as the one attached) which has a text variable surrounded by double quotes should have no problem being read by SAS even if it has commas in the text, as long as the DSD option is enable.
Corresponding to the data I am uploading, which is a sample of public data from the CFPB, below is my import data step. Can someone explain to me why it is not working? It goes haywire on the second observation and sees a mid-text comma as a delimiter, rather than skipping it as a part of the quote. Help much appreciated as I shake this rust off...
data WORK.TEST ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile '\\nera-dcfs\work\User\Alexander.Kfoury\Learning\Code Samples\CFPB\Sample 3.csv'
delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Date_received $15. ;
informat Product $16. ;
informat Sub_product $39. ;
informat Issue $38. ;
informat Sub_issue $38. ;
informat Consumer_complaint_narrative $1265. ;
informat Company_public_response $48. ;
informat Company $40. ;
informat State $23. ;
informat ZIP_code $5. ;
informat Tags $2. ;
informat Consumer_consent_provided_ $20. ;
informat Submitted_via $3. ;
informat Date_sent_to_company mmddyy10. ;
informat Company_response_to_consumer $23. ;
informat Timely_response_ $3. ;
informat Consumer_disputed_ $3. ;
informat Complaint_ID best32. ;
format Date_received $15. ;
format Product $16. ;
format Sub_product $39. ;
format Issue $38. ;
format Sub_issue $38. ;
format Consumer_complaint_narrative $1265. ;
format Company_public_response $48. ;
format Company $40. ;
format State $23. ;
format ZIP_code $5. ;
format Tags $2. ;
format Consumer_consent_provided_ $20. ;
format Submitted_via $3. ;
format Date_sent_to_company mmddyy10. ;
format Company_response_to_consumer $23. ;
format Timely_response_ $3. ;
format Consumer_disputed_ $3. ;
format Complaint_ID best12. ;
input
Date_received $
Product $
Sub_product $
Issue $
Sub_issue $
Consumer_complaint_narrative $
Company_public_response $
Company $
State $
ZIP_code $
Tags $
Consumer_consent_provided_ $
Submitted_via $
Date_sent_to_company
Company_response_to_consumer $
Timely_response_ $
Consumer_disputed_ $
Complaint_ID
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;
Hi - pasting the log. I reviewed and was not clicking for me what the issue was. I will review again and perhaps you can tell me what I am missing.
It appears that the second line of data either had a linefeed/ carriage return entered into the comment text OR something else broke the line. So the second record actually is on two physical lines of data and the end of line encountered is causing your issue.
I am going to agree with @Reeza about reading the log. You would likely see some error about invalid data fo somehitn on the third data line because the quote ending the text from line two at the front of the line is going to get the quoted text pars out of alignment.
The fix may not be easy depending on the actual character at the end of line 2. Sometimes the INFILE option TERMSTR will address this if the problem is a single character such as LF or CR and you are using Windows. Then TERMSTR=CRLF will require both characters to actually terminate a line.
Another thing to consider is that perhaps MISSOVER is not the option for this data. Perhaps you do want to continue reading from the followiing line.
Why does line 4 start with a quoted string instead of a DATE value?
8 data _null_; 9 infile "&path\Sample 3.csv" ; 10 input; 11 list; 12 run; NOTE: The infile "...\Sample 3.csv" is: Filename=...\Sample 3.csv, RECFM=V,LRECL=32767,File Size (bytes)=2271, Last Modified=05Mar2020:13:13:05, Create Time=05Mar2020:13:13:04 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+-- 1 Date received,Product,Sub-product,Issue,Sub-issue,Consumer complaint narrative,Company public res 98 ponse,Company,State,ZIP code,Tags,Consumer consent provided?,Submitted via,Date sent to company,C 195 ompany response to consumer,Timely response?,Consumer disputed?,Complaint ID 270 2 6/5/2015,Credit reporting,,Incorrect information on credit report,Information is not mine,,Compan 98 y chooses not to provide a public response,"TransUnion Intermediate Holdings, Inc.",TX,78703,,Con 195 sent not provided,Web,6/5/2015,Closed with explanation,Yes,Yes,1408548 264 3 5/18/2015,Debt collection,"Other (i.e. phone, health club, etc.)",Disclosure verification of debt 98 ,Not disclosed as an attempt to collect,"I received a letter from IC System back in XXXX XXXX, 20 195 15 stating that I have a balance due from XXXX XXXX XXXX. I was shocked and researched the issue 292 with XXXX XXXX XXXX immediately. XXXX XXXX XXXX confirmed with me on the phone and in writing tha 389 t I did not owe them any balance and the communication from IC System is incorrect. Then, I conta 486 cted IC System to explain the situation to them on XXXX XXXX, 2015 and the representative ( XXXX 583 ) advised me to mail them a dispute letter. Then, I did mail IC System two dispute letters. After 680 I sent the dispute letter, I follow-up with IC System three times on the phone on XXXX XXXX, XXX 777 X XXXX and XXXX XXXX, 2015. IC System advised me that the status is pending the response from XXX 874 X XXXX, but they will not report my file to the credit bureaus as they still need to do a investi 971 gation. However, I just realized they reported my file to the credit bureaus few days ago without 1068 clarified my account with XXXX XXXX XXXX. I called IC System today, XXXX XXXX, 2015 and seeking 1165 for help. Their response to me is I am sorry they can not do anything about it, I can dispute tha 1262 t from the credit bureaus if I want to. I believe the whole process make me very frustrated and h 1359 elpless because of the mistake of a billing. 1403 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+-- 4 ",,"I.C. System, Inc.",TX,774XX,,Consent provided,Web,5/18/2015,Closed with explanation,Yes,No,13 98 81317 102 5 5/18/2015,Credit reporting,,Incorrect information on credit report,Public record,,Company chooses 98 not to provide a public response,Experian,TX,75791,,Consent not provided,Web,5/18/2015,Closed wi 195 th explanation,Yes,No,1379696 223
Make sure your CSV file follows SAS's rules for how delimited text files need to be formatted.
If you don't like SAS's rules then go up-vote this change request:
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.