BookmarkSubscribeRSS Feed
akfoury12
Calcite | Level 5

Hi All,

 

This may be a quick one... I used to be very proficient in SAS and returning to it after some time off. As I recall, a CSV file (such as the one attached) which has a text variable surrounded by double quotes should have no problem being read by SAS even if it has commas in the text, as long as the DSD option is enable.

 

Corresponding to the data I am uploading, which is a sample of public data from the CFPB, below is my import data step. Can someone explain to me why it is not working? It goes haywire on the second observation and sees a mid-text comma as a delimiter, rather than skipping it as a part of the quote. Help much appreciated as I shake this rust off...

 


data WORK.TEST ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile '\\nera-dcfs\work\User\Alexander.Kfoury\Learning\Code Samples\CFPB\Sample 3.csv'
delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;
informat Date_received $15. ;
informat Product $16. ;
informat Sub_product $39. ;
informat Issue $38. ;
informat Sub_issue $38. ;
informat Consumer_complaint_narrative $1265. ;
informat Company_public_response $48. ;
informat Company $40. ;
informat State $23. ;
informat ZIP_code $5. ;
informat Tags $2. ;
informat Consumer_consent_provided_ $20. ;
informat Submitted_via $3. ;
informat Date_sent_to_company mmddyy10. ;
informat Company_response_to_consumer $23. ;
informat Timely_response_ $3. ;
informat Consumer_disputed_ $3. ;
informat Complaint_ID best32. ;
format Date_received $15. ;
format Product $16. ;
format Sub_product $39. ;
format Issue $38. ;
format Sub_issue $38. ;
format Consumer_complaint_narrative $1265. ;
format Company_public_response $48. ;
format Company $40. ;
format State $23. ;
format ZIP_code $5. ;
format Tags $2. ;
format Consumer_consent_provided_ $20. ;
format Submitted_via $3. ;
format Date_sent_to_company mmddyy10. ;
format Company_response_to_consumer $23. ;
format Timely_response_ $3. ;
format Consumer_disputed_ $3. ;
format Complaint_ID best12. ;
input
Date_received $
Product $
Sub_product $
Issue $
Sub_issue $
Consumer_complaint_narrative $
Company_public_response $
Company $
State $
ZIP_code $
Tags $
Consumer_consent_provided_ $
Submitted_via $
Date_sent_to_company
Company_response_to_consumer $
Timely_response_ $
Consumer_disputed_ $
Complaint_ID
;
if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
run;

6 REPLIES 6
Reeza
Super User
Post your log please. The issue is usually clear in the log.
akfoury12
Calcite | Level 5

Hi - pasting the log. I reviewed and was not clicking for me what the issue was. I will review again and perhaps you can tell me what I am missing.

Reeza
Super User
I suspect that long line of text has some invisible characters that are messing things up but I can't see any issues. If you have the original file, try re-running the PROC IMPORT with GUESSINGROWS=MAX to see if you still have issues. It will take a while to run.
This may work, if the issue is not the invisible characters in the field, I'm expecting you have extra line returns in it.
Reeza
Super User
Yup, found it. The return is after the word billing. Basically SAS doesn't recognize hard returns which doesn't makes sense and is annoying as heck.

You first need to read the file, remove those characters and then parse it.
ballardw
Super User

It appears that the second line of data either had a linefeed/ carriage return entered into the comment text OR something else broke the line. So the second record actually is on two physical lines of data and the end of line encountered is causing your issue.

 

I am going to agree with @Reeza about reading the log. You would likely see some error about invalid data fo somehitn on the third data line because the quote ending the text from line two at the front of the line is going to get the quoted text pars out of alignment.

 

The fix may not be easy depending on the actual character at the end of line 2. Sometimes the INFILE option TERMSTR will address this if the problem is a single character such as LF or CR and you are using Windows. Then TERMSTR=CRLF will require both characters to actually terminate a line.

 

Another thing to consider is that perhaps MISSOVER is not the option for this data. Perhaps you do want to continue reading from the followiing line.

 

 

Tom
Super User Tom
Super User

Why does line 4 start with a quoted string instead of a DATE value?

8     data _null_;
9       infile "&path\Sample 3.csv" ;
10      input;
11      list;
12    run;

NOTE: The infile "...\Sample 3.csv" is:
      Filename=...\Sample 3.csv,
      RECFM=V,LRECL=32767,File Size (bytes)=2271,
      Last Modified=05Mar2020:13:13:05,
      Create Time=05Mar2020:13:13:04

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+--
1         Date received,Product,Sub-product,Issue,Sub-issue,Consumer complaint narrative,Company public res
      98  ponse,Company,State,ZIP code,Tags,Consumer consent provided?,Submitted via,Date sent to company,C
     195  ompany response to consumer,Timely response?,Consumer disputed?,Complaint ID 270
2         6/5/2015,Credit reporting,,Incorrect information on credit report,Information is not mine,,Compan
      98  y chooses not to provide a public response,"TransUnion Intermediate Holdings, Inc.",TX,78703,,Con
     195  sent not provided,Web,6/5/2015,Closed with explanation,Yes,Yes,1408548 264
3         5/18/2015,Debt collection,"Other (i.e. phone, health club, etc.)",Disclosure verification of debt
      98  ,Not disclosed as an attempt to collect,"I received a letter from IC System back in XXXX XXXX, 20
     195  15 stating that I have a balance due from XXXX XXXX XXXX. I was shocked and researched the issue
     292  with XXXX XXXX XXXX immediately. XXXX XXXX XXXX confirmed with me on the phone and in writing tha
     389  t I did not owe them any balance and the communication from IC System is incorrect. Then, I conta
     486  cted IC System to explain the situation to them on XXXX XXXX, 2015 and the representative ( XXXX
     583  ) advised me to mail them a dispute letter. Then, I did mail IC System two dispute letters. After
     680   I sent the dispute letter, I follow-up with IC System three times on the phone on XXXX XXXX, XXX
     777  X XXXX and XXXX XXXX, 2015. IC System advised me that the status is pending the response from XXX
     874  X XXXX, but they will not report my file to the credit bureaus as they still need to do a investi
     971  gation. However, I just realized they reported my file to the credit bureaus few days ago without
    1068   clarified my account with XXXX XXXX XXXX. I called IC System today, XXXX XXXX, 2015 and seeking
    1165  for help. Their response to me is I am sorry they can not do anything about it, I can dispute tha
    1262  t from the credit bureaus if I want to. I believe the whole process make me very frustrated and h
    1359  elpless because of the mistake of a billing.  1403
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+--
4         ",,"I.C. System, Inc.",TX,774XX,,Consent provided,Web,5/18/2015,Closed with explanation,Yes,No,13
      98  81317 102
5         5/18/2015,Credit reporting,,Incorrect information on credit report,Public record,,Company chooses
      98   not to provide a public response,Experian,TX,75791,,Consent not provided,Web,5/18/2015,Closed wi
     195  th explanation,Yes,No,1379696 223

Make sure your CSV file follows SAS's rules for how delimited text files need to be formatted.

If you don't like SAS's rules then go up-vote this change request: 

https://communities.sas.com/t5/SASware-Ballot-Ideas/Enhancements-to-INFILE-FILE-to-handle-delimited-...

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 725 views
  • 1 like
  • 4 in conversation