Hi @RichardinOz
I am trying to read in the attached .txt pipe delimited file using proc import. But the quotes prevents me from reading it properly . Please find attached sample file .
Any help will be deeply appreciated
proc import
datafile="Filename"
out=test dbms=DLM replace;
getnames=yes;
Delimiter='|';
Datarow=2;
run;
Hi , many thanks for your reply .. I would ideally like to use proc import .
I found this post
https://communities.sas.com/t5/SAS-Programming/Import-text-file-into-SAS/m-p/657300#M196977
When I executed this piece of code I was able to read the fiels.
But whats happened is there are other fields that are numeric in the file and because of that I get the error as below .
Error: NOTE: Invalid data for EXDT in line 472 295-304. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 EXDT=. %let repA=' '; /* replacement character LF */ %let repD=' '; /* replacement character CR */ %let dsnnme="XXX_yyyy.txt"; /* use full path of CSV file */ data _null_; /* RECFM=N reads the file in binary format. The file consists */ /* of a stream of bytes with no record boundaries. SHAREBUFFERS */ /* specifies that the FILE statement and the INFILE statement */ /* share the same buffer. */ infile &dsnnme recfm=n sharebuffers; file &dsnnme recfm=n; /* OPEN is a flag variable used to determine if the CR/LF is within */ /* double quotes or not. Retain this value. */ retain open 0; input a $char1.; /* If the character is a double quote, set OPEN to its opposite value. */ if a = '"' then open = ^(open); /* If the CR or LF is after an open double quote, replace the byte with */ /* the appropriate value. */ if open then do; if a = '0D'x then put &repD; else if a = '0A'x then put &repA; end; run; proc import datafile="XXX_yyyy.txt" out=work dbms=DLM replace; getnames=yes; Delimiter='|'; Datarow=2; guessingrows=20000; run;
The quotes don't look like they are going to be an issue. The issue appears to be that you have end of line characters in the middle of the values of one of the fields. If you cannot get those removed you might be able to use one of the many methods posted on this forum to pre-process the file to remove the embedded end of line characters.
The issue is records broken by a CRLF.
proc import cannot handle these.
This data step reads the data provided:
data WANT;
infile F delimiter='|' pad firstobs=2;
input ID $ REASON : $200. ;
if REASON =: '"' and char(REASON, length(REASON)) ne '"' then do until(STOP);
input @;
REASON=catt(REASON,compress(_INFILE_,'|'));
if char(REASON, length(REASON))='"' then STOP=1;
else input @1;
end;
REASON=dequote(REASON);
run;
ID | REASON |
---|---|
1234567 | Xxx has alway been great. I have alway been a xxxxxx member.Always supportive of financial struggles too |
1234567 | You’ re amazing.You asdasdasr, and the brown fox jumped over the fence to re-instate him, so that’s the positive score.On the plus side he gets the all the best faciliteis for. |
Hi , many thanks for your reply .. I would ideally like to use proc import .
I found this post
https://communities.sas.com/t5/SAS-Programming/Import-text-file-into-SAS/m-p/657300#M196977
When I executed this piece of code I was able to read the fiels.
But whats happened is there are other fields that are numeric in the file and because of that I get the error as below .
Error: NOTE: Invalid data for EXDT in line 472 295-304. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 EXDT=. %let repA=' '; /* replacement character LF */ %let repD=' '; /* replacement character CR */ %let dsnnme="XXX_yyyy.txt"; /* use full path of CSV file */ data _null_; /* RECFM=N reads the file in binary format. The file consists */ /* of a stream of bytes with no record boundaries. SHAREBUFFERS */ /* specifies that the FILE statement and the INFILE statement */ /* share the same buffer. */ infile &dsnnme recfm=n sharebuffers; file &dsnnme recfm=n; /* OPEN is a flag variable used to determine if the CR/LF is within */ /* double quotes or not. Retain this value. */ retain open 0; input a $char1.; /* If the character is a double quote, set OPEN to its opposite value. */ if a = '"' then open = ^(open); /* If the CR or LF is after an open double quote, replace the byte with */ /* the appropriate value. */ if open then do; if a = '0D'x then put &repD; else if a = '0A'x then put &repA; end; run; proc import datafile="XXX_yyyy.txt" out=work dbms=DLM replace; getnames=yes; Delimiter='|'; Datarow=2; guessingrows=20000; run;
@dennis_oz Sure you can read the data with a data step, fix it, write it out, and read it again via proc import.
But why?
A data step is the only way you can totally control how data is read.
Just use the most appropriate tool is my recommendation.
Hi Chris,
You are correct by saying infile has greater control when reading the file .
Project requirement is to use Proc Import as the file is coming from an external party and sometimes they can increase the number of columns they send without notice. Hence do not want the code to fail in such a scenario.
Your code is amazing and will definitely put it into use . Thank you for the prpmpt reply and appreciate your help !!!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.