BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bhupeshpanwar
Fluorite | Level 6

I have a pipe-delimited text file which has CR and LF characters within double quotes. This leads to issues when I am importing this file into SAS. I found a code that can remove unwanted characters from external file after reading it byte by byte. The file outputted is always 0 bytes and empty. I have pasted the code below. I have used this code successfully in the past but I think I am missing something this time. Need help!!

 

%let dsin= "X:\links.txt";
%let dsout="X:\output.txt";
%let repA=''; /* replacement character for LF */
%let repD=''; /* replacement character for CR */


data _null_;
infile &dsin recfm=n sharebuffers;
file &dsout recfm=n;
input a $char1.;
retain open 0;
if a = '"' then open = ^(open);
if open then do;
if a = '0D'x then put &repD;
else if a = '0A'x then put &repA;
end;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you want to use sharedbuffers then the input and output files have to be the same file.

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

That should work.  It will not remove the CR and LF but it should replace them with spaces.

Show the log from your run.  I suspect that the input file was/is empty.

 

You could remove the SHAREBUFFERS option and add back an explicit write of the other characters. 

data _null_;
  infile &dsin recfm=n ;
  file &dsout recfm=n;
  input a $char1.;
  retain open 0;
  if a = '"' then open = ^(open);
  if open then do;
    if a = '0D'x then a=&repD;
    else if a = '0A'x then a=&repA;
  end;
  put a $char1.;
run;

 Plus without the SHAREDBUFFERS you could actually remove either the CR or LF character or both.

    if a = '0D'x then delete;
Ksharp
Super User
Better post some sample data, so we can test the code for you .
bhupeshpanwar
Fluorite | Level 6

I apologize for the delay as I got caught up with something else.

 

I have attached the sample data file and the output file being generated. As you would see that this is not the expected output. I am also pasting the log. What am I missing here?

 

========LOG ================

47650 /* Fixing CR LF */
47651 %let dsin= "X:\studies.txt";
47652 %let dsout="X:\links.txt";
47653 %let repA=''; /* replacement character for LF */
47654 %let repD=''; /* replacement character for CR */
47655 data _null_;
47656 infile &dsin recfm=n ;
47657 file &dsout recfm=n;
47658 input a $char1.;
47659 retain open 0;
47660 if a = '"' then open = ^(open);
47661 if open then do;
47662 if a = '0D'x then put &repD;
47663 else if a = '0A'x then put &repA;
47664 end;
47665 run;

NOTE: UNBUFFERED is the default with RECFM=N.
NOTE: The infile "X:\studies.txt" is:
Filename=X:\studies.txt,
RECFM=N,LRECL=256,File Size (bytes)=2100,
Last Modified=23Nov2020:10:11:32,
Create Time=23Nov2020:10:05:28

NOTE: UNBUFFERED is the default with RECFM=N.
NOTE: The file "X:\links.txt" is:
Filename=X:\links.txt,
RECFM=N,LRECL=256,File Size (bytes)=0,
Last Modified=23Nov2020:10:18:15,
Create Time=23Nov2020:10:11:59

NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

Tom
Super User Tom
Super User

If you want to use sharedbuffers then the input and output files have to be the same file.

bhupeshpanwar
Fluorite | Level 6

Thanks Tom. I knew I was doing something silly.

 

What exactly does sharedbuffers do? Will it help with the processing time if I intend to use it?

Tom
Super User Tom
Super User

@bhupeshpanwar wrote:

Thanks Tom. I knew I was doing something silly.

 

What exactly does sharedbuffers do? Will it help with the processing time if I intend to use it?


It means that the locations in memory and the INFILE and FILE statements use to store the data being read/written to the disk are the same.  It could impove the performance.  But the risk is that if it doesn't work right you have corrupted the original file.

If the file is not large (smaller than say 20 Gigabytes) then the performance increase is not worth the risk.

 

The code you posted will not work without using SHAREDBUFFERS because it it not explicitly re-writing each character. Use the modified code I posted instead.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1203 views
  • 0 likes
  • 3 in conversation