I'm importing some data from the web using an infile statement in a data step. Some of the fields contain carriage returns / line feeds in the middle of the data that cause bad records to be returned. I'm trying to find the simplest was to clean up the culprit fields which have these characters so the resulting data set in my work file doesn't have these errors.
I was thinking of using the Compress function to remove the characters, but I don't know where to put it in the code.
Compress(name,'0D0A'x)
Here's a simplified version of my code that creates a work file but with errors.
filename notes url "http://www.website.com/....";
data WORK.NOTES;
infile NOTES
delimiter = ','
MISSOVER
DSD
firstobs=2;
informat name $150. ;
informat note $300. ;
format name $150. ;
format note $300. ;
input
name $
note $;
run;
Not tested, you may need to tweak,
filename notes url "http://www.website.com/....";
data WORK.NOTES;
infile NOTES
delimiter = ','
MISSOVER
DSD
firstobs=2
lrecl=1000;
informat name $150. ;
informat note $300. ;
format name $150. ;
format note $300. ;
input @1 @;
_infile_=compress(_infile_,'0D0A'x);
input
name $
note $;
run;
I got this error when inserting this code
ERROR: The INFILE statement MISSOVER option and the INPUT statement double trailing @ option,
are being used in an inconsistent manner. The execution of the DATA STEP is being
terminated to prevent an infinite loop condition.
Right, changing the double @@ to single @ should fix it.
input @1 @;
Previous post has been updated to reflect the change.
This allowed the code to run, but the problem persists. I will try to provide some sample data
Seems to me you have more/different issues than you have presented. So far, we have followed the path you set, while did not get to know exactly what you mean by 'error' or 'problem'. current process should take care of the 2 control characters you think you have, if not, you can try a full-blown version of it: COMPRESS(_INFILE_,,'C'). This will take out all of the control characters, but still may not tackle your 'error' or 'problem'.
Do you know if the issue is ONLY linefeed or is it Carriage Return Linefeed pair, or carriage return only?
What OS are you running on?
You may be able to set the TERMSTR= option on the infile to correct this but settings would depend on your OS and the actual content of the file.
I think the issue is carriage return only.
Using SAS 9.4 on Windows server 2012.
You might try TERMSTR=CRLF on the infile. This would set the requirement for both a carriage return and linefeed to appear to end a line. Note that if this works you will still have CR in the variable. That should be removed with a string function such as Translate though you'll likely need to use the Hex coded value in code.
I tried TERMSTR=CRLF and TERMSTR=CR, both produced zero records
TERMSTR=LF produced results, but with the bad records due to carriage returns
The FILENAME statement supports the TERMSTR= option for the URL method. It can't hurt to try it there. LF is the default.
Make it as a Stream File : filename notes url "http://www.website.com/...."; data WORK.NOTES; infile NOTES recfm=n dlm='<' dsd; input @'>' x : $2000. @@; run; After that you can remove it by compress(x,'0D0A'x);
Here's some sample data and simplified code that illustrates my issue. The imported 'notes' work file has 7 records instead of 5 due to the multiple lines in the 'notes' field for records 3 and 5 in the .txt file.
filename notes 'C:\NotesTest.txt';
proc import
datafile=notes
out=notes
dbms=tab replace;
getnames = yes;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.