I have data in a txt file. I'm trying to figure out how to specify that a special character (or multiple characters) signify the end of a line of data. In the example below, the ~ and the * are the characters which signify that the row of data is complete and it's time for a new row. The data set HAVE is how I'm currently reading the data in. The data set WANT is how I want it to look.
data HAVE;
infile datalines truncover;
input line $500.;
datalines ;
cat dog ~ mouse cat rat*
cat
snake
*
dog
bat
rat~cat~
;
run;
WANT
Obs line
1 cat dog ~
2 mouse cat rat*
3 cat snake*
4 dog bat rat~
5 cat~
Is this possible? Thanks in advance for any help.
Does below return what you're after?
%let external_file=%sysfunc(pathname(work))\mysource.txt;
data _null_;
file "&external_file";
infile datalines truncover;
input;
put _infile_;
datalines;
cat dog ~ mouse cat rat*
cat
snake
*
dog
bat
rat~cat~
;
data want;
infile "&external_file" dlm='*~' recfm=N lrecl=256;
input var :$256.;
/* replace CR and LF with a blank */
var=prxchange('s/[\n\r]+/ /',-1,strip(var));
if missing(var) then delete;
run;
proc print data=want;
run;
Do you only want it to have meaning if it is the last character on the line?
What if it appears in the middle of a line? Does this line still mean the end? Or do you want to ignore those? Or do you also want to split one line into multiple lines?
Are you trying to read from a TEXT file? Or do you already have the data in dataset?
In either case are you trying to generate a new text file or a new dataset?
Only at the end is much easier.
data want;
length line $500;
do until(indexc('*~',char(line,length(line))));
set have;
length newline $2000 ;
newline=catx(' ',newline,line);
end;
drop line;
run;
Results
Obs newline 1 cat dog ~ mouse cat rat* 2 cat snake * 3 dog bat rat~cat~
Otherwise perhaps you can just reprocess the new line back into individual lines.
data want;
length line $2000;
do until(indexc('*~',char(line,length(line))));
set have;
length newline $2000 ;
newline=catx(' ',newline,line);
end;
do index=1 to countw(newline,'~*')-1;
line=scan(newline,index,'~*');
output;
end;
drop newline index;
run;
Result
Obs line 1 cat dog 2 mouse cat rat 3 cat snake 4 dog bat rat 5 cat
Notice the space at the start of the new line two.
Do you want that? If not add a LEFT() function around the SCAN() function to remove the leading spaces.
Does below return what you're after?
%let external_file=%sysfunc(pathname(work))\mysource.txt;
data _null_;
file "&external_file";
infile datalines truncover;
input;
put _infile_;
datalines;
cat dog ~ mouse cat rat*
cat
snake
*
dog
bat
rat~cat~
;
data want;
infile "&external_file" dlm='*~' recfm=N lrecl=256;
input var :$256.;
/* replace CR and LF with a blank */
var=prxchange('s/[\n\r]+/ /',-1,strip(var));
if missing(var) then delete;
run;
proc print data=want;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.