BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
edoyle1
Calcite | Level 5

I have data in a txt file. I'm trying to figure out how to specify that a special character (or multiple characters) signify the end of a line of data. In the example below, the ~ and the * are the characters which signify that the row of data is complete and it's time for a new row. The data set HAVE is how I'm currently reading the data in. The data set WANT is how I want it to look.

 

data HAVE;
infile datalines truncover;
input line $500.;
datalines ;
cat dog ~ mouse cat rat*
cat
snake
*
dog
bat
rat~cat~
;
run;



 

        WANT

Obs          line

1                cat dog ~
2               mouse cat rat*
3               cat snake*
4               dog bat rat~
5               cat~

 

Is this possible? Thanks in advance for any help.

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Does below return what you're after?

%let external_file=%sysfunc(pathname(work))\mysource.txt;
data _null_;
  file "&external_file";
  infile datalines truncover;
  input;
  put _infile_;
  datalines;
cat dog ~ mouse cat rat*
cat
snake
*
dog
bat
rat~cat~
;

data want;
  infile "&external_file" dlm='*~' recfm=N lrecl=256;
  input var :$256.;
  /* replace CR and LF with a blank */
  var=prxchange('s/[\n\r]+/ /',-1,strip(var));
  if missing(var) then delete;
run;

proc print data=want;
run;

Patrick_0-1702606543080.png

 

 

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

Do you only want it to have meaning if it is the last character on the line?

 

What if it appears in the middle of a line?  Does this line still mean the end?  Or do you want to ignore those?  Or do you also want to split one line into multiple lines?

 

Are you trying to read from a TEXT file? Or do you already have the data in dataset?

In either case are you trying to generate a new text file or a new dataset?

 

Only at the end is much easier.

data want;
 length line $500;
 do until(indexc('*~',char(line,length(line))));
   set have;
   length newline $2000 ;
   newline=catx(' ',newline,line);
 end;
 drop line;
run;

Results

Obs    newline

 1     cat dog ~ mouse cat rat*
 2     cat snake *
 3     dog bat rat~cat~

Otherwise perhaps you can just reprocess the new line back into individual lines.

data want;
 length line $2000;
 do until(indexc('*~',char(line,length(line))));
   set have;
   length newline $2000 ;
   newline=catx(' ',newline,line);
 end;
 do index=1 to countw(newline,'~*')-1;
   line=scan(newline,index,'~*');
   output;
 end;
 drop newline index;
run;

Result

Obs    line

 1     cat dog
 2      mouse cat rat
 3     cat snake
 4     dog bat rat
 5     cat

Notice the space at the start of the new line two.

Do you want that?  If not add a LEFT() function around the SCAN() function to remove the leading spaces.

 

 

Patrick
Opal | Level 21

Does below return what you're after?

%let external_file=%sysfunc(pathname(work))\mysource.txt;
data _null_;
  file "&external_file";
  infile datalines truncover;
  input;
  put _infile_;
  datalines;
cat dog ~ mouse cat rat*
cat
snake
*
dog
bat
rat~cat~
;

data want;
  infile "&external_file" dlm='*~' recfm=N lrecl=256;
  input var :$256.;
  /* replace CR and LF with a blank */
  var=prxchange('s/[\n\r]+/ /',-1,strip(var));
  if missing(var) then delete;
run;

proc print data=want;
run;

Patrick_0-1702606543080.png

 

 

edoyle1
Calcite | Level 5
That did it, thanks!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 851 views
  • 0 likes
  • 3 in conversation