BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ninja_turtle
Fluorite | Level 6

Hi All,

 

I am facing a challenge while reading a csv file with the data spread in multiple lines.

 

Sample:

John,Doe

EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087

Mitchelle, Stark

EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493

 

Can you please help me to read think in the following way:

Mitchelle, Stark EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644

Mitchelle, Stark EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087

Mitchelle, Stark EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390

Mitchelle, Stark EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493

 

Thanks,

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
BrunoMueller
SAS Super FREQ

Have a look at the code below. It shows how to determine the "type" of record and process it accordingly.

See also Testing a Condition Before Creating an Observation

 

data want;
  infile cards truncover;
  /*
   * define variables
   */
  length
    type $ 2
    name $ 25
    line $ 80
  ;
  /*
   * read a record 
   * and keep it in the buffer, the @ at the end
   */
  input 
    @1 type $2.
    @
  ;
  /*
   * check for type of record, the logic may vary depending
   * on the data
   *
   * if EW, then process detail
   * else process "header"   
   */
  if type = "EW" then do;
    input @1 line $80.;
    output;
  end;
  else do;
    retain name;
    input @1 name $25.;
  end;  
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
;

View solution in original post

4 REPLIES 4
mkeintz
PROC Star

If every instance is composed of three lines (line1 for the name, line2 and line3 with other data), you could issue three INPUT sand two PUTs, as in:

 

data _null_;
  input name $char25.;
  do i=1 to 2;
    input line $80.;  
    put name $25. line;
  end;
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
run;

 

Of course, the PUT above writes to the log,  You either want to make a new text file (2 lines per person) or a SAS dataset (2 obs per person).  In the latter case, you have to modify the INPUT statement inside the loop to read each individual variable in the data lines, and then use OUTPUT instead of PUT, as in:

 

data want;
  input name $char25.;
  do i=1 to 2;
    input ..... specify input variables here ...;  
    output;
  end;
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ninja_turtle
Fluorite | Level 6
Thank you for the response, but the challenge is the rows after name are not fixed to 2,they can be n number of rows for every name.
BrunoMueller
SAS Super FREQ

Have a look at the code below. It shows how to determine the "type" of record and process it accordingly.

See also Testing a Condition Before Creating an Observation

 

data want;
  infile cards truncover;
  /*
   * define variables
   */
  length
    type $ 2
    name $ 25
    line $ 80
  ;
  /*
   * read a record 
   * and keep it in the buffer, the @ at the end
   */
  input 
    @1 type $2.
    @
  ;
  /*
   * check for type of record, the logic may vary depending
   * on the data
   *
   * if EW, then process detail
   * else process "header"   
   */
  if type = "EW" then do;
    input @1 line $80.;
    output;
  end;
  else do;
    retain name;
    input @1 name $25.;
  end;  
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
;
Ksharp
Super User
data want;
infile cards truncover;
input col $200.;
length name value $ 200;
retain name;
if prxmatch('/[a-z]+,/i',col) then do;name=col;delete;end;
 else value=col;
drop col;
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2238 views
  • 1 like
  • 4 in conversation