BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

Hi All,

 

I am facing a challenge while reading a csv file with the data spread in multiple lines.

 

Sample:

John,Doe

EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087

Mitchelle, Stark

EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493

 

Can you please help me to read think in the following way:

Mitchelle, Stark EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644

Mitchelle, Stark EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087

Mitchelle, Stark EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390

Mitchelle, Stark EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493

 

Thanks,

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
BrunoMueller
SAS Super FREQ

Have a look at the code below. It shows how to determine the "type" of record and process it accordingly.

See also Testing a Condition Before Creating an Observation

 

data want;
  infile cards truncover;
  /*
   * define variables
   */
  length
    type $ 2
    name $ 25
    line $ 80
  ;
  /*
   * read a record 
   * and keep it in the buffer, the @ at the end
   */
  input 
    @1 type $2.
    @
  ;
  /*
   * check for type of record, the logic may vary depending
   * on the data
   *
   * if EW, then process detail
   * else process "header"   
   */
  if type = "EW" then do;
    input @1 line $80.;
    output;
  end;
  else do;
    retain name;
    input @1 name $25.;
  end;  
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
;

View solution in original post

4 REPLIES 4
mkeintz
Jade | Level 19

If every instance is composed of three lines (line1 for the name, line2 and line3 with other data), you could issue three INPUT sand two PUTs, as in:

 

data _null_;
  input name $char25.;
  do i=1 to 2;
    input line $80.;  
    put name $25. line;
  end;
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
run;

 

Of course, the PUT above writes to the log,  You either want to make a new text file (2 lines per person) or a SAS dataset (2 obs per person).  In the latter case, you have to modify the INPUT statement inside the loop to read each individual variable in the data lines, and then use OUTPUT instead of PUT, as in:

 

data want;
  input name $char25.;
  do i=1 to 2;
    input ..... specify input variables here ...;  
    output;
  end;
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Manish_Dobriyal
Calcite | Level 5
Thank you for the response, but the challenge is the rows after name are not fixed to 2,they can be n number of rows for every name.
BrunoMueller
SAS Super FREQ

Have a look at the code below. It shows how to determine the "type" of record and process it accordingly.

See also Testing a Condition Before Creating an Observation

 

data want;
  infile cards truncover;
  /*
   * define variables
   */
  length
    type $ 2
    name $ 25
    line $ 80
  ;
  /*
   * read a record 
   * and keep it in the buffer, the @ at the end
   */
  input 
    @1 type $2.
    @
  ;
  /*
   * check for type of record, the logic may vary depending
   * on the data
   *
   * if EW, then process detail
   * else process "header"   
   */
  if type = "EW" then do;
    input @1 line $80.;
    output;
  end;
  else do;
    retain name;
    input @1 name $25.;
  end;  
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
;
Ksharp
Super User
data want;
infile cards truncover;
input col $200.;
length name value $ 200;
retain name;
if prxmatch('/[a-z]+,/i',col) then do;name=col;delete;end;
 else value=col;
drop col;
datalines;
John,Doe
EW202207070053899 FOR RESOLUTION W260560364 JUNGE NAT POB 098 169644
EW202207070055230 FOR RESOLUTION W263027005 RACE NAT PPS 114 161087
Mitchelle, Stark
EW202205020079881 SEE NOTES W268135093 LEE SGB FAX 738 177390
EW202206270042457 FOR RESOLUTION W255586930 RIGGS NAT FAX 754 737493
;
run;

SAS INNOVATE 2024

innovate-wordmarks-white-horiz.png

SAS is headed back to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team.

Interested in speaking? Content from our attendees is one of the reasons that makes SAS Innovate such a special event!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 519 views
  • 1 like
  • 4 in conversation